Attachment 'HW_SW_production_example.txt'
Download 1 Here is our running setup for the actual snapshot harvest at KB/SB in Denmark:
2
3 We expect to download about 22 TB in 12 weeks with a mix of old and new bitarchive and harvester servers at KB and SB
4 (preparation of the harvest takes about 3 weeks).
5
6 Today, 2.june 2009, we have
7 16 running harvester instances at KB - (6 are actually downloading):
8 6 harvesters on each DL380 G4 x 2 ( 1 snapshot harvest instance on each machine - the rest is ready for or running selective harvest jobs)
9 2 Harvesters on each DL380 G5 x 2 ( 1 snapshot harvest instance on each machine - the rest is ready for or running selective harvest jobs)
10
11 30 running SW harvesters at SB - 11 are actually downloading:
12 6 harvesters on each Dell 2850 x 2 (1 snapshot harvest instance on each machine - the rest is ready for or running selective harvest jobs)
13 6 harvesters on each DL380 G3 x 1 (1 snapshot harvest instance - the rest is ready for or running selective harvest jobs)
14 6 Harvesters on each DL380 G5 x 2 (3 snapshot harvest instances on 1 machine - the rest is ready for selective harvest jobs)
15
16 In total we have 13 harvest instances running snapshot ( 4 at KB and 9 at SB and one extra DL380 G5 harvest server in reserve at KB)
17 The rest is ready for or running selective/event harvest jobs ( 12 on KB and 21 on SB).
18
19 Total archive storage ca. 183 TB (currently used 90 TB)
20
21 The new bitarchive servers e.g.:
22 1 x 360 server with 6 bitapps running 24 hours stores in avg. 240 GB ( measured over 4 days: 90 GB - 360 GB)
23 Our server stress test shows, that the new bitarchive servers can store 16,3 TB within 24 hours running 6 write processes in parallel to 6 RAIDS!
24
25 Each new harvester has currently a avg. capacity of 24MB/sec per connection and can manage 5 snapshot harvest instances per new machine ( the old harvest servers can only manage 1
26 snapshot instance per machine).
27
28 The download capacity is also dependent on how the Heritrix order.xml's are configured !
29
30 There are 15 viewerproxy access instances for QA running at SB plus 1 tomcat and 1 apache (for wayback).
31 At KB there are 10 viewerproxy access instances for QA and 1 Lucene index server.
32
33 Your network should run min. 1 GB or more.
34 You should have a firewall setup which can handle in parallel min 30 - 90 MBit/sec.
35 At SB/KB we have 3 firewalls! The 2 firewalls at KB is currently our main bottleneck.
36 The central admin machine ( see evt. the attached drawings) with the JMS-broker, ADMGui, ArcRepository, BitarchiveMonitors, Derby database and Apache servers for secure login is also
37 a bottleneck and single point of failure and should by mirrored or be in a cluster failover setup.
38
39 Here is our HW setup:
40
41 Bitarchive storage servers at SB:
42
43 number of machines: 2
44 model: Dell PowerEdge 2850 and 2950
45 processors : 2 * Intel Xeon 2.8 GHz and Intel Xeon 2.0 GHz both hyperthreaded
46 RAM: 4GB
47 local hard disk: 73GB mirrored local 32 TB in SAN (raid 5 and raid 6) and 73GB mirrored local 73 TB in SAN (raid 5 and raid 6)
48 network interface:
49 operating system: Linux red hat (RHEL)
50
51 Harvester servers at SB:
52
53 number of machines: 2
54 model: Dell PowerEdge 2850
55 processors: 2 * Intel Xeon CPU 3.20GHz hyperthreaded
56 disk: 600GB (3 300GB in raid-5)
57 RAM: 4GB
58 network interface: 1 Gbit/s
59 OS: Linux Centos
60
61 number of machines: 1
62 model: HP ProLiant DL380 G4
63 processors: 2 * Intel Xeon 2.8 GHz hyperthreaded
64 disk: 340 GB (6 73GB in raid-5)
65 RAM: 2,5GB
66 network interface: 1 Gbit/s
67 OS: Linux Centos
68
69 number of machines: 2
70 model: HP ProLiant DL380 G5
71 processors: 2 * Intel Xeon 2.0 GMz 4 cores
72 disk: 956 GB (8 143GB in raid-5)
73 RAM: 10GB
74 network interface: 1 Gbit/s
75 OS: Linux Centos
76
77 Access machines at SB:
78
79 number of machines: 1
80 model: Dell PowerEdge 2850
81 processors : 2 CPU x 3GHZ
82 RAM: 2 GB
83 local hard disk 1,5 TB local + 4 TB SAN ( for wayback)
84 network interface: 1 Gbit/s
85 OS: Linux
86
87 Bitarchive storage servers at KB new architecture:
88
89 number of machines: 12
90 model: HP DL360 G5
91 processors : 2 x QC CPU 2 GHZ
92 RAM: 3 GB
93 Controllers: Internal P400, External p800
94 Storage: 2 x MSA60 one with 3 x RAID 5 with (3 TB) and the other with 2 x RAID 5 (3 TB) , 1 x RAID 5 (2TB) and 1 TB without RAID for temp data
95 local hard disk: 2 x 72 GB RAID 1 for OS/Software
96 network interface: Gigabit
97 operating system: Windows Web 2008
98 temp-storage to batch jobs: 5%
99
100 Harvester servers at KB:
101
102 number of machines: 2
103 model: HP DL380 G4
104 processors : 2 CPU x 3GHZ
105 RAM: 4 GB
106 local hard disk: 6 x 72 GB
107 network interface:
108 OS: Linux
109
110 number of machines: 2
111 model: HP ProLiant DL380 G5
112 processors: 2 * Intel Xeon 2.0 GHZ 4 cores
113 disk: 956 GB (8 x 146 GB in raid-5)
114 RAM: 10GB
115 network interface: 1 Gigabit
116 OS: Linux Centos
117
118 Access machines at KB:
119
120 number of machines: 1
121 model: HP DL380 G4
122 processors : 1 CPU x 3GHZ
123 RAM: 2 GB
124 local hard disk: 2 x 72 GB + 4 x 300 GB
125 network interface: 1 Gbit/s
126 OS: Linux
127
128 For similar deploy installation see the first deploy example in chapter 10.1 in the Installation Manual
129 ( https://netarchive.dk/suite/Installation_Manual_devel/AppendixC?action=AttachFile&do=get&target=deploy_example.xml )
Attached Files
To refer to attachments on a page, use attachment:filename, as shown below in the list of files. Do NOT use the URL of the [get] link, since this is subject to change and can break easily.You are not allowed to attach a file to this page.