Attachment 'HW_SW_production_example.txt'

Download

   1 Here is our running setup for the actual snapshot harvest at KB/SB in Denmark:
   2 
   3 We expect to download about 22 TB in 12 weeks with a mix of old and new bitarchive and harvester servers at KB and SB
   4 (preparation of the harvest takes about 3 weeks).
   5 
   6 Today,  2.june 2009, we have
   7 16 running harvester instances at KB -  (6  are actually downloading):
   8 6 harvesters on each HP DL380 G4 x 2  ( 1 snapshot harvest instance on  each machine - the rest is ready for  or running selective harvest jobs)
   9 2 Harvesters on each HP DL380 G5 x 2  ( 1 snapshot harvest instance on each  machine - the rest is ready for or running selective harvest jobs)
  10 
  11 30 running SW harvesters at SB - 11 are actually downloading:
  12 6 harvesters on each Dell 2850 x 2 (1 snapshot harvest instance on  each machine - the rest is ready for or running selective harvest jobs)
  13 6 harvesters on each HP DL380 G3 x 1 (1 snapshot harvest instance - the rest is ready for or running selective harvest jobs)
  14 6 Harvesters on each HP DL380 G5 x 2 (3 snapshot harvest instances on 1 machine - the rest is ready for selective harvest jobs)
  15 
  16 In total we have 13 harvest instances running snapshot ( 4 at KB and 9 at SB  and one extra DL380 G5 harvest server in reserve at  KB) 
  17 The rest is ready for or running selective/event harvest jobs ( 12 on KB and 21 on SB).
  18 
  19 Total archive storage ca. 183 TB (August 2009 112 TB used)
  20 
  21 The new bitarchive servers e.g.: 
  22 1 x 360 server with 6 bitapps running 24 hours stores in avg.  240  GB ( measured over 4 days: 90 GB - 360 GB) 
  23 Our server stress test shows, that the new bitarchive servers can store 16,3 TB within 24 hours running 6 write processes in parallel to 6 RAIDS!
  24  
  25 Each new harvester has currently a avg. capacity of 24MB/sec per connection and can manage 5 snapshot harvest instances per new machine ( the old harvest servers can only manage 1 
  26 snapshot instance per machine).
  27 
  28 The download capacity is also dependent on how the Heritrix order.xml's are configured!
  29 
  30 There are 15 viewerproxy access instances for QA running at SB plus 1 tomcat and 1 apache (for wayback).
  31 At KB there are 10 viewerproxy access instances for QA and 1 Lucene index server.
  32 
  33 Your network should run min. 1 GB or more. 
  34 
  35 You should have a firewall setup which can handle in parallel  min 30 - 90 MBit/sec. 
  36 At SB/KB we have 3 firewalls! The 2 firewalls at KB is currently our main bottleneck. 
  37 The central admin machine with the JMS-broker, ADMGui, ArcRepository, BitarchiveMonitors, Derby database and Apache servers for secure login is also a bottleneck and single point of failure and should by mirrored or be in a cluster failover setup.
  38 
  39 Here is our HW setup:
  40 
  41 Bitarchive storage servers at SB:
  42 
  43                              number of machines: 2
  44                              model: Dell PowerEdge 2850 and 2950
  45                              processors : 2 * Intel Xeon 2.8 GHz and Intel Xeon 2.0 GHz both hyperthreaded
  46                              RAM: 4GB
  47                              local hard disk: 73GB mirrored local 32 TB in SAN (raid 5 and raid 6) and 73GB mirrored local 73 TB in SAN (raid 5 and raid 6)
  48                              network interface: 
  49                              operating system: Linux red hat (RHEL)
  50 
  51 Harvester servers at SB:
  52 
  53                              number of machines: 2
  54                              model: Dell PowerEdge 2850
  55                              processors: 2 * Intel Xeon CPU 3.20GHz hyperthreaded
  56                              disk: 600GB (3 300GB in raid-5)
  57                              RAM: 4GB
  58                              network interface: 1 Gbit/s
  59                              OS: Linux Centos
  60 
  61                              number of machines: 1
  62                              model: HP ProLiant DL380 G4
  63                              processors: 2 * Intel Xeon 2.8 GHz hyperthreaded
  64                              disk: 340 GB (6 73GB in raid-5)
  65                              RAM: 2,5GB
  66                              network interface:  1 Gbit/s
  67                              OS: Linux Centos
  68 
  69                              number of machines: 2
  70                              model: HP ProLiant DL380 G5
  71                              processors: 2 * Intel Xeon 2.0 GMz 4 cores
  72                              disk: 956 GB (8 143GB in raid-5)
  73                              RAM: 10GB
  74                              network interface: 1 Gbit/s
  75                              OS: Linux Centos
  76 
  77 Access machines at SB:
  78 
  79                              number of machines: 1
  80                              model: Dell PowerEdge 2850
  81                              processors : 2 CPU x 3GHZ
  82                              RAM: 2 GB
  83                              local hard disk  1,5 TB local + 4 TB SAN ( for wayback)
  84                              network interface: 1 Gbit/s
  85                              OS: Linux
  86 
  87 Bitarchive storage servers at KB  new architecture:
  88 
  89                              number of machines: 12
  90                              model: HP DL360 G5 
  91                              processors : 2 x QC CPU 2 GHZ
  92                              RAM: 3 GB
  93                              Controllers: Internal P400, External p800
  94                              Storage: 2 x MSA60  one with 3 x RAID 5 with (3 TB) and the other with 2 x RAID 5 (3 TB) , 1 x  RAID 5 (2TB) and 1 TB  without RAID for temp data
  95                              local hard disk: 2 x 72 GB RAID 1 for OS/Software
  96                              network interface: Gigabit
  97                              operating system: Windows Web 2008
  98                              temp-storage to batch jobs: 5%
  99 
 100 Harvester servers at KB:
 101 
 102                              number of machines: 2
 103                              model: HP DL380 G4
 104                              processors : 2 CPU x 3GHZ
 105                              RAM: 4 GB
 106                              local hard disk: 6 x 72 GB
 107                              network interface: 
 108                              OS: Linux
 109 
 110                              number of machines: 2
 111                              model: HP ProLiant DL380 G5
 112                              processors: 2 * Intel Xeon 2.0 GHZ 4 cores
 113                              disk: 956 GB (8 x 146 GB in raid-5)
 114                              RAM: 10GB
 115                              network interface:  1 Gigabit
 116                              OS: Linux Centos
 117 
 118 Access machines at KB:
 119 
 120                              number of machines: 1
 121                              model: HP DL380 G4
 122                              processors : 1 CPU x 3GHZ
 123                              RAM: 2 GB
 124                              local hard disk: 2 x 72 GB + 4 x 300 GB
 125                              network interface: 1 Gbit/s
 126                              OS: Linux
 127 
 128 For a similar deploy installation see the first deploy example in chapter 10.1 in the Installation Manual 
 129 ( https://netarchive.dk/suite/Installation_Manual_devel/AppendixC?action=AttachFile&do=get&target=deploy_example.xml )

Attached Files

To refer to attachments on a page, use attachment:filename, as shown below in the list of files. Do NOT use the URL of the [get] link, since this is subject to change and can break easily.
  • [get | view] (2010-05-04 14:00:22, 6.9 KB) [[attachment:HW_SW_production_example.txt]]
  • [get | view] (2010-06-16 11:57:10, 9.5 KB) [[attachment:deploy_example_one_machine.xml]]
 All files | Selected Files: delete move to page copy to page

You are not allowed to attach a file to this page.