User Tools

Site Tools


cluster:83

Warning: Undefined array key -1 in /usr/share/dokuwiki/inc/html.php on line 1458

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:83 [2010/07/13 16:15]
hmeij
cluster:83 [2010/09/14 10:40] (current)
hmeij
Line 55: Line 55:
 ==== Comparison of Quotes ==== ==== Comparison of Quotes ====
  
-^  Item  ^  Advanced Clustering  ^  Dell  ^  Sun  ^+<html> 
 +<!--
 |  quote  |  $ 246,237  |  $ 202,175  |  $ 199,528  | |  quote  |  $ 246,237  |  $ 202,175  |  $ 199,528  |
 +-->
 +</html>
 +
 +^  Item  ^  Advanced Clustering  ^  Dell  ^  Sun  ^
 |  job slots  |  512 (reduce to 352 for $200K)  |  160  |  272  | |  job slots  |  512 (reduce to 352 for $200K)  |  160  |  272  |
 |  compute nodes  |  blades  |  rack server  |  rack server  | |  compute nodes  |  blades  |  rack server  |  rack server  |
Line 92: Line 97:
 Also, as the size increases, we probably can not back up to the VTL anymore.  In such a case, the snapshot capability is important (point in time restore).  Also we could design an "archive" present on these devices (raid 1 for example).  Or perform disk2disk backup copies, still on same device. Also, as the size increases, we probably can not back up to the VTL anymore.  In such a case, the snapshot capability is important (point in time restore).  Also we could design an "archive" present on these devices (raid 1 for example).  Or perform disk2disk backup copies, still on same device.
  
-^  Item  ^  Nexan  ^  RAID Inc  ^  Nexan  ^   +<html> 
-|  quote  |  $ 48,862  |  $ 81,750  |  $ 49,892    +<!-- 
 +|  quote  |  $ 48,862  |  $ 81,750  |  $ 49,892  | 
 +--> 
 +</html> 
 + 
 +^  Item  ^  Nexan  ^  RAID Inc  ^  Nexan  ^      
 |  name  |  SataBeast  |  Xanadu  |  SataBeast  |     |  name  |  SataBeast  |  Xanadu  |  SataBeast  |    
 |  controllers  |  dual    dual  |  dual  |     |  controllers  |  dual    dual  |  dual  |    
Line 110: Line 120:
 ==== Round 2 of Quotes ==== ==== Round 2 of Quotes ====
  
-^  Item  ^  Advanced Clustering //updated//  ^  Dell  ^  Dell #2 //updated//  ^  HP //updated//  ^ +<html><!-- 
-|  quote  |  $ 152,283.77 (+$2K one day, on site)  |  $ 149,380.49    $ 176,856.91 (on site included, PCM config?) (DROP E-CIFSE-SCSI?)   177,800   |+|  quote  |  $ 152,283.77 (+$2K one day, on site)  |  $ 149,380.49    $ 176,856.91 (on site included? checkperhaps +2.5K)   149,996.49 (one week on site included)    
 +--> 
 +</html> 
 + 
 +^  Item  ^  Advanced Clustering //updated//  ^  Dell  ^  Dell #2 //updated//  ^  HP //updated!// 
 +|
 |  job slots  |  240  |  80  |  240  |  256  | |  job slots  |  240  |  80  |  240  |  256  |
-|  overall cost/job slot (minus head node & storage)  |  $ 550  |  $ 892  |  $ 569  |  $ 415  |+|  overall cost/job slot (minus head node & storage)  |  $ 550  |  $ 892  |  $ 569  |  $ <del>415</del>  |
 |  compute nodes  |  blades  |  blades  |  blades  |  blades  | |  compute nodes  |  blades  |  blades  |  blades  |  blades  |
 |  node count  |  30  |  10  |   30  |  32   | |  node count  |  30  |  10  |   30  |  32   |
Line 120: Line 135:
 |  memory  |  6x2=12 gb  |   6x2=12 gb  |  6x2=12 gb  |   12x1=12 gb  | |  memory  |  6x2=12 gb  |   6x2=12 gb  |  6x2=12 gb  |   12x1=12 gb  |
 |  hdd  |  1x250 gb  |  1x146 gb (15k SAS)  |  1x146 gb (15k SAS)  |  1x160 gb  | |  hdd  |  1x250 gb  |  1x146 gb (15k SAS)  |  1x146 gb (15k SAS)  |  1x160 gb  |
-|  gigE   2 x Netgear GSM7352 v2 - 48-port, **10Gb switch to storage**  |  two PowerConnect 6248 - 48 port |  two PowerConnect 6248 - 48, **10Gb switch to storage**  |    ProCurve Switch 2610-48, HP ProCurve 2910-48  |  +|  ehternet   2 x Netgear GSM7352 v2 - 48-port, **10Gb switch to storage**  |  two PowerConnect 6248 - 48 port |  two PowerConnect 6248 - 48, **10Gb switch to storage**  |    ProCurve Switch 2610-48 ($750), HP ProCurve 2910-48 ($3,500) both **1Gb**  |  
-|  infiniband  |  72-Port 4X Configuration Modular DDR InfiniBand Switch, all nodes, includes head node and storage devices for IPoIB (see next section)  |  (3?) 12PT 4X DDR-INFINIBAND, all nodes  |  (3?) 12PT 4X DDR-INFINIBAND, all nodes  |  Voltaire IB 4X **QDR** 36P, all nodes, plus head node for IPoIB  |+|  infiniband  |  72-Port 4X Configuration Modular DDR InfiniBand Switch, all nodes, <del>includes head node and storage devices for IPoIB</del> (see next section)  |  (3?) 12PT 4X DDR-INFINIBAND, all nodes  |  (3?) 12PT 4X DDR-INFINIBAND, all nodes  |  Voltaire IB 4X **QDR** 36P, all nodes, plus head node for IPoIB  |
 |  head node  |  dual quad core Xeon E5620 2.40GHz w/ 12MB, 6x2=12 gb, 2x500 gb hdd (rack)  |  XeonE5620 2.4Ghz, 12M Cache, 6x2=12 gb, 2(?)x146 gb hdd (blade)  |  XeonE5620 2.4Ghz, 12M Cache, 6x2=12 gb, 2(?)x146 gb hdd (blade)  |  HP E5620 dual-quad DL380G7, 6x2=12 gb, 2x250gb (rack)   | |  head node  |  dual quad core Xeon E5620 2.40GHz w/ 12MB, 6x2=12 gb, 2x500 gb hdd (rack)  |  XeonE5620 2.4Ghz, 12M Cache, 6x2=12 gb, 2(?)x146 gb hdd (blade)  |  XeonE5620 2.4Ghz, 12M Cache, 6x2=12 gb, 2(?)x146 gb hdd (blade)  |  HP E5620 dual-quad DL380G7, 6x2=12 gb, 2x250gb (rack)   |
 |  storage  |  Pinnacle Flexible Storage System, single  Intel Xeon Quad Core X3360 2.83GHz w/ 12MB cache, 8 gb ram    PV MD3200i, 6x PV MD1200, direct attach    NX4 10Gb NAS (EMC2)  |  HP StorageWorks MSA60 Array, **direct attach**  | |  storage  |  Pinnacle Flexible Storage System, single  Intel Xeon Quad Core X3360 2.83GHz w/ 12MB cache, 8 gb ram    PV MD3200i, 6x PV MD1200, direct attach    NX4 10Gb NAS (EMC2)  |  HP StorageWorks MSA60 Array, **direct attach**  |
Line 128: Line 143:
 |  storage costs  |  $15K, $0.31/GB  |  $72.5K, $1.42/GB  |  $52K, $1.06/GB  |  $45K, $0.92/GB  | |  storage costs  |  $15K, $0.31/GB  |  $72.5K, $1.42/GB  |  $52K, $1.06/GB  |  $45K, $0.92/GB  |
 |  storage functions  |  multiple NICs, optional 10gb and Infiniband support, LVs > 2 TB, **snapshots imaging enabled**, quota, antivirus scanning, CFIS/NFS/AFP, multiple raids (incl 5&6), cluster expandable  |  dual controller, optional snapshots, direct attach iSCSI, head node performs NFS  |  CIFS/NFS/FTP + iSCSI and Fiber, expandable to 96 TB, **snapview licensed**, deduplication capable  |  multiple NICs (on  head node), direct attach, head node performs NFS duties via IPoIB, not expandable, **no snapshotting**  | |  storage functions  |  multiple NICs, optional 10gb and Infiniband support, LVs > 2 TB, **snapshots imaging enabled**, quota, antivirus scanning, CFIS/NFS/AFP, multiple raids (incl 5&6), cluster expandable  |  dual controller, optional snapshots, direct attach iSCSI, head node performs NFS  |  CIFS/NFS/FTP + iSCSI and Fiber, expandable to 96 TB, **snapview licensed**, deduplication capable  |  multiple NICs (on  head node), direct attach, head node performs NFS duties via IPoIB, not expandable, **no snapshotting**  |
 +|  storage mgt software  |  yes  |  ?  |  yes  |  yes  |
 |  OS  |  CentOS 5.x    RHEL 5.3  |  RHEL 5.3  |  RHEL5.3   | |  OS  |  CentOS 5.x    RHEL 5.3  |  RHEL 5.3  |  RHEL5.3   |
 |  software  |  Intel C++, Fortran, MPI, MKL  |  none  |  dell  |  none  | |  software  |  Intel C++, Fortran, MPI, MKL  |  none  |  dell  |  none  |
 |  scheduler  |  gridengine  |  Platform LSF  |  Platform LSF  |  gridengine  | |  scheduler  |  gridengine  |  Platform LSF  |  Platform LSF  |  gridengine  |
 |  management  |  Breakin, Cloner, Beo Utils, Act Dir, Ganglia  |  Platform OCS5.x  |  Platform OCS5.x  |  HP Cluster Mgt Utility Lic and Media    | |  management  |  Breakin, Cloner, Beo Utils, Act Dir, Ganglia  |  Platform OCS5.x  |  Platform OCS5.x  |  HP Cluster Mgt Utility Lic and Media    |
-|  UPS  |  3000VA UPS  |  5600W, 4U, 208V  |  noneyes?   none  |+|  UPS  |  3000VA UPS  |  5600W, 4U, 208V  |  5600W, 4U208V  |  **none**  |
 |  iKVM  |  yes  |  yes  |  no  |  yes  | |  iKVM  |  yes  |  yes  |  no  |  yes  |
 |  support  |  3 years, NBD  |  3 years, 4-Hour 7x24 On-site Service  |  3 years, NBD  |  3 years, NBD  | |  support  |  3 years, NBD  |  3 years, 4-Hour 7x24 On-site Service  |  3 years, NBD  |  3 years, NBD  |
-|  L6-30  |  3  |    |  2  |    +|  L6-30  |  3  |    |  2(?)   4  
-|  Watts  |  12,891  |    |    |  13,943    +|  Watts  |  12,891  |    |  13,943  10,602  
-|  BTU/hr  |  43,984  |    |    47,616    +|  BTU/hr  |  43,984  |    |    47,616  36,175  
-|  A/C Tons    3.67  |    |  ~4    +|  A/C Tons    3.67  |    |  4  |  3  
-|  weight   |  1829.9 lbs     |    |    +|  weight lbs  |  1829.9  |    |    |  1,006  
-|  cost to run /wo A/C  |  $10,897.28 (9.36c/KWH)  |    |       |+|  cost to run (9.36c/KWH) /wo A/C  |  $10,897.28   |    |  $11,797.12   $8,952.57 (Watts+AC: 24% greener than dell - saves $5,700/year, 18% greener than ACT - saves $4,000/year)  |
 |  Us Used per rack |  40/42U  |  ??/48  |    |  33/42U  | |  Us Used per rack |  40/42U  |  ??/48  |    |  33/42U  |
 |  Note1  |  unlimited lifetime technical support for all hardware and software we supply  |  SAS drives 600 gb @15K is $750 vs  500 gb @7.2K SATA is $275, estimate is that could save $40K  |    NX4 has more protocols than we need but so what, $1/GB is good  |  arrives fully integrated, with  knowledge transfer/training for a week on all parts of the cluster  | |  Note1  |  unlimited lifetime technical support for all hardware and software we supply  |  SAS drives 600 gb @15K is $750 vs  500 gb @7.2K SATA is $275, estimate is that could save $40K  |    NX4 has more protocols than we need but so what, $1/GB is good  |  arrives fully integrated, with  knowledge transfer/training for a week on all parts of the cluster  |
 |  Note2  |  this solution could do IPoIB, presumably yielding an NFS boost in performance, however desire native NFS, see next section     Or it could be that 15K SAS drives in array is a good idea?  |   UPS and iKVM are gone, no big deal  |   IPoIB the unknown, we could do NFS on head node, but then a bottleneck?  | |  Note2  |  this solution could do IPoIB, presumably yielding an NFS boost in performance, however desire native NFS, see next section     Or it could be that 15K SAS drives in array is a good idea?  |   UPS and iKVM are gone, no big deal  |   IPoIB the unknown, we could do NFS on head node, but then a bottleneck?  |
-|  Note3  |  all MPI flavors precompiled with Gnu, Portland (for AMD), Intel (for Xeon)  |  Platform OCS and LSF add costs, support is nice though  |  Total cost for OCS/LSF/RHEL is about $20K, that sets up a differentiall with ACT of $4.5K now  |  <del>Unsure about the lack of management/OS software preinstalled/preconfigured, what if we hit a driver problem with HP hardware?</del> unit will be fully integrated to our specs with whatever we want (CentOS/Gridengine)  |+|  Note3  |  all MPI flavors precompiled with Gnu, Portland (for AMD), Intel (for Xeon)  |  Platform OCS and LSF add costs, support is nice though  |  Total cost for OCS/LSF/RHEL is about $20K, that sets up a differentiall with ACT of $4.5K now  |  <del>Unsure about the lack of management/OS software preinstalled/preconfigured, what if we hit a driver problem with HP hardware?</del> unit will be fully integrated to our specs with whatever we want (CentOS or RHEL/Gridengine)  |
 |  Note4  |  $75K buys another 135 job slots, 17 nodes  |  $75K buys another 80 job slots, 10 nodes  |  there a second NX4 expansion shelf quoted which is not needed, could reduce quote by $5K  |  $75k buys another 180 job slots, 22 nodes  | |  Note4  |  $75K buys another 135 job slots, 17 nodes  |  $75K buys another 80 job slots, 10 nodes  |  there a second NX4 expansion shelf quoted which is not needed, could reduce quote by $5K  |  $75k buys another 180 job slots, 22 nodes  |
 |  Note5  | with these large hard disks on compute nodes, investigate the idea if /localscratch partitions can be presented as one scratch file system via Lustre,ie 200x30= 6TB, definitely worth the effort  |||| |  Note5  | with these large hard disks on compute nodes, investigate the idea if /localscratch partitions can be presented as one scratch file system via Lustre,ie 200x30= 6TB, definitely worth the effort  ||||
Line 231: Line 247:
     - FSU NA     - FSU NA
     - KTZT storage is slow, probably NFS related, get more memory for cpu     - KTZT storage is slow, probably NFS related, get more memory for cpu
 +
 +==== HP Questions ====
 +
 +**UMICH**
 +  * Dell complacent, HP proactive (buys $1M in 3 years, 2852 cores)
 +  * No experience with software
 +  * Good HP support experience
 +  * HP more efficient than Dell, supports that
 +  * Torque/PBS shop
 +  * No IPoIB experience (yet)
 +  * Also a Dell shop
 +
  
  
cluster/83.1279052127.txt.gz · Last modified: 2010/07/13 16:15 by hmeij