User Tools

Site Tools


cluster:110

Warning: Undefined array key -1 in /usr/share/dokuwiki/inc/html.php on line 1458

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:110 [2013/04/15 15:09]
hmeij [Specs: EC GPU]
cluster:110 [2013/05/24 09:39] (current)
hmeij [Specs: MW - GPU]
Line 21: Line 21:
  
 ===== Round 3 ===== ===== Round 3 =====
 +
 +
 +
 +==== Specs: MW - GPU ====
 +
 +This is what we ended up buying May 2013.
 +
 +^  Topic^Description  ^
 +|  General| 10 CPUs (80 cores), 20 GPUs (45,000 cuda cores), 256 gb ram/node (1,280 gb total), plus head node (128 gb)|
 +|  Head Node|1x42U Rackmount System (36 drive bays), 2xXeon E5-2660 2.0 Ghz 20MB Cache 8 cores (total 16 cores)|
 +|  |16x16GB 240-Pin DDR3 1600 MHz ECC (total 256gb, max 512gb), ?x10/100/1000 NIC (3 cables), 3x PCIe x16 Full, 3x PCIe x8|
 +|  |2x1TB 7200RPM (Raid 1) + 16x3TB (Raid 6), Areca Raid Controller|
 +|  |Low profile graphics card, ConnectX-3 VPI adapter card, Single-Port, FDR 56Gb/s (1 cable)|
 +|  |1400w Power Supply 1+1 redundant|
 +|  Nodes|5x 2U Rackmountable Chassis, 5x 2 Xeon E5-2660 2.0 Ghz 20MB Cache 8 cores (16 cores/node), Sandy Bridge series|
 +|  |5x 16x16GB 240-Pin DDR3 1600 MHz (256gb/node memory, max 256gb)|
 +|  |5x 1x120GB SSD 7200RPM, 5x 4xNVIDIA Tesla K20 5 GB GPUs (4/node), 1CPU-2GPU ratio|
 +|  |?x10/100/1000 NIC (1 cable), Dedicated IPMI Port, 5x 4 PCIE 3.0 x16 Slots, 5x 8 PCIE 3,0 x8 Slots|
 +|  |5xConnectX-3 VPI adapter card, Single-Port, QDRFDR 40/56 Gb/s (1 cable)|
 +|  |5x1620W 1+1 Redundant Power Supplies|
 +|  Network|1x 1U Mellanox InfiniBand QDR Switch (18 ports)& HCAs (single port) + 3m cable QDR to existing Voltaire switch|
 +|  |1x 1U 24 Port Rackmount Switch, 10/100/1000, Unmanaged (cables)|
 +|Rack  |1x42U rack with power distributions (14U used)|
 +|  Power|2xPDU, Basic rack, 30A, 208V, Requires 1x L6-30 Power Outlet Per PDU (NEMA L6-30P)|
 +|  Software| CentOS, Bright Cluster Management (1 year support), MVAPich, OpenMPI, CUDA|
 +|  | scheduler and gnu compilers installed and configured|
 +|  | Amber12 (customer provide license) , Lammps, NAMD, Cuda 4.2 (for apps) & 5 |
 +|  Warranty|3 Year Parts and Labor (lifetime technical support)| 
 +|  GPU Teraflops|23.40 double, 70.40 single|
 +|  Quote|<html><!-- estimated at $124,845 --></html>Arrived, includes S&H and Insurance|
 +|Includes  |Cluster pre-installation service  |
 +
 +
 +  * 16U - estimated draw 6,900 Watts and 23,713 BTUs cooling - $30K/year
 +  * 5 GPU shelves
 +  * 2 PDUs
 +  * 42 TB raw
 +  * FDR interconnects
 +  * 120GB SSD drives on nodes
 +  * 256 gb ram on nodes, 16gb/core
 +  * Areca hardware raid
 +  * Lifetime technical support
  
 ==== Specs: EC GPU ==== ==== Specs: EC GPU ====
Line 26: Line 68:
  
 ^  Topic^Description  ^ ^  Topic^Description  ^
-|  General| 12 CPUs (96 cores), 20 GPUs (45,000 cuda cores), 128 gb ram/node, plus head node (128gb)|+|  General| 12 CPUs (96 cores), 20 GPUs (45,000 cuda cores), 128 gb ram/node (640 gb total), plus head node (128gb)|
 |  Head Node|1x2U Rackmount System, 2xXeon E5-2660 2.20 Ghz 20MB Cache 8 cores| |  Head Node|1x2U Rackmount System, 2xXeon E5-2660 2.20 Ghz 20MB Cache 8 cores|
 |  |8x16GB 240-Pin DDR3 1600 MHz ECC (128gb, max 512gb), 2x10/100/1000 NIC, 1x PCIe x16 Full, 6x PCIe x8 Full| |  |8x16GB 240-Pin DDR3 1600 MHz ECC (128gb, max 512gb), 2x10/100/1000 NIC, 1x PCIe x16 Full, 6x PCIe x8 Full|
Line 48: Line 90:
  
  
-  * 20U - estimated draw 7,400 Watts.+  * 20U - estimated draw 7,400 Watts - $30K/year for cooling and power
   * 5 GPU shelves   * 5 GPU shelves
   * 1 CPU shelf   * 1 CPU shelf
-  * 4 PDUs+  * 4 PDU - this could be a problem
-  * 56TB raw+  * 56TB raw  
 +  * QDR interconnects 
 +  * 1 TB disk on node, makes for a large /localscratch
   * LSI hardware raid card   * LSI hardware raid card
  
cluster/110.1366052968.txt.gz · Last modified: 2013/04/15 15:09 by hmeij