cluster:112
This is an old revision of the document!
Table of Contents
All campus utilities are within the physical plant chart of accounts. Departments are not charged.
Overview
| Cluster | Blue Sky | Dell | HP | GPU1) | CPU2) | Comment |
|---|---|---|---|---|---|---|
| 12/2006 | 11/2010 | 04/2013 | 04/2013 | |||
| Age (yrs) | 11 | 5.5 | 1.5 | 0 | 0 | |
| Nodes (Nr) | 45 | 36 | 32 | 4 | 13 | |
| Cores (Nr) | 90 | 256 | 256 | 3)16 gpu, 64 cpu | 208 | |
| Ram (GB) | 1,080 | 384 | 340 | 80 gpu, 512 cpu | 1,664 | |
| Teraflops | 4)0.5-0.7 | 5)0.66 | 6)1.5 | 7)18.71 + 8)1.1 | 9)3.6 | |
| L6-30 (Nr) | 4 | 8 | 4 | 2 | 2 | |
| Capacity10) (Watts) | 24,960 | 49,920 | 24,960 | 12,480 | 12,480 | |
| Power (Watts11)) | N/A12) | 13)11,500 | 14)10,602 | 15)5,900 | 16)5,250 | |
| Cooling (Btus17)) | 18)39,240 | 19)36,175 | 20)20,171 | 21)17,913 | ||
| Annual Costs22) ($) | 46,091 | 42,930 | 25,864 | 23,506 | 23)using 2009 data | |
| $1,000/Tflop | 69.8 | 28.6 | 1.3 | 6.5 | 24)using 2009 data |
Details
All dollar figures based on spreadsheet using March 2009 values
- This will overestimate actual costs because now we have cogen generation
- 64% off peak, 36% on peak consumption
- includes distribution and transmission charges
Notes
- one L6-30 connector can sustain 7,500 watts at 250 volts
- we run at 208V which can sustain 6,240 watts per connector
- wattometer: example 7 Kwh consumed in 19 hours = 368 watts per hour on average
- 1,000 Btus equal 293 watts
- assume cost of $ 0.10 per Kwh (kilo watt hour)
- annual Kwh = watts/1000 (from wattometer) * 24 * 365 * 30! (6 servers died)
- power costs per year = $0.10/Kwh * watts/1,000 * 24 hours * 365 days
HP Nov-2010
- 1.5 years old
- 32 blade servers in 8 2U-shelves
- 4 L6-30
- 10,602 watts = 92,856 kwh annual = $21,465
- 36,175 btus = 10,600 watts = 92,856 kwh annual = $21,465
- total costs: $42,930/year
- “roughly” $4,770 per 2U-shelves (8 2U-shelves + 1 head node of 2U)
Dell Jan-2007
- 5.5 years old
- 36 1U rack servers in 18 2U-shelves
- 8 L6-30
- 11,500 watts = 100,740 kwh annual = $23,098
- 39,240 btus = 11,497 watts = 100,713 kwh annual = $23,093
- total costs: $46,091/year
- “roughly” $2,310 per 2U-shelves (18 2U-shelves + 2 head nodes of 2U)
CPU HPC 2012
- new acquisition
- 13 1U rack servers in 6.5 2U-shelves
- 2 L6-30
- 5,250 watts = 45,990 kwh annual = $11,754
- 17,913 btus = 5,249 watts = 45,981 kwh annual= $11,752
- total costs: $23,506/year
- “roughly” $3,134 per 2U-shelves (6.5 2U-shelves + 1 head node of 2U)
GPU HPC 2012
- new acquisition
- 4 rack servers in 4 2U-shelves
- 2 L6-30
- 5,900 watts = 51,684 kwh annual = $12,934
- 20,171 btus = 5,898 watts = 51,667 kwn annual = $12,930
- total costs: $25,864
- “roughly” $2,874 per 2U-shelves (8 2U-shelves + 1 head node of 2U)
Replace Dell
This is what the picture would look like using the 2009 spreadsheet data
What we should look at is teraflops replacement …
- 0.66 teraflops, measured with Linpack (actual)
- E5-2660 is rated at 140 Gflops so one node provides 280 Gflops (theoretical)
- 2 and 1/3rd 1U nodes then replace the cluster
- acquisition costs
- 3 x $5,782 = $17,346 + PDUs, ethernet switch, cables
- 48 cores, 840 Gflops
- power/AC costs: 3($23,506/13) x3 years = $16,273
- add both costs of new hardware and divide by dell total annual costs power+cooling
- ROI: 3/4 year - but this is on a teraflop by teraflop basis
And if we double the teraflops…1.6 Tflops (actually + 1 Tflops), 1.5 years ROI for 96 cores.
L6-30 Connections
- Dell
- left rack (from rear) LCBR 01, 09, 08, 02 and UPS on 01
- right rack (from rear) LCBR 03, 04 and UPS on 07
- BSS
- LCBR CIR 25/27, 21/23, 22/24, 26/28
- HP
- LCBR CIR 29/31, 28/30
- LCBR 06
- (#3 on Enterprise UPS) PDI1 CIR 31,33
Dell April-2013
Actual data obtained April - May, 2013
Kill-A-Watt Meter Stats; pulled one power supply … mean/1000 * 24 * 365 * 30!
| Node | jobs | kwh | time hh:mm | mean | comment | annual kwh | power annual | +cooling annual (x2) | |
|---|---|---|---|---|---|---|---|---|---|
| c04 | 8 | 2.29 | 06:30 | 352 | idle, no jobs | 92,509 | $21,393 | $42,786 | wow |
| c32 | 8 | 6.94 | 19:35 | 356 | jobs finished | 93,556 | $21,610 | $43,220 | |
| c27 | 8 | 7.42 | 19:30 | 381 | jobs finished | 100,127 | $22,971 | $45,942 | |
| c06 | 8 | 27.46 | 65:11 | 395 | 4 running | 103,806 | $23,734 | $47,468 | |
| c00 | 8 | 8.26 | 21:00 | 393 | still running | 103,280 | $23,625 | $47,250 | * |
| c03 | 8 | 10.36 | 24:03 | 432 | still running | 113,530 | $25,748 | $51,496 | * |
| c04 | 8 | 12.02 | 28:08 | 429 | still running | 112,741 | $25,585 | $51,170 | * |
| c09 | 8 | 10.36 | 24:00 | 432 | still running | 113,530 | $25,748 | $51,496 | * |
| c10 | 8 | 10.21 | 23:55 | 425 | still running | 111,690 | $25,367 | $50,734 | * mean |
| c11 | 4 | still running | * | ||||||
| c13 | 3 | still running | * | ||||||
| c17 | 8 | 27.07 | 71:00 | 381 | still running | * | |||
| c18 | 8 | 11.23 | 24:41 | 455 | still running | * | |||
| c20 | 8 | 50.23 | 143: | 351 | still running | * | |||
| c23 | 8 | 8.71 | 24:00 | 363 | still running | * | |||
| c25 | 8 | 9.80 | 25:02 | 392 | still running | * | |||
| c29 | 8 | 30.22 | 66:22 | 455 | still running | 119,574 | $27,001 | $54,002 | * |
| c31 | 8 | 10.30 | 25:10 | 409 | still running | 107,485 | $24,496 | $48,992 | * |
| c32 | 8 | 11.42 | 24:00 | 476 | still running | 123,379 | $27,789 | $55,578 | * |
| c33 | 8 | 46.23 | 113: | 409 | still running | * | |||
| c35 | 8 | 13.69 | 28:53 | 474 | still running | * | |||
1)
graphical processing unit, learn more about GPU History
2)
central processing unit
3)
provides 40,000 total cuda-cores
4)
cpu estimated
7)
gpu published by Nvidia
10)
at 208 Volts
11)
per hour, KwH=Watts/1,000
12)
only turned on when needed, perhaps 8 weeks per year
13)
2009 Wesleyan calculation
17)
per hour, 1,000 Btus=293 Watts per hour
18)
2009 Wesleyan estimate
22)
power and cooling
cluster/112.1368455684.txt.gz · Last modified: by hmeij
