This is an old revision of the document!
Table of Contents
All campus utilities are within the physical plant chart of accounts. Departments are not charged.
Overview
| Cluster | Blue Sky | Dell | HP | GPU1) | CPU2) | Comment |
|---|---|---|---|---|---|---|
| 12/2006 | 11/2010 | 04/2013 | 04/2013 | |||
| Age (yrs) | 11 | 5.5 | 1.5 | 0 | 0 | |
| Nodes (Nr) | 45 | 36 | 32 | 4 | 13 | |
| Cores (Nr) | 90 | 256 | 256 | 3)16 gpu, 64 cpu | 208 | |
| Ram (GB) | 1,080 | 384 | 340 | 80 gpu, 512 cpu | 1,664 | |
| Teraflops | 4)0.5-0.7 | 5)0.66 | 6)1.5 | 7)18.71 + 8)1.1 | 9)3.6 | |
| L6-30 (Nr) | 4 | 8 | 4 | 2 | 2 | |
| Capacity10) (Watts) | 24,960 | 49,920 | 24,960 | 12,480 | 12,480 | |
| Power (Watts11)) | N/A12) | 13)11,500 | 14)10,602 | 15)5,900 | 16)5,250 | |
| Cooling (Btus17)) | 18)39,240 | 19)36,175 | 20)20,171 | 21)17,913 | ||
| Annual Costs22) ($) | 46,091 | 42,930 | 25,864 | 23,506 | 23)using 2009 data | |
| $1,000/Tflop | 69.8 | 28.6 | 1.3 | 6.5 | 24)using 2009 data |
- 2009 Dell measurements put power consumption at 100,740 KwH per year (how?)
- 2013 Dell measurements (see bottom table) put power consumption at 109,956 KwH per year (for 30 servers)
Details
All dollar figures based on spreadsheet using March 2009 values
- This will overestimate actual costs because now we have cogen generation
- 64% off peak, 36% on peak consumption
- includes distribution and transmission charges
Notes
- one L6-30 connector can sustain 7,500 watts at 250 volts
- we run at 208V which can sustain 6,240 watts per connector
- wattometer: example 7 Kwh consumed in 19 hours = 368 watts per hour on average
- 1,000 Btus equal 293 watts
- assume cost of $ 0.10 per Kwh (kilo watt hour)
- annual Kwh = watts/1000 (from wattometer) * 24 * 365 * 30! (6 servers died)
- power costs per year = $0.10/Kwh * watts/1,000 * 24 hours * 365 days
HP Nov-2010
- 1.5 years old
- 32 blade servers in 8 2U-shelves
- 4 L6-30
- 10,602 watts = 92,856 kwh annual = $21,465
- 36,175 btus = 10,600 watts = 92,856 kwh annual = $21,465
- total costs: $42,930/year
- “roughly” $4,770 per 2U-shelves (8 2U-shelves + 1 head node of 2U)
Dell Jan-2007
- 5.5 years old
- 36 1U rack servers in 18 2U-shelves
- 8 L6-30
- 11,500 watts = 100,740 kwh annual = $23,098
- 39,240 btus = 11,497 watts = 100,713 kwh annual = $23,093
- total costs: $46,091/year
- “roughly” $2,310 per 2U-shelves (18 2U-shelves + 2 head nodes of 2U)
CPU HPC 2012
- new acquisition
- 13 1U rack servers in 6.5 2U-shelves
- 2 L6-30
- 5,250 watts = 45,990 kwh annual = $11,754
- 17,913 btus = 5,249 watts = 45,981 kwh annual= $11,752
- total costs: $23,506/year
- “roughly” $3,134 per 2U-shelves (6.5 2U-shelves + 1 head node of 2U)
GPU HPC 2012
- new acquisition
- 4 rack servers in 4 2U-shelves
- 2 L6-30
- 5,900 watts = 51,684 kwh annual = $12,934
- 20,171 btus = 5,898 watts = 51,667 kwn annual = $12,930
- total costs: $25,864
- “roughly” $2,874 per 2U-shelves (8 2U-shelves + 1 head node of 2U)
Replace Dell
This is what the picture would look like using the 2009 spreadsheet data
What we should look at is teraflops replacement …
- 0.66 teraflops, measured with Linpack (actual)
- E5-2660 is rated at 140 Gflops so one node provides 280 Gflops (theoretical)
- 2 and 1/3rd 1U nodes then replace the cluster
- acquisition costs
- 3 x $5,782 = $17,346 + PDUs, ethernet switch, cables
- 48 cores, 840 Gflops
- power/AC costs: 3($23,506/13) x3 years = $16,273
- add both costs of new hardware and divide by dell total annual costs power+cooling
- ROI: 3/4 year - but this is on a teraflop by teraflop basis
And if we double the teraflops…1.6 Tflops (actually + 1 Tflops), 1.5 years ROI for 96 cores.
L6-30 Connections
- Dell
- left rack (from rear) LCBR 01, 09, 08, 02 and UPS on 01
- right rack (from rear) LCBR 03, 04 and UPS on 07
- BSS
- LCBR CIR 25/27, 21/23, 22/24, 26/28
- HP
- LCBR CIR 29/31, 28/30
- LCBR 06
- (#3 on Enterprise UPS) PDI1 CIR 31,33
Dell April-2013
Actual data obtained April - May, 2013
Kill-A-Watt Meter Stats; pulled one power supply … mean/1000 * 24 * 365 * 30!
| Node | jobs | kwh | time hh:mm | mean | comment | annual kwh | power annual | +cooling annual (x2) | |
|---|---|---|---|---|---|---|---|---|---|
| c04 | 8 | 2.29 | 06:30 | 352 | idle, no jobs | 92,509 | $21,393 | $42,786 | wow |
| c32 | 8 | 6.94 | 19:35 | 356 | jobs finished | 93,556 | $21,610 | $43,220 | |
| c27 | 8 | 7.42 | 19:30 | 381 | jobs finished | 100,127 | $22,971 | $45,942 | |
| c06 | 8 | 27.46 | 65:11 | 395 | 4 running | 103,806 | $23,734 | $47,468 | |
| c00 | 8 | 8.26 | 21:00 | 393 | still running | 103,280 | $23,625 | $47,250 | * |
| c03 | 8 | 10.36 | 24:03 | 432 | still running | 113,530 | $25,748 | $51,496 | * |
| c04 | 8 | 12.02 | 28:08 | 429 | still running | 112,741 | $25,585 | $51,170 | * |
| c09 | 8 | 10.36 | 24:00 | 432 | still running | 113,530 | $25,748 | $51,496 | * |
| c10 | 8 | 10.21 | 23:55 | 425 | still running | 111,690 | $25,367 | $50,734 | * |
| c17 | 8 | 27.07 | 71:00 | 381 | still running | * | |||
| c18 | 8 | 11.23 | 24:41 | 455 | still running | * | |||
| c20 | 8 | 50.23 | 143: | 351 | still running | * | |||
| c23 | 8 | 8.71 | 24:00 | 363 | still running | * | |||
| c25 | 8 | 9.80 | 25:02 | 392 | still running | * | |||
| c29 | 8 | 30.22 | 66:22 | 455 | still running | 119,574 | $27,001 | $54,002 | * |
| c31 | 8 | 10.30 | 25:10 | 409 | still running | 107,485 | $24,496 | $48,992 | * |
| c32 | 8 | 11.42 | 24:00 | 476 | still running | 123,379 | $27,789 | $55,578 | * |
| c33 | 8 | 46.23 | 113: | 409 | still running | * | |||
| c35 | 8 | 13.69 | 28:53 | 474 | still running | * |
- Average for “still running” nodes is 418.4 watts or 109,956 KwH/year
- (watts/1000 Kw per hour) * 24 hours * 365 days * 30 servers
- In 2009 dollars that is $25,000 in power (plus that in cooling) or $50,000 per year
- What is that in 2013 dollars (with cogen online in Pine Street)?
We have measured 19 nodes power consumption (pulling one unit out) with a Kill-A-Watt meter over 775+ hours to arrive at a mean consumption rate of 418.4 watts. That totals to 109,956 KwH/year in power consumption ((watts/1000 Kw per hour) * 24 hours * 365 days * 30 servers), which is a low water mark as the racks in question also contain switches, two UPS units, and a disk array. (Note: Peter did a side calculation using the 4 watt measurement on panel directly and came up with 126,000 KwH/year which can be considered the high water mark estimate).
Based on 12.5 cents per Kwh (this is an all inclusive cost including natural gas cost, heat recoup costs, distribution, maintenance etc) the hardware burns away $13,744.50 per year. Best guess is cooling costs are at least that (another possible low water mark) so the total cost for all power consumption is $27,489 per year. If we run the hardware for another 3 years that total cost is $82,467.
If we could replace, or approximate, the 30 compute nodes' computational power (0.6 teraflops) and job slots (240 cores) with new hardware that consumes 50% less in power, our ROI is 6 years based on the low water mark numbers. When using the high water mark the ROI is 5 years.
| vendor | nodes | chip | ghz | cache | cores | Tcores | disk | ram | ram/core | pdu | eswtich | warranty | watts | cost | tflops |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| dell | 30 | X5355 | 2.66 | 4 | 8 | 240 | 80 | 4-16 | 0.5-2 | Y | Y | N | na | 0.67 | |
