User Tools

Site Tools


cluster:112


Back

All campus utilities are within the physical plant chart of accounts. Departments are not charged.

Overview

Cluster Blue Sky Dell HP GPU1) CPU2) Comment
12/2006 11/2010 04/2013 04/2013
Age (yrs) 11 5.5 1.5 0 0
Nodes (Nr) 45 36 32 4 13
Cores (Nr) 90 256 256 3)16 gpu, 64 cpu 208
Ram (GB) 1,080 384 340 80 gpu, 512 cpu 1,664
Teraflops 4)0.5-0.7 5)0.66 6)1.5 7)18.71 + 8)1.1 9)3.6
L6-30 (Nr) 4 8 4 2 2
Capacity10) (Watts) 24,960 49,920 24,960 12,480 12,480
Power (Watts11)) N/A12) 13)11,500 14)10,602 15)5,900 16)5,250
Cooling (Btus17)) 18)39,240 19)36,175 20)20,171 21)17,913
Annual Costs22) ($) 46,091 42,930 25,864 23,506 23)using 2009 data
$1,000/Tflop 69.8 28.6 1.3 6.5 24)using 2009 data
  • 2009 Dell measurements put power consumption at 100,740 KwH per year (how?)
  • 2013 Dell measurements (see bottom table) put power consumption at 109,956 KwH per year (for 30 servers)

Details

All dollar figures based on spreadsheet using March 2009 values

  • This will overestimate actual costs because now we have cogen generation
    • 64% off peak, 36% on peak consumption
    • includes distribution and transmission charges

Notes

  • one L6-30 connector can sustain 7,500 watts at 250 volts
    • we run at 208V which can sustain 6,240 watts per connector
  • wattometer: example 7 Kwh consumed in 19 hours = 368 watts per hour on average
  • 1,000 Btus equal 293 watts
  • assume cost of $ 0.10 per Kwh (kilo watt hour)
  • annual Kwh = watts/1000 (from wattometer) * 24 * 365 * 30! (6 servers died)
  • power costs per year = $0.10/Kwh * watts/1,000 * 24 hours * 365 days

HP Nov-2010

  • 1.5 years old
  • 32 blade servers in 8 2U-shelves
  • 4 L6-30
    • 10,602 watts = 92,856 kwh annual = $21,465
    • 36,175 btus = 10,600 watts = 92,856 kwh annual = $21,465
  • total costs: $42,930/year
  • “roughly” $4,770 per 2U-shelves (8 2U-shelves + 1 head node of 2U)

Dell Jan-2007

  • 5.5 years old
  • 36 1U rack servers in 18 2U-shelves
  • 8 L6-30
    • 11,500 watts = 100,740 kwh annual = $23,098
    • 39,240 btus = 11,497 watts = 100,713 kwh annual = $23,093
  • total costs: $46,091/year
  • “roughly” $2,310 per 2U-shelves (18 2U-shelves + 2 head nodes of 2U)

CPU HPC 2012

  • new acquisition
  • 13 1U rack servers in 6.5 2U-shelves
  • 2 L6-30
    • 5,250 watts = 45,990 kwh annual = $11,754
    • 17,913 btus = 5,249 watts = 45,981 kwh annual= $11,752
  • total costs: $23,506/year
  • “roughly” $3,134 per 2U-shelves (6.5 2U-shelves + 1 head node of 2U)

GPU HPC 2012

  • new acquisition
  • 4 rack servers in 4 2U-shelves
  • 2 L6-30
    • 5,900 watts = 51,684 kwh annual = $12,934
    • 20,171 btus = 5,898 watts = 51,667 kwn annual = $12,930
  • total costs: $25,864
  • “roughly” $2,874 per 2U-shelves (8 2U-shelves + 1 head node of 2U)

Replace Dell

This is what the picture would look like using the 2009 spreadsheet data

What we should look at is teraflops replacement …

  • 0.66 teraflops, measured with Linpack (actual)
  • E5-2660 is rated at 140 Gflops so one node provides 280 Gflops (theoretical)
  • 2 and 1/3rd 1U nodes then replace the cluster
  • acquisition costs
    • 3 x $5,782 = $17,346 + PDUs, ethernet switch, cables
    • 48 cores, 840 Gflops
    • power/AC costs: 3($23,506/13) x3 years = $16,273
  • add both costs of new hardware and divide by dell total annual costs power+cooling
    • ROI: 3/4 year - but this is on a teraflop by teraflop basis

And if we double the teraflops…1.6 Tflops (actually + 1 Tflops), 1.5 years ROI for 96 cores.

L6-30 Connections

  • Dell
    • left rack (from rear) LCBR 01, 09, 08, 02 and UPS on 01
    • right rack (from rear) LCBR 03, 04 and UPS on 07
  • BSS
    • LCBR CIR 25/27, 21/23, 22/24, 26/28
  • HP
    • LCBR CIR 29/31, 28/30
    • LCBR 06
    • (#3 on Enterprise UPS) PDI1 CIR 31,33

Dell April-2013

Actual data obtained April - May, 2013

Kill-A-Watt Meter Stats; pulled one power supply … mean/1000 * 24 * 365 * 30!

Node jobs kwh time hh:mm mean comment annual kwh power annual +cooling annual (x2)
c04 8 2.29 06:30 352 idle, no jobs 92,509 $21,393 $42,786 wow
c32 8 6.94 19:35 356 jobs finished 93,556 $21,610 $43,220
c27 8 7.42 19:30 381 jobs finished 100,127 $22,971 $45,942
c06 8 27.46 65:11 395 4 running 103,806 $23,734 $47,468
c00 8 8.26 21:00 393 still running 103,280 $23,625 $47,250 *
c03 8 10.36 24:03 432 still running 113,530 $25,748 $51,496 *
c04 8 12.02 28:08 429 still running 112,741 $25,585 $51,170 *
c09 8 10.36 24:00 432 still running 113,530 $25,748 $51,496 *
c10 8 10.21 23:55 425 still running 111,690 $25,367 $50,734 *
c17 8 27.07 71:00 381 still running *
c18 8 11.23 24:41 455 still running *
c20 8 50.23 143: 351 still running *
c23 8 8.71 24:00 363 still running *
c25 8 9.80 25:02 392 still running *
c29 8 30.22 66:22 455 still running 119,574 $27,001 $54,002 *
c31 8 10.30 25:10 409 still running 107,485 $24,496 $48,992 *
c32 8 11.42 24:00 476 still running 123,379 $27,789 $55,578 *
c33 8 46.23 113: 409 still running *
c35 8 13.69 28:53 474 still running *
  • Average for “still running” nodes is 418.4 watts or 109,956 KwH/year
  • (watts/1000 Kw per hour) * 24 hours * 365 days * 30 servers
  • In 2009 dollars that is $25,000 in power (plus that in cooling) or $50,000 per year
  • What is that in 2013 dollars (with cogen online in Pine Street)?

We have measured 19 nodes power consumption (pulling one unit out) with a Kill-A-Watt meter over 775+ hours to arrive at a mean consumption rate of 418.4 watts. That totals to 109,956 KwH/year in power consumption ((watts/1000 Kw per hour) * 24 hours * 365 days * 30 servers), which is a low water mark as the racks in question also contain switches, two UPS units, and a disk array. (Note: Peter did a side calculation using the 4 watt measurement on panel directly and came up with 126,000 KwH/year which can be considered the high water mark estimate).

Based on 12.5 cents per Kwh (this is an all inclusive cost including natural gas cost, heat recoup costs, distribution, maintenance etc) the hardware burns away $13,744.50 per year. Best guess is cooling costs are at least that (another possible low water mark) so the total cost for all power consumption is $27,489 per year. If we run the hardware for another 3 years that total cost is $82,467.

If we could replace, or approximate, the 30 compute nodes' computational power (0.6 teraflops) and job slots (240 cores) with new hardware that consumes 50% less in power, our ROI is 6 years based on the low water mark numbers. When using the high water mark the ROI is 5 years.

Quotes Oct-2013

For $82K one could get CPU-HPC wise …

vendorTnodeschipghzcachecoresTcoresdiskramram/coreTrampdueswitchwarrwattcostgflops
dell30X53552.664mb824080gb4-16gb0.5-2gb340YYN12,552na670
micro142650v22.6020mb162241tb128gb8gb1,792gbYY25,400$77K4,659
advcl102680v22.8025mb2020060gb256gb12.8gb2,560gbYY34,597$84K4,480
exxct122660v22.2025mb202401tb128gb6.4gb1,536gbNN?4,860$60K4,224
hp


Back

1)
graphical processing unit, learn more about GPU History
2)
central processing unit
3)
provides 40,000 total cuda-cores
4)
cpu estimated
5) , 6)
cpu measured with Linpack
7)
gpu published by Nvidia
8) , 9)
cpu theoretical
10)
at 208 Volts
11)
per hour, KwH=Watts/1,000
12)
only turned on when needed, perhaps 8 weeks per year
13)
2009 Wesleyan calculation
14) , 15) , 16) , 19) , 20) , 21)
vendor published
17)
per hour, 1,000 Btus=293 Watts per hour
18)
2009 Wesleyan estimate
22)
power and cooling
23) , 24)
for Wesleyan/vendors data
cluster/112.txt · Last modified: 2013/10/08 19:04 by hmeij