\\ **[[cluster:0|Back]]** **All campus utilities are within the physical plant chart of accounts. Departments are not charged.** ====== Overview ====== ^ Cluster^ Blue Sky^ Dell^ HP^ GPU((graphical processing unit, learn more about [[cluster:107|GPU History]]))^ CPU((central processing unit))^ Comment^ ^ ^ ^ 12/2006^ 11/2010^ 04/2013^ 04/2013^ ^ | Age (yrs)| 11| 5.5| 1.5| 0| 0| | | Nodes (Nr)| 45| 36| 32| 4| 13| | | Cores (Nr)| 90| 256| 256| ((provides 40,000 total cuda-cores))16 gpu, 64 cpu| 208| | | Ram (GB)| 1,080| 384| 340| 80 gpu, 512 cpu| 1,664| | | Teraflops| ((cpu estimated))0.5-0.7| ((cpu measured with Linpack))0.66| ((cpu measured with Linpack))1.5| ((gpu published by Nvidia))18.71 + ((cpu theoretical))1.1| ((cpu theoretical))3.6| | | L6-30 (Nr)| 4| 8| 4| 2| 2| | | Capacity((at 208 Volts)) (Watts)| 24,960| 49,920| 24,960| 12,480| 12,480| | | Power (Watts((per hour, KwH=Watts/1,000)))| N/A((only turned on when needed, perhaps 8 weeks per year))| ((2009 Wesleyan calculation))11,500| ((vendor published))10,602| ((vendor published))5,900| ((vendor published))5,250| | | Cooling (Btus((per hour, 1,000 Btus=293 Watts per hour)))| | ((2009 Wesleyan estimate))39,240| ((vendor published))36,175| ((vendor published))20,171| ((vendor published))17,913| | | Annual Costs((power and cooling)) ($)| | 46,091| 42,930| 25,864| 23,506| ((for Wesleyan/vendors data))using 2009 data| | $1,000/Tflop| | 69.8| 28.6| 1.3| 6.5| ((for Wesleyan/vendors data))using 2009 data| * 2009 Dell measurements put power consumption at 100,740 KwH per year (how?) * 2013 Dell measurements (see bottom table) put power consumption at 109,956 KwH per year (for 30 servers) ====== Details ====== ** All dollar figures based on spreadsheet using March 2009 values ** * This will overestimate actual costs because now we have cogen generation * 64% off peak, 36% on peak consumption * includes distribution and transmission charges ===== Notes ===== * one L6-30 connector can sustain 7,500 watts at 250 volts * we run at 208V which can sustain 6,240 watts per connector * wattometer: example 7 Kwh consumed in 19 hours = 368 watts per hour on average * 1,000 Btus equal 293 watts * assume cost of $ 0.10 per Kwh (kilo watt hour) * annual Kwh = watts/1000 (from wattometer) * 24 * 365 * 30! (6 servers died) * power costs per year = $0.10/Kwh * watts/1,000 * 24 hours * 365 days ===== HP Nov-2010 ===== * 1.5 years old * 32 blade servers in 8 2U-shelves * 4 L6-30 * 10,602 watts = 92,856 kwh annual = $21,465 * 36,175 btus = 10,600 watts = 92,856 kwh annual = $21,465 * total costs: $42,930/year * "roughly" $4,770 per 2U-shelves (8 2U-shelves + 1 head node of 2U) ===== Dell Jan-2007 ===== * 5.5 years old * 36 1U rack servers in 18 2U-shelves * 8 L6-30 * 11,500 watts = 100,740 kwh annual = $23,098 * 39,240 btus = 11,497 watts = 100,713 kwh annual = $23,093 * total costs: $46,091/year * "roughly" $2,310 per 2U-shelves (18 2U-shelves + 2 head nodes of 2U) ===== CPU HPC 2012 ===== * new acquisition * 13 1U rack servers in 6.5 2U-shelves * 2 L6-30 * 5,250 watts = 45,990 kwh annual = $11,754 * 17,913 btus = 5,249 watts = 45,981 kwh annual= $11,752 * total costs: $23,506/year * "roughly" $3,134 per 2U-shelves (6.5 2U-shelves + 1 head node of 2U) ===== GPU HPC 2012 ===== * new acquisition * 4 rack servers in 4 2U-shelves * 2 L6-30 * 5,900 watts = 51,684 kwh annual = $12,934 * 20,171 btus = 5,898 watts = 51,667 kwn annual = $12,930 * total costs: $25,864 * "roughly" $2,874 per 2U-shelves (8 2U-shelves + 1 head node of 2U) * ===== Replace Dell ===== ** This is what the picture would look like using the 2009 spreadsheet data** What we should look at is teraflops replacement ... * 0.66 teraflops, measured with Linpack (actual) * E5-2660 is rated at 140 Gflops so one node provides 280 Gflops (theoretical) * 2 and 1/3rd 1U nodes then replace the cluster * acquisition costs * 3 x $5,782 = $17,346 + PDUs, ethernet switch, cables * 48 cores, 840 Gflops * power/AC costs: 3($23,506/13) x3 years = $16,273 * add both costs of new hardware and divide by dell total annual costs power+cooling * ROI: 3/4 year - but this is on a teraflop by teraflop basis And if we double the teraflops...1.6 Tflops (actually + 1 Tflops), 1.5 years ROI for 96 cores. ===== L6-30 Connections ===== * Dell * left rack (from rear) LCBR 01, 09, 08, 02 and UPS on 01 * right rack (from rear) LCBR 03, 04 and UPS on 07 * BSS * LCBR CIR 25/27, 21/23, 22/24, 26/28 * HP * LCBR CIR 29/31, 28/30 * LCBR 06 * (#3 on Enterprise UPS) PDI1 CIR 31,33 ====== Dell April-2013 ====== ** Actual data obtained April - May, 2013 ** Kill-A-Watt Meter Stats; pulled one power supply ... mean/1000 * 24 * 365 * 30! ^ Node^ jobs^ kwh^ time hh:mm^ mean^ comment^ annual kwh^ power annual^ +cooling annual (x2)^ ^ | c04| 8| 2.29| 06:30| 352| idle, no jobs| 92,509| $21,393| $42,786| wow| | c32| 8| 6.94| 19:35| 356| jobs finished| 93,556| $21,610| $43,220| | | c27| 8| 7.42| 19:30| 381| jobs finished| 100,127| $22,971| $45,942| | | c06| 8| 27.46| 65:11| 395| 4 running| 103,806| $23,734| $47,468| | | c00| 8| 8.26| 21:00| 393| still running| 103,280| $23,625| $47,250| *| | c03| 8| 10.36| 24:03| 432| still running| 113,530| $25,748| $51,496| *| | c04| 8| 12.02| 28:08| 429| still running| 112,741| $25,585| $51,170| *| | c09| 8| 10.36| 24:00| 432| still running| 113,530| $25,748| $51,496| *| | c10| 8| 10.21| 23:55| 425| still running| 111,690| $25,367| $50,734| *| | c17| 8| 27.07| 71:00| 381| still running| | | | *| | c18| 8| 11.23| 24:41| 455| still running| | | | *| | c20| 8| 50.23| 143:| 351| still running| | | | *| | c23| 8| 8.71| 24:00| 363| still running| | | | *| | c25| 8| 9.80| 25:02| 392| still running| | | | *| | c29| 8| 30.22| 66:22| 455| still running| 119,574| $27,001| $54,002| *| | c31| 8| 10.30| 25:10| 409| still running| 107,485| $24,496| $48,992| *| | c32| 8| 11.42| 24:00| 476| still running| 123,379| $27,789| $55,578| *| | c33| 8| 46.23| 113:| 409| still running| | | | *| | c35| 8| 13.69| 28:53| 474| still running| | | | *| * Average for "still running" nodes is 418.4 watts or 109,956 KwH/year * (watts/1000 Kw per hour) * 24 hours * 365 days * 30 servers * In 2009 dollars that is $25,000 in power (plus that in cooling) or $50,000 per year * What is that in 2013 dollars (with cogen online in Pine Street)? We have measured 19 nodes power consumption (pulling one unit out) with a Kill-A-Watt meter over 775+ hours to arrive at a mean consumption rate of 418.4 watts. That totals to 109,956 KwH/year in power consumption ((watts/1000 Kw per hour) * 24 hours * 365 days * 30 servers), which is a low water mark as the racks in question also contain switches, two UPS units, and a disk array. (Note: Peter did a side calculation using the 4 watt measurement on panel directly and came up with 126,000 KwH/year which can be considered the high water mark estimate). Based on 12.5 cents per Kwh (this is an all inclusive cost including natural gas cost, heat recoup costs, distribution, maintenance etc) the hardware burns away $13,744.50 per year. Best guess is cooling costs are at least that (another possible low water mark) so the total cost for all power consumption is $27,489 per year. If we run the hardware for another 3 years that total cost is $82,467. If we could replace, or approximate, the 30 compute nodes' computational power (0.6 teraflops) and job slots (240 cores) with new hardware that consumes 50% less in power, our ROI is 6 years based on the low water mark numbers. When using the high water mark the ROI is 5 years. ====== Quotes Oct-2013 ====== For $82K one could get CPU-HPC wise ... *[[http://download.intel.com/support/processors/xeon/sb/xeon_E5-2600.pdf|Intel Xeon Metrics]] * E5-2650v2 "ivy" 128 gflops * E5-2680v2 172.8 gflops ^vendor^Tnodes^chip^ghz^cache^cores^Tcores^disk^ram^ram/core^Tram^pdu^eswitch^warr^watt^cost^gflops^ |dell|30|X5355|2.66|4mb|8|240|80gb|4-16gb|0.5-2gb|340|Y|Y|N|12,552|na|670| |micro|14|2650v2|2.60|20mb|16|224|1tb|128gb|8gb|1,792gb|Y|Y|2|5,400|$77K|4,659| |advcl|10|2680v2|2.80|25mb|20|200|60gb|256gb|12.8gb|2,560gb|Y|Y|3|4,597|$84K|4,480| |exxct|12|2660v2|2.20|25mb|20|240|1tb|128gb|6.4gb|1,536gb|N|N|?|4,860|$60K|4,224| |hp| | | | | | | | | | | | | | | | \\ **[[cluster:0|Back]]**