This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:123 [2013/10/10 13:48] hmeij [Replace Dell Racks] |
cluster:123 [2013/10/18 13:55] hmeij [Update] |
||
---|---|---|---|
Line 7: | Line 7: | ||
Subtitle: A win-win solution proposed by Physical Plant and ITS | Subtitle: A win-win solution proposed by Physical Plant and ITS | ||
- | Once upon a time, back in 2013, two Dell racks full of compute nodes, sat noisily chewing away energy on the 5th floor of Science Tower. | + | Once upon a time, back in 2013, two Dell racks full of compute nodes, sat noisily chewing away energy on the 5th floor of Science Tower. |
+ | The Dell racks contain 30 compute nodes, two UPS units, two disks arrays and two switches. We have measured 19 nodes power consumption (pulling one of the dual power units out) with a Kill-A-Watt meter for over 775+ total hours. The mean power consumption rate is 418.4 watts. That totals to 109,956 KwH/year in power consumption ((watts/ | ||
+ | |||
+ | Next we need to convert to a dollar value. | ||
+ | |||
+ | Based on 12.5 cents the Dell compute nodes consume $13,744.50 per year in power. Best guess is cooling costs are at least that (another possible low water mark). So the total cost for both power and cooling consumption is $27,489 per year. | ||
+ | |||
+ | Next step was to collect vendor quotes for a target budget of $82K, 3 years of Dell energy consumption, | ||
+ | |||
+ | Old hardware: 109,956 KwH/year for power\\ | ||
+ | 30 nodes, 2.66 ghz, 4 mb L-cache (for cpu), 240 cores (job slots),\\ | ||
+ | 80 gb local drive, 340 gb total ram, 12,555 watts (power no cooling), 670 gigaflops (actual measure) | ||
+ | |||
+ | New hardware v1: 47,304 KwH/year for power or 43% of old hardware\\ | ||
+ | 14 nodes, 2.60 ghz, 20 mb L-cache (for cpu), 224 cores (job slots),\\ | ||
+ | 1TB local drive, 1,792 gb total ram, 5,400 watts (power no cooling), 4,659 gigaflops (theoretical) | ||
+ | |||
+ | New hardware v2 (half of v1): 23,652 KwH/year for cooling or 22% of Old hardware\\ | ||
+ | 7 nodes, 2.60 ghz, 20 mb L-cache (for cpu), 112 cores (job slots),\\ | ||
+ | 1TB local drive, 1,792 gb total ram, 2,700 watts (power no cooling), 2,329 gigaflops (theoretical) | ||
+ | |||
+ | If we reduced the node count to 7 (the minimum configuration to meet the job slot count of the Dell hardware), the total energy consumption (power plus cooling) would be 5,400 watts. | ||
+ | |||
+ | In two years, the new hardware would have saved $43,152 on energy costs based on the low water mark (Dell' | ||
+ | |||
+ | * There are enough Infiniband ports available to put all new hardware nodes on such a switch (add cards and cables cost for each node) | ||
+ | * The internal disks on each node need to be of a high speed (10K or better) and of a certain size (300 GB or larger) mimicking the Dell disk arrays (adds costs) | ||
+ | * we maybe able to add two more nodes by switching to a more exapansive lower wattage CPU (and remain within budget as well as below the 50% energy consumption threshold as compared with Dell's consumption. | ||
+ | * accomplished by switching from 8 core 2650v2 (130 watt) 2.6 ghz CPU to 10 core 2660v2 (95 watt) 2.2 ghz CPU | ||
+ | |||
+ | But it is all very doable within a budget of $45-$50K. And it can be the solution for: | ||
+ | |||
+ | * replace Dell's racks functions and match or exceed its performance | ||
+ | * seriously reduce energy consumption benefiting Physical Plant' | ||
+ | * allow ITS to treat the third Liebert cooling tower as backup/ | ||
+ | * being way green | ||
+ | |||
+ | The Libert family rejoices. | ||
+ | |||
+ | ==== Update ==== | ||
+ | |||
+ | The table below contains data for a cluster whose nodes are all on the Infiniband switch (and also ethernet switch for provision and data). | ||
+ | |||
+ | |||
+ | ^Tnodes^Tcores^THcores^Tmem gb^Watts^%of Dell^TEnergy^TEnergy $/ | ||
+ | |10|160|320|2, | ||
- | The end. | ||
\\ | \\ | ||
**[[cluster: | **[[cluster: |