This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:93 [2011/01/07 20:13] hmeij |
cluster:93 [2011/01/09 20:30] hmeij |
||
---|---|---|---|
Line 3: | Line 3: | ||
====== Greentail ====== | ====== Greentail ====== | ||
+ | |||
+ | Time to introduce our new high performance cluster '' | ||
+ | |||
+ | In order to accommodate the new cluster, we have reduced the Blue Sky Studios cluster from 3 racks in production to a single rack. That rack contains nothing but 24 gb memory nodes offering just over 1.1 TB of memory across 46 nodes. | ||
+ | |||
+ | There are no changes to the Dell cluster (petaltail/ | ||
+ | |||
+ | If we like the HP management tools, in the future we may ingest cluster petaltail/ | ||
+ | |||
+ | As always, suggestions welcome. | ||
===== Design ===== | ===== Design ===== | ||
+ | |||
+ | The purchase of the HP hardware followed a fierce bidding round in which certain design aspects had to met. | ||
+ | |||
+ | * We continually run out of disk space for our home directories. | ||
+ | * We wanted more nodes, in fewer queues, with a decent memory footprint. | ||
+ | * All nodes should be on an Infiniband switch. | ||
+ | * A single queue is preferred. | ||
+ | * Data (NFS) was to be served up via a secondary gigabit ethernet switch, hence not compete with administrative traffic. | ||
+ | * (With the HP solution we will actually route data (NFS) traffic over the infiniband switch using OFED/MPI, a practice called [[http:// | ||
+ | * Linux (Redhat or CentOS) as operating system. | ||
+ | * Flexible on scheduler (options: Lava, LSF, Sun Grid Engine) | ||
+ | * The disk array, switches and login node should be backed by some form of UPS (not the compute nodes) | ||
+ | * (We actually have moved those to our enterprise data center UPS, which is backed by building generator) | ||
===== Performance ===== | ===== Performance ===== | ||
+ | |||
+ | During the scheduled power outage of December 28th, 2010, some benchmarks were performed on old and new clusters. | ||
+ | |||
+ | In short using linpack (More about [[http:// | ||
+ | |||
+ | * greentail' | ||
+ | * petaltail/ | ||
+ | * petaltail/ | ||
+ | * so the total is 570 gigaflops, but you'd never want to run across both switches simultaneously | ||
+ | * sharptail' | ||
+ | * never got quite all the nodes working together, not sure why | ||
+ | |||
===== Home Dirs ===== | ===== Home Dirs ===== |