User Tools

Site Tools


cluster:93

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:93 [2011/01/07 20:31]
hmeij
cluster:93 [2011/01/07 20:52]
hmeij
Line 11: Line 11:
  
 ===== Design ===== ===== Design =====
 +
 +The purchase of the HP hardware followed a fierce bidding round in which certain design aspects had to met.
 +
 +  * We continually run of disk space for our home directories.  So the new cluster had to have a large disk array on board.
 +  * We wanted more nodes, in fewer queues, with a decent memory footprint.
 +  * All nodes should be on an Infiniband switch.
 +  * A single queue is preferred.
 +  * Data (NFS) was to be served up via a secondary gigabit ethernet switch, hence not compete with administrative traffic.
 +  * (With the HP solution we will actually route data (NFS) traffic over the infiniband switch using OFED/MPI, a practice called [[http://en.wikipedia.org/wiki/OpenFabrics_Alliance|IPoIB]])
 +  * Linux or CentOS as operating system.
 +  * Flexible on scheduler (options: Lava, LSF, Sun Grid Engine)
 +  * The disk array, switches and login node should be backed by some form of UPS (not the compute nodes)
 +  * (We actually have moved those to our enterprise data center UPS, which is backed by building generator)
  
 ===== Performance ===== ===== Performance =====
 +
 +During the scheduled power outage of December 28th, 2010, some benchmarks were performed on old and new clusters.  To read about the details of all that, view this [[cluster:91|page]].
 +
 +In short using linpack (More about [[http://en.wikipedia.org/wiki/LINPACK|Linpack on wikipedia]]) here are the results.  The results are dependent on the combination of total memory, total cores, and speed of processors.
 +
 +  * greentail's Voltaire infiniband switch is capable of just over 1,500 gigaflops or 1.5 teraflops.
 +  * petaltail/swallowtail's Cisco infiniband switch is capable of 325 gigaflops
 +  * petaltail/swallowtail's Force 10 ethernet switch is capable of 245 gigaflops
 +    * so the total is 570 gigaflops, but you'd never want to run across both switches simultaneously
 +  * sharptail's HP ProCurve ethernet switch is estimated at deleivering between 500-700 gigaflops
 +    * never got quite all the nodes working together, not sure why
 +
  
 ===== Home Dirs ===== ===== Home Dirs =====
cluster/93.txt ยท Last modified: 2011/01/11 20:55 by hmeij