User Tools

Site Tools


cluster:93

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:93 [2011/01/07 20:49]
hmeij
cluster:93 [2011/01/09 20:30]
hmeij
Line 4: Line 4:
 ====== Greentail ====== ====== Greentail ======
  
-Time to introduce our new high performance cluster ''greentail'', an HP HPC solution.  IF you want to read more about the details of the hardware, you can find it at [[https://dokuwiki.wesleyan.edu/doku.php?id=cluster:83#round_2_of_quotes|Enternal Link]]. The reference for ''greentail'' is because this cluster consumes 18-24% less power/cooling that the competing bids.  The green tail refers to the **Smooth Green Snake**, which no surprise, has a green tail.  [[http://www.ct.gov/dep/cwp/view.asp?A=2723&Q=325780|External Link]] for more information.+Time to introduce our new high performance cluster ''greentail'', an Hewlett Packard HPC solution.  If you want to read more about the details of the hardware, you can find it at [[https://dokuwiki.wesleyan.edu/doku.php?id=cluster:83#round_2_of_quotes|Enternal Link]]. The reference for ''greentail'' is because this cluster consumes 18-24% less power/cooling than the competing bids.  The green tail refers to the **Smooth Green Snake**, which no surprise, has a green tail.  [[http://www.ct.gov/dep/cwp/view.asp?A=2723&Q=325780|External Link]] for more information.
  
-In order to accommodate the new cluster, we have reduced the Blue Sky Studios cluster from 3 racks in production to a single rack.  That rack contains nothing but 24 gb memory nodes offering just over 1.1 TB of memory across 46 nodes.  Because the cluster is not power consumption friendly, it is our "on demand" cluster.  If jobs are pending in the sole ''bss24'' queue (offering 92 job slots), we will get notified and will poer on more nodes.  If it is not being used, we'll power down the nodes.  The login node for this cluster is host sharptail (which can only be reached by first ssh into host petaltail or swallowtail, then ssh to sharptail.+In order to accommodate the new cluster, we have reduced the Blue Sky Studios cluster from 3 racks in production to a single rack.  That rack contains nothing but 24 gb memory nodes offering just over 1.1 TB of memory across 46 nodes.  Because the cluster is not power consumption friendly, it is our "on demand" cluster.  If jobs are pending in the sole ''bss24'' queue (offering 92 job slots), we will get notified and will power on more nodes.  Or just email us. If it is not being used, we'll power down the nodes.  The login node for this cluster is host sharptail (which can only be reached by first ssh into host petaltail or swallowtail, then ssh to sharptail).
  
-The are no changes to the Dell cluster (petaltail/swallowtail).  However be sure to read the home directory section below.  It is important all users understand the impact of changes to come.+There are no changes to the Dell cluster (petaltail/swallowtail).  However be sure to read the home directory section below.  __It is important all users understand the impact of changes to come.__ 
 + 
 +If we like the HP management tools, in the future we may ingest cluster petaltail/swallowtail and sharptail into greentail for a single point of access.  Regardless of that move, the home directories will be served by greentail.  That is a significant change. More details below. 
 + 
 +As always, suggestions welcome.
  
 ===== Design ===== ===== Design =====
Line 14: Line 18:
 The purchase of the HP hardware followed a fierce bidding round in which certain design aspects had to met. The purchase of the HP hardware followed a fierce bidding round in which certain design aspects had to met.
  
-  * We continually run of disk space for our home directories.  So the new cluster had to have a large disk array on board.+  * We continually run out of disk space for our home directories.  So the new cluster had to have a large disk array on board.
   * We wanted more nodes, in fewer queues, with a decent memory footprint.   * We wanted more nodes, in fewer queues, with a decent memory footprint.
   * All nodes should be on an Infiniband switch.   * All nodes should be on an Infiniband switch.
   * A single queue is preferred.   * A single queue is preferred.
   * Data (NFS) was to be served up via a secondary gigabit ethernet switch, hence not compete with administrative traffic.   * Data (NFS) was to be served up via a secondary gigabit ethernet switch, hence not compete with administrative traffic.
-  * (With the HP solution we will actually route data (NFS) traffic over the infiniband switch, a practice called IPoIB) +  * (With the HP solution we will actually route data (NFS) traffic over the infiniband switch using OFED/MPI, a practice called [[http://en.wikipedia.org/wiki/OpenFabrics_Alliance|IPoIB]]
-  * Linux or CentOS as operating system.+  * Linux (Redhat or CentOSas operating system.
   * Flexible on scheduler (options: Lava, LSF, Sun Grid Engine)   * Flexible on scheduler (options: Lava, LSF, Sun Grid Engine)
   * The disk array, switches and login node should be backed by some form of UPS (not the compute nodes)   * The disk array, switches and login node should be backed by some form of UPS (not the compute nodes)
cluster/93.txt · Last modified: 2011/01/11 20:55 by hmeij