User Tools

Site Tools


cluster:93

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:93 [2011/01/09 20:30]
hmeij
cluster:93 [2011/01/11 15:39]
hmeij
Line 1: Line 1:
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
 +
 +|{{:cluster:greentail.jpg|}}|
 +|  greentail  |
 +
  
 ====== Greentail ====== ====== Greentail ======
  
-Time to introduce our new high performance cluster ''greentail'', an Hewlett Packard HPC solution.  If you want to read more about the details of the hardware, you can find it at [[https://dokuwiki.wesleyan.edu/doku.php?id=cluster:83#round_2_of_quotes|Enternal Link]]. The reference for ''greentail'' is because this cluster consumes 18-24% less power/cooling than the competing bids.  The green tail refers to the **Smooth Green Snake**, which no surprise, has a green tail.  [[http://www.ct.gov/dep/cwp/view.asp?A=2723&Q=325780|External Link]] for more information.+Time to introduce our new high performance cluster ''greentail'', an Hewlett Packard HPC solution.  If you want to read more about the details of the hardware, you can find it at [[https://dokuwiki.wesleyan.edu/doku.php?id=cluster:83#round_2_of_quotes|Enternal Link]]. The name refers to the **Smooth Green Snake**, which no surprise, has a green tail.  [[http://www.ct.gov/dep/cwp/view.asp?A=2723&Q=325780|External Link]] for more information.  The reference for ''greentail'' is because this cluster consumes 18-24% less power/cooling than the competing bids
  
 In order to accommodate the new cluster, we have reduced the Blue Sky Studios cluster from 3 racks in production to a single rack.  That rack contains nothing but 24 gb memory nodes offering just over 1.1 TB of memory across 46 nodes.  Because the cluster is not power consumption friendly, it is our "on demand" cluster.  If jobs are pending in the sole ''bss24'' queue (offering 92 job slots), we will get notified and will power on more nodes.  Or just email us. If it is not being used, we'll power down the nodes.  The login node for this cluster is host sharptail (which can only be reached by first ssh into host petaltail or swallowtail, then ssh to sharptail). In order to accommodate the new cluster, we have reduced the Blue Sky Studios cluster from 3 racks in production to a single rack.  That rack contains nothing but 24 gb memory nodes offering just over 1.1 TB of memory across 46 nodes.  Because the cluster is not power consumption friendly, it is our "on demand" cluster.  If jobs are pending in the sole ''bss24'' queue (offering 92 job slots), we will get notified and will power on more nodes.  Or just email us. If it is not being used, we'll power down the nodes.  The login node for this cluster is host sharptail (which can only be reached by first ssh into host petaltail or swallowtail, then ssh to sharptail).
  
-There are no changes to the Dell cluster (petaltail/swallowtail).  However be sure to read the home directory section below.  __It is important all users understand the impact of changes to come.__+There are no changes to the Dell cluster (petaltail/swallowtail).  However be sure to read the home directory section below.  __It is important that all users understand the impact of changes to come.__
  
-If we like the HP management tools, in the future we may ingest cluster petaltail/swallowtail and sharptail into greentail for a single point of access.  Regardless of that move, the home directories will be served by greentail.  That is a significant change. More details below.+If we like the HP management tools, in the future we may ingest cluster petaltail/swallowtail and sharptail into greentail for a single point of access.  Regardless of that move, the home directories will be served by greentail in the future.  That is a significant change. More details below.
  
 As always, suggestions welcome. As always, suggestions welcome.
Line 16: Line 20:
 ===== Design ===== ===== Design =====
  
-The purchase of the HP hardware followed a fierce bidding round in which certain design aspects had to met.+The purchase of the HP hardware followed a fierce bidding round in which certain design aspects had to be met.
  
   * We continually run out of disk space for our home directories.  So the new cluster had to have a large disk array on board.   * We continually run out of disk space for our home directories.  So the new cluster had to have a large disk array on board.
Line 35: Line 39:
 In short using linpack (More about [[http://en.wikipedia.org/wiki/LINPACK|Linpack on wikipedia]]) here are the results.  The results are dependent on the combination of total memory, total cores, and speed of processors. In short using linpack (More about [[http://en.wikipedia.org/wiki/LINPACK|Linpack on wikipedia]]) here are the results.  The results are dependent on the combination of total memory, total cores, and speed of processors.
  
-  * greentail's Voltaire infiniband switch is capable of just over 1,500 gigaflops or 1.5 teraflops. +  * greentail'nodes, all on Voltaire infiniband switch, are capable of just over 1,500 gigaflops or 1.5 teraflops. 
-  * petaltail/swallowtail's Cisco infiniband switch is capable of 325 gigaflops +  * petaltail/swallowtail'nodes on Cisco infiniband switch are capable of 325 gigaflops 
-  * petaltail/swallowtail's Force 10 ethernet switch is capable of 245 gigaflops+  * petaltail/swallowtail'nodes on Force 10 ethernet switch are capable of 245 gigaflops
     * so the total is 570 gigaflops, but you'd never want to run across both switches simultaneously     * so the total is 570 gigaflops, but you'd never want to run across both switches simultaneously
-  * sharptail's HP ProCurve ethernet switch is estimated at deleivering between 500-700 gigaflops+  * sharptail'nodes on HP ProCurve ethernet switch are estimated at delivering between 500-700 gigaflops
     * never got quite all the nodes working together, not sure why     * never got quite all the nodes working together, not sure why
  
  
 ===== Home Dirs ===== ===== Home Dirs =====
 +
 +The home directory disk space (5 TB) on the clusters is served up via NFS from one of our data center NetApp storage servers (named filer3).  (Lets refer to those as "old home dirs"). We will be migrating off filer3 to greentail's local disk array.  The path will remain the same on greentail: /home/username. (Lets refer to those as "new home dirs").
 +
 +In order to do this, your old home directory content was copied over christmas-newyears break.  Since then, it will be copied weekly from filer3 to greentail's disk array.  When you create new files in your old home dirs they will show up on greentail's new home dirs.  However, if you delete files in old home dirs, and they have already been copied over, the files will remain in your new home dirs.  If you create new files in greentail's new home dirs they will **not** be copied back to your old home dirs.
 +
 +To avoid a conflict between home dirs I strongly suggest you create a directory to store the files you will be creating on greentail, for example /home/username/greentail or /home/username/hp.
 +
 +At some point in the future, greentail's new home dirs will be mounted on the petaltail/swallowtail and sharptail clusters.  Filer3's old home dirs will then disappear permanently.
 +
 +Greentail's new home dirs will provide 10 TB of disk space.  Again, the clusters file system should not be used to archive data. However, doubling the home directory size should provide much needed relief.
 +
 +Because of the size of the new home dirs, we will also not be able to provide backup via TSM (Tivoli).  Backup via TSM to our Virtual Tape Library (VTL) will be replaced with disk to disk backup on greentail's disk array.  That has some serious implications.  Please read the section about RSnapshot.
 +
 +
 +===== Passwords =====
 +
 +The password, shadow and group files of host petaltail were used to populate greentail's equivalent files.
 +
 +If you change your password, do it on all four hosts (petaltail, swallowtail, sharptail and greentail).
 +
 +I know, a pain.
 +
  
 ===== SSH Keys ===== ===== SSH Keys =====
  
-Within the directory **/home/username/.ssh** there is a file named **authorized_keys**.  Within this file are your public SSH keys.  Because your home directory contents are copied over to host greentail, you should be able to ssh from host petaltail or swallowtail to host greentail without a password prompt.  If not, your keys are not set up properly.+Within the directory **/home/username/.ssh** there is a file named **known_hosts**.  Within this file are host level public SSH keys.  Because your home directory contents are copied over to host greentail, you should be able to ssh from host petaltail or swallowtail to host greentail without a password prompt.  If not, your keys are not set up properly.
  
 You can also log in to host greentail directly (''ssh username@greentail.wesleyan.edu'').  From host greentail should be able to to ssh to host petaltail or swallowtail without a password prompt. If not, your keys are not set up properly. You can also log in to host greentail directly (''ssh username@greentail.wesleyan.edu'').  From host greentail should be able to to ssh to host petaltail or swallowtail without a password prompt. If not, your keys are not set up properly.
  
-To set up your ssh keys:+ 
 +Note: the software stack on host petaltail/swallowtail created ssh keys for you automatically upon your first login, so for most of you this is all set.  To set up your private/public ssh keys:
  
   * log into a host, then issue the command ''ssh-keygen -t rsa''   * log into a host, then issue the command ''ssh-keygen -t rsa''
   * supply an empty passphrase (just hit return)   * supply an empty passphrase (just hit return)
-  * then copy the contents of /home/username/.ssh/id_rsa.pub into the file authroized_keys +  * then copy the contents of /home/username/.ssh/id_rsa.pub into the file authorized_keys 
-  * yoiu can have multiple public ssh key entries in this file+  * you can have multiple public ssh key entries in this file
  
-Note: the software stack on host petaltail/swallowtail created ssh keys for you automatically upon your first login, so for most you this is all set. 
- 
-To test if your keys are set up right, simply ssh around the hosts petaltail, swallowtail and greentail. 
  
 ===== Rsnapshot ===== ===== Rsnapshot =====
 +
 +[[http://rsnapshot.org|Rsnapshot]] can be used to perform disk to disk backup of file systems using linux tools such as hard and soft links and rsync.  It will replace our practice of backing up to the virtual tape library. Since I had to disable the functionality of keeping modified files for 30 days (one file version only) because of the mass of files in /home (18 million by last count) we actually gain functionality using rsnapshot.  Rsnaphot will take daily, weekly and monthly point in time backups. We will keep backups for the last 6 days, the last 4 weeks and the last 3 months.
 +
 +Rsnaphot content of all the new home directory content is made available to you at /snapshot/repository/? where ? is a single letter from a to z.  This file system is read only.  Users can retrieve deleted data by simply copying the data lost back into their new home directories.
 +
 +Within the snapshot repository you will find directories:
 +
 +  * daily.0 (yesterday), daily.1 (day before yesterday) etc ... daily backups are taken mon-sat at 11 pm
 +  * weekly.0 (last week), weekly.1 (week before last week) etc ... weekly backups are taken sunday at 10:30 pm
 +  * monthly.0 (last month), monthly.1 (month before last month) etc ... monthly backups are taken on the first day of each month at 10:00 pm
 +
 +/home and /snapshot are different logical volumes using a different set of disks to protect against loss of data.  In addition, both use RAID 6 (double parity) for another layer of protection.  However, it is one disk array comprised of 4 disk shelves directly attached to greentail.  A catastrophic failure implies the potential of data loss.  I therefore encourage you to archive data elsewhere for permanent storage.
 +
 +===== Sanscratch =====
 +
 +Previously there were two scratch areas available to your programs: /localscratch which is roughly 50 GB on each node's local hard disk and /sanscratch a shared scratch area available to all nodes.  Sanscratch allows you to monitor your jobs progress by looking in /sanscratch/jobpid. It was also much larger (1 TB).
 +
 +However, since our fantastic crash of June 2008 ([[cluster:67|The catastrophic crash of June 08]] /snapshot was simply a directory inside /home and thus compete for disk space.
 +
 +On greentail, /sanscratch will be a separate logical volume of 5 TB using a different disk set.  SO i urge those that have very large files to stage their files in /sanscratch when running their jobs for best performance.  The scheduler will always create (and delete!) two directories for you.  The JOBPID of your job is used to create /localscratch/jobpid and /sanscratch/jobpid.
 +
 +===== MPI =====
 +
 +For those of you running MPI or MPI enabled applications, you will need to make some changes to your scripts.  The ''wrapper'' program to use with greentail's Lava scheduler is the same as for cluster sharptail. It can be found here:  /share/apps/bin/lava.openmpi.mpirun.   If other flavors are desired, you can inform me or look look at the example scripts lava.//mpi_flavor//.mpi[run|exec].
 +
 +Sometime ago I wrote some code to detect if a node is infiniband enabled or not, and based on the result, add command line arguments to the mpirun invocation.  If you use that code, you will need to change:  the path to obtain the port status (/usr/bin/ibv_devinfo) and in the block specify the interface change eth1 to ib0.
  
 ===== ... ===== ===== ... =====
 +
 +|{{:cluster:swallowtail.jpg|}}|{{:cluster:petaltail.jpg|}}|{{:cluster:sharptail.jpg|}}|
 +|  swallowtail  |  petaltail  |  sharptail  |
  
  
cluster/93.txt · Last modified: 2011/01/11 20:55 by hmeij