User Tools

Site Tools


cluster:93

Warning: Undefined array key -1 in /usr/share/dokuwiki/inc/html.php on line 1458

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:93 [2011/01/11 10:38]
hmeij
cluster:93 [2011/01/11 15:55] (current)
hmeij
Line 14: Line 14:
 There are no changes to the Dell cluster (petaltail/swallowtail).  However be sure to read the home directory section below.  __It is important that all users understand the impact of changes to come.__ There are no changes to the Dell cluster (petaltail/swallowtail).  However be sure to read the home directory section below.  __It is important that all users understand the impact of changes to come.__
  
-If we like the HP management tools, in the future we may ingest cluster petaltail/swallowtail and sharptail into greentail for a single point of access.  Regardless of that move, the home directories will be served by greentail.  That is a significant change. More details below.+If we like the HP management tools, in the future we may ingest cluster petaltail/swallowtail and sharptail into greentail for a single point of access.  Regardless of that move, the home directories will be served by greentail in the future.  That is a significant change. More details below.
  
 As always, suggestions welcome. As always, suggestions welcome.
Line 23: Line 23:
  
   * We continually run out of disk space for our home directories.  So the new cluster had to have a large disk array on board.   * We continually run out of disk space for our home directories.  So the new cluster had to have a large disk array on board.
-  * We wanted more nodes, in fewer queues, with a decent memory footprint.+  * We wanted more nodes,  with a decent memory footprint (settled on 12 gb per node).
   * All nodes should be on an Infiniband switch.   * All nodes should be on an Infiniband switch.
   * A single queue is preferred.   * A single queue is preferred.
-  * Data (NFS) was to be served up via a secondary gigabit ethernet switch, hence not compete with administrative traffic.+  * Data (NFS) traffic was to be served up via a secondary gigabit ethernet switch, hence not compete with administrative traffic.
   * (With the HP solution we will actually route data (NFS) traffic over the infiniband switch using OFED/MPI, a practice called [[http://en.wikipedia.org/wiki/OpenFabrics_Alliance|IPoIB]])   * (With the HP solution we will actually route data (NFS) traffic over the infiniband switch using OFED/MPI, a practice called [[http://en.wikipedia.org/wiki/OpenFabrics_Alliance|IPoIB]])
   * Linux (Redhat or CentOS) as operating system.   * Linux (Redhat or CentOS) as operating system.
Line 51: Line 51:
 The home directory disk space (5 TB) on the clusters is served up via NFS from one of our data center NetApp storage servers (named filer3).  (Lets refer to those as "old home dirs"). We will be migrating off filer3 to greentail's local disk array.  The path will remain the same on greentail: /home/username. (Lets refer to those as "new home dirs"). The home directory disk space (5 TB) on the clusters is served up via NFS from one of our data center NetApp storage servers (named filer3).  (Lets refer to those as "old home dirs"). We will be migrating off filer3 to greentail's local disk array.  The path will remain the same on greentail: /home/username. (Lets refer to those as "new home dirs").
  
-In order to do this, your old home directory content was copied over christmas-newyears break.  Since then, it will be copied weekly from filer3 to greentail's disk array.  When you create new files in your old home dirs they will show up on greentail's new home dirs.  However, if you delete files in old home dirs, and they have already been copied over, the files will remain in your new home dirs.  If you create new files in greentail's new home dirs they will **not** be copied back to your old home dirs.+In order to do this, your old home directory content was copied over christmas-newyears break to greentail.  Since then, it will be refreshed //weekly// from filer3 to greentail's disk array.   
 +  * When you create new files in your old home dirs they will show up on greentail's new home dirs after a week.   
 +  * If you delete files in old home dirs, and they have already been copied over, the files will remain in your new home dirs.   
 +  * If you create new files in greentail's new home dirs they will **not** be copied back to your old home dirs
 +  * If you modify a file on greentail's new home dirs that //also// exists on the old homes dirs, you will loose your changes when the weekly refresh happens.
  
-To avoid a conflict between home dirs I strongly suggest you create a directory to store the files you will be creating on greentail, for example /home/username/greentail or /home/username/hp.+To avoid a conflict between home dirs I strongly suggest you create a directory to store the files you will be creating on greentail's new home dirs, for example /home/username/greentail or /home/username/hp.  That way the weekly refresh will not interfere with new files created on greentail.
  
-At some point in the future, greentail's new home dirs will be mounted on the petaltail/swallowtail and sharptail clusters.  Filer3's old home dirs will then disappear permanently.+At some point in the future, greentail's new home dirs will be mounted on the petaltail/swallowtail and sharptail clusters.  Filer3's old home dirs content will then disappear permanently.
  
-Greentail's new home dirs will provide 10 TB of disk space.  Again, the clusters file system should not be used to archive data. However, doubling the home directory size should provide much needed relief.+Greentail's new home dirs will provide 10 TB of disk space.  Again, the clusters file systems should not be used to archive data. Doubling the home directory size should provide much needed relief.
  
-Because of the size of the new home dirs, we will also not be able to provide backup via TSM (Tivoli).  Backup via TSM to our Virtual Tape Library (VTL) will be replaced with disk to disk backup on greentail's disk array.  That has some serious implications.  Please read the section about RSnapshot.+Because of the size of the new home dirs, we will also not be able to provide backup via TSM (Tivoli).  Backup via TSM to our Virtual Tape Library (VTL) will be replaced with disk to disk (D2D) backup on greentail's disk array.  That has some serious implications.  Please read the section about Rsnapshot.
  
  
Line 73: Line 77:
 ===== SSH Keys ===== ===== SSH Keys =====
  
-Within the directory **/home/username/.ssh** there is a file named **known_hosts**.  Within this file are host level public SSH keys.  Because your home directory contents are copied over to host greentail, you should be able to ssh from host petaltail or swallowtail to host greentail without a password prompt If not, your keys are not set up properly.+You can log in to host greentail directly (''ssh username@greentail.wesleyan.edu''). VPN required for off campus access.
  
-You can also log in to host greentail directly (''ssh username@greentail.wesleyan.edu'').  From host greentail should be able to to ssh to host petaltail or swallowtail without a password prompt. If not, your keys are not set up properly.+Within the directory **/home/username/.ssh** there is a file named **authorized_keys** Within this file are public SSH keys.  Because your home directory contents are copied over to host greentail, you should be able to ssh from host petaltail or swallowtail to host greentail without a password prompt.  If not, your keys are not set up properly.  You may need to add your ''id_rsa.pub'' content to this file.
  
- +Note: the software stack on host petaltail (administrative server) creates ssh keys for you automatically upon your first login, so for most of you this is all set.  To set up your private/public ssh keys manually:
-Note: the software stack on host petaltail/swallowtail created ssh keys for you automatically upon your first login, so for most of you this is all set.  To set up your private/public ssh keys:+
  
   * log into a host, then issue the command ''ssh-keygen -t rsa''   * log into a host, then issue the command ''ssh-keygen -t rsa''
Line 84: Line 87:
   * then copy the contents of /home/username/.ssh/id_rsa.pub into the file authorized_keys   * then copy the contents of /home/username/.ssh/id_rsa.pub into the file authorized_keys
   * you can have multiple public ssh key entries in this file   * you can have multiple public ssh key entries in this file
 +
 +The file **known_hosts** contains server level ssh keys.  This is necessary for MPI programs to log into compute nodes without a password prompt and submit your jobs.  That file has been prepped for you.
 +
  
  
Line 104: Line 110:
 Previously there were two scratch areas available to your programs: /localscratch which is roughly 50 GB on each node's local hard disk and /sanscratch a shared scratch area available to all nodes.  Sanscratch allows you to monitor your jobs progress by looking in /sanscratch/jobpid. It was also much larger (1 TB). Previously there were two scratch areas available to your programs: /localscratch which is roughly 50 GB on each node's local hard disk and /sanscratch a shared scratch area available to all nodes.  Sanscratch allows you to monitor your jobs progress by looking in /sanscratch/jobpid. It was also much larger (1 TB).
  
-However, since our fantastic crash of June 2008 ([[cluster:67|The catastrophic crash of June 08]] /snapshot was simply a directory inside /home and thus compete for disk space.+However, since our fantastic crash of June 2008 ([[cluster:67|The catastrophic crash of June 08]] page) /sanscratch was simply a directory inside /home and thus competes for disk space and IO.
  
-On greentail, /sanscratch will be a separate logical volume of 5 TB using a different disk set.  SO i urge those that have very large files to stage their files in /sanscratch when running their jobs for best performance.  The scheduler will always create (and delete!) two directories for you.  The JOBPID of your job is used to create /localscratch/jobpid and /sanscratch/jobpid.+On greentail, /sanscratch will be a separate logical volume of 5 TB using a different disk set.  So I urge those that have very large files, or generate lots of IO, to stage their files in /sanscratch/jobid when running their jobs for best performance.  The scheduler will always create (and delete!) two directories for you.  The JOBPID of your job is used to create /localscratch/jobpid and /sanscratch/jobpid.
  
 ===== MPI ===== ===== MPI =====
Line 112: Line 118:
 For those of you running MPI or MPI enabled applications, you will need to make some changes to your scripts.  The ''wrapper'' program to use with greentail's Lava scheduler is the same as for cluster sharptail. It can be found here:  /share/apps/bin/lava.openmpi.mpirun.   If other flavors are desired, you can inform me or look look at the example scripts lava.//mpi_flavor//.mpi[run|exec]. For those of you running MPI or MPI enabled applications, you will need to make some changes to your scripts.  The ''wrapper'' program to use with greentail's Lava scheduler is the same as for cluster sharptail. It can be found here:  /share/apps/bin/lava.openmpi.mpirun.   If other flavors are desired, you can inform me or look look at the example scripts lava.//mpi_flavor//.mpi[run|exec].
  
-Sometime ago I wrote some code to detect if a node is infiniband enabled or not, and based on the result, add command line arguments to the mpirun invocation.  If you use that code, you will need to change:  the path to obtain the port status (/usr/bin/ibv_devinfo) and in the block specify the interface change eth1 to ib0.+Sometime ago I wrote some code to detect if a node is infiniband enabled or not, and based on the result, add command line arguments to the mpirun invocation.  If you use that code, you will need to change:  the path to obtain the port status (/usr/bin/ibv_devinfo) and in the block specifing the interface change eth1 to ib0
 + 
 +===== Software ===== 
 + 
 +The same /share/apps software directory build when we were working petaltail as a new administrative host has been copied to greentail.  Petaltail is redhat linux 5.1 while greentail is redhat linux 5.5.  I anticipate everything to work but we may encounter missing libraries (major/minor versions).
  
 ===== ... ===== ===== ... =====
cluster/93.1294760330.txt.gz · Last modified: 2011/01/11 10:38 by hmeij