User Tools

Site Tools


cluster:136

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
cluster:136 [2015/02/10 16:46]
hmeij created
cluster:136 [2020/07/28 13:21] (current)
hmeij07
Line 1: Line 1:
 \\ \\
 **[[cluster:​0|Back]]** **[[cluster:​0|Back]]**
 +
 +''/​home''​ is defunct but remains for compatibility. It has been moved from sharptail to whitetail. New home directories are at ''/​zfshomes''​. Although quotas are in place (starting at 1T for new accounts) users typically get what they need.  Static content should eventually be migrated to our Rstore platform.
 +
 + --- //​[[hmeij@wesleyan.edu|Henk]] 2020/07/28 13:18//
  
 ==== HomeDir & Storage Options ==== ==== HomeDir & Storage Options ====
Line 6: Line 10:
 The HPCC cluster'​s file server ''​sharptail.wesleyan.edu''​ serves out all home directories to all nodes at location /home. It is 10 TB in size and it currently takes the nightly process of backup 1-2 hours to churn through. ​ Making it larger would thus generate more traffic on /​home. ​ So we've, for now while it works for us, come up with this policy: The HPCC cluster'​s file server ''​sharptail.wesleyan.edu''​ serves out all home directories to all nodes at location /home. It is 10 TB in size and it currently takes the nightly process of backup 1-2 hours to churn through. ​ Making it larger would thus generate more traffic on /​home. ​ So we've, for now while it works for us, come up with this policy:
  
-  * All users are under quota which automatically ​get increased by 100 GB increments.+  * All users are under quota which automatically ​is increased by 100 GB increments.
   * When a user consumes 1024 GB (1 TB) the automatic increases stop.   * When a user consumes 1024 GB (1 TB) the automatic increases stop.
     * this home file system is twice a month backed up from sharptail'​s to greentail'​s disk array     * this home file system is twice a month backed up from sharptail'​s to greentail'​s disk array
-    * nightly snapshots (point in time backups) are done on sharptail disk array and stored there too+    * nightly snapshots (point in time backups) are done on sharptail'​s ​disk array and stored there too
  
-At this point users need to off load static content to other locations. ​ Contents like old analyses, results of published papers, etc. Users typically have two options at this time:+At this point users need to off load static content to other locations. ​ Contents like old analyses, results of published papers, etc. Users typically have one local option available:
  
-  * Keep contents out of /home and migrate it to /archives (7 TB)+  * Keep contents out of /home and migrate it to /archives (7 TB, accessible on all "​tail"​ nodes)
     * request a directory for you in this file system and move contents to it     * request a directory for you in this file system and move contents to it
     * this archive file system is twice a month backed up from sharptail'​s to greentail'​s disk array     * this archive file system is twice a month backed up from sharptail'​s to greentail'​s disk array
-  * Users with home directories of 500 GB in size should start considering moving data to /archives+  * Users with home directories of 500GB in size should start considering moving data to /archives
  
 Users whom are considered inactive have their home directories relocated to /​archives/​inactive Users whom are considered inactive have their home directories relocated to /​archives/​inactive
 +
   * these accounts are kept around until we do an account edit and purge (has never happened so far)   * these accounts are kept around until we do an account edit and purge (has never happened so far)
-  ​+ 
 +The remote storage option, if your storage needs cannot be supported by /archives, is off-cluster storage. Rstore is our latest storage solution for groups and labs with such needs. 
 + 
 +  ​ask your lead faculty member if your lab/group has such an area or request one 
 +  * then move your static content permanently off the HPCC cluster environment 
 +  * details can be found at [[cluster:​135|RSTORE FAQ]] 
 + 
 +==== Moving Content ==== 
 + 
 +Our file server is named ''​sharptail.wesleyan.edu''​ (or ''​sharptail''​ when on cluster) and it is a 4U integrated storage and server module with an 48TB of disk array. Moving content can severely crippled this server. **/home** is served out by this server to all nodes and if the server can not handle all read/write requests everything comes to a halt. So when moving content please monitor and also observe if others are currently doing something along this line. Here are some tips. 
 + 
 + 
 +Do not use any type of copy tool with a GUI or cp/scp or s/ftp. Especially the GUI (drag&​drop) are Verboten! These tools are not smart enough and frequently generated blocked processes that halt everything. Use ''​rsync''​ in a linux/unix environment. 
 + 
 +**Check it out:** 
 + 
 +  * ''​ssh sharptail.wesleyan.edu''​ 
 +  * is the server busy (''​uptime''​ loads < 8 are ok)  
 +  * is there memory available (''​free -m''​ look at free values) 
 +  * is anybody else using rsync (''​ps -efl | grep rsync''​) 
 +  * is the server busy writing (''​iotop''​ look at the M/s disk writes(q to quit), values >100-200 M/s == busy!) 
 + 
 +Three scenarios are depicted below. When crossing the vertical boundaries you are not dealing with local content anymore, thus the content needs to flow over the network. ''​rsync''​ has many features, one of the important one is the use of a remote shell allowing an elegant way to cross these boundaries.  
 + 
 +<​code>​ 
 + 
 +                        |         /​home ​        ​| ​   group share     ​| ​   some lab location ​     
 +some lab location ​      ​| ​                      ​| ​                   | 
 +                  <​----------->​ sharptail <​----------->​ Rstore <​-----------> ​                                
 +some other college ​     |                       ​| ​                   |     
 +                        |         /​archives ​    ​| ​    lab share      |    some other college ​               
 + 
 +</​code>​ 
 + 
 +**Some feature examples** 
 + 
 +  * preserve permissions,​ do a checksum between source/​target files, observe what will happen 
 +      * ''​rsync -vac --dry-run''​ 
 +  * delete files on destination not present on source (careful!) 
 +      * ''​rsync --delete''​ 
 +  * throttle the rate of traffic generated, make your sysadmin happy, use <5000 
 +      * ''​rsync --bwlimit=2500''​ 
 +  * do not look inside files 
 +      * ''​rsync --whole-files''​ 
 +  * use a remote shell from host to host (crossing those vertical boundaries above) 
 +      * ''​rsync ​ -vac /​home/​my/​stuff/ ​ user@somehost.wesleyan.edu:/​home/​my/​stuff/''​ 
 + 
 +Note the use of trailing slashes, it means update everything inside source ''​stuff/''​ within target ''​stuff/''​. If you left the first trailing slash off the above command it means put source directory ''​stuff/''​ inside target directory ''​stuff/''​ meaning you'll end up with target ''/​home/​my/​stuff/​stuff''​. You've been warned. Use the dry run option if unsure what will happen. 
 + 
 +** Putting it all together ** 
 + 
 +<​code>​ 
 + 
 +# copy the dir stuff from lab or remote college to my home on HPCC in tmp area  
 +# (first log in to remote location) 
 + 
 +rsync -vac --bwlimit=2500 --whole-files /​home/​user/​stuff user@sharptail.wesleyan.edu:/​home/​user/​tmp/​ 
 + 
 +# sync my HPCC dir stuff folder into /archives locally on sharptail, then clean up 
 +# (first log in to sharptail)  
 + 
 +rsync -vac --bwlimit=2500 /​home/​user/​stuff/ ​ /​archives/​user/​stuff/​ 
 +rm -rf /​home/​user/​stuff/​* 
 + 
 +# generate a copy of content on Rstore disk array outside of HPCC but within wesleyan.edu 
 +# (get paths and share names from faculty member, on sharptail do) 
 + 
 +rsync -vac --bwlimit=2500 /​home/​user/​stuff ​ user@rstoresrv0.wesleyan.edu:/​data/​2/​labcontent/​projects/​ 
 + 
 +# you can also do this in reverse, log in to sharptail first 
 + 
 +rsync -vac --bwlimt=2500 user@rstoresrv0.wesleyan.edu:/​data/​2/​labcontent/​projects/​stuff ​ /home/user/  
 + 
 +</​code>​ 
  
 \\ \\
cluster/136.1423604788.txt.gz · Last modified: 2015/02/10 16:46 (external edit)