This is an old revision of the document!
The HPCC cluster's file server
sharptail.wesleyan.edu serves out all home directories to all nodes at location /home. It is 10 TB in size and it currently takes the nightly process of backup 1-2 hours to churn through. Making it larger would thus generate more traffic on /home. So we've, for now while it works for us, come up with this policy:
At this point users need to off load static content to other locations. Contents like old analyses, results of published papers, etc. Users typically have one local option available:
Users whom are considered inactive have their home directories relocated to /archives/inactive
The remote storage option, if your storage needs cannot be supported by /archives, is off-cluster storage. Rstore is our latest storage solution for groups and labs with such needs.
Our file server is named
sharptail when on cluster) and it is a 4U integrated storage and server module with an 48TB of disk array. Moving content can severely crippled this server. /home is served out by this server to all nodes and if the server can not handle all read/write requests everything comes to a halt. So when moving content please monitor and also observe if others are currently doing something along this line. Here are some tips.
Do not use any type of copy tool with a GUI or cp/scp or s/ftp. Especially the GUI (drag&drop) are Verboten! These tools are not smart enough and frequently generated blocked processes that halt everything. Use
rsync in a linux/unix environment.
Check it out:
uptimeloads < 8 are ok)
free -mlook at free values)
ps -efl | greep rsync)
iotoplook at the M/s disk writes(q to quit), values >100-200 M/s bad)
Three scenarios are depicted below
| | | some lab location some lab location | | | <-----------> sharptail <-----------> Rstore <-----------> some other college | | | | | | some other college
rsync -vac –dry-run
So to put it all together, for example move my directory in my home directory named stuff elsewhere
rsync –vac –delete –bwlimit=2500 –dry-run /home/username/stuff rstore0:/data/2/somelabgroup/mydirecotory/
Is output ok? Then run again the
–dry-run option omitted.
Note the lack of source trailing slash but present destination trailing slash; meaning put source inside destination location. If both had a trailing slash it would mean; update source and target at these locations. Beware.
–delete may bite.
Once contents have been migrated
rm -rf /home/username/stuff