DokuWiki

This is an old revision of the document!

Sharptail Cluster

The new hardware has been delivered and rack&stacked. First priority was looking around while /home was copied from greentail:/home. This cluster is comprised of one head node (sharptail) and 5 nodes (n33-n37). The head node has a 48 TB disk array and 128 GB of memory. The nodes each have 256 GB of memory and offer dual 8 core chips. With so much memory hyperthreading has been turned on doubling the number of cores the operating systems recognizes (so 32 cores per node). Each node also contain 4 GPUs. The entire cluster provides for 1.3+ TB of memory and 20+ Teraflops of computational power. That's almost 7x what we currently have. All these resources will be made available via the Lava scheduler later on.

What is a GPU-HPC cluster

Recess Period

July and August 2013 I'll call the “Recess! Stay&Play” period. Due to vacation days and all that, final configuration will not be achieved till later this summer. So sharptail is open for ssh access during this time. You may run whatever you want directly on these nodes. There is no scheduler.

ssh sharptail.wesleyan.edu
- then ssh to one of the nodes
- setup your environment like in a submit script, then run your program
Reboots may happen. I'll try to warn folks when.
Shell access will disappear in final production mode! (use greentail or swallowtail)

/home

Sharptail is slated to become our file server for /home taking over from greentail. Cut over will be last step before going into production. Meanwhile the first sync from greentail to sharptail is about to finish but refreshs will happen. When a refresh happens:

Files that are created on greentail are pushed to sharptail
Files that disappeared on greentail also disappear on sharptail
Files that were created on sharptail (and do not exist on greentail) disappear

So it's important that if you want to keep stuff on sharptail you need to copy that to greentail before a refresh happens. I suggest you create a ~/sharptail directory and inside of that on sharptail. You can transfer files like so:

cp -rp /home/username/sharptail /mnt/greentail_home/username/
scp -rp ~/sharptail greentail:~

So in short, in the future, sharptail:/home is our active file system while greentail:/home will become say greentail:/home_backup (inactive). They will be kept in sync and rsnapshot'ed on both disk array so we have a better backup/restore strategy.

/sanscratch

Sharptail will provide the users (and scheduler) with another 5 TB scratch file system. During this period it is only provide to the sharptail nodes (n33-n37). In the future it will provide this file system to all nodes except greentail nodes (n1-n32).

Please offload as much IO from /home by staging your jobs in /sanscratch
an example: SAS read the submit2 section.

GPU code

LAMMPS, Amber and NAMD have been compiled using Nvidia's toolkit. They are located in /cm/share/apps.

Module files have been created for these apps and are automatically laoded upon login. For example:

[hmeij@sharptail ~]$ module list
Currently Loaded Modulefiles:
  1) cuda50/toolkit/5.0.35              3) namd/ibverbs-smp-cuda/2013-06-02   5) lammps/cuda/2013-01-27
  2) mvapich2/gcc/64/1.6                4) amber/gpu/13

Testing of GPUs at vendor sites, may help get the idea of how to run code.

Lammps GPU Testing

Amber GPU Testing

Back