This is an old revision of the document!
The new hardware has been delivered and rack&stacked. First priority was looking around while /home was copied from greentail:/home. This cluster is comprised of one head node (sharptail) and 5 nodes (n33-n37). The head node has a 48 TB disk array and 128 GB of memory. The nodes each have 256 GB of memory and offer dual 8 core chips. With so much memory hyperthreading has been turned on doubling the number of cores the operating systems recognizes (so 32 cores per node). Each node also contain 4 GPUs. The entire cluster provides for 1.3+ TB of memory and 20+ Teraflops of computational power. That's almost 7x what we currently have. All these resources will be made available via the Lava scheduler later on.
July and August 2013 I'll call the “Recess! Stay&Play” period. Due to vacation days and all that, final configuration will not be achieved till later this summer. So sharptail is open for ssh access during this time. You may run whatever you want directly on these nodes. There is no scheduler.
Sharptail is slated to become our file server for /home taking over from greentail. Cut over will be last step before going into production. Meanwhile the first sync from greentail to sharptail is about to finish but refreshs will happen. When a refresh happens:
So it's important that if you want to keep stuff on sharptail you need to copy that to greentail before a refresh happens. I suggest you create a ~/sharptail directory and inside of that on sharptail. You can transfer files like so:
So in short, in the future, sharptail:/home is our active file system while greentail:/home will become say greentail:/home_backup (inactive). They will be kept in sync and rsnapshot'ed on both disk array so we have a better backup/restore strategy.
Sharptail will provide the users (and scheduler) with another 5 TB scratch file system. During this period it is only provide to the sharptail nodes (n33-n37). In the future it will provide this file system to all nodes except greentail nodes (n1-n32).
With hyperthreading on the 5 nodes provide for 160 cores. We need to reserve 20 cores for the GPUs (one per GPU), and lets reserve another 20 cores for the OS (5 per node). That still leaves 120 cores for regular jobs like you are used to on greentail. These 120 cores will show up later as a new queue, one that is fit for jobs that need much memory. On average 256 gb per node minus 20 gb for 4 GPUs minus 20 gb for OS leaves 5.6 gb per core
.
So since there is no scheduler, you need to setup your environment and execute your program. Here is an example of a program that normally runs on the imw queue. If your program involves MPI you need to be a bit up to speed on what the lava wrapper actually does for you.
First create the machinesfile, set up your environment by sourcing the appropriate files, submit your program, and monitor the parallel jobs starting using top.
[hmeij@sharptail cd]$ cat mpi_machines n33 n33 n33 n33 n34 n34 n34 n34 [hmeij@sharptail cd]$ . /share/apps/intel/cce/10.0.025/bin/iccvars.sh [hmeij@sharptail cd]$ . /share/apps/intel/fce/10.0.025/bin/ifortvars.sh [hmeij@sharptail cd]$ time /home/apps/openmpi/1.2+intel-10/bin/mpirun \ -x LD_LIBRARY_PATH -machinefile ./mpi_machines \ /share/apps/amber/9+openmpi-1.2+intel-9/exe/pmemd \ -O -i inp/mini.in -p 1g6r.cd.parm -c 1g6r.cd.randions.crd.1 -ref 1g6r.cd.randions.crd.1 & [1] 3304 [hmeij@sharptail cd]$ ssh n33 top -b -n1 -u hmeij top - 14:49:28 up 1 day, 5:24, 1 user, load average: 0.89, 0.20, 0.06 Tasks: 769 total, 5 running, 764 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.0%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 264635888k total, 2364236k used, 262271652k free, 44716k buffers Swap: 31999992k total, 0k used, 31999992k free, 382224k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24348 hmeij 20 0 334m 58m 6028 R 100.0 0.0 0:17.23 pmemd 24345 hmeij 20 0 307m 44m 8816 R 100.0 0.0 0:17.20 pmemd 24346 hmeij 20 0 310m 42m 8824 R 98.3 0.0 0:17.22 pmemd 24347 hmeij 20 0 318m 48m 8004 R 98.3 0.0 0:17.19 pmemd 24353 hmeij 20 0 15552 1636 832 R 1.9 0.0 0:00.03 top 24344 hmeij 20 0 86828 2324 1704 S 0.0 0.0 0:00.01 orted 24352 hmeij 20 0 107m 1864 860 S 0.0 0.0 0:00.00 sshd [hmeij@sharptail cd]$ ssh n34 top -b -n1 -u hmeij top - 14:49:37 up 1 day, 2:40, 0 users, load average: 1.89, 0.47, 0.16 Tasks: 766 total, 5 running, 761 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 264635888k total, 2310176k used, 262325712k free, 29788k buffers Swap: 31999992k total, 0k used, 31999992k free, 359596k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 12198 hmeij 20 0 334m 61m 5328 R 99.8 0.0 0:25.88 pmemd 12200 hmeij 20 0 302m 33m 5368 R 99.8 0.0 0:25.88 pmemd 12201 hmeij 20 0 310m 40m 5352 R 99.8 0.0 0:25.88 pmemd 12199 hmeij 20 0 310m 39m 5372 R 97.8 0.0 0:25.87 pmemd 12205 hmeij 20 0 15552 1636 832 R 3.8 0.0 0:00.04 top 12197 hmeij 20 0 86828 2324 1704 S 0.0 0.0 0:00.01 orted 12204 hmeij 20 0 107m 1864 860 S 0.0 0.0 0:00.00 sshd
LAMMPS, Amber and NAMD have been compiled using Nvidia's toolkit. They are located in /cm/share/apps.
Module files have been created for these apps and are automatically laoded upon login. For example:
[hmeij@sharptail ~]$ module list Currently Loaded Modulefiles: 1) cuda50/toolkit/5.0.35 3) namd/ibverbs-smp-cuda/2013-06-02 5) lammps/cuda/2013-01-27 2) mvapich2/gcc/64/1.6 4) amber/gpu/13
Testing of GPUs at vendor sites, may help get the idea of how to run code.