User Tools

Site Tools


cluster:142

This is an old revision of the document!



Back

Scratch Spaces

We have different locations for scratch space. Some local to the nodes, some mounted across the network. Here is the current setup as of August 2015.

  • /localscratch
    • Local to each node, different sizes roughly around 50-80 GB
    • Warning: on nodes n46-n59 there is no hard disk but a SataDOM (usb device pluggedt directly into sysem board, 16 GB in size, holds just the OS). Do not use /localscratch on these nodes.
  • /sanscratch
    • Two 5 TB file systems mounted IpoIB using NFS
      • One from greentail's disk array for nodes n1-n32 and b0-b50
      • One from sharptail's disk array fro all other nodes
  • /localscratch5tb
    • 5 TB file system provided by local drives (3x2TB, Raid 0) on nodes in the mw256fd queue
    • The list of nodes done: n38 n39 n40 n41 n43 n44 n45

48 TB of local scratch space will be made available in 6 TB chunks on the nodes in the queue mw256fd. That yields 5TB of local scratch space per node using Raid 0 and file type ext4, mounted at /localscratch5tb. Everybody may use this but it has specifically been put in place for Gaussian jobs yielding massive RWF files (application scratch files).

Note: Everybody is welcome to store content in /localscratch5tb/username/ permanently for easy job access of large data files unless it interferes with jobs. However be warned that a) it's local storage, b) it's raid 0 (one disk failures and all data is lost), c) it's like /tmp read and write permission for all (do chmod go-rwx /localscratch5tb/username for some protection, nad d) this file system is not backed up. In addition, /sanscratch/username/ will also be allowed.

You need to change your working directory to the location the scheduler has made for you. Also save your output before the job terminates, the scheduler will remove that working directory. Here is the workflow…

#!/bin/bash
# submit like so: bsub < run.forked

# if writing large checkpoint files uncomment next lines
#ionice -c 2 -n 7 -p $$
#ionice -p $$

#BSUB -q mw256fd
#BSUB -o out
#BSUB -e err
#BSUB -J test

# job slots: match inside gaussian.com
#BSUB -n 4
# force all onto one host (shared code and data stack)
#BSUB -R "span[hosts=1]"

# unique job scratch dirs
MYSANSCRATCH=/sanscratch/$LSB_JOBID
MYLOCALSCRATCH=/localscratch/$LSB_JOBID
MYLOCALSCRATCH5TB=/localscratch5tb/$LSB_JOBID
export MYSANSCRATCH MYLOCALSCRATCH MYLOCALSCRATCH5TB

# cd to remote working directory
cd $MYLOCALSCRATCH5TB
pwd

# environment
export GAUSS_SCRDIR="$MYLOCALSCRATCH5TB"

export g09root="/share/apps/gaussian/g09root"
. $g09root/g09/bsd/g09.profile

#export gdvroot="/share/apps/gaussian/gdvh11"
#. $gdvroot/gdv/bsd/gdv.profile

# stage input data to localscratch5tb
cp ~/jobs/forked/gaussian.com .
touch gaussian.log

# run plain vanilla
g09 < gaussian.com > gaussian.log

# run dev
#gdv < gaussian.com > gaussian.log

# save results back to homedir !!!
cp gaussian.log ~/jobs/forked/output.$LSB_JOBID


Back

cluster/142.1441648049.txt.gz · Last modified: 2015/09/07 13:47 by hmeij