cluster:103
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| cluster:103 [2011/12/21 15:11] – [SAS] hmeij | cluster:103 [2011/12/22 19:34] (current) – [Submit 2] hmeij | ||
|---|---|---|---|
| Line 6: | Line 6: | ||
| ==== SAS ==== | ==== SAS ==== | ||
| - | SAS, the statistical software, and much more, frequently used in the social sciences, is available on the High Performance Academic Computing Cluster. | + | SAS, the statistical |
| SAS is typically invoked in batch mode by submitting a script (*.sas text file). | SAS is typically invoked in batch mode by submitting a script (*.sas text file). | ||
| - | SAS can be invoked in interactive mode on the head node for debugging and code development if needed. | + | SAS can be invoked in interactive mode on the head node for debugging and code development, if needed. |
| * at sas.com [[http:// | * at sas.com [[http:// | ||
| * SAS/ODS examples [[http:// | * SAS/ODS examples [[http:// | ||
| * SAS/ | * SAS/ | ||
| - | * | + | |
| + | A tutor application is available at [[http:// | ||
| + | |||
| + | ==== Program ==== | ||
| + | |||
| + | So lets generate a little SAS program using a Unix editor like vi/vim, emacs or pico. | ||
| + | |||
| + | * First we generate the input data file '' | ||
| + | |||
| + | < | ||
| + | 1234567890 | ||
| + | 0987654321 | ||
| + | 2468097531 | ||
| + | </ | ||
| + | |||
| + | * Next a simple SAS file '' | ||
| + | |||
| + | < | ||
| + | options nocenter; | ||
| + | filename test ' | ||
| + | |||
| + | data one; | ||
| + | infile test; | ||
| + | input @2 x 3.1 @6 y 3.1; | ||
| + | total = x * y; | ||
| + | run; | ||
| + | |||
| + | proc print; run; | ||
| + | </ | ||
| + | |||
| + | * Lets test it by submitting on head node | ||
| + | |||
| + | < | ||
| + | [root@greentail sas]# ll | ||
| + | total 8 | ||
| + | -rw-r--r-- 1 root root 33 Dec 21 10:16 test.dat | ||
| + | -rw-r--r-- 1 root root 140 Dec 21 10:22 test.sas | ||
| + | [root@greentail sas]# sas test | ||
| + | [root@greentail sas]# cat test.lst | ||
| + | The SAS System | ||
| + | |||
| + | Obs x | ||
| + | |||
| + | | ||
| + | | ||
| + | | ||
| + | |||
| + | </ | ||
| + | |||
| + | ==== Submit ==== | ||
| + | |||
| + | Ok, so now we have a program that works. | ||
| + | |||
| + | * Create a shell script for example with the name '' | ||
| + | * Set execute permissions '' | ||
| + | * Submit (see below) | ||
| + | |||
| + | < | ||
| + | |||
| + | # | ||
| + | # submit via 'bsub < run' | ||
| + | |||
| + | #BSUB -q hp12 | ||
| + | #BSUB -J test | ||
| + | #BSUB -o stdout | ||
| + | #BSUB -e stderr | ||
| + | |||
| + | time sas test | ||
| + | |||
| + | </ | ||
| + | |||
| + | The leading '#' | ||
| + | |||
| + | < | ||
| + | |||
| + | [hmeij@greentail sas]$ bsub < run | ||
| + | Job < | ||
| + | |||
| + | [hmeij@greentail sas]$ bjobs | ||
| + | JOBID | ||
| + | 492637 | ||
| + | |||
| + | [hmeij@greentail sas]$ bqueues | ||
| + | QUEUE_NAME | ||
| + | hp12 | ||
| + | matlab | ||
| + | stata 50 | ||
| + | elw 50 | ||
| + | emw 50 | ||
| + | ehw 50 | ||
| + | ehwfd 50 | ||
| + | imw 50 | ||
| + | bss24 50 | ||
| + | |||
| + | [hmeij@greentail sas]$ bjobs | ||
| + | JOBID | ||
| + | 492637 | ||
| + | [hmeij@greentail sas]$ bjobs | ||
| + | No unfinished job found | ||
| + | |||
| + | [hmeij@greentail sas]$ ll | ||
| + | total 28 | ||
| + | -rwxr--r-- 1 hmeij its 115 Dec 21 10:48 run | ||
| + | -rw-r--r-- 1 hmeij its 42 Dec 21 10:49 stderr | ||
| + | -rw-r--r-- 1 hmeij its 838 Dec 21 10:49 stdout | ||
| + | -rw-r--r-- 1 hmeij its 33 Dec 21 10:16 test.dat | ||
| + | -rw-r--r-- 1 hmeij its 2565 Dec 21 10:49 test.log | ||
| + | -rw-r--r-- 1 hmeij its 258 Dec 21 10:49 test.lst | ||
| + | -rw-r--r-- 1 hmeij its 140 Dec 21 10:22 test.sas | ||
| + | </ | ||
| + | |||
| + | And so the job was dispatched to host '' | ||
| + | |||
| + | The '' | ||
| + | |||
| + | ==== Submit 2 ==== | ||
| + | |||
| + | On the back end compute nodes, unless specified, the job runs inside your home directory. | ||
| + | |||
| + | In the SAS program we add the following lines | ||
| + | |||
| + | < | ||
| + | %let jobpid = %sysget(LSB_JOBID); | ||
| + | libname here "/ | ||
| + | </ | ||
| + | |||
| + | And change this line to use local disks for storage | ||
| + | |||
| + | < | ||
| + | data here.one; | ||
| + | </ | ||
| + | |||
| + | In the submission script we change the following | ||
| + | |||
| + | * new submission file with edits | ||
| + | * -n implies reserve job slots (cpu cores) for job (not necesssary, SAS jobs will always use only one) | ||
| + | * -R reserves memory, for example, reserve 200 MB of memory on target compute node | ||
| + | * scheduler creates unique dirs in scratch by JOBPID for you, so we'll stage the job there | ||
| + | * but now we must copy relevant files //to// scratch dir and results back //to// home dir | ||
| + | |||
| + | |||
| + | < | ||
| + | |||
| + | # | ||
| + | # submit via 'bsub < run' | ||
| + | |||
| + | #BSUB -q hp12 | ||
| + | #BSUB -J test | ||
| + | #BSUB -o stdout | ||
| + | #BSUB -e stderr | ||
| + | #BSUB -n 1 | ||
| + | #BSUB -R " | ||
| + | |||
| + | # unique job dir in scratch | ||
| + | export MYSANSCRATCH=/ | ||
| + | cd $MYSANSCRATCH | ||
| + | |||
| + | cp ~/ | ||
| + | time sas test | ||
| + | cp test.log test.lst ~/sas | ||
| + | |||
| + | </ | ||
| + | |||
| + | |||
| + | * you can monitor the progress of your jobs from greentail while it runs | ||
| + | |||
| + | < | ||
| + | [hmeij@greentail sas]$ ll / | ||
| + | total 16 | ||
| + | -rw-r--r-- 1 hmeij its 33 Dec 21 14:31 test.dat | ||
| + | -rw-r--r-- 1 hmeij its 2568 Dec 21 14:31 test.log | ||
| + | -rw-r--r-- 1 hmeij its 258 Dec 21 14:31 test.lst | ||
| + | -rw-r--r-- 1 hmeij its 140 Dec 21 14:31 test.sas | ||
| + | </ | ||
| + | |||
| + | ==== Best Practices ==== | ||
| + | |||
| + | * You may submit as many SAS jobs as you like, just leave enough resources available for others to also get work done | ||
| + | * Because SAS submission are serial, non-parallel jobs your -n flag is always 1 | ||
| + | * Reserve resources if you know what you need, especially memory | ||
| + | * Use /sanscratch for large data jobs with heavy read/write operations | ||
| + | * Queue ehwfd is preferentially for Gaussian users and stay off the stata and matlab queues please | ||
| + | * Write smart SAS code, for example, use data set indexes and PROC SQL (this can be your best friend) | ||
| + | | ||
| \\ | \\ | ||
| **[[cluster: | **[[cluster: | ||
cluster/103.1324480305.txt.gz · Last modified: by hmeij
