This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cluster:103 [2011/12/21 15:25] hmeij [SAS] |
cluster:103 [2011/12/22 19:34] (current) hmeij [Submit 2] |
||
---|---|---|---|
Line 6: | Line 6: | ||
==== SAS ==== | ==== SAS ==== | ||
- | SAS, the statistical software, and much more, frequently used in the social sciences, is available on the High Performance Academic Computing Cluster. | + | SAS, the statistical |
SAS is typically invoked in batch mode by submitting a script (*.sas text file). | SAS is typically invoked in batch mode by submitting a script (*.sas text file). | ||
- | SAS can be invoked in interactive mode on the head node for debugging and code development if needed. | + | SAS can be invoked in interactive mode on the head node for debugging and code development, if needed. |
* at sas.com [[http:// | * at sas.com [[http:// | ||
Line 16: | Line 16: | ||
* SAS/ | * SAS/ | ||
- | * A tutor is available at [[http:// | + | A tutor application |
==== Program ==== | ==== Program ==== | ||
Line 63: | Line 63: | ||
</ | </ | ||
+ | |||
+ | ==== Submit ==== | ||
+ | |||
+ | Ok, so now we have a program that works. | ||
+ | |||
+ | * Create a shell script for example with the name '' | ||
+ | * Set execute permissions '' | ||
+ | * Submit (see below) | ||
+ | |||
+ | < | ||
+ | |||
+ | #!/bin/bash | ||
+ | # submit via 'bsub < run' | ||
+ | |||
+ | #BSUB -q hp12 | ||
+ | #BSUB -J test | ||
+ | #BSUB -o stdout | ||
+ | #BSUB -e stderr | ||
+ | |||
+ | time sas test | ||
+ | |||
+ | </ | ||
+ | |||
+ | The leading '#' | ||
+ | |||
+ | < | ||
+ | |||
+ | [hmeij@greentail sas]$ bsub < run | ||
+ | Job < | ||
+ | |||
+ | [hmeij@greentail sas]$ bjobs | ||
+ | JOBID | ||
+ | 492637 | ||
+ | |||
+ | [hmeij@greentail sas]$ bqueues | ||
+ | QUEUE_NAME | ||
+ | hp12 | ||
+ | matlab | ||
+ | stata 50 | ||
+ | elw 50 | ||
+ | emw 50 | ||
+ | ehw 50 | ||
+ | ehwfd 50 | ||
+ | imw 50 | ||
+ | bss24 50 | ||
+ | |||
+ | [hmeij@greentail sas]$ bjobs | ||
+ | JOBID | ||
+ | 492637 | ||
+ | [hmeij@greentail sas]$ bjobs | ||
+ | No unfinished job found | ||
+ | |||
+ | [hmeij@greentail sas]$ ll | ||
+ | total 28 | ||
+ | -rwxr--r-- 1 hmeij its 115 Dec 21 10:48 run | ||
+ | -rw-r--r-- 1 hmeij its 42 Dec 21 10:49 stderr | ||
+ | -rw-r--r-- 1 hmeij its 838 Dec 21 10:49 stdout | ||
+ | -rw-r--r-- 1 hmeij its 33 Dec 21 10:16 test.dat | ||
+ | -rw-r--r-- 1 hmeij its 2565 Dec 21 10:49 test.log | ||
+ | -rw-r--r-- 1 hmeij its 258 Dec 21 10:49 test.lst | ||
+ | -rw-r--r-- 1 hmeij its 140 Dec 21 10:22 test.sas | ||
+ | </ | ||
+ | |||
+ | And so the job was dispatched to host '' | ||
+ | |||
+ | The '' | ||
+ | |||
+ | ==== Submit 2 ==== | ||
+ | |||
+ | On the back end compute nodes, unless specified, the job runs inside your home directory. | ||
+ | |||
+ | In the SAS program we add the following lines | ||
+ | |||
+ | < | ||
+ | %let jobpid = %sysget(LSB_JOBID); | ||
+ | libname here "/ | ||
+ | </ | ||
+ | |||
+ | And change this line to use local disks for storage | ||
+ | |||
+ | < | ||
+ | data here.one; | ||
+ | </ | ||
+ | |||
+ | In the submission script we change the following | ||
+ | |||
+ | * new submission file with edits | ||
+ | * -n implies reserve job slots (cpu cores) for job (not necesssary, SAS jobs will always use only one) | ||
+ | * -R reserves memory, for example, reserve 200 MB of memory on target compute node | ||
+ | * scheduler creates unique dirs in scratch by JOBPID for you, so we'll stage the job there | ||
+ | * but now we must copy relevant files //to// scratch dir and results back //to// home dir | ||
+ | |||
+ | |||
+ | < | ||
+ | |||
+ | #!/bin/bash | ||
+ | # submit via 'bsub < run' | ||
+ | |||
+ | #BSUB -q hp12 | ||
+ | #BSUB -J test | ||
+ | #BSUB -o stdout | ||
+ | #BSUB -e stderr | ||
+ | #BSUB -n 1 | ||
+ | #BSUB -R " | ||
+ | |||
+ | # unique job dir in scratch | ||
+ | export MYSANSCRATCH=/ | ||
+ | cd $MYSANSCRATCH | ||
+ | |||
+ | cp ~/ | ||
+ | time sas test | ||
+ | cp test.log test.lst ~/sas | ||
+ | |||
+ | </ | ||
+ | |||
+ | |||
+ | * you can monitor the progress of your jobs from greentail while it runs | ||
+ | |||
+ | < | ||
+ | [hmeij@greentail sas]$ ll / | ||
+ | total 16 | ||
+ | -rw-r--r-- 1 hmeij its 33 Dec 21 14:31 test.dat | ||
+ | -rw-r--r-- 1 hmeij its 2568 Dec 21 14:31 test.log | ||
+ | -rw-r--r-- 1 hmeij its 258 Dec 21 14:31 test.lst | ||
+ | -rw-r--r-- 1 hmeij its 140 Dec 21 14:31 test.sas | ||
+ | </ | ||
+ | |||
+ | ==== Best Practices ==== | ||
+ | |||
+ | * You may submit as many SAS jobs as you like, just leave enough resources available for others to also get work done | ||
+ | * Because SAS submission are serial, non-parallel jobs your -n flag is always 1 | ||
+ | * Reserve resources if you know what you need, especially memory | ||
+ | * Use /sanscratch for large data jobs with heavy read/write operations | ||
+ | * Queue ehwfd is preferentially for Gaussian users and stay off the stata and matlab queues please | ||
+ | * Write smart SAS code, for example, use data set indexes and PROC SQL (this can be your best friend) | ||
+ | * ... suggestions will be added to this page | ||
\\ | \\ | ||
**[[cluster: | **[[cluster: |