This shows you the differences between two versions of the page.
— |
cluster:32 [2007/05/16 11:27] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | \\ | ||
+ | **[[cluster: | ||
+ | => Lava, the scheduler, is not natively capable for parallel jobs submissions. | ||
+ | |||
+ | => There is a splendid course offered by NCSA at UIUC about MPI. If you're serious about MPI, take it; you can find a link to access this course **[[cluster: | ||
+ | |||
+ | => In all the examples below, '' | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===== Jobs ===== | ||
+ | |||
+ | Infiniband! | ||
+ | |||
+ | PLEASE READ THE 'ENV TEST' SECTION, IT'LL EXPLAIN WHY IT IS COMPLICATED. | ||
+ | Also, you need to test that your environment is set up correctly | ||
+ | |||
+ | This write up will only focus on how to submit jobs using scripts, meaning in batch mode. A single bash shell script (they must be bash shells!) will submit myscript to the scheduler. | ||
+ | |||
+ | **imyscript** | ||
+ | < | ||
+ | #!/bin/bash | ||
+ | |||
+ | # queue | ||
+ | #BSUB -q idebug -n 16 | ||
+ | |||
+ | # email me (##SUB) or save in $HOME (#SUB) | ||
+ | ##BSUB -o outfile.email # standard ouput | ||
+ | #BSUB -e outfile.err | ||
+ | |||
+ | # unique job scratch dirs | ||
+ | MYSANSCRATCH=/ | ||
+ | MYLOCALSCRATCH=/ | ||
+ | export MYSANSCRATCH MYLOCALSCRATCH | ||
+ | |||
+ | # run my job | ||
+ | / | ||
+ | |||
+ | echo DONE ... these dirs will be removed via post_exec | ||
+ | echo $MYSANSCRATCH $MYLOCALSCRATCH | ||
+ | |||
+ | # label my job | ||
+ | #BSUB -J myLittleiJob | ||
+ | </ | ||
+ | |||
+ | This looks much like the non-infiniband job submissions but there are some key changes. | ||
+ | |||
+ | The most significant change is that we will be calling a ' | ||
+ | |||
+ | If you want to use the [[http:// | ||
+ | |||
+ | "make today an [[http:// | ||
+ | |||
+ | ===== bsub and bjobs ===== | ||
+ | |||
+ | Straightforward. | ||
+ | |||
+ | < | ||
+ | [hmeij@swallowtail ~]$ bsub < imyscript | ||
+ | Job < | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | [hmeij@swallowtail ~]$ bjobs | ||
+ | JOBID | ||
+ | 1011 hmeij | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | [hmeij@swallowtail ~]$ bjobs | ||
+ | JOBID | ||
+ | 1011 hmeij | ||
+ | compute-1-16: | ||
+ | compute-1-16: | ||
+ | compute-1-15: | ||
+ | compute-1-15: | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | [hmeij@swallowtail ~]$ bjobs | ||
+ | No unfinished job found | ||
+ | </ | ||
+ | |||
+ | Note: as expected 8 cores (EXEC_HOST) were invoked on each node. | ||
+ | |||
+ | |||
+ | ===== bhist ===== | ||
+ | |||
+ | You can query the scheduler regarding the status of your job. | ||
+ | |||
+ | < | ||
+ | [hmeij@swallowtail ~]$ bhist -l 1011 | ||
+ | |||
+ | Job < | ||
+ | # | ||
+ | SUB) or save in $HOME (# | ||
+ | ndard ouput;# | ||
+ | ique job scratch dirs; | ||
+ | |||
+ | Thu Apr 19 14:54:19: Submitted from host < | ||
+ | < | ||
+ | ; | ||
+ | Thu Apr 19 14:54:24: Dispatched to 16 Hosts/ | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | Thu Apr 19 14:54:24: Starting (Pid 6266); | ||
+ | Thu Apr 19 14:54:31: Running with execution home </ | ||
+ | / | ||
+ | Thu Apr 19 14:55:47: Done successfully. The CPU time used is 0.0 seconds; | ||
+ | Thu Apr 19 14:55:57: Post job process done successfully; | ||
+ | |||
+ | Summary of time in seconds spent in various states by Thu Apr 19 14:55:57 | ||
+ | PEND | ||
+ | 5 0 83 | ||
+ | </ | ||
+ | |||
+ | |||
+ | |||
+ | ===== Job Ouput ===== | ||
+ | |||
+ | The above job submission yields ... | ||
+ | |||
+ | < | ||
+ | [hmeij@swallowtail ~]$ cat outfile.err | ||
+ | Process 11 on compute-1-16.local | ||
+ | Process 6 on compute-1-15.local | ||
+ | Process 14 on compute-1-16.local | ||
+ | Process 0 on compute-1-15.local | ||
+ | Process 1 on compute-1-15.local | ||
+ | Process 2 on compute-1-15.local | ||
+ | Process 3 on compute-1-15.local | ||
+ | Process 8 on compute-1-16.local | ||
+ | Process 4 on compute-1-15.local | ||
+ | Process 9 on compute-1-16.local | ||
+ | Process 5 on compute-1-15.local | ||
+ | Process 10 on compute-1-16.local | ||
+ | Process 7 on compute-1-15.local | ||
+ | Process 12 on compute-1-16.local | ||
+ | Process 13 on compute-1-16.local | ||
+ | Process 15 on compute-1-16.local | ||
+ | </ | ||
+ | |||
+ | and the following email | ||
+ | |||
+ | < | ||
+ | Job < | ||
+ | Job was executed on host(s) < | ||
+ | < | ||
+ | </ | ||
+ | </ | ||
+ | Started at Thu Apr 19 14:54:24 2007 | ||
+ | Results reported at Thu Apr 19 14:55:47 2007 | ||
+ | |||
+ | Your job looked like: | ||
+ | |||
+ | ------------------------------------------------------------ | ||
+ | # LSBATCH: User input | ||
+ | #!/bin/bash | ||
+ | |||
+ | # queue | ||
+ | #BSUB -q idebug -n 16 | ||
+ | |||
+ | # email me (##SUB) or save in $HOME (#SUB) | ||
+ | ##BSUB -o outfile.email # standard ouput | ||
+ | #BSUB -e outfile.err | ||
+ | |||
+ | # unique job scratch dirs | ||
+ | MYSANSCRATCH=/ | ||
+ | MYLOCALSCRATCH=/ | ||
+ | export MYSANSCRATCH MYLOCALSCRATCH | ||
+ | |||
+ | # run my job | ||
+ | / | ||
+ | |||
+ | # label my job | ||
+ | #BSUB -J myLittleiJob | ||
+ | |||
+ | |||
+ | ------------------------------------------------------------ | ||
+ | |||
+ | Successfully completed. | ||
+ | |||
+ | Resource usage summary: | ||
+ | |||
+ | CPU time : | ||
+ | Max Memory : 7 MB | ||
+ | Max Swap : | ||
+ | |||
+ | Max Processes | ||
+ | Max Threads | ||
+ | |||
+ | The output (if any) follows: | ||
+ | |||
+ | pi is approximately 3.1416009869231245, | ||
+ | wall clock time = 0.312946 | ||
+ | DONE ... these dirs will be removed via post_exec | ||
+ | / | ||
+ | |||
+ | PS: | ||
+ | |||
+ | Read file < | ||
+ | </ | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===== Bingo ===== | ||
+ | |||
+ | When i ran these OpenMPI invocations i was also running a HPLinpack benchmark on the nodes on the infiniband (to assess if the nodes would respond). | ||
+ | |||
+ | The idebug queue overrides the job slots set for each node (Max Job Slots = # of cores => 8). It allows for QJOB_LIMIT=16 and UJOB_LIMIT=16. | ||
+ | |||
+ | {{: | ||
+ | |||
+ | And so it was.\\ | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===== The Problem ===== | ||
+ | |||
+ | (important i repeat this from another page --- // | ||
+ | |||
+ | Once you have your binary compiled, you can execute it on the head node or any other node with a hardcoded '' | ||
+ | |||
+ | < | ||
+ | [hmeij@swallowtail ~]$/ | ||
+ | </ | ||
+ | |||
+ | This will not work when submitting your program to '' | ||
+ | |||
+ | <hi yellow> | ||
+ | Lava (your scheduler) is not natively capable for parallel jobs, so you will have to write your own integration script to parse the hosts allocated by LSF (with LSB_HOSTS variable) and integrate them to your MPI distribution. | ||
+ | </hi> | ||
+ | |||
+ | <hi orange> | ||
+ | Also, because the lack of LSF's parallel support daemons, these scripts can only provide a loose integration to Lava. Specifically, | ||
+ | </hi> | ||
+ | |||
+ | |||
+ | And this makes the job submission process for parallel jobs tedious. | ||
+ | |||
+ | |||
+ | \\ | ||
+ | **[[cluster: |