This shows you the differences between two versions of the page.
cluster:59 [2008/01/09 14:04] |
cluster:59 [2008/01/09 14:04] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | \\ | ||
+ | **[[cluster: | ||
+ | |||
+ | |||
+ | ====== Complete Documentation ====== | ||
+ | |||
+ | It's all at this link **[[https:// | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ====== New Features in LSF 6.2 ====== | ||
+ | |||
+ | This page will be expanded to show examples of LSF/HPC advanced features. | ||
+ | |||
+ | The more information you can provide to the scheduler regarding run times, resources needed and when, the more efficient the scheduling will be. The examples below are just made up scenarios. | ||
+ | |||
+ | |||
+ | => Also read up on the new queue configurations: | ||
+ | |||
+ | |||
+ | As part of the upgrade: | ||
+ | |||
+ | * Jobs were terminated ... for a list of which ones view [[http:// | ||
+ | |||
+ | * The working directories of those terminated jobs were saved in **/ | ||
+ | |||
+ | * When the new scheduler came online it started with JOBPID 101 ... that may clobber some of your old output files so i've spooled the JOBPIDs forward to 30,000. | ||
+ | |||
+ | * Some home directories have been relocated but / | ||
+ | |||
+ | * **Parallel job submission syntax has/will change!** However, the "old way" still works. | ||
+ | |||
+ | * We're still experiencing license issues ... more later. | ||
+ | |||
+ | |||
+ | |||
+ | ===== Exclusive ===== | ||
+ | |||
+ | If you wish to use a compute node in an " | ||
+ | |||
+ | Here is how it works, in your program ... | ||
+ | |||
+ | < | ||
+ | #BSUB -q elw | ||
+ | #BSUB -x | ||
+ | #BSUB -J " | ||
+ | </ | ||
+ | |||
+ | Once your job runs ... | ||
+ | |||
+ | < | ||
+ | [hmeij@swallowtail ~]$ bhosts | ||
+ | HOST_NAME | ||
+ | compute-1-18 | ||
+ | </ | ||
+ | |||
+ | you will notice that the host status is now " | ||
+ | |||
+ | < | ||
+ | [hmeij@swallowtail ~]$ bhosts -l compute-1-18 | ||
+ | HOST compute-1-18 | ||
+ | STATUS | ||
+ | closed_Excl | ||
+ | </ | ||
+ | |||
+ | Please note that the '' | ||
+ | |||
+ | * for serial jobs '' | ||
+ | * '' | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===== Resource Reservation ===== | ||
+ | |||
+ | **'' | ||
+ | |||
+ | Very powerful argument you can give to '' | ||
+ | |||
+ | Here is a simple example. A simple script, we're going to ask for 200 MB of memory. | ||
+ | |||
+ | < | ||
+ | ... | ||
+ | # queue | ||
+ | #BSUB -q elw | ||
+ | #BSUB -R " | ||
+ | #BSUB -J " | ||
+ | ... | ||
+ | </ | ||
+ | |||
+ | Submit job and observe the resource reservation (note the value under " | ||
+ | |||
+ | There are many, many options using the resource reservation options. | ||
+ | |||
+ | < | ||
+ | [hmeij@swallowtail ~]$ bsub < ./ | ||
+ | Job < | ||
+ | |||
+ | [hmeij@swallowtail ~]$ bjobs | ||
+ | JOBID | ||
+ | 30238 | ||
+ | |||
+ | [hmeij@swallowtail ~]$ bhosts -l compute-1-21 | ||
+ | HOST compute-1-21 | ||
+ | STATUS | ||
+ | ok | ||
+ | |||
+ | | ||
+ | r15s | ||
+ | | ||
+ | | ||
+ | ... | ||
+ | </ | ||
+ | |||
+ | There are two custom resources that have been defined outside of LSF. They are ' | ||
+ | |||
+ | Remember that / | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===== Wall Clock Time ===== | ||
+ | |||
+ | Not a //new// feature, but one which i strongly encourage you to use. \\ | ||
+ | Queue policy of BACKFILL //is a new option//, defined at queue level. | ||
+ | |||
+ | With wall clock time information available for each job, the scheduler is able to exercise the BACKFILL policy. | ||
+ | |||
+ | To specify ... | ||
+ | |||
+ | < | ||
+ | #BSUB -W hours: | ||
+ | </ | ||
+ | |||
+ | For efficient backfilling, | ||
+ | |||
+ | |||
+ | ===== Parallel Jobs ===== | ||
+ | |||
+ | |||
+ | ==== Old Way ==== | ||
+ | |||
+ | Good news! It appears the "old way" of submitting jobs still works. | ||
+ | |||
+ | |||
+ | |||
+ | ==== Spanning ==== | ||
+ | |||
+ | A very handy feature. | ||
+ | |||
+ | But consider ... 16 jobslots (cores) are requested and we want no more than 2 allocated per host. The resource request '' | ||
+ | |||
+ | < | ||
+ | #!/bin/bash | ||
+ | #BSUB -q imw | ||
+ | #BSUB -n 16 | ||
+ | #BSUB -J test | ||
+ | #BSUB -o out | ||
+ | #BSUB -e err | ||
+ | #BSUB -R " | ||
+ | ... | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | |||
+ | [hmeij@swallowtail cd]$ bjobs | ||
+ | |||
+ | JOBID | ||
+ | 30244 | ||
+ | 2*compute-1-8: | ||
+ | 2*compute-1-7 test Nov 20 11:04 | ||
+ | |||
+ | </ | ||
+ | |||
+ | This also works with the "old way" of submitting ;-)\\ | ||
+ | Some jobs will benefit from this tremendously, | ||
+ | |||
+ | |||
+ | ==== New Way ==== | ||
+ | |||
+ | Lets start a **[[cluster: | ||
+ | |||
+ | --- // | ||
+ | |||
+ | \\ | ||
+ | **[[cluster: | ||