This shows you the differences between two versions of the page.
cluster:64 [2008/01/10 16:42] |
cluster:64 [2008/01/10 16:42] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | \\ | ||
+ | **[[cluster: | ||
+ | ===== LSF & MPI ===== | ||
+ | |||
+ | The new installation of LSF supports an integrated environment for submitting parallel jobs. What this means is that the scheduler can keep track of the resource consumption of a job spawning many parallel tasks. | ||
+ | |||
+ | Small changes are needed to your job script. | ||
+ | |||
+ | Below is a sequence of steps detailing how it all works. | ||
+ | |||
+ | - change the references to the old lava wrapper scripts to the new lsf wrapper scripts: | ||
+ | * ''/ | ||
+ | * ''/ | ||
+ | * ''/ | ||
+ | - you do not have to specify the '' | ||
+ | * ''# | ||
+ | |||
+ | ==== cpi ==== | ||
+ | |||
+ | First, lets compile a tiny little C program. | ||
+ | |||
+ | ^Flavor^MPI Compiler^Arguments^ | ||
+ | | mvapich | ||
+ | | openmpi | ||
+ | | openmpi_intel | ||
+ | | topspin | ||
+ | |||
+ | The surprise here is that we end up with binaries ranging in size of 10 Kb to 3 MB. Topspin is the MPI flavor that came with our cluster for the Infiniband switch. | ||
+ | |||
+ | < | ||
+ | |||
+ | lrwxrwxrwx | ||
+ | -rwxr-xr-x | ||
+ | -rwxr-xr-x | ||
+ | -rwxr-xr-x | ||
+ | -rwxr-xr-x | ||
+ | |||
+ | </ | ||
+ | |||
+ | ==== Job Script ==== | ||
+ | |||
+ | Here is the test script we'll use for testing. | ||
+ | |||
+ | < | ||
+ | |||
+ | #!/bin/bash | ||
+ | rm -f ./err ./out | ||
+ | |||
+ | #BSUB -q imw | ||
+ | #BSUB -n 4 | ||
+ | #BSUB -R " | ||
+ | #BSUB -J mpi.lsf | ||
+ | #BSUB -e err | ||
+ | #BSUB -o out | ||
+ | |||
+ | # WRAPPERS | ||
+ | |||
+ | echo topsin | ||
+ | time / | ||
+ | |||
+ | echo openmpi | ||
+ | time / | ||
+ | echo openmpi_intel | ||
+ | time / | ||
+ | |||
+ | </ | ||
+ | |||
+ | ==== Infiniband ==== | ||
+ | |||
+ | * queue: imw | ||
+ | * all 3 MPI flavors | ||
+ | |||
+ | * err | ||
+ | |||
+ | < | ||
+ | Process 1 on compute-1-3.local | ||
+ | Process 0 on compute-1-3.local | ||
+ | Process 3 on compute-1-13.local | ||
+ | Process 2 on compute-1-13.local | ||
+ | |||
+ | real 0m6.837s | ||
+ | user 0m0.032s | ||
+ | sys | ||
+ | Process 1 on compute-1-3.local | ||
+ | Process 2 on compute-1-13.local | ||
+ | Process 3 on compute-1-13.local | ||
+ | Process 0 on compute-1-3.local | ||
+ | |||
+ | real 0m2.071s | ||
+ | user 0m0.018s | ||
+ | sys | ||
+ | Process 0 on compute-1-3.local | ||
+ | Process 1 on compute-1-3.local | ||
+ | Process 2 on compute-1-13.local | ||
+ | Process 3 on compute-1-13.local | ||
+ | |||
+ | real 0m1.489s | ||
+ | user 0m0.014s | ||
+ | sys | ||
+ | </ | ||
+ | |||
+ | * out | ||
+ | |||
+ | < | ||
+ | The output (if any) follows: | ||
+ | |||
+ | topsin | ||
+ | pi is approximately 3.1416009869231249, | ||
+ | wall clock time = 0.000184 | ||
+ | openmpi | ||
+ | pi is approximately 3.1416009869231249, | ||
+ | wall clock time = 0.010902 | ||
+ | openmpi_intel | ||
+ | pi is approximately 3.1416009869231249, | ||
+ | wall clock time = 0.021757 | ||
+ | |||
+ | </ | ||
+ | |||
+ | ==== Ethernet ==== | ||
+ | |||
+ | * queue '' | ||
+ | * MPI: openmpi_intel only. | ||
+ | |||
+ | * err | ||
+ | |||
+ | < | ||
+ | -------------------------------------------------------------------------- | ||
+ | [0,1,0]: MVAPI on host compute-1-23 was unable to find any HCAs. | ||
+ | Another transport will be used instead, although this may result in | ||
+ | lower performance. | ||
+ | -------------------------------------------------------------------------- | ||
+ | -------------------------------------------------------------------------- | ||
+ | [0,1,1]: MVAPI on host compute-1-23 was unable to find any HCAs. | ||
+ | Another transport will be used instead, although this may result in | ||
+ | lower performance. | ||
+ | -------------------------------------------------------------------------- | ||
+ | -------------------------------------------------------------------------- | ||
+ | [0,1,2]: MVAPI on host compute-1-18 was unable to find any HCAs. | ||
+ | Another transport will be used instead, although this may result in | ||
+ | lower performance. | ||
+ | -------------------------------------------------------------------------- | ||
+ | -------------------------------------------------------------------------- | ||
+ | [0,1,3]: MVAPI on host compute-1-18 was unable to find any HCAs. | ||
+ | Another transport will be used instead, although this may result in | ||
+ | lower performance. | ||
+ | -------------------------------------------------------------------------- | ||
+ | Process 0 on compute-1-23.local | ||
+ | Process 2 on compute-1-18.local | ||
+ | Process 3 on compute-1-18.local | ||
+ | Process 1 on compute-1-23.local | ||
+ | |||
+ | real 0m1.344s | ||
+ | user 0m0.014s | ||
+ | sys | ||
+ | |||
+ | </ | ||
+ | |||
+ | * out | ||
+ | |||
+ | < | ||
+ | The output (if any) follows: | ||
+ | |||
+ | openmpi_intel | ||
+ | pi is approximately 3.1416009869231249, | ||
+ | wall clock time = 0.013772 | ||
+ | |||
+ | </ | ||
+ | |||
+ | ==== Amber ==== | ||
+ | |||
+ | To test Amber with the new scripts, we rerun the programs from [[cluster: | ||
+ | |||
+ | ^Amber^MPI Flavor^Switch^NProcs^JAC bench 1^JAC bench 2^Factor_IX bench 1^Factor_IX bench 2^ | ||
+ | |9 |topspin | ||
+ | |9openmpi | ||
+ | |9openmpi | ||
+ | |||
+ | |||
+ | \\ | ||
+ | **[[cluster: |