cluster:64
no way to compare when less than two revisions
Differences
This shows you the differences between two versions of the page.
| — | cluster:64 [2008/01/10 21:42] (current) – created - external edit 127.0.0.1 | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | \\ | ||
| + | **[[cluster: | ||
| + | ===== LSF & MPI ===== | ||
| + | |||
| + | The new installation of LSF supports an integrated environment for submitting parallel jobs. What this means is that the scheduler can keep track of the resource consumption of a job spawning many parallel tasks. | ||
| + | |||
| + | Small changes are needed to your job script. | ||
| + | |||
| + | Below is a sequence of steps detailing how it all works. | ||
| + | |||
| + | - change the references to the old lava wrapper scripts to the new lsf wrapper scripts: | ||
| + | * ''/ | ||
| + | * ''/ | ||
| + | * ''/ | ||
| + | - you do not have to specify the '' | ||
| + | * ''# | ||
| + | |||
| + | ==== cpi ==== | ||
| + | |||
| + | First, lets compile a tiny little C program. | ||
| + | |||
| + | ^Flavor^MPI Compiler^Arguments^ | ||
| + | | mvapich | ||
| + | | openmpi | ||
| + | | openmpi_intel | ||
| + | | topspin | ||
| + | |||
| + | The surprise here is that we end up with binaries ranging in size of 10 Kb to 3 MB. Topspin is the MPI flavor that came with our cluster for the Infiniband switch. | ||
| + | |||
| + | < | ||
| + | |||
| + | lrwxrwxrwx | ||
| + | -rwxr-xr-x | ||
| + | -rwxr-xr-x | ||
| + | -rwxr-xr-x | ||
| + | -rwxr-xr-x | ||
| + | |||
| + | </ | ||
| + | |||
| + | ==== Job Script ==== | ||
| + | |||
| + | Here is the test script we'll use for testing. | ||
| + | |||
| + | < | ||
| + | |||
| + | #!/bin/bash | ||
| + | rm -f ./err ./out | ||
| + | |||
| + | #BSUB -q imw | ||
| + | #BSUB -n 4 | ||
| + | #BSUB -R " | ||
| + | #BSUB -J mpi.lsf | ||
| + | #BSUB -e err | ||
| + | #BSUB -o out | ||
| + | |||
| + | # WRAPPERS | ||
| + | |||
| + | echo topsin | ||
| + | time / | ||
| + | |||
| + | echo openmpi | ||
| + | time / | ||
| + | echo openmpi_intel | ||
| + | time / | ||
| + | |||
| + | </ | ||
| + | |||
| + | ==== Infiniband ==== | ||
| + | |||
| + | * queue: imw | ||
| + | * all 3 MPI flavors | ||
| + | |||
| + | * err | ||
| + | |||
| + | < | ||
| + | Process 1 on compute-1-3.local | ||
| + | Process 0 on compute-1-3.local | ||
| + | Process 3 on compute-1-13.local | ||
| + | Process 2 on compute-1-13.local | ||
| + | |||
| + | real 0m6.837s | ||
| + | user 0m0.032s | ||
| + | sys | ||
| + | Process 1 on compute-1-3.local | ||
| + | Process 2 on compute-1-13.local | ||
| + | Process 3 on compute-1-13.local | ||
| + | Process 0 on compute-1-3.local | ||
| + | |||
| + | real 0m2.071s | ||
| + | user 0m0.018s | ||
| + | sys | ||
| + | Process 0 on compute-1-3.local | ||
| + | Process 1 on compute-1-3.local | ||
| + | Process 2 on compute-1-13.local | ||
| + | Process 3 on compute-1-13.local | ||
| + | |||
| + | real 0m1.489s | ||
| + | user 0m0.014s | ||
| + | sys | ||
| + | </ | ||
| + | |||
| + | * out | ||
| + | |||
| + | < | ||
| + | The output (if any) follows: | ||
| + | |||
| + | topsin | ||
| + | pi is approximately 3.1416009869231249, | ||
| + | wall clock time = 0.000184 | ||
| + | openmpi | ||
| + | pi is approximately 3.1416009869231249, | ||
| + | wall clock time = 0.010902 | ||
| + | openmpi_intel | ||
| + | pi is approximately 3.1416009869231249, | ||
| + | wall clock time = 0.021757 | ||
| + | |||
| + | </ | ||
| + | |||
| + | ==== Ethernet ==== | ||
| + | |||
| + | * queue '' | ||
| + | * MPI: openmpi_intel only. | ||
| + | |||
| + | * err | ||
| + | |||
| + | < | ||
| + | -------------------------------------------------------------------------- | ||
| + | [0,1,0]: MVAPI on host compute-1-23 was unable to find any HCAs. | ||
| + | Another transport will be used instead, although this may result in | ||
| + | lower performance. | ||
| + | -------------------------------------------------------------------------- | ||
| + | -------------------------------------------------------------------------- | ||
| + | [0,1,1]: MVAPI on host compute-1-23 was unable to find any HCAs. | ||
| + | Another transport will be used instead, although this may result in | ||
| + | lower performance. | ||
| + | -------------------------------------------------------------------------- | ||
| + | -------------------------------------------------------------------------- | ||
| + | [0,1,2]: MVAPI on host compute-1-18 was unable to find any HCAs. | ||
| + | Another transport will be used instead, although this may result in | ||
| + | lower performance. | ||
| + | -------------------------------------------------------------------------- | ||
| + | -------------------------------------------------------------------------- | ||
| + | [0,1,3]: MVAPI on host compute-1-18 was unable to find any HCAs. | ||
| + | Another transport will be used instead, although this may result in | ||
| + | lower performance. | ||
| + | -------------------------------------------------------------------------- | ||
| + | Process 0 on compute-1-23.local | ||
| + | Process 2 on compute-1-18.local | ||
| + | Process 3 on compute-1-18.local | ||
| + | Process 1 on compute-1-23.local | ||
| + | |||
| + | real 0m1.344s | ||
| + | user 0m0.014s | ||
| + | sys | ||
| + | |||
| + | </ | ||
| + | |||
| + | * out | ||
| + | |||
| + | < | ||
| + | The output (if any) follows: | ||
| + | |||
| + | openmpi_intel | ||
| + | pi is approximately 3.1416009869231249, | ||
| + | wall clock time = 0.013772 | ||
| + | |||
| + | </ | ||
| + | |||
| + | ==== Amber ==== | ||
| + | |||
| + | To test Amber with the new scripts, we rerun the programs from [[cluster: | ||
| + | |||
| + | ^Amber^MPI Flavor^Switch^NProcs^JAC bench 1^JAC bench 2^Factor_IX bench 1^Factor_IX bench 2^ | ||
| + | |9 |topspin | ||
| + | |9openmpi | ||
| + | |9openmpi | ||
| + | |||
| + | |||
| + | \\ | ||
| + | **[[cluster: | ||
cluster/64.txt · Last modified: by 127.0.0.1
