\\
**[[cluster:0|Home]]**

===== LSF & MPI =====

The new installation of LSF supports an integrated environment for submitting parallel jobs.  What this means is that the scheduler can keep track of the resource consumption of a job spawning many parallel tasks.  Lava was unable to do so.

Small changes are needed to your job script.  We will first implement a **[[http://lsfdocs.wesleyan.edu/hpc6.2_using/parallel_jobs.html#196824|Generic Parallel Job Launcher Framework]]**.  The two MPI flavors i will fully integrated at a later date will be OpenMPI and Topspin.

Below is a sequence of steps detailing how it all works.  For a quick synopsis, here is what you need to change to start using this new Framework right away:

  - change the references to the old lava wrapper scripts to the new lsf wrapper scripts:
    * ''/share/apps/bin/lsf.topspin.wrapper''
    * ''/share/apps/bin/lsf.openmpi.wrapper''
    * ''/share/apps/bin/lsf.openmpi_intel.wrapper''
  - you do not have to specify the ''-n'' option anymore when invoking your application, the BSUB line is enough
    * ''#BSUB -n 4''

==== cpi ====

First, lets compile a tiny little C program.  ''cpi.c'' is a program that calculates  **[[http://en.wikipedia.org/wiki/Pi|π]], Pi**.  It first fires off a bunch of parallel tasks, each calculates Pi. Worker #0 reports its own calculation results back to standard out.

^Flavor^MPI Compiler^Arguments^
| mvapich  |/share/apps/mvapich-0.9.9/bin/mpicc  | -o cpi_mvapich cpi.c  |
| openmpi  |/share/apps/openmpi-1.2/bin/mpicc  | -o cpi_openmpi cpi.c |
| openmpi_intel  |/share/apps/openmpi-1.2_intel/bin/mpicc  | -o cpi_openmpi_intel cpi.c  |
| topspin  |/usr/local/topspin/mpi/mpich/bin/mpicc  | -o cpi_topspin cpi.c  |

The surprise here is that we end up with binaries ranging in size of 10 Kb to 3 MB.  Topspin is the MPI flavor that came with our cluster for the Infiniband switch.  MVApich and OpenMPI were downloaded and the source compiled with ''gcc''. The alternate OpenMPI (openmpi_intel) was compiled with Intel's compilers.  Topspin can only run across the Infiniband switch but both OpenMPI flavors can use either switch.

<code>

lrwxrwxrwx  1 hmeij its      33 Jan  3 15:49 cpi.c -> /share/apps/openmpi-1.2/bin/cpi.c
-rwxr-xr-x  1 hmeij its  406080 Jan  7 14:38 cpi_mvapich
-rwxr-xr-x  1 hmeij its   10166 Jan  8 15:36 cpi_openmpi
-rwxr-xr-x  1 hmeij its 3023929 Jan  3 16:32 cpi_openmpi_intel
-rwxr-xr-x  1 hmeij its    9781 Jan  3 16:25 cpi_topspin

</code>

==== Job Script ====

Here is the test script we'll use for testing.  Note the lack of the ''-n'' option on the line invoking our application.  We will ask for 4 parallel tasks with 2 tasks per node.

<code>

#!/bin/bash
rm -f ./err ./out

#BSUB -q imw
#BSUB -n 4 
#BSUB -R "span[ptile=2]"
#BSUB -J mpi.lsf
#BSUB -e err
#BSUB -o out 

# WRAPPERS

echo topsin
time /share/apps/bin/lsf.topspin.wrapper ./cpi_topspin

echo openmpi
time /share/apps/bin/lsf.openmpi.wrapper ./cpi_openmpi
echo openmpi_intel
time /share/apps/bin/lsf.openmpi_intel.wrapper ./cpi_openmpi_intel

</code>

==== Infiniband ====

  * queue: imw
  * all 3 MPI flavors

  * err

<code>
Process 1 on compute-1-3.local
Process 0 on compute-1-3.local
Process 3 on compute-1-13.local
Process 2 on compute-1-13.local

real    0m6.837s
user    0m0.032s
sys     0m0.086s
Process 1 on compute-1-3.local
Process 2 on compute-1-13.local
Process 3 on compute-1-13.local
Process 0 on compute-1-3.local

real    0m2.071s
user    0m0.018s
sys     0m0.035s
Process 0 on compute-1-3.local
Process 1 on compute-1-3.local
Process 2 on compute-1-13.local
Process 3 on compute-1-13.local

real    0m1.489s
user    0m0.014s
sys     0m0.035s
</code>

  * out 

<code>
The output (if any) follows:

topsin
pi is approximately 3.1416009869231249, Error is 0.0000083333333318
wall clock time = 0.000184
openmpi
pi is approximately 3.1416009869231249, Error is 0.0000083333333318
wall clock time = 0.010902
openmpi_intel
pi is approximately 3.1416009869231249, Error is 0.0000083333333318
wall clock time = 0.021757

</code>

==== Ethernet ====

  * queue ''emw''
  * MPI: openmpi_intel only.

  * err

<code>
--------------------------------------------------------------------------
[0,1,0]: MVAPI on host compute-1-23 was unable to find any HCAs.
Another transport will be used instead, although this may result in 
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,1]: MVAPI on host compute-1-23 was unable to find any HCAs.
Another transport will be used instead, although this may result in 
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,2]: MVAPI on host compute-1-18 was unable to find any HCAs.
Another transport will be used instead, although this may result in 
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,3]: MVAPI on host compute-1-18 was unable to find any HCAs.
Another transport will be used instead, although this may result in 
lower performance.
--------------------------------------------------------------------------
Process 0 on compute-1-23.local
Process 2 on compute-1-18.local
Process 3 on compute-1-18.local
Process 1 on compute-1-23.local

real    0m1.344s
user    0m0.014s
sys     0m0.022s

</code>

  * out 

<code>
The output (if any) follows:

openmpi_intel
pi is approximately 3.1416009869231249, Error is 0.0000083333333318
wall clock time = 0.013772

</code>

==== Amber ====

To test Amber with the new scripts, we rerun the programs from [[cluster:42#results|these test results]]. Note that the memory footprint has changed of the nodes involved. Lets tally up some runs and note the run time. We'll run each twice

^Amber^MPI Flavor^Switch^NProcs^JAC bench 1^JAC bench 2^Factor_IX bench 1^Factor_IX bench 2^
|9  |topspin  |infiniband  |  4  |  01m38s  |  01m57s  |  02m45s  |  02m38s  |
|9openmpi  |openmpi_intel  |infiniband  |  4  |  01m30s  |  01m35s  |  02m34s  |  02m035s  |
|9openmpi  |openmpi_intel  |ethernet  |  4  |  02m15s  |  02m06s  |  03m32s  |  03m47s  |


\\
**[[cluster:0|Home]]**