DokuWiki

LSF & MPI

The new installation of LSF supports an integrated environment for submitting parallel jobs. What this means is that the scheduler can keep track of the resource consumption of a job spawning many parallel tasks. Lava was unable to do so.

Small changes are needed to your job script. We will first implement a Generic Parallel Job Launcher Framework. The two MPI flavors i will fully integrated at a later date will be OpenMPI and Topspin.

Below is a sequence of steps detailing how it all works. For a quick synopsis, here is what you need to change to start using this new Framework right away:

change the references to the old lava wrapper scripts to the new lsf wrapper scripts:
- /share/apps/bin/lsf.topspin.wrapper
- /share/apps/bin/lsf.openmpi.wrapper
- /share/apps/bin/lsf.openmpi_intel.wrapper
you do not have to specify the -n option anymore when invoking your application, the BSUB line is enough
- #BSUB -n 4

cpi

First, lets compile a tiny little C program. cpi.c is a program that calculates π, Pi. It first fires off a bunch of parallel tasks, each calculates Pi. Worker #0 reports its own calculation results back to standard out.

Flavor	MPI Compiler	Arguments
mvapich	/share/apps/mvapich-0.9.9/bin/mpicc	-o cpi_mvapich cpi.c
openmpi	/share/apps/openmpi-1.2/bin/mpicc	-o cpi_openmpi cpi.c
openmpi_intel	/share/apps/openmpi-1.2_intel/bin/mpicc	-o cpi_openmpi_intel cpi.c
topspin	/usr/local/topspin/mpi/mpich/bin/mpicc	-o cpi_topspin cpi.c

The surprise here is that we end up with binaries ranging in size of 10 Kb to 3 MB. Topspin is the MPI flavor that came with our cluster for the Infiniband switch. MVApich and OpenMPI were downloaded and the source compiled with gcc. The alternate OpenMPI (openmpi_intel) was compiled with Intel's compilers. Topspin can only run across the Infiniband switch but both OpenMPI flavors can use either switch.

lrwxrwxrwx  1 hmeij its      33 Jan  3 15:49 cpi.c -> /share/apps/openmpi-1.2/bin/cpi.c
-rwxr-xr-x  1 hmeij its  406080 Jan  7 14:38 cpi_mvapich
-rwxr-xr-x  1 hmeij its   10166 Jan  8 15:36 cpi_openmpi
-rwxr-xr-x  1 hmeij its 3023929 Jan  3 16:32 cpi_openmpi_intel
-rwxr-xr-x  1 hmeij its    9781 Jan  3 16:25 cpi_topspin

Job Script

Here is the test script we'll use for testing. Note the lack of the -n option on the line invoking our application. We will ask for 4 parallel tasks with 2 tasks per node.

#!/bin/bash
rm -f ./err ./out

#BSUB -q imw
#BSUB -n 4 
#BSUB -R "span[ptile=2]"
#BSUB -J mpi.lsf
#BSUB -e err
#BSUB -o out 

# WRAPPERS

echo topsin
time /share/apps/bin/lsf.topspin.wrapper ./cpi_topspin

echo openmpi
time /share/apps/bin/lsf.openmpi.wrapper ./cpi_openmpi
echo openmpi_intel
time /share/apps/bin/lsf.openmpi_intel.wrapper ./cpi_openmpi_intel

Infiniband

queue: imw
all 3 MPI flavors

err

Process 1 on compute-1-3.local
Process 0 on compute-1-3.local
Process 3 on compute-1-13.local
Process 2 on compute-1-13.local

real    0m6.837s
user    0m0.032s
sys     0m0.086s
Process 1 on compute-1-3.local
Process 2 on compute-1-13.local
Process 3 on compute-1-13.local
Process 0 on compute-1-3.local

real    0m2.071s
user    0m0.018s
sys     0m0.035s
Process 0 on compute-1-3.local
Process 1 on compute-1-3.local
Process 2 on compute-1-13.local
Process 3 on compute-1-13.local

real    0m1.489s
user    0m0.014s
sys     0m0.035s

out

The output (if any) follows:

topsin
pi is approximately 3.1416009869231249, Error is 0.0000083333333318
wall clock time = 0.000184
openmpi
pi is approximately 3.1416009869231249, Error is 0.0000083333333318
wall clock time = 0.010902
openmpi_intel
pi is approximately 3.1416009869231249, Error is 0.0000083333333318
wall clock time = 0.021757

Ethernet

queue emw
MPI: openmpi_intel only.

err

--------------------------------------------------------------------------
[0,1,0]: MVAPI on host compute-1-23 was unable to find any HCAs.
Another transport will be used instead, although this may result in 
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,1]: MVAPI on host compute-1-23 was unable to find any HCAs.
Another transport will be used instead, although this may result in 
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,2]: MVAPI on host compute-1-18 was unable to find any HCAs.
Another transport will be used instead, although this may result in 
lower performance.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,3]: MVAPI on host compute-1-18 was unable to find any HCAs.
Another transport will be used instead, although this may result in 
lower performance.
--------------------------------------------------------------------------
Process 0 on compute-1-23.local
Process 2 on compute-1-18.local
Process 3 on compute-1-18.local
Process 1 on compute-1-23.local

real    0m1.344s
user    0m0.014s
sys     0m0.022s

out

The output (if any) follows:

openmpi_intel
pi is approximately 3.1416009869231249, Error is 0.0000083333333318
wall clock time = 0.013772

Amber

To test Amber with the new scripts, we rerun the programs from these test results. Note that the memory footprint has changed of the nodes involved. Lets tally up some runs and note the run time. We'll run each twice

Amber	MPI Flavor	Switch	NProcs	JAC bench 1	JAC bench 2	Factor_IX bench 1	Factor_IX bench 2
9	topspin	infiniband	4	01m38s	01m57s	02m45s	02m38s
9openmpi	openmpi_intel	infiniband	4	01m30s	01m35s	02m34s	02m035s
9openmpi	openmpi_intel	ethernet	4	02m15s	02m06s	03m32s	03m47s

Home

DokuWiki

User Tools

Site Tools

Table of Contents

LSF & MPI

cpi

Job Script

Infiniband

Ethernet

Amber

Page Tools