User Tools

Site Tools


cluster:64

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cluster:64 [2008/01/10 16:42] (current)
Line 1: Line 1:
 +\\
 +**[[cluster:​0|Home]]**
  
 +===== LSF & MPI =====
 +
 +The new installation of LSF supports an integrated environment for submitting parallel jobs.  What this means is that the scheduler can keep track of the resource consumption of a job spawning many parallel tasks. ​ Lava was unable to do so.
 +
 +Small changes are needed to your job script. ​ We will first implement a **[[http://​lsfdocs.wesleyan.edu/​hpc6.2_using/​parallel_jobs.html#​196824|Generic Parallel Job Launcher Framework]]**. ​ The two MPI flavors i will fully integrated at a later date will be OpenMPI and Topspin.
 +
 +Below is a sequence of steps detailing how it all works. ​ For a quick synopsis, here is what you need to change to start using this new Framework right away:
 +
 +  - change the references to the old lava wrapper scripts to the new lsf wrapper scripts:
 +    * ''/​share/​apps/​bin/​lsf.topspin.wrapper''​
 +    * ''/​share/​apps/​bin/​lsf.openmpi.wrapper''​
 +    * ''/​share/​apps/​bin/​lsf.openmpi_intel.wrapper''​
 +  - you do not have to specify the ''​-n''​ option anymore when invoking your application,​ the BSUB line is enough
 +    * ''#​BSUB -n 4''​
 +
 +==== cpi ====
 +
 +First, lets compile a tiny little C program. ​ ''​cpi.c''​ is a program that calculates ​ **[[http://​en.wikipedia.org/​wiki/​Pi|π]],​ Pi**.  It first fires off a bunch of parallel tasks, each calculates Pi. Worker #0 reports its own calculation results back to standard out.
 +
 +^Flavor^MPI Compiler^Arguments^
 +| mvapich ​ |/​share/​apps/​mvapich-0.9.9/​bin/​mpicc ​ | -o cpi_mvapich cpi.c  |
 +| openmpi ​ |/​share/​apps/​openmpi-1.2/​bin/​mpicc ​ | -o cpi_openmpi cpi.c |
 +| openmpi_intel ​ |/​share/​apps/​openmpi-1.2_intel/​bin/​mpicc ​ | -o cpi_openmpi_intel cpi.c  |
 +| topspin ​ |/​usr/​local/​topspin/​mpi/​mpich/​bin/​mpicc ​ | -o cpi_topspin cpi.c  |
 +
 +The surprise here is that we end up with binaries ranging in size of 10 Kb to 3 MB.  Topspin is the MPI flavor that came with our cluster for the Infiniband switch. ​ MVApich and OpenMPI were downloaded and the source compiled with ''​gcc''​. The alternate OpenMPI (openmpi_intel) was compiled with Intel'​s compilers. ​ Topspin can only run across the Infiniband switch but both OpenMPI flavors can use either switch.
 +
 +<​code>​
 +
 +lrwxrwxrwx ​ 1 hmeij its      33 Jan  3 15:49 cpi.c -> /​share/​apps/​openmpi-1.2/​bin/​cpi.c
 +-rwxr-xr-x ​ 1 hmeij its  406080 Jan  7 14:38 cpi_mvapich
 +-rwxr-xr-x ​ 1 hmeij its   10166 Jan  8 15:36 cpi_openmpi
 +-rwxr-xr-x ​ 1 hmeij its 3023929 Jan  3 16:32 cpi_openmpi_intel
 +-rwxr-xr-x ​ 1 hmeij its    9781 Jan  3 16:25 cpi_topspin
 +
 +</​code>​
 +
 +==== Job Script ====
 +
 +Here is the test script we'll use for testing. ​ Note the lack of the ''​-n''​ option on the line invoking our application. ​ We will ask for 4 parallel tasks with 2 tasks per node.
 +
 +<​code>​
 +
 +#!/bin/bash
 +rm -f ./err ./out
 +
 +#BSUB -q imw
 +#BSUB -n 4 
 +#BSUB -R "​span[ptile=2]"​
 +#BSUB -J mpi.lsf
 +#BSUB -e err
 +#BSUB -o out 
 +
 +# WRAPPERS
 +
 +echo topsin
 +time /​share/​apps/​bin/​lsf.topspin.wrapper ./​cpi_topspin
 +
 +echo openmpi
 +time /​share/​apps/​bin/​lsf.openmpi.wrapper ./​cpi_openmpi
 +echo openmpi_intel
 +time /​share/​apps/​bin/​lsf.openmpi_intel.wrapper ./​cpi_openmpi_intel
 +
 +</​code>​
 +
 +==== Infiniband ====
 +
 +  * queue: imw
 +  * all 3 MPI flavors
 +
 +  * err
 +
 +<​code>​
 +Process 1 on compute-1-3.local
 +Process 0 on compute-1-3.local
 +Process 3 on compute-1-13.local
 +Process 2 on compute-1-13.local
 +
 +real    0m6.837s
 +user    0m0.032s
 +sys     ​0m0.086s
 +Process 1 on compute-1-3.local
 +Process 2 on compute-1-13.local
 +Process 3 on compute-1-13.local
 +Process 0 on compute-1-3.local
 +
 +real    0m2.071s
 +user    0m0.018s
 +sys     ​0m0.035s
 +Process 0 on compute-1-3.local
 +Process 1 on compute-1-3.local
 +Process 2 on compute-1-13.local
 +Process 3 on compute-1-13.local
 +
 +real    0m1.489s
 +user    0m0.014s
 +sys     ​0m0.035s
 +</​code>​
 +
 +  * out 
 +
 +<​code>​
 +The output (if any) follows:
 +
 +topsin
 +pi is approximately 3.1416009869231249,​ Error is 0.0000083333333318
 +wall clock time = 0.000184
 +openmpi
 +pi is approximately 3.1416009869231249,​ Error is 0.0000083333333318
 +wall clock time = 0.010902
 +openmpi_intel
 +pi is approximately 3.1416009869231249,​ Error is 0.0000083333333318
 +wall clock time = 0.021757
 +
 +</​code>​
 +
 +==== Ethernet ====
 +
 +  * queue ''​emw''​
 +  * MPI: openmpi_intel only.
 +
 +  * err
 +
 +<​code>​
 +--------------------------------------------------------------------------
 +[0,1,0]: MVAPI on host compute-1-23 was unable to find any HCAs.
 +Another transport will be used instead, although this may result in 
 +lower performance.
 +--------------------------------------------------------------------------
 +--------------------------------------------------------------------------
 +[0,1,1]: MVAPI on host compute-1-23 was unable to find any HCAs.
 +Another transport will be used instead, although this may result in 
 +lower performance.
 +--------------------------------------------------------------------------
 +--------------------------------------------------------------------------
 +[0,1,2]: MVAPI on host compute-1-18 was unable to find any HCAs.
 +Another transport will be used instead, although this may result in 
 +lower performance.
 +--------------------------------------------------------------------------
 +--------------------------------------------------------------------------
 +[0,1,3]: MVAPI on host compute-1-18 was unable to find any HCAs.
 +Another transport will be used instead, although this may result in 
 +lower performance.
 +--------------------------------------------------------------------------
 +Process 0 on compute-1-23.local
 +Process 2 on compute-1-18.local
 +Process 3 on compute-1-18.local
 +Process 1 on compute-1-23.local
 +
 +real    0m1.344s
 +user    0m0.014s
 +sys     ​0m0.022s
 +
 +</​code>​
 +
 +  * out 
 +
 +<​code>​
 +The output (if any) follows:
 +
 +openmpi_intel
 +pi is approximately 3.1416009869231249,​ Error is 0.0000083333333318
 +wall clock time = 0.013772
 +
 +</​code>​
 +
 +==== Amber ====
 +
 +To test Amber with the new scripts, we rerun the programs from [[cluster:​42#​results|these test results]]. Note that the memory footprint has changed of the nodes involved. Lets tally up some runs and note the run time. We'll run each twice
 +
 +^Amber^MPI Flavor^Switch^NProcs^JAC bench 1^JAC bench 2^Factor_IX bench 1^Factor_IX bench 2^
 +|9  |topspin ​ |infiniband ​ |  4  |  01m38s ​ |  01m57s ​ |  02m45s ​ |  02m38s ​ |
 +|9openmpi ​ |openmpi_intel ​ |infiniband ​ |  4  |  01m30s ​ |  01m35s ​ |  02m34s ​ |  02m035s ​ |
 +|9openmpi ​ |openmpi_intel ​ |ethernet ​ |  4  |  02m15s ​ |  02m06s ​ |  03m32s ​ |  03m47s ​ |
 +
 +
 +\\
 +**[[cluster:​0|Home]]**
cluster/64.txt · Last modified: 2008/01/10 16:42 (external edit)