User Tools

Site Tools


cluster:64

Warning: Undefined array key 0 in /usr/share/dokuwiki/inc/html.php on line 1271

Warning: Trying to access array offset on value of type bool in /usr/share/dokuwiki/inc/html.php on line 1164

Warning: Trying to access array offset on value of type bool in /usr/share/dokuwiki/inc/html.php on line 1168

Warning: Trying to access array offset on value of type bool in /usr/share/dokuwiki/inc/html.php on line 1171

Warning: Trying to access array offset on value of type bool in /usr/share/dokuwiki/inc/html.php on line 1172

Warning: Undefined array key 0 in /usr/share/dokuwiki/inc/ChangeLog/ChangeLog.php on line 345

Warning: Undefined array key 1 in /usr/share/dokuwiki/inc/html.php on line 1453

Warning: Undefined array key -1 in /usr/share/dokuwiki/inc/html.php on line 1454

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cluster:64 [2008/01/10 16:42] (current)
Line 1: Line 1:
 +\\
 +**[[cluster:0|Home]]**
  
 +===== LSF & MPI =====
 +
 +The new installation of LSF supports an integrated environment for submitting parallel jobs.  What this means is that the scheduler can keep track of the resource consumption of a job spawning many parallel tasks.  Lava was unable to do so.
 +
 +Small changes are needed to your job script.  We will first implement a **[[http://lsfdocs.wesleyan.edu/hpc6.2_using/parallel_jobs.html#196824|Generic Parallel Job Launcher Framework]]**.  The two MPI flavors i will fully integrated at a later date will be OpenMPI and Topspin.
 +
 +Below is a sequence of steps detailing how it all works.  For a quick synopsis, here is what you need to change to start using this new Framework right away:
 +
 +  - change the references to the old lava wrapper scripts to the new lsf wrapper scripts:
 +    * ''/share/apps/bin/lsf.topspin.wrapper''
 +    * ''/share/apps/bin/lsf.openmpi.wrapper''
 +    * ''/share/apps/bin/lsf.openmpi_intel.wrapper''
 +  - you do not have to specify the ''-n'' option anymore when invoking your application, the BSUB line is enough
 +    * ''#BSUB -n 4''
 +
 +==== cpi ====
 +
 +First, lets compile a tiny little C program.  ''cpi.c'' is a program that calculates  **[[http://en.wikipedia.org/wiki/Pi|π]], Pi**.  It first fires off a bunch of parallel tasks, each calculates Pi. Worker #0 reports its own calculation results back to standard out.
 +
 +^Flavor^MPI Compiler^Arguments^
 +| mvapich  |/share/apps/mvapich-0.9.9/bin/mpicc  | -o cpi_mvapich cpi.c  |
 +| openmpi  |/share/apps/openmpi-1.2/bin/mpicc  | -o cpi_openmpi cpi.c |
 +| openmpi_intel  |/share/apps/openmpi-1.2_intel/bin/mpicc  | -o cpi_openmpi_intel cpi.c  |
 +| topspin  |/usr/local/topspin/mpi/mpich/bin/mpicc  | -o cpi_topspin cpi.c  |
 +
 +The surprise here is that we end up with binaries ranging in size of 10 Kb to 3 MB.  Topspin is the MPI flavor that came with our cluster for the Infiniband switch.  MVApich and OpenMPI were downloaded and the source compiled with ''gcc''. The alternate OpenMPI (openmpi_intel) was compiled with Intel's compilers.  Topspin can only run across the Infiniband switch but both OpenMPI flavors can use either switch.
 +
 +<code>
 +
 +lrwxrwxrwx  1 hmeij its      33 Jan  3 15:49 cpi.c -> /share/apps/openmpi-1.2/bin/cpi.c
 +-rwxr-xr-x  1 hmeij its  406080 Jan  7 14:38 cpi_mvapich
 +-rwxr-xr-x  1 hmeij its   10166 Jan  8 15:36 cpi_openmpi
 +-rwxr-xr-x  1 hmeij its 3023929 Jan  3 16:32 cpi_openmpi_intel
 +-rwxr-xr-x  1 hmeij its    9781 Jan  3 16:25 cpi_topspin
 +
 +</code>
 +
 +==== Job Script ====
 +
 +Here is the test script we'll use for testing.  Note the lack of the ''-n'' option on the line invoking our application.  We will ask for 4 parallel tasks with 2 tasks per node.
 +
 +<code>
 +
 +#!/bin/bash
 +rm -f ./err ./out
 +
 +#BSUB -q imw
 +#BSUB -n 4 
 +#BSUB -R "span[ptile=2]"
 +#BSUB -J mpi.lsf
 +#BSUB -e err
 +#BSUB -o out 
 +
 +# WRAPPERS
 +
 +echo topsin
 +time /share/apps/bin/lsf.topspin.wrapper ./cpi_topspin
 +
 +echo openmpi
 +time /share/apps/bin/lsf.openmpi.wrapper ./cpi_openmpi
 +echo openmpi_intel
 +time /share/apps/bin/lsf.openmpi_intel.wrapper ./cpi_openmpi_intel
 +
 +</code>
 +
 +==== Infiniband ====
 +
 +  * queue: imw
 +  * all 3 MPI flavors
 +
 +  * err
 +
 +<code>
 +Process 1 on compute-1-3.local
 +Process 0 on compute-1-3.local
 +Process 3 on compute-1-13.local
 +Process 2 on compute-1-13.local
 +
 +real    0m6.837s
 +user    0m0.032s
 +sys     0m0.086s
 +Process 1 on compute-1-3.local
 +Process 2 on compute-1-13.local
 +Process 3 on compute-1-13.local
 +Process 0 on compute-1-3.local
 +
 +real    0m2.071s
 +user    0m0.018s
 +sys     0m0.035s
 +Process 0 on compute-1-3.local
 +Process 1 on compute-1-3.local
 +Process 2 on compute-1-13.local
 +Process 3 on compute-1-13.local
 +
 +real    0m1.489s
 +user    0m0.014s
 +sys     0m0.035s
 +</code>
 +
 +  * out 
 +
 +<code>
 +The output (if any) follows:
 +
 +topsin
 +pi is approximately 3.1416009869231249, Error is 0.0000083333333318
 +wall clock time = 0.000184
 +openmpi
 +pi is approximately 3.1416009869231249, Error is 0.0000083333333318
 +wall clock time = 0.010902
 +openmpi_intel
 +pi is approximately 3.1416009869231249, Error is 0.0000083333333318
 +wall clock time = 0.021757
 +
 +</code>
 +
 +==== Ethernet ====
 +
 +  * queue ''emw''
 +  * MPI: openmpi_intel only.
 +
 +  * err
 +
 +<code>
 +--------------------------------------------------------------------------
 +[0,1,0]: MVAPI on host compute-1-23 was unable to find any HCAs.
 +Another transport will be used instead, although this may result in 
 +lower performance.
 +--------------------------------------------------------------------------
 +--------------------------------------------------------------------------
 +[0,1,1]: MVAPI on host compute-1-23 was unable to find any HCAs.
 +Another transport will be used instead, although this may result in 
 +lower performance.
 +--------------------------------------------------------------------------
 +--------------------------------------------------------------------------
 +[0,1,2]: MVAPI on host compute-1-18 was unable to find any HCAs.
 +Another transport will be used instead, although this may result in 
 +lower performance.
 +--------------------------------------------------------------------------
 +--------------------------------------------------------------------------
 +[0,1,3]: MVAPI on host compute-1-18 was unable to find any HCAs.
 +Another transport will be used instead, although this may result in 
 +lower performance.
 +--------------------------------------------------------------------------
 +Process 0 on compute-1-23.local
 +Process 2 on compute-1-18.local
 +Process 3 on compute-1-18.local
 +Process 1 on compute-1-23.local
 +
 +real    0m1.344s
 +user    0m0.014s
 +sys     0m0.022s
 +
 +</code>
 +
 +  * out 
 +
 +<code>
 +The output (if any) follows:
 +
 +openmpi_intel
 +pi is approximately 3.1416009869231249, Error is 0.0000083333333318
 +wall clock time = 0.013772
 +
 +</code>
 +
 +==== Amber ====
 +
 +To test Amber with the new scripts, we rerun the programs from [[cluster:42#results|these test results]]. Note that the memory footprint has changed of the nodes involved. Lets tally up some runs and note the run time. We'll run each twice
 +
 +^Amber^MPI Flavor^Switch^NProcs^JAC bench 1^JAC bench 2^Factor_IX bench 1^Factor_IX bench 2^
 +|9  |topspin  |infiniband  |  4  |  01m38s  |  01m57s  |  02m45s  |  02m38s  |
 +|9openmpi  |openmpi_intel  |infiniband  |  4  |  01m30s  |  01m35s  |  02m34s  |  02m035s  |
 +|9openmpi  |openmpi_intel  |ethernet  |  4  |  02m15s  |  02m06s  |  03m32s  |  03m47s  |
 +
 +
 +\\
 +**[[cluster:0|Home]]**
cluster/64.txt · Last modified: 2008/01/10 16:42 (external edit)