User Tools

Site Tools


cluster:119

Warning: Undefined array key -1 in /usr/share/dokuwiki/inc/html.php on line 1458

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:119 [2013/08/21 11:03]
hmeij
cluster:119 [2021/06/17 15:32] (current)
hmeij07
Line 1: Line 1:
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
- 
-Jobs need to be submitted to the scheduler on host sharptail itself for now and will be dispatched to nodes n33-n37 in queue mwgpu.  
- --- //[[hmeij@wesleyan.edu|Meij, Henk]] 2013/08/21 11:01// 
  
 ==== Submitting GPU Jobs ==== ==== Submitting GPU Jobs ====
 +
 +Please plenty of time between multiple GPU job submissions.  Like minutes.
 +
 +Jobs need to be submitted to the scheduler via cottontail to queues mwgpu, amber128, exx96.
 +
 +This page is old, the gpu resource ''gpu4'' should be used, a more recent page can be found [[cluster:173|K20 Redo Usage]]. Although there might some useful information on this page explaining gpu jobs.
 + --- //[[hmeij@wesleyan.edu|Henk]] 2021/06/17 15:29//
 +
 +**Articles**
 +
 +  * [[http://www.pgroup.com/lit/articles/insider/v5n2a1.htm]] Tesla vs. Xeon Phi vs. Radeon: A Compiler Writer's Perspective 
 +  * [[http://www.pgroup.com/lit/articles/insider/v5n2a5.htm]] Calling CUDA Fortran kernels from MATLAB 
 +
  
  
Line 44: Line 54:
 </code> </code>
  
-With ''gpu-info'' we can view our running job.  ''gpu-info'' and ''gpu-free'' are available [[http://ambermd.org/gpus/]] (I had to hard code my GPU string information as they came in at 02,03,82&83, you can use deviceQuery to find them).+With ''gpu-info'' we can view our running job.  ''gpu-info'' and ''gpu-free'' are available <del>[[http://ambermd.org/gpus/]]</del> [[http://ambermd.org/gpus12/#Running]](I had to hard code my GPU string information as they came in at 02,03,82&83, you can use deviceQuery to find them).
  
 <code> <code>
Line 58: Line 68:
 3       Tesla K20m      21 C            0 % 3       Tesla K20m      21 C            0 %
 ==================================================== ====================================================
 +
 +[hmeij@sharptail sharptail]$ ssh n33 gpu-free
 +1,3,0
 +
 +
  
 </code> </code>
Line 116: Line 131:
 #!/bin/bash #!/bin/bash
 # submit via 'bsub < run.gpu' # submit via 'bsub < run.gpu'
-rm -f mdout.[0-9]* auout.[0-9]*+rm -f mdout.[0-9]* auout.[0-9]* apoa1out.[0-9]*
 #BSUB -e err #BSUB -e err
 #BSUB -o out #BSUB -o out
Line 122: Line 137:
 #BSUB -J test #BSUB -J test
  
-## leave sufficient time between job submissions (30-60 secs) +from greentail we need to set up the module env 
-## the number of GPUs allocated matches -n value automatically +export PATH=/home/apps/bin:/cm/local/apps/cuda50/libs/304.54/bin:
-## always reserve GPU (gpu=1), setting this to is a cpu job only +/cm/shared/apps/cuda50/sdk/5.0.35/bin/linux/release:/cm/shared/apps/lammps/cuda/2013-01-27/:\ 
-## reserve 6144 MB (GB + 20%) memory per GPU +/cm/shared/apps/amber/amber12/bin:/cm/shared/apps/namd/ibverbs-smp-cuda/2013-06-02/:\ 
-## run all processes (1<=n<=4)) on same node (hosts=1).+/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:
 +/usr/sbin:/cm/shared/apps/cuda50/toolkit/5.0.35/bin:/cm/shared/apps/cuda50/sdk/5.0.35/bin/linux/release:\ 
 +/cm/shared/apps/cuda50/libs/current/bin:/cm/shared/apps/cuda50/toolkit/5.0.35/open64/bin:\ 
 +/cm/shared/apps/mvapich2/gcc/64/1.6/bin:/cm/shared/apps/mvapich2/gcc/64/1.6/sbin 
 +export LD_LIBRARY_PATH=/cm/local/apps/cuda50/libs/304.54/lib64:
 +/cm/shared/apps/cuda50/toolkit/5.0.35/lib64:/cm/shared/apps/amber/amber12/lib:
 +/cm/shared/apps/amber/amber12/lib64:/cm/shared/apps/namd/ibverbs-smp-cuda/2013-06-02/:
 +/cm/shared/apps/cuda50/toolkit/5.0.35/lib64:/cm/shared/apps/cuda50/libs/current/lib64:
 +/cm/shared/apps/cuda50/toolkit/5.0.35/open64/lib:/cm/shared/apps/cuda50/toolkit/5.0.35/extras/CUPTI/lib:
 +/cm/shared/apps/mvapich2/gcc/64/1.6/lib
  
-#BSUB -n 1 
-#BSUB -R "rusage[gpu=1:mem=6144],span[hosts=1]" 
- 
-# unique job scratch dirs 
-MYSANSCRATCH=/sanscratch/$LSB_JOBID 
-MYLOCALSCRATCH=/localscratch/$LSB_JOBID 
-export MYSANSCRATCH MYLOCALSCRATCH 
-cd $MYSANSCRATCH 
- 
-# AMBER 
-# stage the data 
-cp ~/sharptail/* . 
-# feed the wrapper 
-lava.mvapich2.wrapper pmemd.cuda.MPI \ 
--O -o mdout.$LSB_JOBID -inf mdinfo.1K10 -x mdcrd.1K10 -r restrt.1K10 -ref inpcrd 
-# save results 
-cp mdout.[0-9]* ~/sharptail/ 
- 
-# LAMMPS 
-# GPUIDX=1 use allocated GPU(s), GPUIDX=0 cpu run only (view header au.inp) 
-export GPUIDX=1 
-# stage the data 
-cp ~/sharptail/* . 
-# feed the wrapper 
-lava.mvapich2.wrapper lmp_nVidia \ 
--c off -var GPUIDX $GPUIDX -in au.inp -l auout.$LSB_JOBID 
-# save results 
-cp auout.[0-9]* ~/sharptail/ 
- 
-</code> 
- 
- 
-==== lava.mvampich2.wrapper ==== 
- 
-<code> 
- 
-#!/bin/bash 
-# submit via 'bsub < run.gpu' 
-rm -f mdout.[0-9]* auout.[0-9]* apoa1out.[0-9]* 
-#BSUB -e err 
-#BSUB -o out 
-#BSUB -q mwgpu 
-#BSUB -J test 
  
 ## leave sufficient time between job submissions (30-60 secs) ## leave sufficient time between job submissions (30-60 secs)
Line 217: Line 198:
 # save results # save results
 cp apoa1out.$LSB_JOBID ~/sharptail/ cp apoa1out.$LSB_JOBID ~/sharptail/
 +
 +
 +</code>
 +
 +
 +==== gromacs.sub ====
 +
 +<code>
 +
 +#!/bin/bash
 +
 +rm -rf gromacs.out gromacs.err \#* *.log
 +
 +# from greentail we need to recreate module env
 +export PATH=/home/apps/bin:/cm/local/apps/cuda50/libs/304.54/bin:\
 +/cm/shared/apps/cuda50/sdk/5.0.35/bin/linux/release:/cm/shared/apps/lammps/cuda/2013-01-27/:\
 +/cm/shared/apps/amber/amber12/bin:/cm/shared/apps/namd/ibverbs-smp-cuda/2013-06-02/:\
 +/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:\
 +/usr/sbin:/cm/shared/apps/cuda50/toolkit/5.0.35/bin:/cm/shared/apps/cuda50/sdk/5.0.35/bin/linux/release:\
 +/cm/shared/apps/cuda50/libs/current/bin:/cm/shared/apps/cuda50/toolkit/5.0.35/open64/bin:\
 +/cm/shared/apps/mvapich2/gcc/64/1.6/bin:/cm/shared/apps/mvapich2/gcc/64/1.6/sbin
 +export LD_LIBRARY_PATH=/cm/local/apps/cuda50/libs/304.54/lib64:\
 +/cm/shared/apps/cuda50/toolkit/5.0.35/lib64:/cm/shared/apps/amber/amber12/lib:\
 +/cm/shared/apps/amber/amber12/lib64:/cm/shared/apps/namd/ibverbs-smp-cuda/2013-06-02/:\
 +/cm/shared/apps/cuda50/toolkit/5.0.35/lib64:/cm/shared/apps/cuda50/libs/current/lib64:\
 +/cm/shared/apps/cuda50/toolkit/5.0.35/open64/lib:/cm/shared/apps/cuda50/toolkit/5.0.35/extras/CUPTI/lib:\
 +/cm/shared/apps/mvapich2/gcc/64/1.6/lib
 +
 +#BSUB -o gromacs.out
 +#BSUB -e gromacs.err
 +#BSUB -N
 +#BSUB -J 325monolayer
 +
 +# read /share/apps/gromacs/build.sh
 +. /share/apps/intel/composerxe/bin/iccvars.sh intel64
 +export VMDDIR=/share/apps/vmd/1.8.6
 +
 +## CPU RUN: queue mw256, n<=28, must run on one node (thread_mpi)
 +##BSUB -q mw256
 +##BSUB -n 2
 +##BSUB -R "rusage[gpu=0],span[hosts=1]"
 +#export PATH=/share/apps/gromacs/4.6-icc-gpu/bin:$PATH
 +#. /share/apps/gromacs/4.6-icc-gpu/bin/GMXRC.bash
 +#mdrun -nt 2 -s 325topol.tpr -c 325monolayer.gro -e 325ener.edr -o 325traj.trr -x 325traj.xtc
 +
 +## GPU RUN: gpu (1-4), queue mwgpu, n (1-4, matches gpu count), must run on one node
 +##BSUB -q mwgpu
 +##BSUB -n 1
 +##BSUB -R "rusage[gpu=1:mem=7000],span[hosts=1]"
 +## signal GMXRC is a gpu run with: 1=thread_mpi
 +#export GMXRC=1
 +#export PATH=/share/apps/gromacs/4.6-icc-gpu/bin:$PATH
 +#. /share/apps/gromacs/4.6-icc-gpu/bin/GMXRC.bash
 +#lava.mvapich2.wrapper mdrun \
 +#-testverlet -s 325topol.tpr -c 325monolayer.gro -e 325ener.edr -o 325traj.trr -x 325traj.xtc
 +
 +# GPU RUN: gpu (1-4), queue mwgpu, n (1-4, matches gpu count), must run on one node
 +#BSUB -q mwgpu
 +#BSUB -n 1
 +#BSUB -R "rusage[gpu=1:mem=7000],span[hosts=1]"
 +# signal GMXRC is a gpu run with: 2=mvapich2
 +export GMXRC=2
 +export PATH=/share/apps/gromacs/4.6-mpi-gpu/bin:$PATH
 +. /share/apps/gromacs/4.6-mpi-gpu/bin/GMXRC.bash
 +lava.mvapich2.wrapper mdrun_mpi \
 +-testverlet -s 325topol.tpr -c 325monolayer.gro -e 325ener.edr -o 325traj.trr -x 325traj.xtc
 +
 +
 +</code>
 +
 +==== matlab.sub ====
 +
 +<code>
 +
 +#!/bin/bash
 +
 +rm -rf out err *.out
 +
 +# from greentail we need to recreate module env
 +export PATH=/home/apps/bin:/cm/local/apps/cuda50/libs/304.54/bin:\
 +/cm/shared/apps/cuda50/sdk/5.0.35/bin/linux/release:/cm/shared/apps/lammps/cuda/2013-01-27/:\
 +/cm/shared/apps/amber/amber12/bin:/cm/shared/apps/namd/ibverbs-smp-cuda/2013-06-02/:\
 +/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:\
 +/usr/sbin:/cm/shared/apps/cuda50/toolkit/5.0.35/bin:/cm/shared/apps/cuda50/sdk/5.0.35/bin/linux/release:\
 +/cm/shared/apps/cuda50/libs/current/bin:/cm/shared/apps/cuda50/toolkit/5.0.35/open64/bin:\
 +/cm/shared/apps/mvapich2/gcc/64/1.6/bin:/cm/shared/apps/mvapich2/gcc/64/1.6/sbin
 +export PATH=/share/apps/matlab/2013a/bin:$PATH
 +export LD_LIBRARY_PATH=/cm/local/apps/cuda50/libs/304.54/lib64:\
 +/cm/shared/apps/cuda50/toolkit/5.0.35/lib64:/cm/shared/apps/amber/amber12/lib:\
 +/cm/shared/apps/amber/amber12/lib64:/cm/shared/apps/namd/ibverbs-smp-cuda/2013-06-02/:\
 +/cm/shared/apps/cuda50/toolkit/5.0.35/lib64:/cm/shared/apps/cuda50/libs/current/lib64:\
 +/cm/shared/apps/cuda50/toolkit/5.0.35/open64/lib:/cm/shared/apps/cuda50/toolkit/5.0.35/extras/CUPTI/lib:\
 +/cm/shared/apps/mvapich2/gcc/64/1.6/lib
 +
 +#BSUB -o out
 +#BSUB -e err
 +#BSUB -N
 +#BSUB -J test
 +
 +# GPU RUN: (1-4), queue mwgpu, n (1-4, matches gpu count), must run on one node
 +#BSUB -q mwgpu
 +#BSUB -n 1
 +#BSUB -R "rusage[gpu=1:mem=7000],span[hosts=1]"
 +# signal MATGPU is a gpu run
 +export MATGPU=1
 +lava.mvapich2.wrapper matlab -nodisplay  -r test
 +
 +
 +</code>
 +
 +==== lava.mvampich2.wrapper ====
 +
 +<code>
 +
 +#!/bin/sh
 +
 +# This is a copy of lava.openmpi.wrapper which came with lava OCS kit
 +# Trying to make it work with mvapich2
 +# -hmeij 13aug2013
 +
 +#
 +#  Copyright (c) 2007 Platform Computing
 +#
 +# This script is a wrapper for openmpi mpirun
 +# it generates the machine file based on the hosts
 +# given to it by Lava.
 +#
 +
 +# RLIMIT_MEMLOCK problem with libibverbs -hmeij
 +ulimit -l unlimited
 +
 +
 +usage() {
 +        cat <<USEEOF
 +USAGE:  $0
 +        This command is a wrapper for mpirun (openmpi).  It can
 +        only be run within Lava using bsub e.g.
 +                bsub -n # "$0 -np # {my mpi command and args}"
 +
 +        The wrapper will automatically generate the
 +        machinefile used by mpirun.
 +
 +        NOTE:  The list of hosts cannot exceed 4KBytes.
 +USEEOF
 +}
 +
 +if [ x"${LSB_JOBFILENAME}" = x -o x"${LSB_HOSTS}" = x ]; then
 +    usage
 +    exit -1
 +fi
 +
 +MYARGS=$*
 +WORKDIR=`dirname ${LSB_JOBFILENAME}`
 +MACHFILE=${WORKDIR}/mpi_machines
 +ARGLIST=${WORKDIR}/mpi_args
 +
 +# Check if mpirun is in the PATH -hmeij
 +T=`which --skip-alias mpirun_rsh`
 +#T=`which mpirun_rsh`
 +if [ $? -ne 0 ]; then
 +    echo "Error:  mpirun_rsh is not in your PATH."
 +    exit -2
 +fi
 +
 +echo "${MYARGS}" > ${ARGLIST}
 +#T=`grep -- -machinefile ${ARGLIST} |wc -l`
 +T=`grep -- -hostfile ${ARGLIST} |wc -l`
 +if [ $T -gt 0 ]; then
 +    echo "Error:  Do not provide the machinefile for mpirun."
 +    echo "        It is generated automatically for you."
 +    exit -3
 +fi
 +
 +# Make the open-mpi machine file
 +echo "${LSB_HOSTS}" > ${MACHFILE}.lst
 +tr '\/ ' '\r\n' < ${MACHFILE}.lst > ${MACHFILE}
 +
 +MPIRUN=`which --skip-alias mpirun_rsh`
 +#MPIRUN=/share/apps/openmpi/1.2+intel-9/bin/mpirun
 +#echo "executing: ${MPIRUN} -x LD_LIBRARY_PATH -machinefile ${MACHFILE} ${MYARGS}"
 +
 +# sanity checks number of processes 1-4
 +np=`wc -l ${MACHFILE} | awk '{print $1}'
 +if [ $np -lt 1 -o $np -gt 4 ]; then
 +    echo "Error:  Incorrect number of processes ($np)"
 +    echo "        -n can be an integer in the range of 1 to 4"
 +    exit -4
 +fi
 +
 +# sanity check single node
 +nh=`cat ${MACHFILE} | sort -u | wc -l` 
 +if [ $nh -ne 1 ]; then
 +    echo "Error:  No host or more than one host specified ($nh)"
 +    exit -5
 +fi
 +
 +# one host, one to four gpus
 +gpunp=`cat ${MACHFILE} | wc -l | awk '{print $1}'`
 +gpuhost=`cat ${MACHFILE} | sort -u | tr -d '\n'`
 +gpuid=( $(for i in `ssh $gpuhost gpu-free | sed "s/,/ /g"`; do echo $i; done | shuf | head -$gpunp) )
 +if [ $gpunp -eq 1 ]; then
 +        CUDA_VISIBLE_DEVICES=$gpuid
 +        echo "GPU allocation instance $gpuhost:$gpuid"
 +else
 +        gpuids=`echo ${gpuid[@]} | sed "s/ /,/g"`
 +        CUDA_VISIBLE_DEVICES="$gpuids"
 +        echo "GPU allocation instance $gpuhost:$CUDA_VISIBLE_DEVICES"
 +fi
 +# namd ignores this
 +export CUDA_VISIBLE_DEVICES
 +#debug# setid=`ssh $gpuhost echo $CUDA_VISIBLE_DEVICES | tr '\n' ' '`
 +#debug# echo "setid=$setid";
 +
 +
 +if [ -n "$GMXRC" ]; then
 +        # gromacs needs them from base 0, so gpu 2,3 is string 01
 +        if [ ${#gpuid[*]} -eq 1 ]; then
 +                gmxrc_gpus="0"
 +        elif [ ${#gpuid[*]} -eq 2 ]; then
 +                gmxrc_gpus="01"
 +        elif [ ${#gpuid[*]} -eq 3 ]; then
 +                gmxrc_gpus="012"
 +        elif [ ${#gpuid[*]} -eq 4 ]; then
 +                gmxrc_gpus="0123"
 +        fi
 +
 +        if [ $GMXRC -eq 1 ]; then
 +                newargs=`echo ${MYARGS} | sed "s/mdrun/mdrun -gpu_id $gmxrc_gpus/g"`
 +                echo "executing: $newargs"
 +                $newargs
 +        elif [ $GMXRC -eq 2 ]; then
 +                newargs=`echo ${MYARGS} | sed "s/mdrun_mpi/mdrun_mpi -gpu_id $gmxrc_gpus/g"`
 +                echo "executing: ${MPIRUN} -ssh -hostfile ${MACHFILE} -np $gpunp $newargs"
 +                ${MPIRUN} -ssh -hostfile ${MACHFILE} -np $gpunp $newargs
 +        fi
 +
 +elif [ -n "$MATGPU" ] && [ $MATGPU -eq 1 ]; then
 +        echo "executing: ${MYARGS}
 +        ${MYARGS}
 +elif [ -n "$CHARMRUN" ] && [ $CHARMRUN -eq 1 ]; then
 +        cat ${MACHFILE}.lst | tr '\/ ' '\r\n' | sed 's/^/host /g' > ${MACHFILE}
 +        echo "executing: charmrun $NAMD_DIR/namd2 +p$gpunp ++nodelist ${MACHFILE} +idlepoll +devices $CUDA_VISIBLE_DEVICES ${MYARGS}"
 +        charmrun $NAMD_DIR/namd2 +p$gpunp ++nodelist ${MACHFILE} +idlepoll +devices $CUDA_VISIBLE_DEVICES ${MYARGS}
 +else
 +        echo "executing: ${MPIRUN} -ssh -hostfile ${MACHFILE} -np $gpunp ${MYARGS}"
 +        ${MPIRUN} -ssh -hostfile ${MACHFILE} -np $gpunp ${MYARGS}
 +fi
 +
 +exit $?
 +
 +
 +</code>
 +
 +
 +===== elim code =====
 +
 +<code>
 +
 +#!/usr/bin/perl
 +
 +while (1) {
 +
 +        $gpu = 0;
 +        $log = '';
 +        if (-e "/usr/local/bin/gpu-info" ) {
 +                $tmp = `/usr/local/bin/gpu-info | egrep "Tesla K20"`;
 +                @tmp = split(/\n/,$tmp);
 +                foreach $i (0..$#tmp) {
 +                        ($a,$b,$c,$d,$e,$f,$g) = split(/\s+/,$tmp[$i]);
 +                        if ( $f == 0 ) { $gpu = $gpu + 1; }
 +                        #print "$a $f $gpu\n";
 +                        $log .= "$f,";
 +                }
 +        }
 +        # nr_of_args name1 value1 
 +        $string = "1 gpu $gpu";
 +
 +        $h = `hostname`; chop($h);
 +        $d = `date +%m/%d/%y_%H:%M:%S`; chop($d);
 +        foreach $i ('n33','n34','n35','n36','n37') {
 +                if ( "$h" eq "$i" ) {
 +                        `echo "$d,$log" >> /share/apps/logs/$h.gpu.log`;
 +                }
 +        }
 +
 +        # you need the \n to flush -hmeij
 +        # you also need the space before the line feed -hmeij
 +        print "$string \n"; 
 +        # or use
 +        #syswrite(OUT,$string,1);
 +
 +        # smaller than specified in lsf.shared
 +        sleep 10;
 +
 +}
 +
  
 </code> </code>
cluster/119.1377097412.txt.gz · Last modified: 2013/08/21 11:03 by hmeij