This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cluster:173 [2018/08/24 18:14] hmeij07 [K20 Redo Usage] |
cluster:173 [2019/03/25 12:04] (current) hmeij07 |
||
---|---|---|---|
Line 6: | Line 6: | ||
One node '' | One node '' | ||
- | The node is in " | + | Use the ''# |
+ | Update n33-n36 same as n37 (the wrapper | ||
+ | --- // | ||
Usage is about the same as jobs going to the '' | Usage is about the same as jobs going to the '' | ||
Line 16: | Line 18: | ||
** Please check your new results against previous output **. | ** Please check your new results against previous output **. | ||
+ | |||
+ | Details on how the environment was setup | ||
+ | |||
+ | * [[cluster: | ||
+ | * [[cluster: | ||
+ | * [[cluster: | ||
Here is a submit script for recompiled local versions of Amber, Gromacs and Lammps using a custom wrapper. | Here is a submit script for recompiled local versions of Amber, Gromacs and Lammps using a custom wrapper. | ||
Line 23: | Line 31: | ||
< | < | ||
+ | #!/bin/bash | ||
+ | # submit via 'bsub < run.sh' | ||
+ | rm -f out err | ||
+ | #BSUB -e err | ||
+ | #BSUB -o out | ||
+ | #BSUB -q mwgpu | ||
+ | #BSUB -J "K20 test" | ||
+ | ###BSUB -m n37 | ||
+ | |||
+ | #n33-n37 are done and all the same 11Oct2018 | ||
+ | # the wrapper is called the same on all host | ||
+ | |||
+ | # cuda 9 & openmpi | ||
+ | export PATH=/ | ||
+ | export LD_LIBRARY_PATH=/ | ||
+ | export PATH=/ | ||
+ | export LD_LIBRARY_PATH=/ | ||
+ | |||
+ | |||
+ | ## leave sufficient time between job submissions (30-60 secs) | ||
+ | ## the number of GPUs allocated matches -n value automatically | ||
+ | ## always reserve GPU (gpu=1), setting this to 0 is a cpu job only | ||
+ | ## reserve 12288 MB (11 GB + 1 GB overhead) memory per GPU | ||
+ | ## run all processes (1< | ||
+ | |||
+ | |||
+ | # unique job scratch dirs | ||
+ | MYSANSCRATCH=/ | ||
+ | MYLOCALSCRATCH=/ | ||
+ | export MYSANSCRATCH MYLOCALSCRATCH | ||
+ | cd $MYLOCALSCRATCH | ||
+ | |||
+ | |||
+ | # uncomment one software block by removing ONLY one # on each line | ||
+ | |||
+ | |||
+ | ## AMBER we need to recreate env, $AMBERHOME is already set | ||
+ | ##BSUB -n 1 | ||
+ | ##BSUB -R " | ||
+ | #export PATH=/ | ||
+ | #export LD_LIBRARY_PATH=/ | ||
+ | #source / | ||
+ | ## stage the data | ||
+ | #cp -r ~/ | ||
+ | ## feed the wrapper | ||
+ | # | ||
+ | #-O -o mdout.$LSB_JOBID -inf mdinfo.1K10 -x mdcrd.1K10 -r restrt.1K10 -ref inpcrd | ||
+ | ## save results | ||
+ | #scp mdout.$LSB_JOBID ~/k20redo/ | ||
+ | |||
+ | |||
+ | ## GROMACS (using all GPUs example) | ||
+ | ##BSUB -n 4 | ||
+ | ##BSUB -R " | ||
+ | #export CPU_GPU_REQUEST=4: | ||
+ | ## signal GMXRC is a gpu run with: 1=thread_mpi 2=openmpi | ||
+ | #export GMXRC=2 | ||
+ | #export PATH=/ | ||
+ | #export LD_LIBRARY_PATH=/ | ||
+ | #. / | ||
+ | #cd / | ||
+ | # | ||
+ | # -maxh 0.5 -nsteps 600000 -multidir 01 02 03 04 -gpu_id 0123 \ | ||
+ | # -ntmpi 0 -npme 0 -s topol.tpr -ntomp 0 -pin on -nb gpu | ||
+ | |||
+ | |||
+ | |||
+ | ## LAMMPS | ||
+ | ##BSUB -n 1 | ||
+ | ##BSUB -R " | ||
+ | ## GPUIDX=1 use allocated GPU(s), GPUIDX=0 cpu run only (view header input file) | ||
+ | #export GPUIDX=1 # use with -var $GPUIDX in inout file, view au.in, or use -suffix | ||
+ | #export PATH=/ | ||
+ | ## stage the data | ||
+ | #cp -r ~/ | ||
+ | ## feed the wrapper | ||
+ | # | ||
+ | #-suffix gpu -var GPUIDX $GPUIDX -in in.colloid -l out.colloid.$LSB_JOBID | ||
+ | ## save results | ||
+ | #scp out.colloid.$LSB_JOBID ~/k20redo/ | ||
</ | </ | ||
- | | + | |
+ | ==== ib0 ==== | ||
+ | |||
+ | We've lost the ability of bringing up interface '' | ||
+ | |||
+ | Details are described here ... http:// | ||
+ | |||
+ | Reflecting on this, this is not necessarily that bad. For GPU compute nodes we do not really need it. This would also free up 5 infiniband ports on the switch and make the available ports a total of 7. That could be allocated to new servers we're thinking of buying. | ||
+ | |||
+ | \\ | ||
+ | **[[cluster: |