This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:182 [2019/08/12 17:05] hmeij07 [Lammps] |
cluster:182 [2019/08/12 17:50] hmeij07 [Gromacs] |
||
---|---|---|---|
Line 72: | Line 72: | ||
Precision for GPU calculations | Precision for GPU calculations | ||
- | * DD -D_DOUBLE_DOUBLE | + | * [DD] -D_DOUBLE_DOUBLE |
- | * SD -D_SINGLE_DOUBLE | + | * [SD] -D_SINGLE_DOUBLE |
- | * SS -D_SINGLE_SINGLE | + | * [SS] -D_SINGLE_SINGLE |
Line 85: | Line 85: | ||
But the T4 shines in this application. The mixed or single precision modes compete well given the T4's price and wattage consumption. | But the T4 shines in this application. The mixed or single precision modes compete well given the T4's price and wattage consumption. | ||
+ | |||
+ | ==== Gromacs ==== | ||
+ | |||
+ | Gromacs was build on each of the nodes locally letting it select the optimal CPU (AVX, SSE) and GPU accelerators. The '' | ||
+ | |||
+ | |||
+ | ^ ns/ | ||
+ | | Mixed | | | 254| | | gpu=1, 01-04 | | ||
+ | | Mixed | | 551| | | 546| gpu=4, 01-04 | | ||
+ | | Mixed | | | | | 650| gpu=4, 01-08 | | ||
+ | | Mixed | | | | | 733| gpu=4, 01-16 | | ||
+ | |||
+ | The T4 is P100's equal in mixed precision performance. Add the wattage factor and you have a favorite. | ||
+ | |||
==== Scripts ==== | ==== Scripts ==== | ||
Line 147: | Line 161: | ||
</ | </ | ||
+ | |||
+ | * Gromacs | ||
+ | |||
+ | < | ||
+ | |||
+ | #!/bin/bash | ||
+ | #SBATCH --nodes=1 | ||
+ | #SBATCH --nodelist=node9 | ||
+ | #SBATCH --job-name=" | ||
+ | #SBATCH --ntasks-per-node=32 | ||
+ | #SBATCH --gres=gpu: | ||
+ | #SBATCH --exclusive | ||
+ | |||
+ | export PATH=$HOME/ | ||
+ | export LD_LIBRARY_PATH=$HOME/ | ||
+ | . $HOME/ | ||
+ | rm -f gpu/??/c* gpu/??/e* gpu/??/s* gpu/??/ | ||
+ | cd gpu | ||
+ | |||
+ | # T4 | ||
+ | #export CUDA_VISIBLE_DEVICES=0123 | ||
+ | |||
+ | mpirun -np 8 gmx_mpi mdrun -maxh 1 -gpu_id 0123 \ | ||
+ | -nsteps 1000000 -multidir 05 06 07 08 \ | ||
+ | -ntmpi 0 -npme 0 -s topol.tpr -ntomp 0 -pin on -nb gpu | ||
+ | |||
+ | </ | ||
+ | |||
+ | And GPU utilization was outstanding. | ||
+ | |||
+ | [heme@login1 gromacs-2018]$ ssh node9 ./ | ||
+ | id, | ||
+ | 0, Tesla T4, 66, 866 MiB, 14213 MiB, 98 %, 9 %\\ | ||
+ | 1, Tesla T4, 67, 866 MiB, 14213 MiB, 98 %, 9 %\\ | ||
+ | 2, Tesla T4, 66, 866 MiB, 14213 MiB, 99 %, 9 %\\ | ||
+ | 3, Tesla T4, 64, 866 MiB, 14213 MiB, 97 %, 9 %\\ | ||
+ | |||
\\ | \\ | ||
**[[cluster: | **[[cluster: |