This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:182 [2019/08/12 15:00] hmeij07 [Amber] |
cluster:182 [2019/08/12 16:28] hmeij07 [Scripts] |
||
---|---|---|---|
Line 52: | Line 52: | ||
^ ns/ | ^ ns/ | ||
- | | DPFP | 5.21| | + | | DPFP | 5.21| 18.35| |
- | | SXFP | 11.82| | + | | SXFP | 11.82| 37.44| |
- | | SFFP | 11.91| | + | | SFFP | 11.91| 40.98| |
+ | Like last testing outcome, in the SFFP precision mode it is best to run four individual jobs, one per GPU (mpi=1, gpu=1). Best performance is the P100 at 47.64 vs the RTX at 39.69 ns/day per node. The T4 runs about 1/3 as fast and really falters in DPFP precision mode. But in SXFP (experimental) precision mode the T4 makes up in performance. | ||
+ | Can't complain about utilization rates.\\ | ||
+ | Amber mpi=4 gpu=4\\ | ||
+ | |||
+ | [heme@login1 amber16]$ ssh node7 ./ | ||
+ | id, | ||
+ | 0, Tesla P100-PCIE-16GB, | ||
+ | 1, Tesla P100-PCIE-16GB, | ||
+ | 2, Tesla P100-PCIE-16GB, | ||
+ | 3, Tesla P100-PCIE-16GB, | ||
+ | |||
+ | ==== Lammps ==== | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ^ ns/ | ||
+ | | DPFP | | | | | | | ||
+ | | SXFP | | | | | | | ||
+ | | SFFP | | | | | | | ||
==== Scripts ==== | ==== Scripts ==== | ||
+ | |||
+ | All 3 software applications were compiled within default environment and Cuda 10.1 | ||
+ | |||
+ | Currently Loaded Modules:\\ | ||
+ | 1) GCCcore/ | ||
+ | 2) zlib/ | ||
+ | 3) binutils/ | ||
+ | |||
+ | Follow\\ | ||
+ | https:// | ||
* Amber | * Amber | ||
Line 81: | Line 111: | ||
</ | </ | ||
+ | * Lammps | ||
+ | < | ||
+ | |||
+ | #!/bin/bash | ||
+ | |||
+ | #SBATCH --nodes=1 | ||
+ | #SBATCH --nodelist=node5 | ||
+ | #SBATCH --job-name=" | ||
+ | #SBATCH --gres=gpu: | ||
+ | #SBATCH --ntasks-per-node=1 | ||
+ | #SBATCH --exclusive | ||
+ | |||
+ | # RTX | ||
+ | mpirun --oversubscribe -x LD_LIBRARY_PATH -np 1 \ | ||
+ | -H localhost \ | ||
+ | ~/ | ||
+ | -in in.colloid > rtx-1:1 | ||
+ | |||
+ | [heme@login1 lammps-5Jun19]$ squeue | ||
+ | JOBID PARTITION | ||
+ | 2239 normal | ||
+ | |||
+ | [heme@login1 lammps-5Jun19]$ ssh node5 ./gpu-info | ||
+ | id, | ||
+ | 0, Quadro RTX 6000, 50, 186 MiB, 24004 MiB, 51 %, 0 % | ||
+ | |||
+ | </ | ||
\\ | \\ | ||
**[[cluster: | **[[cluster: |