This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cluster:182 [2019/08/13 12:39] hmeij07 [Gromacs] |
cluster:182 [2019/12/13 13:33] (current) hmeij07 |
||
---|---|---|---|
Line 2: | Line 2: | ||
**[[cluster: | **[[cluster: | ||
+ | |||
==== P100 vs RTX 6000 & T4 ==== | ==== P100 vs RTX 6000 & T4 ==== | ||
Line 54: | Line 54: | ||
| DPFP | 5.21| 18.35| | | DPFP | 5.21| 18.35| | ||
| SXFP | 11.82| | | SXFP | 11.82| | ||
- | | | + | | |
Like last testing outcome, in the SFFP precision mode it is best to run four individual jobs, one per GPU (mpi=1, gpu=1). Best performance is the P100 at 47.64 vs the RTX at 39.69 ns/day per node. The T4 runs about 1/3 as fast and really falters in DPFP precision mode. But in SXFP (experimental) precision mode the T4 makes up in performance. | Like last testing outcome, in the SFFP precision mode it is best to run four individual jobs, one per GPU (mpi=1, gpu=1). Best performance is the P100 at 47.64 vs the RTX at 39.69 ns/day per node. The T4 runs about 1/3 as fast and really falters in DPFP precision mode. But in SXFP (experimental) precision mode the T4 makes up in performance. | ||
Line 90: | Line 90: | ||
==== Gromacs ==== | ==== Gromacs ==== | ||
- | Gromacs was build on each of the nodes locally letting it select the optimal CPU (AVX, SSE) and GPU accelerators. The '' | + | Gromacs was build on each of the nodes locally letting it select the optimal CPU (AVX, SSE) and GPU accelerators. |