Differences

This shows you the differences between two versions of the page.

--- cluster:182 [2019/08/13 12:39] – [Gromacs] hmeij07
+++ cluster:182 [2019/12/13 13:33] (current) – hmeij07
@@ Line 2: / Line 2: @@
 **[[cluster:0|Back]]**
 ==== P100 vs RTX 6000 & T4 ====
@@ Line 54: / Line 54: @@
 |  DPFP  |  5.21|  18.35|  0.75|  0.35|  1.29|
 |  SXFP  |  11.82|  37.44|  17.05|  7.01|  18.91|
-|  SFFP  |  11.91|  40.98|  9.92|  4.35|  16.22|
+|  SPFP  |  11.91|  40.98|  9.92|  4.35|  16.22|
 Like last testing outcome, in the SFFP precision mode it is best to run four individual jobs, one per GPU (mpi=1, gpu=1). Best performance is the P100 at 47.64 vs the RTX at 39.69 ns/day per node. The T4 runs about 1/3 as fast and really falters in DPFP precision mode. But in SXFP (experimental) precision mode the T4 makes up in performance.
@@ Line 90: / Line 90: @@
 ==== Gromacs ====
-Gromacs was build on each of the nodes locally letting it select the optimal CPU (AVX, SSE) and GPU accelerators. The ''cmake'' flag -DGMX_BUILD_OWN_FFTW=ON yields a mixed precision compilation which is recommended. Then we ran multidir options 01-04 on single GPU, and 01-08 and 01-16 on all 4 GPUs when possible.
+Gromacs was build on each of the nodes locally letting it select the optimal CPU (AVX, SSE) and GPU accelerators. "GROMACS simulations are normally run in “mixed” floating-point precision, which is suited for the use of single precision in FFTW. " The ''cmake'' flag ''-DGMX_BUILD_OWN_FFTW=ON'' yields a mixed precision compilation which is recommended. Then we ran multidir options 01-04 on single GPU, and 01-08 and 01-16 on all 4 GPUs when possible.