Differences

This shows you the differences between two versions of the page.

--- cluster:175 [2018/09/22 18:25]
hmeij07 [P100 vs GTX & K20]
+++ cluster:175 [2018/09/25 12:35]
hmeij07
@@ Line 5: / Line 5: @@
 ==== P100 vs GTX & K20 ====
-^    ^  P100  ^  GTX  ^  K20  ^
+^    ^  P100  ^  GTX  ^  K20  ^    ^
-|  cores  |  3,584  |  3,584  |  2,496  |
+|  cores  |  3,584  |  3,584  |  2,496  |  count  |
-|  mem  |  12/16  |  11g  |  5g  |
+|  mem  |  12/16  |  11  |  5  |  gb  |
-|  ghz  |  2.6  |  1.6  |  0.7  |
+|  ghz  |  2.6  |  1.6  |  0.7  |  speed  |
-|  dpfp  |  4.7  |  0.355  |  1.15  |
+|  flops  |  4.7/5.3  |  0.355  |  1.15  |  dpfp  |
-Comparing these GPUs yields the following data. These are not "benchmark suites" so your mileage may vary. It will give us some comparative information for decision making on our 2018 GPU Expansion Project.  The GTX & K20 data comes from this page [[cluster:164|GTX 1080 Ti]]
+Comparing these GPUs yields the following results presented below. These are not "benchmark suites" so your mileage may vary. It will give us some comparative information for decision making on our 2018 GPU Expansion Project.  The GTX & K20 data comes from this page [[cluster:164|GTX 1080 Ti]]
 Credits: This work was made possible, in part, through HPC time donated by Microway, Inc. We gratefully acknowledge Microway for providing access to their GPU-accelerated compute cluster.
@@ Line 36: / Line 36: @@
 </code>
+Look at these gpu temperatures, that's Celsius.
 ==== Lammps ====
@@ Line 67: / Line 69: @@
 </code>
+=== WZ ===
+a) 1GPU with X CPUs
+#cpus   ns/day
+           89.6
+           61.6
+           34.3
+b) 4 GPUs with 4 CPUs
+ns/day
+----------
+compare these results to
+K20 GPU + 4 CPUs  37 ns/day
+K20 GPU + 6 CPUs  47 ns/day
+GTX GPU + 4 CPUs  73 ns/day
+GTX GPU + 6 CPUs  90 ns/day
+----
+Test dir location
+/home/heme/work/lmp-wz
+--------------
+I am running a material called PMMA (https://en.wikipedia.org/wiki/Poly(methyl_methacrylate) )
+The reason for the different benchmark than yours is that the PMMA simulations require the calculation of molecular bonds, which is not implemented in GPU.
 ==== Gromacs ====
@@ Line 72: / Line 104: @@
 Gromacs has shown vastly improved performance between versions. v5 delivered about 20 ns/day per K20 server and 350 ns/day on GTX server. v2018 delivered 75 ns/day per K20 server and 900 ns/day on GTX server. A roughly 3x improvement.
-On the P100 test node, I could not invoke the multidir option of gromacs (have run it on GTX, weird). The utilization of the gpu drops as more and more gpus are deployed.  The optimum performance was with dual gpus achieving 36 ns/day. Four one gpu jobs would deliver 136 ns/day/server, far short of the 900 ns/day for our GTX server. (We only have dual P100 nodes quoted).
+On the P100 test node, I could not invoke the multidir option of gromacs (have run it on GTX, weird). The utilization of the gpu drops as more and more gpus are deployed.  The optimum performance was with dual gpus achieving 36 ns/day. Four one gpu jobs would deliver 120 ns/day/server, far short of the 900 ns/day for our GTX server. (We only have dual P100 nodes quoted).
 <code>
@@ Line 81: / Line 113: @@
 localhost,localhost,localhost,localhost,localhost,localhost,\
 localhost,localhost,localhost,localhost,localhost,localhost,localhost \
-gmx_mpi mdrun -gpu_id 0123 -ntmpi 0 \
+gmx_mpi mdrun -gpu_id 0123 -ntmpi 0 -nt 0 \
  -s topol.tpr -ntomp 4 -npme 1 -nsteps 20000 -pin on -nb gpu

DokuWiki

User Tools

Site Tools

Differences

Page Tools