User Tools

Site Tools


cluster:175

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:175 [2018/09/22 18:35]
hmeij07
cluster:175 [2018/09/26 20:01]
hmeij07 [Gromacs]
Line 9: Line 9:
 |  mem  |  12/16  |  11  |  5  |  gb  | |  mem  |  12/16  |  11  |  5  |  gb  |
 |  ghz  |  2.6  |  1.6  |  0.7  |  speed  | |  ghz  |  2.6  |  1.6  |  0.7  |  speed  |
-|  flops  |  4.7  |  0.355  |  1.15  |  dpfp  |+|  flops  |  4.7/5.3   0.355  |  1.15  |  dpfp  |
  
 Comparing these GPUs yields the following results presented below. These are not "benchmark suites" so your mileage may vary. It will give us some comparative information for decision making on our 2018 GPU Expansion Project.  The GTX & K20 data comes from this page [[cluster:164|GTX 1080 Ti]] Comparing these GPUs yields the following results presented below. These are not "benchmark suites" so your mileage may vary. It will give us some comparative information for decision making on our 2018 GPU Expansion Project.  The GTX & K20 data comes from this page [[cluster:164|GTX 1080 Ti]]
Line 69: Line 69:
  
 </code> </code>
 +
 +==== Lammps (PMMA) ====
 +
 +Using a material called PMMA (https://en.wikipedia.org/wiki/Poly(methyl_methacrylate) aka acrylic glass or plexiglas ("safety glass"). The PMMA simulations require the calculation of molecular bonds, which is not implemented in GPU hence more CPU cores are required than the Coillod example. The optimal ratio cpu:gpu appears to be 4-6:1.
 + 
 +^  gpu  ^  cpus  ^  ns/day  ^  quad  ^  ns/day/node  ^
 +|  1 P100  |  4  |  89  |  x4  |  356  |
 +|  1 GTX  |  6  |  90  |  x4  |  360  |
 +|  1 K20  |  6  |  47  |  x4  |  188  |
 +
 +That means the P100 works as well as the GTX. The K20 works at 50% the performance level of the others which is impressive for this old gpu.
 +
 +
 +
  
 ==== Gromacs ==== ==== Gromacs ====
Line 74: Line 88:
 Gromacs has shown vastly improved performance between versions. v5 delivered about 20 ns/day per K20 server and 350 ns/day on GTX server. v2018 delivered 75 ns/day per K20 server and 900 ns/day on GTX server. A roughly 3x improvement. Gromacs has shown vastly improved performance between versions. v5 delivered about 20 ns/day per K20 server and 350 ns/day on GTX server. v2018 delivered 75 ns/day per K20 server and 900 ns/day on GTX server. A roughly 3x improvement.
  
-On the P100 test node, I could not invoke the multidir option of gromacs (have run it on GTX, weird). The utilization of the gpu drops as more and more gpus are deployed.  The optimum performance was with dual gpus achieving 36 ns/day. Four one gpu jobs would deliver 136 ns/day/server, far short of the 900 ns/day for our GTX server. (We only have dual P100 nodes quoted).+On the P100 test node, I could not invoke the multidir option of gromacs (have run it on GTX, weird). The utilization of the gpu drops as more and more gpus are deployed.  The optimum performance was with dual gpus achieving 36 ns/day. Four one gpu jobs would deliver 120 ns/day/server, far short of the 900 ns/day for our GTX server. (We only have dual P100 nodes quoted).
  
 <code> <code>
Line 83: Line 97:
 localhost,localhost,localhost,localhost,localhost,localhost,\ localhost,localhost,localhost,localhost,localhost,localhost,\
 localhost,localhost,localhost,localhost,localhost,localhost,localhost \ localhost,localhost,localhost,localhost,localhost,localhost,localhost \
-gmx_mpi mdrun -gpu_id 0123 -ntmpi 0 \+gmx_mpi mdrun -gpu_id 0123 -ntmpi 0 -nt 0 \
  -s topol.tpr -ntomp 4 -npme 1 -nsteps 20000 -pin on -nb gpu  -s topol.tpr -ntomp 4 -npme 1 -nsteps 20000 -pin on -nb gpu
  
Line 103: Line 117:
 0, Tesla P100-PCIE-16GB, 36, 7048 MiB, 9232 MiB, 97 %, 4 % 0, Tesla P100-PCIE-16GB, 36, 7048 MiB, 9232 MiB, 97 %, 4 %
  
 +</code>
 +
 +==== Gromacs 2018.3 ====
 +
 +<code>
 +multidir -gpu_id 0123
 +-np  8 -ntomp  4 -npme 1 -maxh 0.1 -pin on -nb gpu
 +01/md.log:Performance:       36.692        0.654
 +02/md.log:Performance:       36.650        0.655
 +03/md.log:Performance:       36.623        0.655
 +04/md.log:Performance:       36.663        0.655
 +-np 16 -ntomp  8 -npme 1 -maxh 0.1 -pin on -nb gpu
 +01/md.log:Performance:       25.151        0.954
 +02/md.log:Performance:       25.257        0.950
 +03/md.log:Performance:       25.247        0.951
 +04/md.log:Performance:       25.345        0.947
 +
 +multidir -gpu_id 00112233
 +-np  8 -ntomp  4 -npme 1 -maxh 0.1 -pin on -nb gpu
 +Error in user input:
 +The string of available GPU device IDs '00112233' may not contain duplicate
 +device IDs
 </code> </code>
  
cluster/175.txt ยท Last modified: 2018/11/29 18:00 by hmeij07