User Tools

Site Tools


cluster:164

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:164 [2017/10/27 19:30]
hmeij07
cluster:164 [2017/10/30 14:05]
hmeij07 [PPMA Bench]
Line 564: Line 564:
  
 ==== PPMA Bench ==== ==== PPMA Bench ====
 +
 +  * Runs fastest when constrined to one gpu with 4 mpi threads
 +  * Room for improvement as gpu and gpu memory are not fully utilized
 +  * Adding mpi threads or more gpus reduces ns/day performance
 +  * No idea if adding omp threads shows a different picture
 +  * No idea how it compares to K20 gpus
  
 <code> <code>
  
-PMMA Benchmark Performance Metric (x  nr of gpus)+nvidia-smi -pm 0; nvidia-smi -c 0 
 +# gpu_id is done via CUDA_VISIBLE_DEVICES 
 +export CUDA_VISIBLE_DEVCES=[0,1,2,3] 
 + 
 +# on n78 
 +cd /home/hmeij/lammps/benchmark 
 +rm -f /tmp/lmp-run.log;rm -f *.jpg;\ 
 +time /usr/local/mpich-3.1.4/bin/mpirun -launcher ssh -f ./hostfile  -n $STRING_1 \ 
 +/usr/local/lammps-11Aug17/lmp_mpi-double-double-with-gpu -suffix gpu -pk gpu $STRING_2 \ 
 +-in nvt.in -var t 310 > /dev/null 2>&1; grep ^Performance /tmp/lmp-run.log 
 + 
 + 
 +PMMA Benchmark Performance Metric ns/day (x  nr of gpus for node output)
  
  
-GTX on n78+Lammps 11Aug17 on GTX1080Ti (n78)
  
 -n 1, -gpu_id 3 -n 1, -gpu_id 3
Line 617: Line 635:
 -n 4, -gpu_id 0 -n 4, -gpu_id 0
 -n 4, -gpuid 0123 -n 4, -gpuid 0123
 +
 +# comparison of binaries running PMMA
 +# 1 gpu 4 mpi threads each run
 +
 +# lmp_mpi-double-double-with-gpu.log
 +Performance: 49.833 ns/day, 0.482 hours/ns, 576.769 timesteps/s
 +# lmp_mpi-single-double-with-gpu.log
 +Performance: 58.484 ns/day, 0.410 hours/ns, 676.899 timesteps/s
 +# lmp_mpi-single-single-with-gpu.log
 +Performance: 56.660 ns/day, 0.424 hours/ns, 655.793 timesteps/s
  
  
cluster/164.txt · Last modified: 2018/09/21 11:59 by hmeij07