This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:164 [2017/10/27 19:38] hmeij07 |
cluster:164 [2017/11/16 17:41] hmeij07 [PPMA Bench] |
||
---|---|---|---|
Line 569: | Line 569: | ||
* Adding mpi threads or more gpus reduces ns/day performance | * Adding mpi threads or more gpus reduces ns/day performance | ||
* No idea if adding omp threads shows a different picture | * No idea if adding omp threads shows a different picture | ||
+ | * No idea how it compares to K20 gpus | ||
< | < | ||
+ | |||
nvidia-smi -pm 0; nvidia-smi -c 0 | nvidia-smi -pm 0; nvidia-smi -c 0 | ||
# gpu_id is done via CUDA_VISIBLE_DEVICES | # gpu_id is done via CUDA_VISIBLE_DEVICES | ||
Line 634: | Line 636: | ||
-n 4, -gpuid 0123 | -n 4, -gpuid 0123 | ||
+ | # comparison of binaries running PMMA | ||
+ | # 1 gpu 4 mpi threads each run | ||
+ | |||
+ | # lmp_mpi-double-double-with-gpu.log | ||
+ | Performance: | ||
+ | # lmp_mpi-single-double-with-gpu.log | ||
+ | Performance: | ||
+ | # lmp_mpi-single-single-with-gpu.log | ||
+ | Performance: | ||
</ | </ | ||
+ | |||
+ | ==== FSL ==== | ||
+ | |||
+ | **User Time Reported** from time command | ||
+ | |||
+ | * mwgpu cpu run | ||
+ | * 2013 model name : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz | ||
+ | * All tests 45m | ||
+ | * Bft test 16m28s (bedpostx) | ||
+ | |||
+ | * amber128 cpu run | ||
+ | * 2017 model name : Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz | ||
+ | * All tests 17m - 2.5x faster | ||
+ | * Bft test 3m39s - 6x faster (bedpostx) | ||
+ | |||
+ | * amber128 gpu run | ||
+ | * 2017 CUDA Device Name: GeForce GTX 1080 Ti | ||
+ | * Bft gpu test 0m1.881s (what!? from command line) - 116x faster (bedpostx_gpu) | ||
+ | * Bft gpu test 0m1.850s (what!? via scheduler) - 118x faster (bedpostx_gpu) | ||
+ | |||
+ | |||
\\ | \\ | ||
**[[cluster: | **[[cluster: |