This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:175 [2018/09/22 12:47] hmeij07 [Lammps] |
cluster:175 [2018/09/22 18:25] hmeij07 [P100 vs GTX & K20] |
||
---|---|---|---|
Line 3: | Line 3: | ||
**[[cluster: | **[[cluster: | ||
- | ==== GTX vs P100 & K20 ==== | + | ==== P100 vs GTX & K20 ==== |
+ | |||
+ | ^ ^ P100 ^ GTX ^ K20 ^ | ||
+ | | cores | 3,584 | 3,584 | 2,496 | | ||
+ | | mem | 12/16 | 11g | 5g | | ||
+ | | ghz | 2.6 | 1.6 | 0.7 | | ||
+ | | dpfp | 4.7 | 0.355 | 1.15 | | ||
Comparing these GPUs yields the following data. These are not " | Comparing these GPUs yields the following data. These are not " | ||
Line 37: | Line 43: | ||
On our GTX server best performance was a ratio of 16:4 cpu:gpu for 932,493 tau/day (11x faster than our K20). However scaling the job to a ratio of cpu:gpu of 4:2 yields 819,207 tau/day which means a quad server can deliver about 1.6 million tau/day. | On our GTX server best performance was a ratio of 16:4 cpu:gpu for 932,493 tau/day (11x faster than our K20). However scaling the job to a ratio of cpu:gpu of 4:2 yields 819,207 tau/day which means a quad server can deliver about 1.6 million tau/day. | ||
- | A single P100 beat this easily coming in at 2.6 million tau/day. Spreading the problem over more gpus did raise overall performance to 3.3 million tau/day. However, four cpu:gpu 1:1 jobs would achieve slightly over 10 million tau/day. That is almost 10x faster than the GTX server. | + | A single P100 gpu beats this easily coming in at 2.6 million tau/day. Spreading the problem over more gpus did raise overall performance to 3.3 million tau/day. However, four cpu:gpu 1:1 jobs would achieve slightly over 10 million tau/day. That is almost 10x faster than our GTX server. |
< | < | ||
Line 66: | Line 72: | ||
Gromacs has shown vastly improved performance between versions. v5 delivered about 20 ns/day per K20 server and 350 ns/day on GTX server. v2018 delivered 75 ns/day per K20 server and 900 ns/day on GTX server. A roughly 3x improvement. | Gromacs has shown vastly improved performance between versions. v5 delivered about 20 ns/day per K20 server and 350 ns/day on GTX server. v2018 delivered 75 ns/day per K20 server and 900 ns/day on GTX server. A roughly 3x improvement. | ||
- | On the P100 I could not invoke the multidir option of gromacs (have run it on GTX, weird). The utilization of the gpu drops as more and more gpus are deployed. | + | On the P100 test node, I could not invoke the multidir option of gromacs (have run it on GTX, weird). The utilization of the gpu drops as more and more gpus are deployed. |
< | < | ||
Line 83: | Line 89: | ||
| | ||
- | gpu=4 mpi=25 ntomp=4 -npme 1 | + | gpu=4 mpi=25 |
Performance: | Performance: | ||
gpu=3 (same) | gpu=3 (same) | ||
Line 96: | Line 102: | ||
</ | </ | ||
+ | |||
+ | ==== What to Buy ==== | ||
+ | |||
+ | * Amber folks: does not matter | ||
+ | * Lammps folks: P100 nodes please | ||
+ | * Gromacs folks: GTX nodes please | ||
+ | |||
+ | Remember that GTX gpus placed in Data Center voids their warranty. | ||
+ | |||
\\ | \\ |