Both sides previous revision
Previous revision
Next revision
|
Previous revision
Next revision
Both sides next revision
|
cluster:192 [2020/02/26 18:34] hmeij07 [Usage] |
cluster:192 [2020/02/26 20:05] hmeij07 [EXX96] |
===== EXX96 ===== | ===== EXX96 ===== |
| |
A page for me on how these 12 nodes were build up after they arrived. To make them "ala n37" which as the test node in redoing our K20 nodes, see [[cluster:172|K20 Redo]] | A page for me on how these 12 nodes were build up after they arrived. To make them "ala n37" which was the test node in redoing our K20 nodes, see [[cluster:172|K20 Redo]] |
| |
Page best followed bottom to top. | Page best followed bottom to top. |
==== Usage ==== | ==== Usage ==== |
| |
The new queue ''exx96'' will be comprised of nodes ''n79-n90''. Each node holds 4x RTX2080S gpus, 2x Xeon Silver 4214 2.2 Ghz cpus, 96 GB memory and a 1TB SSD. ''/localscratch'' is around 800 GD. | The new queue ''exx96'' will be comprised of nodes ''n79-n90''. Each node holds 4x RTX2080S gpus, 2x Xeon Silver 4214 2.2 Ghz 12-core cpus, 96 GB memory and a 1TB SSD. ''/localscratch'' is around 800 GB. |
| |
A new static resource is introduced for all nodes holding gpus. ''n78'' in queue ''amber128'' and ''n33-n37'' in queue ''mwgpu''. The name of this resource is ''gpu4''. Moving forward please use it instaed of ''gpu'' or ''gputest''. | A new static resource is introduced for all nodes holding gpus. ''n78'' in queue ''amber128'' and ''n33-n37'' in queue ''mwgpu'' and the nodes mentioned above. The name of this resource is ''gpu4''. Moving forward please use it instead of ''gpu'' or ''gputest''. |
| |
The wrappers provided assume your cpu:gpu ration is 1:1 hence in your submit code you will have ''#BSUB -n 1'' and in your resource allocation line ''gpu4=1''. If your ratio is something else you can set CPU_GPU_REQUEST, for example CPU_GPU_REQUEST=4:2 which expectas the lines ''#BSUB -n 4'' and ''gpu4=2'' in your submit script. | The wrappers provided assume your cpu:gpu ratio is 1:1 hence in your submit code you will have ''#BSUB -n 1'' and in your resource allocation line ''gpu4=1''. If your ratio is something else you can set CPU_GPU_REQUEST. For example CPU_GPU_REQUEST=4:2 expects the lines ''#BSUB -n 4'' and ''gpu4=2'' in your submit script. Sample script at ''/home/hmeij/k20redo/run.rtx'' |
| |
The wrappers (78.mpich3.wrapper for n78, and n37.openmpi.wrapper for all others) are located in ''/usr/local/bin'' and will set up the environment and start these applications: amber, lammps, gromacs, matlab and namd. | The wrappers (78.mpich3.wrapper for ''n78'', and n37.openmpi.wrapper for all others) are located in ''/usr/local/bin'' and will set up your environment and start either of these applications: amber, lammps, gromacs, matlab and namd from ''/usr/local''. |
| |
| |
<code> | <code> |
| |
| # command that shows gpu reservations |
bhosts -l n79 | bhosts -l n79 |
gputest gpu4 | gputest gpu4 |
Total 0 3 | Total 0 3 |
Reserved 0.0 0.1 | Reserved 0.0 1.0 |
| |
| # old way of doing that |
lsload -l n79 | lsload -l n79 |
| |
n79 ok 0.0 0.0 0.0 0% 0.0 0 0 2e+08 826G 10G 90G 3.0 | n79 ok 0.0 0.0 0.0 0% 0.0 0 0 2e+08 826G 10G 90G 3.0 |
| |
mdout.325288:| Master Total CPU time: 982.60 seconds 0.27 hours 1:1 | </code> |
mdout.325289:| Master Total CPU time: 611.08 seconds 0.17 hours 4:2 | |
mdout.326208:| Master Total CPU time: 537.97 seconds 0.15 hours 36:4 | |
| |
#BSUB -n 4 | Peer to peer communication is possible (via PCIe rather than NVlink) with this hardware. This will get rather messy in setting up. Some quick off the cuff performance data reveals some impact. Generally in our environment the gains are not worth the effort. Using Amber and ''pmemd.cuda.MPI'' |
#BSUB -R "rusage[gpu4=2:mem=6288],span[hosts=1]" | |
export CPU_GPU_REQUEST=4:2 | |
| |
</code> | <code> |
| cpu:gpu |
| mdout.325288:| Master Total CPU time: 982.60 seconds 0.27 hours 1:1 |
| mdout.325289:| Master Total CPU time: 611.08 seconds 0.17 hours 4:2 |
| mdout.326208:| Master Total CPU time: 537.97 seconds 0.15 hours 36:4 |
| |
| </code> |
==== Miscellaneous ==== | ==== Miscellaneous ==== |
| |