This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
cluster:208 [2022/06/01 08:42] hmeij07 [gpu testing] |
cluster:208 [2022/06/03 08:30] hmeij07 [gpu testing] |
||
---|---|---|---|
Line 411: | Line 411: | ||
* twisted logic | * twisted logic | ||
* so recent openhpc version but old slurm version in software stack | * so recent openhpc version but old slurm version in software stack | ||
- | * trying standalone install on openhpc prod cluster | + | * trying standalone install on openhpc prod cluster |
+ | * do all 4 jobs have similar wall time? Yes on n100 varies from 0.6 to 0.7 hours | ||
+ | |||
+ | * ohpc v2.4 slurm v 20.11.8 | ||
+ | * part=test, n 1, B 1:1:1, cuda_visible=0, | ||
+ | * same as above but all 16 jobs run on gpu 0 | ||
+ | * so the limit to 4 jobs on rtx5000 gpu is a hardware phenomenon? | ||
+ | * all 16 jobs finished, waal times of 3.11 to 3.60 hours | ||
+ | |||
+ | |||