User Tools

Site Tools


cluster:208

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
cluster:208 [2022/05/31 12:34]
hmeij07 [gpu testing]
cluster:208 [2022/06/01 12:42]
hmeij07 [gpu testing]
Line 385: Line 385:
 ===== gpu testing ===== ===== gpu testing =====
  
-  * test slurm v 21.08.1+  * test standalone slurm v 21.08.1
   * n33-n37 each: 4 gpus, 16 cores, 16 threads, 32 cpus   * n33-n37 each: 4 gpus, 16 cores, 16 threads, 32 cpus
   * submit one at a time, observe     * submit one at a time, observe  
Line 405: Line 405:
   * do all 16 jobs log the same wall time? Yes, between 10.10 and 10.70 hours.   * do all 16 jobs log the same wall time? Yes, between 10.10 and 10.70 hours.
  
-  * ohpc slurm v 20.11.8+  * ohpc v2.4 slurm v 20.11.8 
   * part=test, n 1, B 1:1:1, cuda_visible=0, no node specified, n100 only   * part=test, n 1, B 1:1:1, cuda_visible=0, no node specified, n100 only
-  * +  * hit a bug, you must specify cpus-per-gpu **and** mem-per-gpu 
 +  * then slurm detects 4 gpus on allocated node and allows 4 jobs on a single allocated gpu 
 +  * twisted logic 
 +  * so recent openhpc version but old slurm version in software stack 
 +  * trying standalone install on openhpc prod cluster 
 + 
 ===== Changes ===== ===== Changes =====
  
cluster/208.txt · Last modified: 2022/11/02 17:28 by hmeij07