Warning: Undefined array key "DOKU_PREFS" in /usr/share/dokuwiki/inc/common.php on line 2082
cluster:207 [DokuWiki]

User Tools

Site Tools


cluster:207

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
cluster:207 [2021/10/13 13:06]
hmeij07
cluster:207 [2023/10/27 14:44]
hmeij07
Line 2: Line 2:
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
  
 +**Make sure munge/unmunge work between 1.3/2.4, that date is in sync (else you get error #16)**
  
 ===== Slurm Test Env ===== ===== Slurm Test Env =====
Line 7: Line 8:
 Getting a head start on our new login node plus two cpu+gpu compute node project. Hardware has been purchased but there is long delivery time. Meanwhile it makes sense to setup a standalone Slurm scheduler and do some testing and have as a backup. Slurm will be running on ''greentail52'' with a some compute nodes. Getting a head start on our new login node plus two cpu+gpu compute node project. Hardware has been purchased but there is long delivery time. Meanwhile it makes sense to setup a standalone Slurm scheduler and do some testing and have as a backup. Slurm will be running on ''greentail52'' with a some compute nodes.
  
-This page just intended to keep documentation sources handy.+This page just intended to keep documentation sources handy. Go to the **Users** page [[cluster:208|Slurm Test Env]]
  
 **SLURM documentation** **SLURM documentation**
Line 34: Line 35:
 https://slurm.schedmd.com/slurm.conf.html https://slurm.schedmd.com/slurm.conf.html
 section: node configuration section: node configuration
 +
 +The node range expression can contain one pair of square brackets with a sequence of comma-separated numbers and/or ranges of numbers separated by a "-" (e.g. "linux[0-64,128]", or "lx[15,18,32-33]")
 +
 Features (hasGPU, hasRTX5000) Features (hasGPU, hasRTX5000)
 are intended to be used to filter nodes eligible to run jobs via the --constraint argument. are intended to be used to filter nodes eligible to run jobs via the --constraint argument.
Line 50: Line 54:
 https://slurm.schedmd.com/gres.html#GPU_Management https://slurm.schedmd.com/gres.html#GPU_Management
 setting up gres.conf setting up gres.conf
 +
 +give GPU jobs priority using the Multifactor Priority plugin:
 +https://slurm.schedmd.com/priority_multifactor.html#tres
 +PriorityWeightTRES=GRES/gpu=1000
 +example here: https://slurm.schedmd.com/SLUG19/Priority_and_Fair_Trees.pdf
 +requires faishare thus the database
  
 https://slurm.schedmd.com/mc_support.html https://slurm.schedmd.com/mc_support.html
Line 172: Line 182:
  
 </code> </code>
 +
 +** SLURM installation Updated **
 +
 +<code>
 +
 +export PATH=/usr/local/cuda/bin:$PATH
 +export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
 +
 +[root@cottontail2 slurm-22.05.2]# which gcc mpicc nvcc
 +/opt/ohpc/pub/compiler/gcc/9.4.0/bin/gcc
 +/opt/ohpc/pub/mpi/openmpi4-gnu9/4.1.1/bin/mpicc
 +/usr/local/cuda/bin/nvcc
 +
 +
 +./configure \
 +--prefix=/usr/local/slurm-22.05.2 \
 +--sysconfdir=/usr/local/slurm-22.05.2/etc \
 +--with-nvml=/usr/local/cuda
 +make
 +make install
 +
 +export PATH=/usr/local/slurm/bin:$PATH
 +export LD_LIBRARY_PATH=/usr/local/slurm/lib:$LD_LIBRARY_PATH
 +
 +[root@cottontail2 slurm-22.05.2]# find /usr/local/slurm-22.05.2/ -name auth_munge.so
 +/usr/local/slurm-22.05.2/lib/slurm/auth_munge.so
 +
 +</code>
 +
  
 ** SLURM installation ** ** SLURM installation **
cluster/207.txt ยท Last modified: 2023/10/27 14:47 by hmeij07