Next revision
|
Previous revision
Next revision
Both sides next revision
|
cluster:218 [2022/06/30 17:23] hmeij07 created |
cluster:218 [2022/06/30 17:46] hmeij07 [Rocky8 Slurm Template] |
| |
# sorta like bqueues | # sorta like bqueues |
$ sinfo -l | sinfo -l |
| |
# more node info | # more node info |
| |
* manual pages for conf files or commands, for example | * manual pages for conf files or commands, for example |
* ''man lsf.conf'' | * ''man slurm.conf'' |
* ''man sbatch'' | * ''man sbatch'' |
* etc ...(see above commands) | * etc ...(see above commands) |
Slurm has a builtin MPI flavor. I suggest you do not rely on it. The documentation states that on major release upgrades the ''libslurm.so'' library is not backwards compatible. All software using this library would need to be recompiled. | Slurm has a builtin MPI flavor. I suggest you do not rely on it. The documentation states that on major release upgrades the ''libslurm.so'' library is not backwards compatible. All software using this library would need to be recompiled. |
| |
There is a handy parallel job launcher which may be of use, it is called ''srun''. ''srun'' commands can be embedded in a job submission script but it can also be used interactively to test commands out. | There is a handy parallel job launcher which may be of use, it is called ''srun''. ''srun'' commands can be embedded in a job submission script but it can also be used interactively to test commands out. The submmited job will have a single JOBPID and launch multiple tasks. |
| |
<code> | <code> |
[hmeij@cottontail2 ~]$ module avail | [hmeij@cottontail2 ~]$ module avail |
| |
------------------------ /opt/ohpc/pub/moduledeps/gnu9-openmpi4 ------------------------- | ------------------- /opt/ohpc/pub/moduledeps/gnu9-openmpi4 ------------- |
adios/1.13.1 fftw/3.3.8 netcdf-cxx/4.3.1 petsc/3.16.1 py3-scipy/1.5.1 slepc/3.16.0 | adios/1.13.1 netcdf-cxx/4.3.1 py3-scipy/1.5.1 |
boost/1.76.0 hypre/2.18.1 netcdf-fortran/4.5.3 phdf5/1.10.8 scalapack/2.1.0 superlu_dist/6.4.0 | boost/1.76.0 netcdf-fortran/4.5.3 scalapack/2.1.0 |
dimemas/5.4.2 imb/2019.6 netcdf/4.7.4 pnetcdf/1.12.2 scalasca/2.5 tau/2.29 | dimemas/5.4.2 netcdf/4.7.4 scalasca/2.5 |
example2/1.0 mfem/4.3 omb/5.8 ptscotch/6.0.6 scorep/6.0 trilinos/13.2.0 | example2/1.0 omb/5.8 scorep/6.0 |
extrae/3.7.0 mumps/5.2.1 opencoarrays/2.9.2 py3-mpi4py/3.0.3 sionlib/1.7.4 | extrae/3.7.0 opencoarrays/2.9.2 sionlib/1.7.4 |
| fftw/3.3.8 petsc/3.16.1 slepc/3.16.0 |
| hypre/2.18.1 phdf5/1.10.8 superlu_dist/6.4.0 |
| imb/2019.6 pnetcdf/1.12.2 tau/2.29 |
| mfem/4.3 ptscotch/6.0.6 trilinos/13.2.0 |
| mumps/5.2.1 py3-mpi4py/3.0.3 |
| |
------------------------------ /opt/ohpc/pub/moduledeps/gnu9 ------------------------------ | ------------------------- /opt/ohpc/pub/moduledeps/gnu9 ---------------- |
R/4.1.2 impi/2021.5.1 mpich/3.4.2-ofi openblas/0.3.7 plasma/2.8.0 superlu/5.2.1 | R/4.1.2 mpich/3.4.2-ofi plasma/2.8.0 |
gsl/2.7 likwid/5.0.1 mpich/3.4.2-ucx (D) openmpi4/4.1.1 (L) py3-numpy/1.19.5 | gsl/2.7 mpich/3.4.2-ucx (D) py3-numpy/1.19.5 |
hdf5/1.10.8 metis/5.1.0 mvapich2/2.3.6 pdtoolkit/3.25.1 scotch/6.0.6 | hdf5/1.10.8 mvapich2/2.3.6 scotch/6.0.6 |
| impi/2021.5.1 openblas/0.3.7 superlu/5.2.1 |
| likwid/5.0.1 openmpi4/4.1.1 (L) |
| metis/5.1.0 pdtoolkit/3.25.1 |
| |
-------------------------------- /opt/ohpc/pub/modulefiles ------------------------------- | --------------------------- /opt/ohpc/pub/modulefiles ------------------- |
EasyBuild/4.5.0 example1/1.0 libfabric/1.13.0 (L) prun/2.2 (L) | EasyBuild/4.5.0 hwloc/2.5.0 (L) prun/2.2 (L) |
autotools (L) gnu9/9.4.0 (L) ohpc (L) singularity/3.7.1 | autotools (L) intel/2022.0.2 singularity/3.7.1 |
charliecloud/0.15 hwloc/2.5.0 (L) os ucx/1.11.2 (L) | charliecloud/0.15 libfabric/1.13.0 (L) ucx/1.11.2 (L) |
cmake/3.21.3 intel/2022.0.2 papi/5.7.0 valgrind/3.18.1 | cmake/3.21.3 ohpc (L) valgrind/3.18.1 |
| example1/1.0 os |
| gnu9/9.4.0 (L) papi/5.7.0 |
| |
--------------------------- /share/apps/CENTOS8/ohpc/modulefiles --------------------------- | ----------------------- /share/apps/CENTOS8/ohpc/modulefiles ------------ |
amber/20 cuda/11.6 hello-mpi/1.0 hello/1.0 miniconda3/py39 | amber/20 cuda/11.6 hello-mpi/1.0 hello/1.0 miniconda3/py39 |
| |
#SBATCH -B 1:1:1 # S:C:T=sockets/node:cores/socket:threads/core | #SBATCH -B 1:1:1 # S:C:T=sockets/node:cores/socket:threads/core |
###SBATCH -B 2:4:1 # S:C:T=sockets/node:cores/socket:threads/core | ###SBATCH -B 2:4:1 # S:C:T=sockets/node:cores/socket:threads/core |
#SBATCH --cpus-per-gpu=1 | |
#SBATCH --mem-per-gpu=7168 | |
# | # |
# GPU control | # GPU control |
| #SBATCH --cpus-per-gpu=1 |
| #SBATCH --mem-per-gpu=7168 |
###SBATCH --gres=gpu:geforce_gtx_1080_ti:1 # n78 | ###SBATCH --gres=gpu:geforce_gtx_1080_ti:1 # n78 |
#SBATCH --gres=gpu:quadro_rtx_5000:1 # n[100-101] | #SBATCH --gres=gpu:quadro_rtx_5000:1 # n[100-101] |