Both sides previous revision
Previous revision
Next revision
|
Previous revision
Next revision
Both sides next revision
|
cluster:218 [2022/06/30 17:39] hmeij07 [MPI] |
cluster:218 [2022/06/30 17:50] hmeij07 [Testing!] |
[hmeij@cottontail2 ~]$ module avail | [hmeij@cottontail2 ~]$ module avail |
| |
------------------------ /opt/ohpc/pub/moduledeps/gnu9-openmpi4 ------------------------- | ------------------- /opt/ohpc/pub/moduledeps/gnu9-openmpi4 ------------- |
adios/1.13.1 fftw/3.3.8 netcdf-cxx/4.3.1 petsc/3.16.1 py3-scipy/1.5.1 slepc/3.16.0 | adios/1.13.1 netcdf-cxx/4.3.1 py3-scipy/1.5.1 |
boost/1.76.0 hypre/2.18.1 netcdf-fortran/4.5.3 phdf5/1.10.8 scalapack/2.1.0 superlu_dist/6.4.0 | boost/1.76.0 netcdf-fortran/4.5.3 scalapack/2.1.0 |
dimemas/5.4.2 imb/2019.6 netcdf/4.7.4 pnetcdf/1.12.2 scalasca/2.5 tau/2.29 | dimemas/5.4.2 netcdf/4.7.4 scalasca/2.5 |
example2/1.0 mfem/4.3 omb/5.8 ptscotch/6.0.6 scorep/6.0 trilinos/13.2.0 | example2/1.0 omb/5.8 scorep/6.0 |
extrae/3.7.0 mumps/5.2.1 opencoarrays/2.9.2 py3-mpi4py/3.0.3 sionlib/1.7.4 | extrae/3.7.0 opencoarrays/2.9.2 sionlib/1.7.4 |
| fftw/3.3.8 petsc/3.16.1 slepc/3.16.0 |
| hypre/2.18.1 phdf5/1.10.8 superlu_dist/6.4.0 |
| imb/2019.6 pnetcdf/1.12.2 tau/2.29 |
| mfem/4.3 ptscotch/6.0.6 trilinos/13.2.0 |
| mumps/5.2.1 py3-mpi4py/3.0.3 |
| |
------------------------------ /opt/ohpc/pub/moduledeps/gnu9 ------------------------------ | ------------------------- /opt/ohpc/pub/moduledeps/gnu9 ---------------- |
R/4.1.2 impi/2021.5.1 mpich/3.4.2-ofi openblas/0.3.7 plasma/2.8.0 superlu/5.2.1 | R/4.1.2 mpich/3.4.2-ofi plasma/2.8.0 |
gsl/2.7 likwid/5.0.1 mpich/3.4.2-ucx (D) openmpi4/4.1.1 (L) py3-numpy/1.19.5 | gsl/2.7 mpich/3.4.2-ucx (D) py3-numpy/1.19.5 |
hdf5/1.10.8 metis/5.1.0 mvapich2/2.3.6 pdtoolkit/3.25.1 scotch/6.0.6 | hdf5/1.10.8 mvapich2/2.3.6 scotch/6.0.6 |
| impi/2021.5.1 openblas/0.3.7 superlu/5.2.1 |
| likwid/5.0.1 openmpi4/4.1.1 (L) |
| metis/5.1.0 pdtoolkit/3.25.1 |
| |
-------------------------------- /opt/ohpc/pub/modulefiles ------------------------------- | --------------------------- /opt/ohpc/pub/modulefiles ------------------- |
EasyBuild/4.5.0 example1/1.0 libfabric/1.13.0 (L) prun/2.2 (L) | EasyBuild/4.5.0 hwloc/2.5.0 (L) prun/2.2 (L) |
autotools (L) gnu9/9.4.0 (L) ohpc (L) singularity/3.7.1 | autotools (L) intel/2022.0.2 singularity/3.7.1 |
charliecloud/0.15 hwloc/2.5.0 (L) os ucx/1.11.2 (L) | charliecloud/0.15 libfabric/1.13.0 (L) ucx/1.11.2 (L) |
cmake/3.21.3 intel/2022.0.2 papi/5.7.0 valgrind/3.18.1 | cmake/3.21.3 ohpc (L) valgrind/3.18.1 |
| example1/1.0 os |
| gnu9/9.4.0 (L) papi/5.7.0 |
| |
--------------------------- /share/apps/CENTOS8/ohpc/modulefiles --------------------------- | ----------------------- /share/apps/CENTOS8/ohpc/modulefiles ------------ |
amber/20 cuda/11.6 hello-mpi/1.0 hello/1.0 miniconda3/py39 | amber/20 cuda/11.6 hello-mpi/1.0 hello/1.0 miniconda3/py39 |
| |
#SBATCH -B 1:1:1 # S:C:T=sockets/node:cores/socket:threads/core | #SBATCH -B 1:1:1 # S:C:T=sockets/node:cores/socket:threads/core |
###SBATCH -B 2:4:1 # S:C:T=sockets/node:cores/socket:threads/core | ###SBATCH -B 2:4:1 # S:C:T=sockets/node:cores/socket:threads/core |
#SBATCH --cpus-per-gpu=1 | |
#SBATCH --mem-per-gpu=7168 | |
# | # |
# GPU control | # GPU control |
| #SBATCH --cpus-per-gpu=1 |
| #SBATCH --mem-per-gpu=7168 |
###SBATCH --gres=gpu:geforce_gtx_1080_ti:1 # n78 | ###SBATCH --gres=gpu:geforce_gtx_1080_ti:1 # n78 |
#SBATCH --gres=gpu:quadro_rtx_5000:1 # n[100-101] | #SBATCH --gres=gpu:quadro_rtx_5000:1 # n[100-101] |
==== CentOS7 Slurm Template ==== | ==== CentOS7 Slurm Template ==== |
| |
In this job template I have it setup to run ''pmemd.MPI'' but could also invoke ''pmemd.cuda'' with proper parameter settings. I could also toggle between amber16 or amber20 which on queues ''mwgpu'' and ''exx96'' are local disk CentOS7 software installations. Amber16 will not run on Rocky8 (tried it but forgot error message...we can expect problems like this, hence testing!). | In this job template I have it setup to run ''pmemd.MPI'' but could also invoke ''pmemd.cuda'' with proper parameter settings. On queues ''mwgpu'' and ''exx96'' amber[16,20] are local disk CentOS7 software installations. Amber16 will not run on Rocky8 (tried it but forgot error message...we can expect problems like this, hence testing!). |
| |
Note also that we're running mwgpu's K20 cuda version 9.2 on exx96 queue (default cuda version 10.2). Not proper but it works. Hence this script will run on both queues. Oh, now I remember, it is that amber16 was compiled with cuda 9.2 drivers which are supported in cuda 10+_ but not in cuda 11+. So Amber 16, if needed, would need to be compiled in Rocky8 environment. (that may work like amber20). | Note also that we're running mwgpu's K20 cuda version 9.2 on exx96 queue (default cuda version 10.2). Not proper but it works. Hence this script will run on both queues. Oh, now I remember, it is that amber16 was compiled with cuda 9.2 drivers which are supported in cuda 10.x but not in cuda 11.x. So Amber 16, if needed, would need to be compiled in Rocky8 environment (and may work like amber20 module). |
| |
* ''/zfshomes/hmeij/slurm/run.centos'' | * ''/zfshomes/hmeij/slurm/run.centos'' |
# | # |
# GPU control | # GPU control |
###SBATCH --gres=gpu:tesla_k20m:1 # n[33-37] | |
###SBATCH --gres=gpu:geforce_rtx_2080_s:1 # n[79-90] | |
###SBATCH --cpus-per-gpu=1 | ###SBATCH --cpus-per-gpu=1 |
###SBATCH --mem-per-gpu=7168 | ###SBATCH --mem-per-gpu=7168 |
| ###SBATCH --gres=gpu:tesla_k20m:1 # n[33-37] |
| ###SBATCH --gres=gpu:geforce_rtx_2080_s:1 # n[79-90] |
# | # |
# Node control | # Node control |
July 2022 is for **testing...** lots to learn! | July 2022 is for **testing...** lots to learn! |
| |
Kudos to Abhilash for working our way through all this. | Kudos to Abhilash and colin for working our way through all this. |
| |
\\ | \\ |
**[[cluster:0|Back]]** | **[[cluster:0|Back]]** |
| |