User Tools

Site Tools




Introducing our new login node cottontail2. It is a server designed to run the Slurm scheduler and will sport the OpenHPC v2.4 software stack (External Link). We are deploying the Slurm/Warewulf recipe. You can find details at External Link: Rocky 8.5 with Architecture = (x86_64).

The original design was described at new primary login node page, but all that was pre-pandemic. The major deviance is we could not obtain 10G ethernet switches so going with 1G for now.

cottontail2 runs the Rocky 8.5 operating system and has two fast Intel Xeon 5222 “Cascade Lake-SP” 3.8 GHz 4-core 14nm CPUs. In addition it has 96GB DDR4 2933 MHz ECC/Registered Memory.

On cottontail2 you can submit Slurm jobs to the test queue. From this server you can SSH to cottontail much like greentail52 (do not add You may also continue to login to Both these server will be around awhile.

The hope is that most of our compute nodes will be converted to Rocky 8.5 and added to the Slurm queues. Probably not hp12 nodes (too old) nor mwgpu nodes (K20 gpu model not supported anymore).


These nodes each have:

  • dual Intel Xeon 4214R “Cascade Lake Refresh” 2.4 GHz 12-core 14nm CPUs
  • 192GB DDR4 2933 MHz ECC/Registered Memory
  • A single 2T hard disk providing for 1.4T /localscratch
  • 4 NVIDIA “Turing” Quadro RTX 5000 PCI-E+NVLink 16GB GPU Accelerator / Graphics Card
    • 16GB GDDR6 ECC Memory
    • FP Performance (with GPU Boost): 22.3 TFLOPS (half), 11.2 TFLOPS (single), 0.35 TFLOPS (double)
    • Provides up to 89.2 Deep Learning TFLOPS
    • Quadro RTX NVLink: 25GB/sec (bi-directional), 50GB/sec total bandwidth (requires bridge kit)

The nodes are defined at the bottom of these files

  • cottontail2:/etc/slurm/slurm.conf
  • cottontail2:/etc/slurm/gres.conf


[hmeij@cottontail2 ~]$ sinfo

test*        up 1-00:00:00      2   idle n[100-101]

[hmeij@cottontail2 ~]$ sinfo -lN

Thu Mar 24 14:18:45 2022
n100           1     test*        idle 48     2:12:2 192071        0    100 hasLocal none                
n101           1     test*        idle 48     2:12:2 192071        0    100 hasLocal none  

[hmeij@cottontail2 ~]$ scontrol show node n100

NodeName=n100 Arch=x86_64 CoresPerSocket=12 
   CPUAlloc=0 CPUTot=48 CPULoad=0.00
   NodeAddr=n100 NodeHostName=n100 Version=20.11.8
   OS=Linux 4.18.0-348.12.2.el8_5.x86_64 #1 SMP Wed Jan 19 17:53:40 UTC 2022 
   RealMemory=192071 AllocMem=0 FreeMem=190797 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=100 Owner=N/A MCS_label=N/A
   BootTime=2022-03-23T15:59:28 SlurmdStartTime=2022-03-23T15:59:53
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

Note: in the output above S:C:T stands for Sockets:Cores:Threads, you can reserve resources based on these values. Features= allow you to filter the nodes that can run your job by requesting a certain feature, for example run on nodes with 192 GB memory (hasMem192gb). Gres= (Generic RESource) defines resources available, for example quadro_rtx_5000 gpus. You can request the detailed resource or just rtx_5000 or just quadro.

You can find more information on the Slurm Test Env page. Would be a good read.

Slurm Jobs

Read the Slurm Test Env page, it will be helpful. What's presented in this section is a brief introduction on how to run a job on the RTX5000 gpus of n[100-101] compute nodes.

We will submit a job performing a gpu burn operation and cuda memory tests. These typically run overnight so we'll terminate after 15 minutes. Here is submit script.

# [found at XStream]
# Slurm will IGNORE all lines after the FIRST BLANK LINE,
# even the ones containing #SBATCH.
# Always put your SBATCH parameters at the top of your batch script.
# Took me days to find ... really silly behavior -Henk
#SBATCH --job-name="test"
#SBATCH --output=out   # or both in default file
#SBATCH --error=err    # slurm-$SLURM_JOBID.out
#SBATCH --mail-type=END
#SBATCH --time=00:15:00
# NODE control
#SBATCH -N 1     # default, nodes 
# CPU control
#SBATCH -n 48     # tasks=S*C*T
#SBATCH -B 2:12:2 # S:C:T=sockets/node:cores/socket:threads/core
# GPU control
#SBATCH --gres=gpu:quadro_rtx_5000:4  # n[100-101]


And the submit process.

[hmeij@cottontail2 microway]$ sbatch run.slurm 
Submitted batch job 1000004

[hmeij@cottontail2 microway]$ squeue
           1000004      test     test    hmeij  R       0:08      1 n100
[hmeij@cottontail2 ~]$ ssh n100 gpu-info
0, Quadro RTX 5000, 47, 13656 MiB, 2469 MiB, 100 %, 43 %
1, Quadro RTX 5000, 49, 13656 MiB, 2469 MiB, 100 %, 37 %
2, Quadro RTX 5000, 49, 13656 MiB, 2469 MiB, 100 %, 29 %
3, Quadro RTX 5000, 48, 13656 MiB, 2469 MiB, 100 %, 76 %

[hmeij@cottontail2 ~]$ ssh n100 gpu-process
gpu_name, gpu_id, pid, process_name
Quadro RTX 5000, 0, 18714, ./gpu_burn
Quadro RTX 5000, 1, 18743, ./gpu_burn
Quadro RTX 5000, 2, 18744, ./gpu_burn
Quadro RTX 5000, 3, 18745, ./gpu_burn

And the results in standard output and error files


4 GPUs detected: 

GPU 0,CUDA device 0: Quadro RTX 5000 was detected as a quadro card. GPU burn will be run in single precision.
         16384 MiB detected, will run gpu_burn with a matrix size of 16384.

Running gpu_burn  50000     for all GPUs

Quadro RTX 5000
Quadro RTX 5000
Quadro RTX 5000
Quadro RTX 5000

3.3%  proc: 0/0/11/0 err: 0/0/0/0 tmp: 51C/53C/53C/53C
6.3%  proc: 0/11/11/0 err: 0/0/0/0 tmp: 51C/53C/53C/53C

slurmstepd: error: *** JOB 1000004 ON n100 CANCELLED AT 2022-03-28T13:20:12 DUE TO TIME LIMIT ***


OpenHPC provides recent versions of the GNU autotools collection, the Valgrind memory debugger, EasyBuild, and Spack (skipped). For more information on EasyBuild read EasyBuild page.

Requests for software and toolchains installations can be made, consult


OpenHPC presently packages the GNU compiler toolchain integrated with the underlying Lmod modules system in a hierarchical fashion. The modules system will conditionally present compiler-dependent software based on the toolchain currently loaded.

  • gnu9-compilers-ohpc

MPI stacks

For MPI development and runtime support, OpenHPC provides pre-packaged builds for a variety of MPI families and transport layers. OpenHPC 2.x introduces the use of two related transport layers for the MPICH and OpenMPI builds that support a variety of underlying fabrics: UCX (Unified Communication X) and OFI (OpenFabrics interfaces). Both versions support Ethernet, Infiniband and Omni-Path. We do no use the latter two fabrics (although we do have some Infiniband but do not custom compile for it).

  • openmpi4-gnu9-ohpc # ofi & ucs
  • mpich-ofi-gnu9-ohpc # ofi only
  • mpich-ucx-gnu9-ohpc # ucx only

Default Environment

A default development environment for compilations for parallel programs requiring MPI. This setup can be conveniently enabled via modules and the OpenHPC modules environment is pre-configured to load an ohpc module on login (if present, [it is]). Our default environment enables autotools, the GNU compiler toolchain, and the OpenMPI stack.

  • lmod-defaults-gnu9-openmpi4-ohpc
  • meta-site file /opt/ohpc/pub/modulefiles/ohpc
  • loaded in via /etc/profile.d/lmod.[sh|csh]

3rd party Libs

OpenHPC provides pre-packaged builds for a number of popular open-source tools and libraries, for example FFTW and HDF5 (including serial and parallel I/O support), and the GNU Scientific Library (GSL).

# libraries/tools meta-packages built with GNU toolchain

  • ohpc-gnu9-serial-libs
  • ohpc-gnu9-io-libs
  • ohpc-gnu9-python-libs
  • ohpc-gnu9-runtimes

# parallel lib meta-packages for all available MPI toolchains

  • ohpc-gnu9-mpich-parallel-libs
  • ohpc-gnu9-openmpi4-parallel-libs


OpenHPC also provides compatible builds for use with the compilers and MPI stack included in newer versions of the Intel® OneAPI HPC Toolkit (using the classic compiler variants).

  • intel-oneapi-toolkit-release-ohpc
  • intel-compilers-devel-ohpc
  • intel-mpi-devel-ohpc

# libs and tools

  • ohpc-intel-serial-libs
  • ohpc-intel-geopm
  • ohpc-intel-io-libs
  • ohpc-intel-perf-tools
  • ohpc-intel-python3-libs
  • ohpc-intel-mpich-parallel-libs
  • ohpc-intel-openmpi4-parallel-libs
  • ohpc-intel-impi-parallel-libs

Module Environment

So what does all this look like? On the compute nodes /opt/intel and /opt/ohpc/pub are mounted from cottontail2. The user environment is managed with package Lmod/module ( This eliminates the need to control your environment with PATH and LD_LIBRARY_PATH exports. (But you will still have to do so when using software compiled in cottontail/greentail52 environments (CentOS 6+7).

For the new environment I will probably compile software using
the OpenHPC software stack and stage the modules in
duplicating the /opt/ohpc/pub setup. We'll have to experiment a bit.

So after login, the default environment shows.
Please note that in this OpenHPC default environment
has been removed from $PATH. Probably about time.

# default environment
[hmeij@cottontail2 ~]$ module list

Currently Loaded Modules:
  1) autotools   3) gnu9/9.4.0    5) ucx/1.11.2         7) openmpi4/4.1.1
  2) prun/2.2    4) hwloc/2.5.0   6) libfabric/1.13.0   8) ohpc


[hmeij@cottontail2 ~]$ which gcc mpicc

# set at login
[hmeij@cottontail2 ~]$ env | grep -i modulepath

# more is available
# a serial gnu9 and a parallel gnu9-openmpi4 toolchains and a tool/compiler chain.

hmeij@cottontail2 ~]$ module avail

-------------------- /opt/ohpc/pub/moduledeps/gnu9-openmpi4 --------------------
   adios/1.13.1     netcdf-cxx/4.3.1        py3-scipy/1.5.1
   boost/1.76.0     netcdf-fortran/4.5.3    scalapack/2.1.0
   dimemas/5.4.2    netcdf/4.7.4            scalasca/2.5
   example2/1.0     omb/5.8                 scorep/6.0
   extrae/3.7.0     opencoarrays/2.9.2      sionlib/1.7.4
   fftw/3.3.8       petsc/3.16.1            slepc/3.16.0
   hypre/2.18.1     phdf5/1.10.8            superlu_dist/6.4.0
   imb/2019.6       pnetcdf/1.12.2          tau/2.29
   mfem/4.3         ptscotch/6.0.6          trilinos/13.2.0
   mumps/5.2.1      py3-mpi4py/3.0.3

------------------------ /opt/ohpc/pub/moduledeps/gnu9 -------------------------
   R/4.1.2          mpich/3.4.2-ofi         plasma/2.8.0
   gsl/2.7          mpich/3.4.2-ucx  (D)    py3-numpy/1.19.5
   hdf5/1.10.8      mvapich2/2.3.6          scotch/6.0.6
   impi/2021.5.1    openblas/0.3.7          superlu/5.2.1
   likwid/5.0.1     openmpi4/4.1.1   (L)
   metis/5.1.0      pdtoolkit/3.25.1

-------------------------- /opt/ohpc/pub/modulefiles ---------------------------
   EasyBuild/4.5.0          hwloc/2.5.0      (L)    prun/2.2          (L)
   autotools         (L)    intel/2022.0.2          singularity/3.7.1
   charliecloud/0.15        libfabric/1.13.0 (L)    ucx/1.11.2        (L)
   cmake/3.21.3             ohpc             (L)    valgrind/3.18.1
   example1/1.0      (L)    os
   gnu9/9.4.0        (L)    papi/5.7.0

   D:  Default Module
   L:  Module is loaded


Switching to Intel's OneAPI toolchain requires swapping to Intel's gnu9 compiler.

[hmeij@cottontail2 ~]$ module load intel/2022.0.2
Loading compiler version 2022.0.2
Loading tbb version 2021.5.1
Loading compiler-rt version 2022.0.2
Loading oclfpga version 2022.0.2
  Load "debugger" to debug DPC++ applications with the gdb-oneapi debugger.
  Load "dpl" for additional DPC++ APIs:
Loading mkl version 2022.0.2

Lmod has detected the following error: You can only have one compiler module loaded at a time.
You already have gnu9 loaded.
To correct the situation, please execute the following command:

  $ module swap gnu9 intel/2022.0.2

# after the swap we observe 

[hmeij@cottontail2 ~]$ which icc icx mpicc

# debugger module
[hmeij@cottontail2 ~]$ module load debugger
Loading debugger version 2021.5.0


Disabled the following, it sets up ~/.ssh/config file that conflicts with old HPC head node.


##if [ -x "/usr/bin/cluster-env" ]; then
##   /usr/bin/cluster-env

  • get stateless PXE booting working
    • tinymem, mw128
  • then goldenimage imaging
    • amber128


Figure how to launch docker containers with charliecloud (NGC catalog)

NGC Containers: We built libnvidia-container to make it easy to run CUDA applications inside containers

salloc, srun and sbatch (in Slurm 21.08+) have the '–container' argument … greentail52's test slurm version is 21.08.1, cottontail2 runs slurm version 20.11.8 - so test on greentail52 first.


Figure out an upgrade process before going production.

  • Do you actually want to upgrade OpenHPC?
    • v2.6 deploys ww4.x (maybe not want this, containers)
    • chroot images and rebuild images running rocky 8
    • OneAPI similar conflicts? (/opt/intel and /opt/ohpc/pub)
    • slurm complications?
  • Upgrade Openhpc, OneAPI should be on new head node
    • test compatibility compilers
    • slurm clients
yum upgrade "*-ohpc"
yum upgrade "ohpc-base"


yum update --disablerepo=* --enablerepo=[oneAPI,OpenHPC]

Upgrade history

  • OS only, 30 Jun 2022 (90+ days up) - no ohpc, oneapi (/opt)
  • OS only, 18 Aug 2023 (440+ days up) - no ohpc, oneapi (/opt)

example modules

Independent modules can be inserted in the OpenHPC environment. But I will try to keep them separate from the beginning so ass to not accidentally customize the environment. Two indeoendent application examples explained below.

# an application with no compiler or MPI runtime dependencies

mkdir /opt/ohpc/pub/modulefiles/example1
cp /opt/ohpc/pub/examples/example.modulefile \

# an application dependent on OpenMPI and the GNU toolchain

mkdir /opt/ohpc/pub/moduledeps/gnu9-openmpi4/example2
cp /opt/ohpc/pub/examples/example-mpi-dependent.modulefile \

# why would you put these in pub/libs ???

[hmeij@cottontail2 ~]$ module show example1/1.0
whatis("Name: example ")
whatis("Version: 1.0 ")
whatis("Category: runtime library ")
whatis("Description: example independant module ")
whatis("URL ")
This module loads the example program

Version 1.0



Sample job to run Amber20 on n[100-101]

Amber cmake download fails with READLINE error … package readline-devel needs to be installed to get past that which pulls in ncurses-c++-libs-6.1-9.20180224.el8.x86_64 ncurses-devel-6.1-9.20180224.el8.x86_64 readline-devel-7.0-10.el8.x86_64

Example script run.rocky for cpu or gpu run (for queues amber128 [n78] and test [n100-n101] for gpus and mw128 and tinymem for cpus)

# [found at XStream]
# Slurm will IGNORE all lines after the FIRST BLANK LINE,
# even the ones containing #SBATCH.
# Always put your SBATCH parameters at the top of your batch script.
# Took me days to find ... really silly behavior -Henk
#SBATCH --job-name="test"
#SBATCH --output=out   # or both in default file
#SBATCH --error=err    # slurm-$SLURM_JOBID.out
#SBATCH --mail-type=END
# NODE control
#SBATCH -N 1     # default, nodes
# CPU control
#SBATCH -n 8     # tasks=S*C*T
###SBATCH -B 1:1:1 # S:C:T=sockets/node:cores/socket:threads/core
#SBATCH -B 2:4:1 # S:C:T=sockets/node:cores/socket:threads/core
###SBATCH --cpus-per-gpu=1
###SBATCH --mem-per-gpu=7168 
# GPU control
###SBATCH --gres=gpu:geforce_gtx_1080_ti:1  # n78
###SBATCH --gres=gpu:quadro_rtx_5000:1  # n[100-101]
# Node control
#SBATCH --partition=tinymem
#SBATCH --nodelist=n57

# unique job scratch dirs

### AMBER20
#source /share/apps/CENTOS8/ohpc/software/amber/20/
# OR #
module load amber/20
# check
which nvcc gcc mpicc pmemd.cuda

# stage the data
cp -r ~/sharptail/* .


# for amber20 on n[100-101] gpus, select gpu model
#mpirun -x LD_LIBRARY_PATH -machinefile ~/slurm/localhosts-one.txt \
#-np  1 \
#pmemd.cuda \
#-O -o mdout.$SLURM_JOB_ID -inf mdinfo.1K10 -x mdcrd.1K10 -r restrt.1K10 -ref inpcrd

# for amber20 on n59/n77 cpus, select partition
mpirun -x LD_LIBRARY_PATH -machinefile ~/slurm/localhosts.txt \
-np  8 \
pmemd.MPI \
-O -o mdout.$SLURM_JOB_ID -inf mdinfo.1K10 -x mdcrd.1K10 -r restrt.1K10 -ref inpcrd

scp mdout.$SLURM_JOB_ID ~/tmp/

Example script run.centos for cpus or gpu run (queues mwgpu, exx96)

# [found at XStream]
# Slurm will IGNORE all lines after the FIRST BLANK LINE,
# even the ones containing #SBATCH.
# Always put your SBATCH parameters at the top of your batch script.
# Took me days to find ... really silly behavior -Henk
#SBATCH --job-name="test"
#SBATCH --output=out   # or both in default file
#SBATCH --error=err    # slurm-$SLURM_JOBID.out
##SBATCH --mail-type=END
# NODE control
#SBATCH -N 1     # default, nodes
# CPU control
#SBATCH -n 1     # tasks=S*C*T
#SBATCH -B 1:1:1 # S:C:T=sockets/node:cores/socket:threads/core
###SBATCH -B 2:4:1 # S:C:T=sockets/node:cores/socket:threads/core
# GPU control
###SBATCH --gres=gpu:tesla_k20m:1  # n[33-37]
#SBATCH --gres=gpu:geforce_rtx_2080_s:1  # n[79-90]
#SBATCH --cpus-per-gpu=1
#SBATCH --mem-per-gpu=7168
# Node control
#SBATCH --partition=exx96
#SBATCH --nodelist=n88

# unique job scratch dirs

# amber20/cuda 9.2/openmpi good for n33-n37 and n79-n90
export PATH=/share/apps/CENTOS7/openmpi/4.0.4/bin:$PATH
export LD_LIBRARY_PATH=/share/apps/CENTOS7/openmpi/4.0.4/lib:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/n37-cuda-9.2
export PATH=/usr/local/n37-cuda-9.2/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/n37-cuda-9.2/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="/usr/local/n37-cuda-9.2/lib:${LD_LIBRARY_PATH}"
export PATH=/share/apps/CENTOS7/python/3.8.3/bin:$PATH
export LD_LIBRARY_PATH=/share/apps/CENTOS7/python/3.8.3/lib:$LD_LIBRARY_PATH
which nvcc mpirun python

source /usr/local/amber20/
# stage the data
cp -r ~/sharptail/* .

###export CUDA_VISIBLE_DEVICES=`shuf -i 0-3 -n 1`

# for amber20 on n[33-37] gpus, select gpu model
mpirun -x LD_LIBRARY_PATH -machinefile ~/slurm/localhosts-one.txt \
-np  1 \
pmemd.cuda \
-O -o mdout.$SLURM_JOB_ID -inf mdinfo.1K10 -x mdcrd.1K10 -r restrt.1K10 -ref inpcrd

# for amber20 on n59/n100 cpus, select partition
#mpirun -x LD_LIBRARY_PATH -machinefile ~/slurm/localhosts.txt \
#-np  8 \
#pmemd.MPI \
#-O -o mdout.$SLURM_JOB_ID -inf mdinfo.1K10 -x mdcrd.1K10 -r restrt.1K10 -ref inpcrd

scp mdout.$SLURM_JOB_ID ~/tmp/

The script was converted to a module like so

# or do this and add content of foo_1.0 to this module
#$LMOD_DIR/sh_to_modulefile  --to TCL --from=bash \
#--output /tmp/foo_1.0 \

# need Lmod 8.6+, ohpc has 8.5.1
#switch -- [module-info shelltype] {
#    sh {
#        source-sh bash $scriptpath/
#    }
#    csh {
#        source-sh tcsh $scriptpath/amber.csh
#    }

# which generated these lines with the Tcl header, then add these to the modulefile for amber/20

setenv AMBERHOME {/share/apps/CENTOS8/ohpc/software/amber/20}
setenv LD_LIBRARY_PATH {/share/apps/CENTOS8/ohpc/software/amber/20/lib}
prepend-path PATH {/share/apps/CENTOS8/ohpc/software/amber/20/bin}
setenv PERL5LIB {/share/apps/CENTOS8/ohpc/software/amber/20/lib/perl}
setenv PYTHONPATH {/share/apps/CENTOS8/ohpc/software/amber/20/lib/python3.9/site-packages}


Amber22 is somehow incompatible with CentOS/Rocky openmpi (yum install). Hence the latest version of openmpi was compiled and installed into $AMBERHOME. No need to set PATHs, just be sure to source in your script. (compile instructions below for me…)
“download a recent version of OpenMPI at, untar the distribution in amber22_src/AmberTools/src, and execute in that directory the configure_openmpi script. (Do this after you have done a serial install, and have sourced the script in the installation folder to create an AMBERHOME)”

[hmeij@n79 src]$ echo $AMBERHOME

[hmeij@n79 src]$ which mpirun mpicc

First establish a successful run with the run.rocky script for Amber20 (listed above). Then change the module in your script. (for queues amber128 [n78] and test [n100-n101] for gpus and mw128 and tinymem for cpus)

module load amber/22

# if the module does not show up in the output of your console

module avail

# treat your module cache as out of date

module --ignore_cache avail

First establish a success full run with the run.centos script for Amber20 (listed above, for cpus or gpus on queues mwgpu and exx96).

Then edit the script and apply these edits. We had to use a specific compatible gcc/g++ version to make this work. Hardware is getting too old.

# comment out the 2 export lines pointing to openmpi
##export PATH=/share/apps/CENTOS7/openmpi/4.0.4/bin:$PATH
##export LD_LIBRARY_PATH=/share/apps/CENTOS7/openmpi/4.0.4/lib:$LD_LIBRARY_PATH

# additional gcc 6.5.0
export PATH=/share/apps/CENTOS7/gcc/6.5.0/bin:$PATH
export LD_LIBRARY_PATH=/share/apps/CENTOS7/gcc/6.5.0/lib64:$LD_LIBRARY_PATH

# edit or add correct source line, which and ldd lines just for debugging
###source /usr/local/amber16/ # works on mwgpu
###source /usr/local/amber20/ # works on exx96
source /share/apps/CENTOS7/amber/amber22/ # works on mwgpu and exx96
which nvcc mpirun python
ldd `which pmemd.cuda_SPFP`


cluster/214.txt · Last modified: 2023/08/18 12:19 by hmeij07