User Tools

Site Tools


cluster:217


Back

Slurm entangles

So, vaguely I remember when redoing our K20 gpu nodes I had troubles with that ASUS hardware and Warewulf 3.6. Now I have deployed a production cluster using OpenHPC 2.4, Rocky 8.5 and Warewulf 3.9 version. Same deal. Do not know what is going on but just documenting.

That's too bad as I was hoping to have a single operating system cluster. But now I will have to think about what to do with our CentOS 7 hardware which is running the old scheduler. The hope was to migrate everything to Slurm scheduler.

ASUS

First we reset the BIOS and make sure PXE boot is enable, legacy boot mode.

  1. Save & Exit > Restore Defaults > Yes > Save & Reset, then next boot
  2. Advanced > CMS Configuration > Enable > Boot Option filter = Legacy
  3. Advanced > CMS Configuration → Network = Legacy
  4. Advanced > Network Stack – Enable
  5. within that tab enable PXE4 and PXE6 support
  6. Boot > Boot order; network first then hard drive
  7. Save & Exit > Yes & Reset

Next we create the warewulf node object and boot (see deploy script, at bottom).

When this ASUS hardware boots, it sends over the correct mac address. We observe….

# in /var/log/messages

Jun 10 09:13:41 cottontail2 dhcpd[380262]: DHCPDISCOVER from 04:d9:f5:bc:6e:c2 via eth0
Jun 10 09:13:41 cottontail2 dhcpd[380262]: DHCPOFFER on 192.168.102.100 to 04:d9:f5:bc:6e:c2 via eth0

# in /etc/httpd/logs/access_log

Jun  10 09:13:57 cottontail2 in.tftpd[388239]: Client ::ffff:192.168.102.100 \
finished /warewulf/ipxe/bin-i386-pcbios/undionly.kpxe

That's it. Everything goes quiet. On the node's console during pxe boot I observe ipxe net0 being configured with correct mac address, then it times out with the error “no more network devices available”, or some such. Then the node continues to boot hard disk and CentOS 6 shows up.

And when testing connectivity between node and SMS all is well … but the GET never happens, the ixpe config file is there, the correct nic is responding, weird. ASUS splash screen: “In search of the incredible”. Indeed.

[root@n90 tmp]# telnet cottontail2 80
Trying 192.168.102.250...
Connected to cottontail2.
Escape character is '^]'.
GET /WW/file?hwaddr=04:d9:f5:bc:6e:c2&timestamp=0

# all files are retrieved

Slurm #1

First Idea: install OHPC v1.3 CentOS7 slurmd client on the node, then join that to OHPC v2.4 slurmctld. To do that first yum install the ohpc-release from this location

Next do a 'yum install generic-pacakge-name' of these packages to install slurmd client of ohpc 1.3 for centos 7.

-rw-r--r-- 1 root root    35264 Feb 23  2017 munge-devel-ohpc-0.5.12-21.1.x86_64.rpm
-rw-r--r-- 1 root root    51432 Jun 16 14:40 munge-libs-ohpc-0.5.12-21.1.x86_64.rpm
-rw-r--r-- 1 root root   114060 Jun 16 14:40 munge-ohpc-0.5.12-21.1.x86_64.rpm
-rw-r--r-- 1 root root     3468 Jun 16 14:44 ohpc-filesystem-1.3-26.1.ohpc.1.3.6.noarch.rpm
-rw-r--r-- 1 root root     2396 Jun 16 14:40 ohpc-slurm-client-1.3.8-3.1.ohpc.1.3.8.x86_64.rpm
-rw-r--r-- 1 root root  4434196 Jun 16 14:46 pmix-ohpc-2.2.2-9.1.ohpc.1.3.7.x86_64.rpm
-rw-r--r-- 1 root root    17324 Jun 16 14:40 slurm-contribs-ohpc-18.08.8-4.1.ohpc.1.3.8.1.x86_64.rpm
-rw-r--r-- 1 root root   198028 Jun 16 14:40 slurm-example-configs-ohpc-18.08.8-4.1.ohpc.1.3.8.1.x86_64.rpm
-rw-r--r-- 1 root root 13375940 Jun 16 14:40 slurm-ohpc-18.08.8-4.1.ohpc.1.3.8.1.x86_64.rpm
-rw-r--r-- 1 root root   148980 Jun 16 14:40 slurm-pam_slurm-ohpc-18.08.8-4.1.ohpc.1.3.8.1.x86_64.rpm
-rw-r--r-- 1 root root   796280 Jun 16 14:44 slurm-perlapi-ohpc-18.08.8-4.1.ohpc.1.3.8.1.x86_64.rpm
-rw-r--r-- 1 root root   654104 Jun 16 14:40 slurm-slurmd-ohpc-18.08.8-4.1.ohpc.1.3.8.1.x86_64.rpm

Make sure munge/unmunge work between 1.3/2.4, that date is in sync (else you get error #16), and startup slurmd with 2.4 config files in place. This works but slurmd client of 1.3 fails to register. This appears to be an error in that the slurm versions are too far apart, 2018 vs 2020. Hmm, why is ophc v2.4 running such an old slurm version?

Had to uncomment this for slurmd to start (seems ok because they are slurmctld settings not used by slurmd client…according to slurm list)

#SelectType=select/cons_tres
#SelectTypeParameters=CR_CPU_Memory
[root@cottontail2 ~]#  munge -n -t 10 | ssh n90 unmunge
STATUS:           Success (0)
ENCODE_HOST:      cottontail2 (192.168.102.250)
ENCODE_TIME:      2022-06-17 09:35:08 -0400 (1655472908)
DECODE_TIME:      2022-06-17 09:35:07 -0400 (1655472907)
TTL:              10
CIPHER:           aes128 (4)
MAC:              sha256 (5)
ZIP:              none (0)
UID:              root (0)
GID:              root (0)
LENGTH:           0

Too bad. Ok, we'll keep the munge packages and remove all other ohpc v1.3 packages.

Slurm #2

Second Idea: download Slurm source the closest version just above ohpc v2.4 version. Next compile 20.11.9 slurm and see if it is accepted on ohpc v2.4 slurm 20.11.8 to register ….

export PATH=/share/apps/CENTOS7/openmpi/4.0.4/bin:$PATH
export LD_LIBRARY_PATH=/share/apps/CENTOS7/openmpi/4.0.4/lib:$LD_LIBRARY_PATH
[root@n90 ~]# which gcc mpicc
/usr/bin/gcc
/share/apps/CENTOS7/openmpi/4.0.4/bin/mpicc

./configure \
--prefix=/usr/local/slurm-20.11.9 \
--sysconfdir=/usr/local/slurm-20.11.9/etc \
--with-nvml=/usr/local/cuda
make
make install

[root@n90 slurm-20.11.9]# find /usr/local/slurm-20.11.9 -name auth_munge.so
/usr/local/slurm-20.11.9/lib/slurm/auth_munge.so

# 

YES! it does register, hurray.

Finish with

  • make generic /usr/local/slurm → /usr/local/slurm-20.11.9 link
  • copy over munge.key, restart munge
  • startup at boot in /etc/rc.local
  • export envs in /etc/bashrc
  • make dirs /var/log/slurm /var/spool/slurm

Do NOT mount /opt/intel and /opt/ohpc/pub from SMS, that's all Rocky8.5 stuff.

Slurm #3

There is a warning on Slurm web page re the older versions archives page

“Due to a security vulnerability (CVE-2022-29500), all versions of Slurm prior to 21.08.8 or 20.11.9 are no longer available for download” … so why is openhpc v2.4 running such an old slurm version?

Third Idea: once we're fully deployed I may go to latest Slurm version, run on different ports with maybe newer munge version (although that should not matter, why does this scheduler even need munge?)

Run Slurm outside of openhpc via local compile in /usr/local/. A standalone version, the most recent version.

Decided to go this route v22.05.02 standalone version (with ohpc v1.3 or v2.4 munge packages). You need all three packages (munge, munge-libs and munge-devel) on host where you compile slurm (note: cottontail2 for rocky8.5, node n90 for centos7). Then just copy.

Henk 2022/06/23 19:07

Deploy

I use a script to make sure I do not miss any steps when imaging. Works like a charm but with ASUS hardware. This script will do stateless, stateful or golden image. For golden image creation follow this Warewulf Golden Image link.

#!/bin/bash

# FIX vnfs & bootstrap for appropriate node
# formats 1t /dev/nvme0n1 !!!

# deploy a chroot server via PXE golden image transfer
# templates are always in stateless CHROOT/rocky8.5/root/wwtemplates
# look at header deploy.txt

node=$1
hwaddr0=$2
ipaddr0=$3
hwaddr1=$4
ipaddr1=$5

if [ $# != 5 ]; then
        echo "missing args: node hwaddr0 ipaddr0 hwaddr1 ipaddr1 "
        exit
fi

wwsh object delete $node -y
sleep 3

wwsh node new $node --netdev=eth0 \
--hwaddr=$hwaddr0 --ipaddr=$ipaddr0 \
--netmask=255.255.0.0  --network=255.255.0.0 -y

wwsh node set $node --netdev=eth1 \
--hwaddr=$hwaddr1 --ipaddr=$ipaddr1 \
--netmask=255.255.0.0  --network=255.255.0.0 -y

wwsh provision set $node --fileadd hosts,munge.key -y
wwsh provision set $node --fileadd passwd,shadow,group -y
wwsh provision set $node --fileadd network.ww,ifcfg-eth1.ww -y

# PRESHELL & POSTSHELL 1=enable, 0=disable
#wwsh provision set $node --postshell=1 -y
#wwsh provision set $node --kargs="net.ifnames=0,biosdevname=0" -y
#wwsh provision set --postnetdown=1 $node -y

# stateless, comment out for golden image
# wwsh provision set $node --bootstrap=4.18.0-348.12.2.el8_5.x86_64 -y
# wwsh provision set $node --vnfs=rocky8.5 -y

# stateful, comment out for golden image and stateless
# install grub2 in $CHROOT first, rebuild vnfs
# wwsh provision set --filesystem=efi-n90  $node -y
# wwsh provision set --bootloader=nvme0n1  $node -y

# uncomment for golden image, comment out stateless and stateful
 wwsh provision set $node --bootstrap=4.18.0-348.12.2.el8_5.x86_64 -y
 wwsh provision set $node --vnfs=n101.chroot -y
 wwsh provision set --filesystem=efi-n90  $node -y
 wwsh provision set --bootloader=nvme0n1  $node -y


wwsh provision set --bootlocal=UNDEF $node -y
echo "for stateful or golden image, after first boot issue"
echo "wwsh provision set --bootlocal=normal $node -y"

wwsh pxe update
wwsh dhcp update
systemctl restart dhcpd
systemctl restart httpd
systemctl restart tftp.socket
# crontab will shutdown these services at 5pm

  • n90.deploy.txt
# formats 1T /dev/nvme0n1 !!!
n90 04:D9:F5:BC:6E:C2 192.168.102.100 04:D9:F5:BC:6E:C3 10.10.102.100


Back

cluster/217.txt · Last modified: 2022/06/24 09:53 by hmeij07