User Tools

Site Tools


cluster:171

This is an old revision of the document!



Back

Warewulf Golden Image

Build an OpenHPC provisioning server using the Warewulf/Slurm recipe CentOS 7.5 x86_64. Described at local page OpenHPC 1.3.1 and web site http://openhpc.community/downloads/. Make sure stateless works.

We have a standalone Warewulf 3.6.99 provisioning server on CentOS 6.10 with golden images so we can fall back if necessary.

We will be adding custom software in /usr/local for the K20 GPUs, view K20 Redo page. With the golden image approach you custom build your environment directly on the compute node, make sure it all works, and grab the golden image.

That's the grand plan.

Step 1

  • Hookup a DVD to a node and install CentOS 7.x, I will be using 7.2
    • sda1 /boot 768 mb, sda3 swap 32 gb and / sda2 75 gb (7.x reorders…)
    • select “Compute Node” software install
    • nic1 private ip, nic2 public ip (see below)
    • my node name will be n37
    • Upon reboot, stay on console, accept license
  • Generate ssh keys on node, ssh-keygen -t rsa
  • From SMS_server scp global authorized_keys, known_hosts and /etc/hosts on node
  • From SMS_server add /root/.ssh/cluster.pub to n37:/root/.ssh/authorized_keys
  • Disable selinux, /etc/selinux/config
  • Allow root login, /etc/ssh/sshd_config
  • Install iptables, disable firewalld, disable NetworkManager
  • Configure nic1 on private network 192.168.102.x so it can reach OHPC SMS_server via “eth0”
  • Configure nic 2 on public network for temporary access via “eth1”
    • so we can run yum update, yum install
    • also some software will want to download stuff (amber, gromacs)
    • I snake a temporary ethernet cable from node to a public switch
    • warewulf will make this a private ip when imaging
  • Save generic files, handy for a reset
    • cp /etc/sysconfig/network-scripts/ifcfg-[“eth0”,“eth1”] to /root
    • fdisk -l > /root/fdisk-l.txt
    • cp from /etc files passwd shadow group to /root
    • cp from /etc files fstab hosts profile bashrc to /root
  • reboot
  • Check firewall, iptables -L

Step 2

Leverage the OpenHPC tools in the CentOS/Slurm recipe building the compute node n37.

yum -y install http://build.openhpc.community/OpenHPC:/1.3/CentOS_7/x86_64/ohpc-release-1.3-1.el7.x86_64.rpm
yum -y update ohpc-release

yum -y install ohpc-base-compute 

# skipping ohpc-slurm-client

yum -y install ntp tcl tcl-devel 

# skipping lmod-ohpc

yum -y groupinstall "Infiniband Support" 

yum -y install infinipath-psm   

systemctl enable rdma

# install scheduler rpm, copy config file from primary login node
# install missing rpms needed for mounts 
yum -y install nfs-utils nfs-utils-lib nfs4-acl-tools

# some local stuff
echo 'relayhost = 192.168.102.42' >> /etc/postfix/main.cf
ln -s /home /share
mkdir /localscratch /sanscratch
chmod ugo+rwx /localscratch /sanscratch
chmod o+t /localscratch /sanscratch

# should have a working compute node now, stay with 7.2 kernel
yum update  --exclude=kernel*
reboot

# login, poke around, take public "eth1" nic down
uname -a
ifdown enp6s0  # hook up 10.10 private network cable
systemctl disable iptables
yum clean all

Step 3

Next we prepare SMS_server's warewulf to use golden image approach. Consult the page OpenHPC 1.3.1 in the Warewulf section on what is already setup using the stateless node approach.

No hybridization path. (Not necessary, just making sure).

# /etc/warewulf/vnfs.conf

#off hybridize += /usr/X11R6
#off hybridize += /usr/lib/locale
#off hybridize += /usr/lib64/locale
#off hybridize += /usr/include
#off hybridize += /usr/share/man
#off hybridize += /usr/share/doc
#off hybridize += /usr/share/locale

In the buildchroot function of /usr/libexec/warewulf/wwmkchroot/golden-system.tmpl make sure you keep adding your NFS mounts. In my case I'll be adding…

  ...
  --exclude=/var/cache/* \
  --exclude=/sanscratch/* \
  --exclude=/localscratch/* \
  root@SOURCEADDR:/ .
  

Make sure you can login passwordless via ssh from SMS_server to your node n37. If not, add the SMS_server:/root/.ssh/cluster.pub content to n37:/root/.ssh/authorized_keys, then on SMS_server execute the command to make the CHROOT.

SOURCEADDR=n37 wwmkchroot golden-system /data/goldimages/vanilla.chroot 

[root@ohpc0-test ~]# du -hs /data/goldimages/vanilla.chroot
1.6G    /data/goldimages/vanilla.chroot

Step 4

Create files deploy.[sh|txt] files (attached at bottom of page, I keep them in CHROOT/root/). Make sure this new deploy.sh script assigns the node n37 to vanilla.chroot and change bootstrap string if newer kernel is present.

Ran into UEFI boot loader problems. More on that later.

# final touches in CHROOT
# edit deploy.sh (check filesystems, vnfs, bootstrap, UEFI stuff)

# version
echo "vanilla compute node with scheduler `date`" > \
/data/goldimages/vanilla.chroot/root/VERSION

# make bootstrap, takes couple of mins
wwbootstrap --chroot=/data/goldimages/vanilla.chroot 3.10.0-327.el7.x86_64

# make vnfs, patience, most of the time is spend in "adding image to datastore"
wwvnfs --chroot=/data/goldimages/vanilla.chroot

VNFS NAME            SIZE (M) CHROOT LOCATION
vanilla.chroot      533.9    /data/goldimages/vanilla.chroot

# configure node, deploy
# assumes node pxe boots first

cd /data/goldimages/vanilla.chroot/root
./deploy.sh `grep ^n37 deploy.txt`
ssh n37 reboot

# once the deploy is on it's way, imaging might take 5 mins or so
# for next node boot to be from local disk, on SMS_server issue

wwsh provision set --bootlocal=EXIT n37 -y

Step 5

So after imaging and reboot, what do we have? Definitely an imaged node, the partitions have shuffled around. And our VERSION file came from the vnfs made from CHROOT. We also have eth0, eth1 and ib0.

[root@n37 ~]# uname -a
Linux n37 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

[root@n37 ~]# cat /root/VERSION 
vanilla compute node with scheduler Fri Aug 17 10:24:18 EDT 2018

[root@n37 ~]# fdisk -l | grep sda
Disk /dev/sda: 120.0 GB, 120034123776 bytes, 234441648 sectors
/dev/sda1              63     1510109      755023+  83  Linux
/dev/sda2         1510110    65529134    32009512+  82  Linux swap / Solaris
/dev/sda3        65529135   234436544    84453705   83  Linux

[root@n37 etc]# ifconfig | grep 'inet '
Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
        inet 192.168.102.47  netmask 255.255.0.0  broadcast 192.168.255.255
        inet 10.10.102.47  netmask 255.255.0.0  broadcast 10.10.255.255
        inet 10.11.103.47  netmask 255.255.0.0  broadcast 10.11.255.255
        inet 127.0.0.1  netmask 255.255.255.0

Before the reboot we grabbed passwd/shadow/group/fstab and pulled to our global archive. Spliced in our user and group bases and added our NFS mounts to fstab (comment out UUID lines). A post boot script will copy those in place as well as the tarball for 'usr/local' which will contain our K20 compiled software (hence the rsync exclude, keep vnfs small). Then all that is left to do is:

[root@n37 ~]# mount -a

[root@n37 rpms]# cd /sanscratch/tmp/rpms
[root@n37 rpms]# yum install --tolerant *3.10.0*

...
Resolving Dependencies
--> Running transaction check
---> Package kernel.x86_64 0:3.10.0-862.11.6.el7 will be installed
---> Package kernel-devel.x86_64 0:3.10.0-862.11.6.el7 will be installed
---> Package kernel-tools.x86_64 0:3.10.0-327.el7 will be updated
---> Package kernel-tools.x86_64 0:3.10.0-862.11.6.el7 will be an update
---> Package kernel-tools-libs.x86_64 0:3.10.0-327.el7 will be updated
---> Package kernel-tools-libs.x86_64 0:3.10.0-862.11.6.el7 will be an update
---> Package kernel-tools-libs-devel.x86_64 0:3.10.0-862.11.6.el7 will be installed
--> Finished Dependency Resolution
...

# reboot from local disk, do not make a golden image of this, stay on 7.2 till 7.5
# is fixed, only add non-kernel packages to vanilla.chroot if needed and re-image

UEFI

As far as I understand it, from CentOS 7.3 and on OHPC/Warewulf switch to GPT disk UEFI boot loader. This involves a boot manager (efibootmgr) that looks not at a master boot record (MBR) on a bootable partition but insteads looks at a file (grubx64.efi). To boot UEFI environment variables need to be supported and you do that in BIOS. Read the OpenHPC thread “Stateful provisioning with warewulf does not work (ohpc 1.3.5)”. You also need to add these packages

  • efibootmgr efivar-libs grub2-efi-x64 dosfstools

My hardware is an ASUS esc4000fdr G2 bought in 2013. In the BIOS boot CSM menus I can set boot filter options to “UEFi and Legacy” and for each type (PXE, Storage,…) I set “UEFI first”. These setting allow the CentOS7.2 image to still Legacy PXE boot. Nice.

But the CentOS 7.5 image…not. No matter what settings I use to boot UEFI I continue to receive EFI variables not supported. If I boot the 7.5 image Legacy wise I receive a “can find path to tmpfs error” in mkbootable. The owner of the list thread posted this solution. Look in the deploy script on how to handle that.

  • efi.sh
#!/bin/bash -x
ent=$(chroot "$NEWROOT" /usr/sbin/efibootmgr | /bin/grep "CentOS-Warewulf")
[[ -n "$ent" ]] && chroot "$NEWROOT" /usr/sbin/efibootmgr -b "${ent:4:4}" -B 
chroot "$NEWROOT" /usr/sbin/efibootmgr -c -d /dev/sda -l "\EFI\centos\grubx64.efi" -L CentOS-Warewulf

deploy.sh

#!/bin/bash
# deploy a n33.chroot type server 
# templates are in /data/templates
node=$1
hwaddr0=$2
ipaddr0=$3
hwaddr1=$4
ipaddr1=$5
ipaddri=$6

if [ $# != 6 ]; then
        echo "missing args: node hwaddr0 ipaddr0 hwaddr1 ipaddr1 ipaddri"
        exit
fi

wwsh object delete $node -y ; sleep 3

wwsh node new $node --netdev=eth0 \
--hwaddr=$hwaddr0 --ipaddr=$ipaddr0 \
--netmask=255.255.0.0  --network=255.255.0.0 -y

wwsh node set $node --netdev=eth1 \
--hwaddr=$hwaddr1 --ipaddr=$ipaddr1 \
--netmask=255.255.0.0  --network=255.255.0.0 -y

wwsh node set $node --netdev=ib0 \
--ipaddr=$ipaddri \
--netmask=255.255.0.0  --network=255.255.0.0 -y

# database file imports must already have been performed
wwsh provision set $node --fileadd network.ww,ifcfg-eth1.ww,ifcfg-ib0.ww -y

wwsh object modify -s bootloader=sda $node -y
wwsh object modify -s diskpartition=sda $node -y
wwsh object modify -s diskformat=sda1,sda3 $node -y
wwsh object modify -s filesystems="mountpoint=/boot:dev=sda1:type=ext4:size=768,dev=sda2:type=swap:size=32768,mountpoint=/:dev=sda3:type=ext4:size=+" $node -y

# centos 7.2 cdrom
wwsh provision set $node --vnfs=vanilla.chroot -y
wwsh provision set $node --bootstrap=3.10.0-327.el7.x86_64  -y

wwsh provision set --bootlocal=UNDEF $node -y
wwsh provision set --postshell=0 $node  -y

# needed with centos7.5 efibootmgr, also setup BIOS
# wwsh file import /data/templates/efi.sh --name=efi  -y
# wwsh object modify -s format=shell efi  -y
# wwsh object modify -s postscript=efi $node  -y  

wwsh pxe update
wwsh dhcp update
# cron turns them off at 4pm
systemctl restart dhcpd
systemctl restart httpd

echo "wwsh provision set --bootlocal=EXIT $node -y"

deploy.txt

File deploy.txt

# n33.chroot type nodes, ASUS type servers with 4x K20 gpus
n37 50:46:5D:E8:1F:A8 192.168.102.47 50:46:5D:E8:1F:A9 10.10.102.47 10.11.103.47
# more servers ...


Back

cluster/171.1534771360.txt.gz · Last modified: 2018/08/20 09:22 by hmeij07