User Tools

Site Tools


cluster:216

Warning: Undefined array key -1 in /usr/share/dokuwiki/inc/html.php on line 1458

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:216 [2022/04/29 10:57]
hmeij07
cluster:216 [2022/06/07 16:07] (current)
hmeij07 [queues left]
Line 153: Line 153:
 yum --installroot=/opt/ohpc/admin/images/rocky8.5 install grub2  yum --installroot=/opt/ohpc/admin/images/rocky8.5 install grub2 
 touch /opt/ohpc/admin/images/rocky8.5/root/VNFS-TEST-WITH-GRUB2 touch /opt/ohpc/admin/images/rocky8.5/root/VNFS-TEST-WITH-GRUB2
 +
 +# build out stateful if desired
 +dnf --installroot $CHROOT install yum
 +dnf --installroot $CHROOT groupinstall "Server with GUI"
 +dnf --installroot $CHROOT install iptables-services
 +dnf --installroot $CHROOT clean all
  
 # rebuild vnfs # rebuild vnfs
Line 212: Line 218:
 ==== golden image ==== ==== golden image ====
  
-After stateful imaging we touch another file on imaged server then build a golden image. The touching of this new file represents customixing and testing the node. So for complex designs we might put the node temporarily on the internet and install nvidia drivers and toolkit. And perhaps install software that will optimize itself based on resources found (like gromacs/lammps probing gpu models for proper architecture). Then we build a golden image.+After stateful imaging we touch another file on imaged server then build a golden image. The touching of this new file represents customizing and testing the node prior to creating golden image. So for complex designs we might put the node temporarily on the internet and install nvidia drivers and toolkit for example. And perhaps install software that will optimize itself based on resources found (like gromacs/lammps probing gpu models for proper architecture). Then we build a golden image when everything works as expected. Hard to do in a CHROOT environment.
  
 <code> <code>
Line 235: Line 241:
 # view /etc/warewulf/vnfs.conf # view /etc/warewulf/vnfs.conf
 # the HYBRIDIZE section is commented out # the HYBRIDIZE section is commented out
 +
 +# /var/[log|spool|run] need to be removed from
 +/usr/libexec/warewulf/wwmkchroot/golden-tmpl
 +
 +# try on compute nodes
 +systemctl enable slurmd
  
 SOURCEADDR=n59 wwmkchroot golden-system \ SOURCEADDR=n59 wwmkchroot golden-system \
Line 276: Line 288:
 </code> </code>
  
 +Awesome. You also have a backup now. Image away. And no need for a dhcp server to always be at the ready. Linux will fix journal file system errors 99% of the time if rebooted from say a utility power loss.\\ Thank you Warewulf team.
 +
 +I also see there are EFI and EFI + NVME filesystem examples in ''/etc/warewulf/filesystem/examples''
 +
 +==== logger ====
 +
 +For some reason, after vnfs has compiled and deployed ''/dev/log'' is a socket file generating permission denied errors. Manual fix to apply, maybe put in ''/etc/rc.local'' in future
 +
 +<code>
 +
 +cd /dev
 +mv log log-orig
 +ln -s /run/systemd/journal/dev-log log
 +
 +logger test
 +journalctl --since=-1m
 +-- Logs begin at Thu 2022-05-12 10:46:49 EDT, end at Thu 2022-05-12 10:52:17 EDT. --
 +May 12 10:52:17 n59 root[3748]: test
 +
 +</code>
 +
 +==== queues left ====
 +
 +Not imaged will be nodes in these queues
 +
 +  * hp12 n[1-n32] Too old and failing fast, centos 6
 +  * mwgpu n[33-n37] K20 gpus EOL, no cuda driver updates anymore, centos7
 +  * mw256fd n[38-n45] When warewulf starts imaging we disappear in a loop of "disks not ready", centos 6
  
  
cluster/216.1651244270.txt.gz ยท Last modified: 2022/04/29 10:57 by hmeij07