User Tools

Site Tools


cluster:217

Warning: Undefined array key -1 in /usr/share/dokuwiki/inc/html.php on line 1458

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:217 [2022/06/20 15:28]
hmeij07 [Slurm entangles]
cluster:217 [2022/06/24 09:53] (current)
hmeij07 [Slurm #3]
Line 17: Line 17:
   - Advanced > Network Stack – Enable   - Advanced > Network Stack – Enable
   - within that tab enable PXE4 and PXE6 support   - within that tab enable PXE4 and PXE6 support
-  - Boot -boot ordernetwork first then hard drive+  - Boot > Boot ordernetwork first then hard drive
   - Save & Exit > Yes & Reset   - Save & Exit > Yes & Reset
  
-Next we create the warewulf object and boot (see deploy script, at bottom).+Next we create the warewulf node object and boot (see deploy script, at bottom).
  
 When this ASUS hardware boots, it sends over the correct mac address. We observe.... When this ASUS hardware boots, it sends over the correct mac address. We observe....
Line 33: Line 33:
 # in /etc/httpd/logs/access_log # in /etc/httpd/logs/access_log
  
-Jun  10 09:13:57 cottontail2 in.tftpd[388239]: Client ::ffff:192.168.102.100 finished /warewulf/ipxe/bin-i386-pcbios/undionly.kpxe+Jun  10 09:13:57 cottontail2 in.tftpd[388239]: Client ::ffff:192.168.102.100 
 +finished /warewulf/ipxe/bin-i386-pcbios/undionly.kpxe
  
 </code> </code>
Line 58: Line 59:
 ==== Slurm #1 ==== ==== Slurm #1 ====
  
-First thought was to install OHPC v1.3 CentOS7 slurmd client on the node, then join that to OHPC v2.4 slurmctld. To do that first ''yum install'' the ohpc-release from this location+First Idea:  install OHPC v1.3 CentOS7 slurmd client on the node, then join that to OHPC v2.4 slurmctld. To do that first ''yum install'' the ohpc-release from this location
  
   * http://repos.openhpc.community/ohpc-1.3/1.3.9/base/CentOS_7/x86_64/ohpc-release-1.3-1.el7.x86_64.rpm   * http://repos.openhpc.community/ohpc-1.3/1.3.9/base/CentOS_7/x86_64/ohpc-release-1.3-1.el7.x86_64.rpm
Line 113: Line 114:
 ==== Slurm #2 ==== ==== Slurm #2 ====
  
-I downloaded Slurm source the closest version just above ohpc v2.4 version. Next compile 20.11.9 slurm and see if it is accepted on ohpc v2.4 slurm 20.11.8+Second Idea: download Slurm source the closest version just above ohpc v2.4 version. Next compile 20.11.9 slurm and see if it is accepted on ohpc v2.4 slurm 20.11.8 to register ....
  
 <code> <code>
Line 133: Line 134:
 /usr/local/slurm-20.11.9/lib/slurm/auth_munge.so /usr/local/slurm-20.11.9/lib/slurm/auth_munge.so
  
-make the generic /usr/local/slurm -> /usr/local/slurm-20.11.9 link+
  
 </code> </code>
Line 139: Line 140:
 YES! it does register, hurray. YES! it does register, hurray.
  
-Finish with startup at boot in ''/etc/rc.local'' and exports envs in ''/etc/bashrc''+Finish with  
 +  * make generic /usr/local/slurm -> /usr/local/slurm-20.11.9 link 
 +  * copy over munge.key, restart munge 
 +  * startup at boot in ''/etc/rc.local''  
 +  * export envs in ''/etc/bashrc'' 
 +  * make dirs /var/log/slurm /var/spool/slurm
  
 Do **NOT** mount ''/opt/intel'' and ''/opt/ohpc/pub'' from SMS, that's all Rocky8.5 stuff. Do **NOT** mount ''/opt/intel'' and ''/opt/ohpc/pub'' from SMS, that's all Rocky8.5 stuff.
 +
  
 ==== Slurm #3 ==== ==== Slurm #3 ====
  
-Just a mental note. There is a warning on Slurm web page re the older versions archives page+There is a warning on Slurm web page re the older versions archives page
  
   * https://www.schedmd.com/archives.php   * https://www.schedmd.com/archives.php
  
-Due to a security vulnerability (CVE-2022-29500), all versions of Slurm prior to 21.08.8 or 20.11.9 are no longer available for download...so why is openhpc v2.4 running such an old slurm version?+"Due to a security vulnerability (CVE-2022-29500), all versions of Slurm prior to 21.08.8 or 20.11.9 are no longer available for download"  ... so why is openhpc v2.4 running such an old slurm version? 
 + 
 +Third Idea: once we're fully deployed I may go to latest Slurm version, run on different ports with maybe newer munge version (although that should not matter, why does this scheduler even need munge?) 
 + 
 +Run Slurm outside of openhpc via local compile in ''/usr/local/''. A standalone version, the most recent version.
  
-Once we're fully deployed I may go to latest Slurm version, run on different ports with maybe newer munge version (although that should not matterwhy does this scheduler even need munge?) and run Slurm outside of openhpcA standalone version.+Decided to go this route v22.05.02 standalone version (with ohpc v1.3 or v2.4 munge packages). You need all three packages (munge, munge-libs and munge-devel) on host where you compile slurm (note: cottontail2 for rocky8.5, node n90 for centos7). Then just copy.
  
  
 + --- //[[hmeij@wesleyan.edu|Henk]] 2022/06/23 19:07//
 ==== Deploy ==== ==== Deploy ====
  
cluster/217.1655753283.txt.gz · Last modified: 2022/06/20 15:28 by hmeij07