This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cluster:217 [2022/06/20 19:27] hmeij07 |
cluster:217 [2022/06/24 13:53] (current) hmeij07 [Slurm #3] |
||
---|---|---|---|
Line 4: | Line 4: | ||
==== Slurm entangles ==== | ==== Slurm entangles ==== | ||
- | So, vaguely I remember when redoing | + | So, vaguely I remember when redoing |
That's too bad as I was hoping to have a single operating system cluster. But now I will have to think about what to do with our CentOS 7 hardware which is running the old scheduler. The hope was to migrate everything to Slurm scheduler. | That's too bad as I was hoping to have a single operating system cluster. But now I will have to think about what to do with our CentOS 7 hardware which is running the old scheduler. The hope was to migrate everything to Slurm scheduler. | ||
Line 17: | Line 17: | ||
- Advanced > Network Stack – Enable | - Advanced > Network Stack – Enable | ||
- within that tab enable PXE4 and PXE6 support | - within that tab enable PXE4 and PXE6 support | ||
- | - Boot -> boot order, network first then hard drive | + | - Boot > Boot order; network first then hard drive |
- Save & Exit > Yes & Reset | - Save & Exit > Yes & Reset | ||
- | Next we create the warewulf object and boot (see deploy script, at bottom). | + | Next we create the warewulf |
When this ASUS hardware boots, it sends over the correct mac address. We observe.... | When this ASUS hardware boots, it sends over the correct mac address. We observe.... | ||
Line 33: | Line 33: | ||
# in / | # in / | ||
- | Jun 10 09:13:57 cottontail2 in.tftpd[388239]: | + | Jun 10 09:13:57 cottontail2 in.tftpd[388239]: |
+ | finished / | ||
</ | </ | ||
Line 58: | Line 59: | ||
==== Slurm #1 ==== | ==== Slurm #1 ==== | ||
- | First thought was to install OHPC v1.3 CentOS7 slurmd client on the node, then join that to OHPC v2.4 slurmctld. To do that first '' | + | First Idea: |
* http:// | * http:// | ||
Line 113: | Line 114: | ||
==== Slurm #2 ==== | ==== Slurm #2 ==== | ||
- | I downloaded | + | Second Idea: download |
< | < | ||
Line 133: | Line 134: | ||
/ | / | ||
- | # make the generic / | + | # |
</ | </ | ||
Line 139: | Line 140: | ||
YES! it does register, hurray. | YES! it does register, hurray. | ||
- | Finish with startup at boot in ''/ | + | Finish with |
+ | * make generic / | ||
+ | * copy over munge.key, restart munge | ||
+ | * startup at boot in ''/ | ||
+ | * export | ||
+ | * make dirs / | ||
Do **NOT** mount ''/ | Do **NOT** mount ''/ | ||
+ | |||
==== Slurm #3 ==== | ==== Slurm #3 ==== | ||
- | Just a mental note. There is a warning on Slurm web page re the older versions archives page | + | There is a warning on Slurm web page re the older versions archives page |
* https:// | * https:// | ||
- | Due to a security vulnerability (CVE-2022-29500), | + | "Due to a security vulnerability (CVE-2022-29500), |
+ | |||
+ | Third Idea: once we're fully deployed I may go to latest Slurm version, run on different ports with maybe newer munge version (although that should not matter, why does this scheduler even need munge?) | ||
+ | |||
+ | Run Slurm outside of openhpc via local compile in ''/ | ||
- | Once we're fully deployed I may go to latest Slurm version, run on different ports with maybe newer munge version | + | Decided |
+ | --- // | ||
==== Deploy ==== | ==== Deploy ==== | ||