This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:217 [2022/06/20 19:27] hmeij07 |
cluster:217 [2022/06/23 19:08] hmeij07 [Slurm #3] |
||
---|---|---|---|
Line 4: | Line 4: | ||
==== Slurm entangles ==== | ==== Slurm entangles ==== | ||
- | So, vaguely I remember when redoing | + | So, vaguely I remember when redoing |
That's too bad as I was hoping to have a single operating system cluster. But now I will have to think about what to do with our CentOS 7 hardware which is running the old scheduler. The hope was to migrate everything to Slurm scheduler. | That's too bad as I was hoping to have a single operating system cluster. But now I will have to think about what to do with our CentOS 7 hardware which is running the old scheduler. The hope was to migrate everything to Slurm scheduler. | ||
Line 17: | Line 17: | ||
- Advanced > Network Stack – Enable | - Advanced > Network Stack – Enable | ||
- within that tab enable PXE4 and PXE6 support | - within that tab enable PXE4 and PXE6 support | ||
- | - Boot -> boot order, network first then hard drive | + | - Boot > Boot order; network first then hard drive |
- Save & Exit > Yes & Reset | - Save & Exit > Yes & Reset | ||
- | Next we create the warewulf object and boot (see deploy script, at bottom). | + | Next we create the warewulf |
When this ASUS hardware boots, it sends over the correct mac address. We observe.... | When this ASUS hardware boots, it sends over the correct mac address. We observe.... | ||
Line 33: | Line 33: | ||
# in / | # in / | ||
- | Jun 10 09:13:57 cottontail2 in.tftpd[388239]: | + | Jun 10 09:13:57 cottontail2 in.tftpd[388239]: |
+ | finished / | ||
</ | </ | ||
Line 58: | Line 59: | ||
==== Slurm #1 ==== | ==== Slurm #1 ==== | ||
- | First thought was to install OHPC v1.3 CentOS7 slurmd client on the node, then join that to OHPC v2.4 slurmctld. To do that first '' | + | First Idea: |
* http:// | * http:// | ||
Line 113: | Line 114: | ||
==== Slurm #2 ==== | ==== Slurm #2 ==== | ||
- | I downloaded | + | Second Idea: download |
< | < | ||
Line 133: | Line 134: | ||
/ | / | ||
- | # make the generic / | + | # |
</ | </ | ||
Line 139: | Line 140: | ||
YES! it does register, hurray. | YES! it does register, hurray. | ||
- | Finish with startup at boot in ''/ | + | Finish with |
+ | * make generic / | ||
+ | * copy over munge.key, restart munge | ||
+ | * startup at boot in ''/ | ||
+ | * export | ||
+ | * make dirs / | ||
Do **NOT** mount ''/ | Do **NOT** mount ''/ | ||
+ | |||
==== Slurm #3 ==== | ==== Slurm #3 ==== | ||
- | Just a mental note. There is a warning on Slurm web page re the older versions archives page | + | There is a warning on Slurm web page re the older versions archives page |
* https:// | * https:// | ||
- | Due to a security vulnerability (CVE-2022-29500), | + | "Due to a security vulnerability (CVE-2022-29500), |
+ | |||
+ | Third Idea: once we're fully deployed I may go to latest Slurm version, run on different ports with maybe newer munge version (although that should not matter, why does this scheduler even need munge?) | ||
- | Once we're fully deployed I may go to latest Slurm version, run on different ports with maybe newer munge version (although that should not matter, why does this scheduler even need munge?) and run Slurm outside of openhpc. A standalone version. | + | Run Slurm outside of openhpc |
+ | Decided to go this route v22.05.02 standalone version (with ohpc v1.3 or v2.4 munge packages. | ||
+ | --- // | ||
==== Deploy ==== | ==== Deploy ==== | ||