cluster:217
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| cluster:217 [2022/06/20 14:04] – hmeij07 | cluster:217 [2022/06/24 13:53] (current) – [Slurm #3] hmeij07 | ||
|---|---|---|---|
| Line 4: | Line 4: | ||
| ==== Slurm entangles ==== | ==== Slurm entangles ==== | ||
| - | So, vaguely I remember when redoing | + | So, vaguely I remember when redoing |
| That's too bad as I was hoping to have a single operating system cluster. But now I will have to think about what to do with our CentOS 7 hardware which is running the old scheduler. The hope was to migrate everything to Slurm scheduler. | That's too bad as I was hoping to have a single operating system cluster. But now I will have to think about what to do with our CentOS 7 hardware which is running the old scheduler. The hope was to migrate everything to Slurm scheduler. | ||
| Line 17: | Line 17: | ||
| - Advanced > Network Stack – Enable | - Advanced > Network Stack – Enable | ||
| - within that tab enable PXE4 and PXE6 support | - within that tab enable PXE4 and PXE6 support | ||
| - | - Boot -> boot order, network first then hard drive | + | - Boot > Boot order; network first then hard drive |
| - Save & Exit > Yes & Reset | - Save & Exit > Yes & Reset | ||
| - | Next we create the warewulf object and boot (see deploy script, at bottom). | + | Next we create the warewulf |
| When this ASUS hardware boots, it sends over the correct mac address. We observe.... | When this ASUS hardware boots, it sends over the correct mac address. We observe.... | ||
| Line 33: | Line 33: | ||
| # in / | # in / | ||
| - | Jun 10 09:13:57 cottontail2 in.tftpd[388239]: | + | Jun 10 09:13:57 cottontail2 in.tftpd[388239]: |
| + | finished / | ||
| </ | </ | ||
| Line 58: | Line 59: | ||
| ==== Slurm #1 ==== | ==== Slurm #1 ==== | ||
| - | First thought was to install OHPC v1.3 CentOS7 slurmd client on the node, then join that to OHPC v2.4 slurmctld. To do that first '' | + | First Idea: |
| * http:// | * http:// | ||
| Line 82: | Line 83: | ||
| Make sure munge/ | Make sure munge/ | ||
| + | |||
| + | Had to uncomment this for slurmd to start (seems ok because they are slurmctld settings not used by slurmd client...according to slurm list) | ||
| + | |||
| + | < | ||
| + | |||
| + | # | ||
| + | # | ||
| + | |||
| + | </ | ||
| < | < | ||
| Line 104: | Line 114: | ||
| ==== Slurm #2 ==== | ==== Slurm #2 ==== | ||
| - | I downloaded | + | Second Idea: download |
| < | < | ||
| Line 124: | Line 134: | ||
| / | / | ||
| - | # make the generic / | + | # |
| </ | </ | ||
| Line 130: | Line 140: | ||
| YES! it does register, hurray. | YES! it does register, hurray. | ||
| - | Finish with startup at boot in ''/ | + | Finish with |
| + | * make generic / | ||
| + | * copy over munge.key, restart munge | ||
| + | * startup at boot in ''/ | ||
| + | * export | ||
| + | * make dirs / | ||
| Do **NOT** mount ''/ | Do **NOT** mount ''/ | ||
| + | |||
| ==== Slurm #3 ==== | ==== Slurm #3 ==== | ||
| - | Just a mental note. There is a warning on Slurm web page re the older versions archives page | + | There is a warning on Slurm web page re the older versions archives page |
| * https:// | * https:// | ||
| - | Due to a security vulnerability (CVE-2022-29500), | + | "Due to a security vulnerability (CVE-2022-29500), |
| + | |||
| + | Third Idea: once we're fully deployed I may go to latest Slurm version, run on different ports with maybe newer munge version (although that should not matter, why does this scheduler even need munge?) | ||
| + | |||
| + | Run Slurm outside of openhpc via local compile in ''/ | ||
| - | Once we're fully deployed I may go to latest Slurm version, run on different ports with maybe newer munge version | + | Decided |
| + | --- // | ||
| ==== Deploy ==== | ==== Deploy ==== | ||
cluster/217.1655733853.txt.gz · Last modified: by hmeij07
