This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cluster:160 [2017/04/05 12:32] hmeij07 [OpenHPC page 4] |
cluster:160 [2017/05/31 15:07] (current) hmeij07 [OpenHPC page 4] |
||
---|---|---|---|
Line 5: | Line 5: | ||
** ib0 ** | ** ib0 ** | ||
+ | |||
+ | Using Infiniband for MPI traffic involves somewhat more configurations. So from the ground up (v1.3 documentation) we start with installing packages needed on CHROOT. | ||
+ | |||
+ | < | ||
+ | |||
+ | # Add IB support and enable | ||
+ | yum -y --installroot=$CHROOT groupinstall " | ||
+ | yum -y --installroot=$CHROOT install infinipath-psm | ||
+ | chroot $CHROOT systemctl enable rdma | ||
+ | # User environment | ||
+ | yum -y --installroot=$CHROOT install lmod-ohpc | ||
+ | |||
+ | </ | ||
+ | |||
+ | Next we import a template file in which the IPADDR and NETMASK values of the '' | ||
+ | |||
+ | < | ||
+ | |||
+ | wwsh file import / | ||
+ | wwsh -y file set ifcfg-ib0.ww --path=/ | ||
+ | |||
+ | |||
+ | wwsh node set $node --netdev=ib0 \ | ||
+ | | ||
+ | | ||
+ | |||
+ | wwsh provision set $node --fileadd ifcfg-ib0.ww -y | ||
+ | |||
+ | </ | ||
+ | |||
+ | Reassemble the VNFS and reimage nodes. Now you can follow IpoIB instructions [[cluster: | ||
+ | |||
+ | Then add these lines to ~test/ | ||
+ | |||
+ | < | ||
+ | |||
+ | # User specific aliases and functions | ||
+ | module load gnu/5.4.0 | ||
+ | module load openmpi/ | ||
+ | module load prun/1.1 | ||
+ | module list | ||
+ | |||
+ | # job.102.out | ||
+ | / | ||
+ | [prun] Master compute host = n29 | ||
+ | [prun] Resource manager = slurm | ||
+ | [prun] Launch cmd = mpirun ./a.out | ||
+ | |||
+ | | ||
+ | --> Process # 0 of 8 is alive. -> n29.localdomain | ||
+ | --> Process # 1 of 8 is alive. -> n29.localdomain | ||
+ | --> Process # 2 of 8 is alive. -> n29.localdomain | ||
+ | --> Process # 3 of 8 is alive. -> n29.localdomain | ||
+ | --> Process # 4 of 8 is alive. -> n31.localdomain | ||
+ | --> Process # 5 of 8 is alive. -> n31.localdomain | ||
+ | --> Process # 6 of 8 is alive. -> n31.localdomain | ||
+ | --> Process # 7 of 8 is alive. -> n31.localdomain | ||
+ | |||
+ | </ | ||
Line 11: | Line 70: | ||
* http:// | * http:// | ||
+ | * http:// | ||
+ | |||
+ | |||
+ | [[cluster: | ||
+ | --- // | ||
- | [[cluster: | + | **POSTFIX NOTE** |
+ | In order to have users be able to send email from the jobs (from inside their jobs, like job progress reports), install mailx on the nodes. in ''/ | ||
+ | |||
\\ | \\ | ||
**[[cluster: | **[[cluster: |