Using Infiniband for MPI traffic involves somewhat more configurations. So from the ground up (v1.3 documentation) we start with installing packages needed on CHROOT. Be sure to follow recipe and install these on SMS too.

# Add IB support and enable
yum -y --installroot=$CHROOT groupinstall "InfiniBand Support"
yum -y --installroot=$CHROOT install infinipath-psm
chroot $CHROOT systemctl enable rdma
# User environment
yum -y --installroot=$CHROOT install lmod-ohpc

Next we import a template file in which the IPADDR and NETMASK values of the ib0 interface will be replaced with values from the database. Add to your deploy scripts lines like

wwsh file import /opt/ohpc/pub/examples/network/centos/ifcfg-ib0.ww
wwsh -y file set ifcfg-ib0.ww --path=/etc/sysconfig/network-scripts/ifcfg-ib0

 wwsh node set $node --netdev=ib0 \
 --ipaddr=$ipaddri \
 --netmask=  --network= -y

 wwsh provision set $node --fileadd ifcfg-ib0.ww -y

Reassemble the VNFS and reimage nodes. Now you can follow IpoIB instructions Infiniband

Then add these lines to ~test/.bashrc file and resubmit job.mpi and you'll notice we now run MPI over Infiniband.

# User specific aliases and functions
module load gnu/5.4.0
module load openmpi/1.10.4
module load prun/1.1
module list

# job.102.out
[prun] Master compute host = n29
[prun] Resource manager = slurm
[prun] Launch cmd = mpirun ./a.out

 Hello, world (8 procs total)
    --> Process #   0 of   8 is alive. -> n29.localdomain
    --> Process #   1 of   8 is alive. -> n29.localdomain
    --> Process #   2 of   8 is alive. -> n29.localdomain
    --> Process #   3 of   8 is alive. -> n29.localdomain
    --> Process #   4 of   8 is alive. -> n31.localdomain
    --> Process #   5 of   8 is alive. -> n31.localdomain
    --> Process #   6 of   8 is alive. -> n31.localdomain
    --> Process #   7 of   8 is alive. -> n31.localdomain


Henk 2017/05/30 11:03


In order to have users be able to send email from the jobs (from inside their jobs), like job progress reports, install mailx and post fix on the nodes. in /etc/postfix/ define a relayhost (like sms-eth0-private) and on SMS define a relayhost like On the nodes you will need to remove the /usr/sbin/sendmail link and create a new link to /usr/sbin/sendmail.postfix.


