User Tools

Site Tools


cluster:160


Back

OpenHPC page 4

ib0

Using Infiniband for MPI traffic involves somewhat more configurations. So from the ground up (v1.3 documentation) we start with installing packages needed on CHROOT. Be sure to follow recipe and install these on SMS too.

# Add IB support and enable
yum -y --installroot=$CHROOT groupinstall "InfiniBand Support"
yum -y --installroot=$CHROOT install infinipath-psm
chroot $CHROOT systemctl enable rdma
# User environment
yum -y --installroot=$CHROOT install lmod-ohpc

Next we import a template file in which the IPADDR and NETMASK values of the ib0 interface will be replaced with values from the database. Add to your deploy scripts lines like

wwsh file import /opt/ohpc/pub/examples/network/centos/ifcfg-ib0.ww
wwsh -y file set ifcfg-ib0.ww --path=/etc/sysconfig/network-scripts/ifcfg-ib0


 wwsh node set $node --netdev=ib0 \
 --ipaddr=$ipaddri \
 --netmask=255.255.0.0  --network=255.255.0.0 -y

 wwsh provision set $node --fileadd ifcfg-ib0.ww -y

Reassemble the VNFS and reimage nodes. Now you can follow IpoIB instructions Infiniband

Then add these lines to ~test/.bashrc file and resubmit job.mpi and you'll notice we now run MPI over Infiniband.

# User specific aliases and functions
module load gnu/5.4.0
module load openmpi/1.10.4
module load prun/1.1
module list

# job.102.out
/opt/ohpc/pub/prun/1.1/prun
[prun] Master compute host = n29
[prun] Resource manager = slurm
[prun] Launch cmd = mpirun ./a.out

 Hello, world (8 procs total)
    --> Process #   0 of   8 is alive. -> n29.localdomain
    --> Process #   1 of   8 is alive. -> n29.localdomain
    --> Process #   2 of   8 is alive. -> n29.localdomain
    --> Process #   3 of   8 is alive. -> n29.localdomain
    --> Process #   4 of   8 is alive. -> n31.localdomain
    --> Process #   5 of   8 is alive. -> n31.localdomain
    --> Process #   6 of   8 is alive. -> n31.localdomain
    --> Process #   7 of   8 is alive. -> n31.localdomain

LMod

OpenHPC page 1 - OpenHPC page 2 - OpenHPC page 3 - page 4

Henk 2017/05/30 11:03

POSTFIX NOTE

In order to have users be able to send email from the jobs (from inside their jobs, like job progress reports), install mailx on the nodes. in /etc/ssmtp/ssmtp.conf define the relayhost (like sms-eth0-private) as the “mailhub”. No other changes needed.


Back

cluster/160.txt · Last modified: 2017/05/31 15:07 by hmeij07