This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
cluster:145 [2015/12/11 19:57] 127.0.0.1 external edit |
cluster:145 [2017/04/05 15:22] (current) hmeij07 |
||
---|---|---|---|
Line 2: | Line 2: | ||
**[[cluster: | **[[cluster: | ||
- | ===== Warewulf Golden Image ===== | + | ===== IPoIB ===== |
- | Also read these pages and this page will make more sense: [[cluster: | + | Redoing our RHEL5.5 HP Proliant blade servers with CentOS 6.7 using [[cluster:144|Warewulf |
- | For some time now I have been looking for a provisioning tool. I've tried along the way ... | + | Not quite there yet, but I'll document here how Infiniband was installed. These compute nodes are connect to a Voltaire interconnect, |
- | * Project Kusu, now defunct, but a great, simple template driven system. No fancy gui. | + | First install |
- | * HP's CMU, also a great tool, golden image approach. The nice feature of CMU is that master node can delegate hundreds of node to be image by a designated compute node relieving the master node. | + | |
- | * Bright Computing, a very complex tool that takes over every config file imaginable. Simple tasks become very burdensome, never achieved traction with this tool. | + | |
- | * xCAT, the behemoth of open source provisioning tools and more. It does it all, which means a huge learning curve. | + | |
- | * [[http:// | + | |
- | + | ||
- | The requirements of the provisioning tool were two fold: | + | |
- | + | ||
- | * My HPCC environment is flooded with small jobs that run for weeks to months (no wall time) but have small memory requirements (< 1GB). Thus I want to design stateless compute nodes, or virtual compute nodes, and frequently tailor the config & setup to the scientific needs (mostly non graphical, just CPU compute bound jobs, very little IO). | + | |
- | * The HPCC also encounters very large jobs (for us that is 16-32 cores with memory requirements in the 256 GB range) utilizing X11, OpenGL, Nvidia and other large complex analyses software. In this case one compute node is build up to satisfaction, | + | |
- | + | ||
- | So I settled on Warewulf which does these two approaches and sports an active forum for questions. | + | |
- | + | ||
- | + | ||
- | Not finding much on the " | + | |
- | + | ||
- | Install Warewulf and poke around the shell '' | + | |
< | < | ||
- | wwsh node new b6 --netdev=eth0 \ | + | # Install and reboot |
- | --hwaddr=00: | + | yum groupinstall " |
- | --netmask=255.255.0.0 | + | yum install infiniband-diags perftest qperf opensm |
- | --groups=wwnodes | + | chkconfig rdma on |
- | wwsh node set b6 --netdev=eth1 \ | + | # for openhpc |
- | --hwaddr=00: | + | yum install inifinipath-psm |
- | --netmask=255.255.0.0 | + | |
- | wwsh provision set b6 --fileadd passwd, | + | yum install opensm |
- | wwsh provision set b6 --fileadd hosts, | + | chkconfig opensm on |
- | wwsh provision set b6 --fileadd network.ww, | + | |
- | </ | + | yum install tcl tk |
+ | yum install infiniband-diags | ||
- | As opposed to the stateless, which grabs it's OS content from the master node, in the " | + | shutdown -r now |
- | Set '' | + | # after reboot |
+ | lsmod | grep ib | ||
- | * / | + | # and the output (ipoib is the important one) |
- | < | + | ib_ipoib |
- | + | ib_ucm | |
- | # minder: all NFS file systems unmounted? | + | ib_uverbs |
- | + | ib_umad | |
- | mkdir / | + | ib_cm 36996 3 ib_ipoib, |
- | + | mlx4_ib | |
- | SOURCEADDR=b0 wwmkchroot golden-system / | + | ib_sa 24060 5 ib_ipoib, |
+ | ib_mad | ||
+ | ib_core | ||
+ | ib_addr | ||
+ | ipv6 335525 | ||
+ | mlx4_core | ||
</ | </ | ||
- | Next, modify | + | Connect |
- | [[http:// | + | |
< | < | ||
- | wwsh object modify -s bootloader=sda b6 | + | # Check IB ports |
- | wwsh object modify -s diskformat=sda1, | + | for i in `ls /sys/class/ |
- | wwsh object modify -s filesystems= \ | + | |
- | " | + | |
- | dev=sda3: | + | |
- | mountpoint=/: | + | |
- | b6 | + | |
- | </ | + | # Your output may vary |
- | + | /sys/class/infiniband/mlx4_0/ | |
- | More on the '' | + | 4: ACTIVE |
- | + | /sys/class/infiniband/mlx4_0/ | |
- | Next we need to get the node booted and trasnfer the VNFS image made from the node b0 contents. At this time look on your master node in /var/lib/mysql and make sure you have enough disk space (these VNFS images will be around | + | 1: DOWN |
- | + | ||
- | < | + | |
- | + | ||
- | # make the image, takes 10 minutes or so | + | |
- | wwvnfs --chroot=/var/chroots/goldimages/b0.chroot | + | |
- | + | ||
- | # switch node to image VNFS | + | |
- | wwsh provision set b6 --vnfs=b0.chroot | + | |
- | + | ||
- | # just to be prudent | + | |
- | wwsh pxe update | + | |
- | wwsh dhcp update | + | |
- | service dhcpd restart | + | |
- | + | ||
- | # check the configs | + | |
- | wwsh object print b6 -p :all | + | |
- | wwsh provision list | + | |
- | + | ||
- | # next for provisioning (just to sure) on first PXE boot | + | |
- | wwsh provision set --bootlocal=UNDEF b6 | + | |
- | # turn the node on | + | # some test commands |
+ | ibhosts | ||
+ | iblinkinfo | ||
+ | ibstatus | ||
</ | </ | ||
- | The console of the target node will now show the IP being assigned, the '' | + | Edit '' |
- | + | ||
- | After all that is done, disable provisioning so that the master ignores the PXE boot and the target node boots of local disk. | + | |
< | < | ||
- | # ignore PXE boot | + | DEVICE=ib0 |
- | wwsh provision set --bootlocal=EXIT b6 | + | TYPE=InfiniBand |
+ | UUID=eac9f00a-245d-4c88-b56f-1bcb6e6ed933 | ||
+ | ONBOOT=yes | ||
+ | NM_CONTROLLED=no | ||
+ | BOOTPROTO=none | ||
+ | HWADDR=80: | ||
+ | CONNECTED_MODE=no | ||
+ | IPADDR=10.11.103.31 | ||
+ | PREFIX=16 | ||
+ | DEFROUTE=node | ||
+ | IPV4_FAILURE_FATAL=yes | ||
+ | IPV6INIT=no | ||
+ | NAME=" | ||
- | </ | + | # then start interface |
+ | ifup ib0 | ||
- | **filesystems** | + | # check the route, then mount /home on this interface |
- | This is currently not working as expected. In my first attempts I'd specify sda1 (size=500), sda2 (size=2048, type=swap) and sda3 (size=fill) but what I end up with is a standard layout it looks like. Any sizes are also ignored. So for now I just pick the ones I want (sda1, sda4, sda7). | + | route |
+ | # Your output may vary | ||
- | This also happens after I remove any UUID references in /etc/fstab, clean up /etc/mtab and clean any and all files in / | + | Kernel IP routing table |
+ | Destination | ||
+ | 10.11.0.0 | ||
+ | link-local * | ||
+ | link-local | ||
+ | 192.168.0.0 | ||
+ | default | ||
- | < | + | # home from sharptail |
+ | 10.11.103.42:/ | ||
- | fdisk -l | ||
- | |||
- | Disk /dev/sda: 80.0 GB, 80026361856 bytes | ||
- | 255 heads, 63 sectors/ | ||
- | Units = cylinders of 16065 * 512 = 8225280 bytes | ||
- | | ||
- | I/O size (minimum/ | ||
- | Disk identifier: 0x000ce092 | ||
- | |||
- | Device Boot Start | ||
- | / | ||
- | / | ||
- | / | ||
- | / | ||
- | / | ||
- | / | ||
- | / | ||
- | |||
- | df -h | ||
- | |||
- | | ||
- | / | ||
- | | ||
- | / | ||
</ | </ | ||
- | |||
- | Warewulf 3.6.99 and CentOS 6.5 | ||
\\ | \\ | ||
**[[cluster: | **[[cluster: |