User Tools

Site Tools


cluster:139

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
cluster:139 [2015/04/08 17:45]
hmeij
cluster:139 [2018/08/15 15:00]
hmeij07 [3.99]
Line 2: Line 2:
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
  
-===== Warewulf LBL =====+===== Warewulf Stateless =====
  
   * [[http://warewulf.lbl.gov/trac]] Warewulf is a scalable systems management suite originally developed to manage large high-performance Linux clusters.  My Project Kusu replacement since IBM bought up Platform LSF and dished the hpccommunnity.org web site, grrrh. (old info at https://dokuwiki.wesleyan.edu/doku.php?id=cluster:88)   * [[http://warewulf.lbl.gov/trac]] Warewulf is a scalable systems management suite originally developed to manage large high-performance Linux clusters.  My Project Kusu replacement since IBM bought up Platform LSF and dished the hpccommunnity.org web site, grrrh. (old info at https://dokuwiki.wesleyan.edu/doku.php?id=cluster:88)
  
 +Get RPMs and install.
 +
 +<code>
 +
 +[root@petaltail ~]# wget -O /etc/yum.repos.d/warewulf-rhel6.repo http://warewulf.lbl.gov/downloads/repo/warewulf-rhel6.repo
 +--2015-04-07 08:45:56--  http://warewulf.lbl.gov/downloads/repo/warewulf-rhel6.repo                                        
 +Resolving warewulf.lbl.gov... 128.3.7.27                                                                                   
 +Connecting to warewulf.lbl.gov|128.3.7.27|:80... connected.                                                                
 +HTTP request sent, awaiting response... 200 OK                                                                             
 +Length: 126 [text/plain]                                                                                                   
 +Saving to: â/etc/yum.repos.d/warewulf-rhel6.repoâ                                                                         
 +
 +100%[======================================================================================>] 126         --.-K/  in 0s      
 +
 +2015-04-07 08:45:56 (22.9 MB/s) - â/etc/yum.repos.d/warewulf-rhel6.repoâsaved [126/126]
 +
 +[root@petaltail ~]# yum install warewulf-common warewulf-cluster warewulf-provision 
 +
 +Installed:
 +  warewulf-cluster.x86_64 0:3.6-1.el6          warewulf-common.noarch 0:3.6-1.el6        warewulf-icr.x86_64 0:3.6-1.el6
 +  warewulf-provision.x86_64 0:3.6-1.el6
 +
 +Dependency Installed:
 +  dhcp.x86_64 12:4.1.1-43.P1.el6.centos.1    tftp-server.x86_64 0:0.49-7.el6    warewulf-provision-server.x86_64 0:3.6-1.el6
 +  warewulf-vnfs.noarch 0:3.6-1.el6
 +
 +Complete!
 +
 +</code>
 +
 +Get and install MySQL and set mysql user root's password.
 +
 +<code>
 +                                              
 +[root@petaltail warewulf]# vi /etc/warewulf/database-root.conf 
 +[root@petaltail warewulf]# service mysqld status
 +
 +mysql> set password for 'root'@'localhost' = PASSWORD('some_string');
 +Query OK, 0 rows affected (0.00 sec)                                
 +
 +[root@petaltail warewulf]# chmod o-r /etc/warewulf/database-root.conf                                  
 +     
 +</code>
 +
 +In the provision config file I turned dynamic_hosts, hostfile and localdomain off (I'll manage those manually) and my private network is run over eth0 (192.168.0.0/255.255.0.0).
 +
 +<code>
 +
 +[root@petaltail warewulf]# vi /etc/warewulf/provision.conf 
 +[root@petaltail warewulf]# vi /etc/warewulf/provision.conf 
 +
 +</code>
 +
 +Next comes a piece of mystery. When executing  ''wwinit ALL'' I ran into a lethal, no less, error. After much digging found the answer here and surprisingly found little in the documentation on this.
 +
 +  * Lethal error thrown by module: /usr/libexec/warewulf/wwinit/91-vnfs.init
 +  * answer is here https://groups.google.com/a/lbl.gov/forum/#!topic/warewulf/5OhfZCf_eqE
 +  * quoting 
 +
 +<code>
 +What it appears you'll need to do initially is build out a chroot
 +directory. For example, for a Scientific Linux 6.x install, you'd do:
 +
 +    # wwmkchroot sl-6 /var/chroots/sl-6     (or whatever path you want
 +to store them at -- run wwmkchroot -h for the help)
 +
 +  That will give the base chroot for the VNFS at /var/chroots/sl-6.
 +You'll then set the environment variable CHROOTDIR when you execute
 +wwinit.
 +
 +    # CHROOTDIR=/var/chroots/sl-6 wwinit VNFS
 +
 +  I know when you use the 'icr' wwinit functions, it creates a base
 +chroot based upon the host OS. But none of the other wwinit scripts,
 +by default, create one; So you will need to build the chroot and
 +specify where the directory is. Then when the VNFS part of wwinit is
 +ran, it will build the VNFS and import it into the datastore at that
 +time, setting the one specified as the default VNFS in the
 +configuration.
 +</code>
 +
 +So we start by making the chroot directories, first we'll build a generic centos-6. Then we initialize the warewulf environment.
 +
 +<code>
 +
 +[root@petaltail ~]# wwmkchroot centos-6 /var/chroots/centos-6
 +
 +[root@petaltail ~]# CHROOTDIR=/var/chroots/centos-6  wwinit ALL
 +database:     Checking /etc/rc.d/init.d/mysqld is installed                  OK
 +database:     Confirming mysqld is configured to start at boot:
 +database:      + chkconfig mysqld on                                         OK
 +database:     Checking to see if MySQL needs to be started:
 +database:      + service mysqld start                                        OK
 +wwsh:         Confirming that wwsh accepts some basic commands
 +wwsh:          + wwsh quit                                                   OK
 +wwsh:          + wwsh help                                                   OK
 +wwsh:          + wwsh node new testnode0000                                  OK
 +wwsh:          + wwsh node list                                              OK
 +wwsh:          + wwsh node delete testnode0000                               OK
 +domain:       Setting default node domain to: "cluster"                      OK
 +authfiles:    Checking to see if /etc/passwd is in the WW Datastore          OK
 +authfiles:    Checking if /etc/passwd is part of default node configuration
 +authfiles:    Checking to see if /etc/group is in the WW Datastore           OK
 +authfiles:    Checking if /etc/group is part of default node configuration
 +nfsd:         Setting domain "cluster" for IDMAPD/NFSv4                      OK
 +nfsd:          + chkconfig nfs on                                            OK
 +nfsd:          + service nfs restart                                         OK
 +nfsd:          + exportfs -a                                                 OK
 +ntpd:         Configured NTP services
 +ntpd:          + chkconfig ntpd on                                           OK
 +ntpd:          + service ntpd restart                                        OK
 +ssh_keys:     Checking ssh keys for root                                     OK
 +ssh_keys:     Checking root's ssh config                                     OK
 +ssh_keys:     Checking for default RSA1 host key for nodes                   OK
 +ssh_keys:     Checking for default RSA host key for nodes                    OK
 +ssh_keys:     Checking for default DSA host key for nodes                    OK
 +tftp:          + /sbin/chkconfig xinetd on                                   OK
 +tftp:          + /sbin/chkconfig tftp on                                     OK
 +tftp:          + /sbin/service xinetd restart                                OK
 +bootstrap:    Creating bootstrap for 2.6.32-504.8.1.el6.x86_64:
 +bootstrap:     + wwbootstrap 2.6.32-504.8.1.el6.x86_64                       OK
 +vnfs:         Building the VNFS image and importing into Warewulf:
 +vnfs:          + wwvnfs -y --hybridpath=/var/chroots/centos-6 --chroot /var/ OK
 +
 +
 +</code>
 +
 +Next restart MySQL, httpd and xinetd. Open your firewall on the private network.
 +
 +  * edit /etc/sysconfig/iptables and restart iptables
 +
 +<code>
 +
 +# local allow
 +-A INPUT -i eth0 -d 192.168.0.0/16 -p tcp --dport 0:65535 -j ACCEPT
 +-A INPUT -i eth0 -d 192.168.0.0/16 -p udp --dport 0:65535 -j ACCEPT
 +
 +</code>
 +
 +Next restart some warewulf services. I had to edit the /etc/warewulf/dhcpd-template.conf several times and hardcoded the values in for network, netmask and ipaddr because I realized this server is my virtualization KVM test host and eth0 is bridged to br0 and warewulf is looking at the first ethernet and picks up eth1.
 +
 +<code>
 +
 +[root@petaltail ~]# wwsh dhcp update
 +Rebuilding the DHCP configuration
 +Done.
 +
 +[root@petaltail ~]# wwsh pxe update
 +No nodes found
 +
 +[root@petaltail ~]# wwsh
 +Warewulf> node new b51 --netdev=eth0 --hwaddr=00:00:00:00:00:00 --ipaddr=192.168.1.5 --groups=wwnodes
 +Warewulf> provision set --lookup groups wwnodes --vnfs=centos-6 --bootstrap=2.6.32-504.8.1.el6.x86_64
 +Are you sure you want to make the following changes to 1 node(s):
 +
 +     SET: BOOTSTRAP            = 2.6.32-504.8.1.el6.x86_64
 +     SET: VNFS                 = centos-6
 +
 +Yes/No> y
 +Warewulf> quit
 +
 +</code>
 +
 +Booting the node we observe the eth0 MAC address being picked up by DHCPd followed by the tftpboot process.  The node boots, et voila, CentOS 6.6 //stateless// compute node. Next import files we want the same on all nodes and associate them with the nodes.
 +
 +<code>
 +
 +[root@petaltail ~]# wwsh file import /etc/passwd
 +
 +[root@petaltail ~]# wwsh file list
 +dynamic_hosts           : -rw-r--r-- 0   root root            10174 /etc/hosts
 +group                   : -rw-r--r-- 1   root root             6247 /etc/group
 +passwd                  : -rw-r--r-- 1   root root            20094 /etc/passwd
 +shadow                  : ---------- 1   root root            23663 /etc/shadow
 +
 +[root@petaltail ~]# wwsh provision set b[0-51] --fileadd passwd
 +
 +[root@petaltail ~]# wwsh provision print
 +#### b51.cluster ##############################################################
 +    b51.cluster: BOOTSTRAP        = 2.6.32-504.8.1.el6.x86_64
 +    b51.cluster: VNFS             = centos-6
 +    b51.cluster: FILES            = dynamic_hosts,group,passwd,shadow
 +    b51.cluster: PRESHELL         = FALSE
 +    b51.cluster: POSTSHELL        = FALSE
 +    b51.cluster: CONSOLE          = UNDEF
 +    b51.cluster: PXELINUX         = UNDEF
 +    b51.cluster: SELINUX          = DISABLED
 +    b51.cluster: KARGS            = "quiet"
 +    b51.cluster: BOOTLOCAL        = FALSE
 +...
 +
 +</code>
 +
 +Next build a hybrid VNFS, this is the way to add packages to the nodes.  For example NTP.
 +
 +<code>
 +
 +cd /var/chroots/centos-6/
 +mkdir vnfs
 +
 +vi etc/fstab (inside of chroot area, edit)
 +192.168.1.217:/var/chroots/centos-6 /vnfs nfs defaults 0 0
 +
 +wwvnfs --chroot /var/chroots/centos-6 --hybridpath=/vnfs
 +Overwrite original: y
 +
 +# reboot node, observe mount, then add NTP to chgroot
 +
 +yum --tolerant --installroot /var/chroots/centos-6 -y install ntp
 +
 +vi etc/init.d/ntpd (inside of chroot, edit)
 +# chkconfig: 35 58 74
 +
 +# then make the following links in etc/rc[3|5].d/S58ntpd pointing to ../init.d/ntpd
 +
 +vi etc/ntp.conf (inside of chroot, sole contents are, IP is master on private network)
 +restrict default ignore
 +restrict 127.0.0.0
 +server 192.168.1.217
 +restrict 192.168.1.217 nomodify
 +
 +wwvnfs --chroot /var/chroots/centos-6 --hybridpath=/vnfs -y
 +VNFS 'centos-6 has been imported
 +
 +# reboot node
 +
 +</code>
 +
 +Then build the node up:
 +
 +  * use yum to install the openlava 2.2.1 RPM (pulls in tcl)
 +    * copy the openlava config files into the centos-6 are
 +  * use yum to install postfix 
 +    * add links in rc3.d and rc5.d
 +    * remove sendmail links
 +  * yum install perl
 +  * yum install munge
 +    * build RPMs from tar ball
 +  * rebuild the VNFS and reboot
 +
 +Sometimes you can edit the files in the chroot directly, sometimes you must modify the installtoot directly.  It's easiest to just do the latter all the time.
 +
 +<code>
 +
 +[root@petaltail ~]# cd /var/chroots
 +[root@petaltail chroots]# chroot centos-6
 +[root@petaltail /]# pwd                  
 +/                          <--- inside of installroot              
 +[root@petaltail /]# mkdir /var/log/munge 
 +[root@petaltail /]# chown munge:munge /var/log/munge
 +[root@petaltail /]# mkdir /var/log/slurm
 +[root@petaltail /]# chown slurm:munge /var/log/slurm
 +chown: invalid user: `slurm:munge'
 +
 +# since the passwd|shadow|group files come from database you need to create the relevant lines
 +
 +[root@petaltail /]# chown slurm:munge /var/log/slurm
 +[root@petaltail /]# exit
 +exit
 +[root@petaltail chroots]# ls
 +centos-6
 +
 +# outside edit commenting out rc.local directives making these dirs etc
 +[root@petaltail chroots]# vi centos-6/etc/rc.local
 +
 +# and don't forget
 +[root@petaltail chroots]# wwvnfs --chroot /var/chroots/centos-6 --hybridpath=/vnfs  -y
 +
 +# reboot node
 +</code>
 +
 +
 +To build short hostnames you can create a template inside of the chroot environment.
 +
 +<code>
 +
 +#--- build file CHROOT/root/wwtemplates/network.ww
 + NETWORKING=yes
 + HOSTNAME=%{NODENAME}
 +#--- end 
 +
 +# add that file (using wwsh provision) to the nodes.
 +
 +[root@]# wwsh file import /var/chroot/centos-6/root/wwtemplates/network.ww \
 + --path=/etc/sysconfig/network --name=network.ww
 +
 +[root@]# wwsh provision set n[00-15] --fileadd=network.ww
 +
 +</code>
 +
 +Second interface: create a template inside of the chroot environment.
 +
 +<code>
 +
 +wwsh node set b49 --netdev=eth1 \
 +--hwaddr=00:00:00:00:00:00 --ipaddr=10.10.1.55 \
 +--netmask=255.255.0.0  --network=255.255.0.0
 +
 +#--- build file CHROOT/root/wwtemplates/ifcfg-eth1.ww
 +DEVICE=eth1
 +BOOTPROTO=static
 +ONBOOT=yes
 +HWADDR=%{NETDEVS::ETH1::HWADDR}
 +IPADDR=%{NETDEVS::ETH1::IPADDR}
 +NETMASK=%{NETDEVS::ETH1::NETMASK}
 +NETWORK=%{NETDEVS::ETH1::NETWORK}
 +#--- end 
 +
 +# add that file (using wwsh provision) to the nodes.
 +
 +[root@]# wwsh file import /var/chroot/centos-6/root/wwtemplates/ifcfg-eth1.ww \
 + --path=/etc/sysconfig/network-scripts/ifcfg-eth1 --name=ifcfg-eth1.ww
 +
 +[root@]# wwsh provision set n[00-15] --fileadd=ifcfg-eth1.ww
 +
 +</code>
 +
 +Now, lets put it all together which can form the basis for a script.
 +
 +<code>
 +
 +# make sure it boots across network, alter BIOS settings
 +
 +wwsh node new b6 --netdev=eth0 \
 +--hwaddr=00:00:00:00:00:00 --ipaddr=192.168.1.12 \
 +--netmask=255.255.0.0  --network=255.255.0.0
 +--groups=wwnodes
 +
 +wwsh node set b6 --netdev=eth1 \
 +--hwaddr=00:00:00:00:00:00 --ipaddr=10.10.100.12 \
 +--netmask=255.255.0.0  --network=255.255.0.0
 +
 +wwsh provision set b6 --fileadd passwd,shadow,group
 +wwsh provision set b6 --fileadd hosts,bashrc,profile
 +wwsh provision set b6 --fileadd network.ww,ifcfg-eth1.ww
 +
 +
 +</code>
 +
 +
 +[[cluster:144|Warewulf Golden Image]]
 +
 +Useful links
 +
 +  * [[http://www.admin-magazine.com/HPC/Articles/Warewulf-Cluster-Manager-Master-and-Compute-Nodes]]
 +  * [[http://www.admin-magazine.com/HPC/Articles/warewulf_cluster_manager_completing_the_environment]]
 +  * http://warewulf.lbl.gov/trac/wiki/Documentation
 +
 +==== 3.99 ====
 +
 +<code>
 +
 +  764  cd warewulf/
 +  765  wget http://warewulf.lbl.gov/downloadds/testing/3.6.99/warewulf-common-3.6.99.tar.gz
 +  766  wget http://warewulf.lbl.gov/downloads/testing/3.6.99/warewulf-common-3.6.99.tar.gz
 +  767  wget http://warewulf.lbl.gov/downloads/testing/3.6.99/warewulf-provision-3.6.99.tar.gz
 +  768  wget http://warewulf.lbl.gov/downloads/testing/3.6.99/warewulf-cluster-3.6.99.tar.gz
 +  769  wget http://warewulf.lbl.gov/downloads/testing/3.6.99/warewulf-vnfs-3.6.99.tar.gz
 +  
 +  771  yum groupinstall "Development Tools"
 + 
 +  773  man rpmbuild
 +  774  rpmbuild --help
 +  775  rpmbuild -ta warewulf-common-3.6.99.tar.gz 
 +  778  rpmbuild -ta warewulf-cluster-3.6.99.tar.gz 
 +  779  rpmbuild -ta warewulf-vnfs-3.6.99.tar.gz 
 +  
 +  782  yum install libselinux-devel
 +  784  yum install  libacl-devel  libattr-devel
 +  785  rpmbuild -ta warewulf-provision-3.6.99.tar.gz 
 +
 +</code>
 +
 +[[cluster:143|Warewulf Statefull]]  [[cluster:144|Warewulf Golden Image]]
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
cluster/139.txt · Last modified: 2018/08/16 12:58 by hmeij07