User Tools

Site Tools


cluster:88

This is an old revision of the document!



Back

Blue Sky Studios

Hardware

We have 4 racks of which 3 are powered up. All on utility power including head/login node. Racks are surprisingly cool compared to our Dell cluster). Some digging revealed that the AMD Opteron chip cycles down to 1 Ghz if not used instead of running at 2.4 Ghz all the time (You can observe this in /proc/cpuinfo).

If you want to use the switches you need to power up the top two shelves within each rack or use an alternate source of power.

We wanted to separate the data traffic (NFS) from the software management and MPI traffic so will be leveraging both ethernet ports on each blade. In order to do that we changed the cabling. In our setup the top procurve switch is always the provision switch (192.168.1.y/255.255.255.0) and the bottom switch is the data switch (10.10.100.y/255.255.0.0). Port 48 of each switch cascades into the next switch, horizontally, so that all 3 procurve switches become one network; provision or data.

We bought 52 three feet CAT6 ethernet cables for each rack. The original purple cables connecting blade to rack in top two shelves within a rack connect to bottom ethernet blade port (eth0). For bottom two racks, the purple cable connect to top ethernet blade port (eth1). Then the rest of the ehternet blade ports were connected with the three feet cables. This results in each blade being connected to top and bottom switch. Now the math does not work out smoothly; 4 shelves with 13 blades is 52 eth[0|1] connections but the switches have 48 ports (minus the uplink port). So you have some blades not connected in each rack.

Our storage is provided by one of our NetApp filers (5TB). The filer is known as filer3a or filer13a and sits on our internal private network with IPs in the 10.10.0.y/255.255.0.0 network range. Two ethernet cables, link aggregated, connect our Dell cluster data switch to this private network (hence we have fail over and possibly 2 Gbit pipe). For simplicity sake, we connected the first procurve switch into the dell data switch rather than running more cables to private network. This means that each blade mounts directly the filer file system (home directories) off the Netapp filer over the private network.

Our head node has a similar setup (provision and data ports). This means that the BSS cluster entirely is on the private network and not reachable from our domain wesleyan.edu. Users must first login to the head/login nodes of the Dell cluster then via ssh keys passwordless reach the BSS head/login node. This has worked out, but with only one cluster and only two ports on the head node, there needs to be a connection to the outside world (for example, eth1 could become the connection to the external world and the storage must be mounted over this connection as well; eth0 must be on the 192.168 provision subnet).

Management

For our operating system we choose CentOS 5.3 and burned the ISO images to cdrom. For our management software we choose Project Kusu which can be found at http://www.hpccommunity.org/. Project Kusu is the open source counter part of Platform.com's OCS software stack, a ROCKS based but enhanced commercial version (which we run on the Dell cluster). For our scheduler we choose Lava, also found at this site, which is the open source counter part of Platfrom.com's LSF scheduler. You can also find monitoring tools at this site and we also burned to cdrom the ISO images for Ganglia, NTop and Cacti in addition to the Kusu Installer and Lava kits.

Once you have all these burned to cdrom, you are ready to step through 12 installation screens which are fairly straight forward. The screens are described at http://www.hpccommunity.org/section/kusu-45/ along with an Installation and Overview guides. Boot a selected blade, this will become the installer node also referred to as head or login node from the Kusu Installer cdrom. Provide information configuring the network, root account, local hard disks etc. Towards the last step Kusu will ask for the kits you want installed. Feed it the CentOS, Lava, Ganglia, NTop and Cacti kits. After this step Kusu will finish the installation and reboot. One customization inserted in this process is that we added a new partition /localscratch of about 50GB.

After reboot, Kusu will have create a /depot with the CentOS in it. It can be manipulated with repoman (for example, take a snapshot before you change anything). Configuration information is loaded in postgres sql databases. A DHCP server is started listening on the provision network. Also in /opt you'll find GNU compilations of many MPI flavors including OpenMPI. Also a working installation of Lava can be queried (bhosts, bqueues, etc). Ganglia, Ntop and Cacti will also be running and are monitoring your installer node.


Back

cluster/88.1280517343.txt.gz · Last modified: 2010/07/30 19:15 by hmeij