User Tools

Site Tools


cluster:77

Table of Contents


Back

Expansion

Donation of hardware by Blue Sky Studios

4 racks with blade servers, 52 servers per rack, will be dropped off tomorrow. Not all will be turned on as 5% of the servers are kaput. So the first step is to build a fully working rack. Memory footprint of the servers is either 12 or 24 GB. Chipset is AMD Operon Model 250 single core dual CPUs at 1.0 GHZ.

Since there is no budget we will be installing CentOS with Lava as the scheduler. More details to come.

Pictures

Two Angstrom racks on each side of Dell racks.

The closest two racks each hold 52 servers, each with dual AMD Opteron 248 model chips running at 1.0 Ghz with a memory foot print of 12 GB and holds a single 80 GB disk and a dual port NIC. Each rack holds two HP Procurve 2848 gigabit switches. Memory wise each rack holds 52*12 = 624 GB.

The two racks on the other side are similar as the front two, except the memory footprint of each server is 24 GB. That means each rack holds 52*24 = 1,248 GB. So all four Angstrom racks hold 208 servers that can access a total of 3,744 GB. Compare that to the Dell cluster, which in July 2009, holds a total of 320 GB. The Dell racks offer 288 job slots and the Angstrom could offer 416 job slots.

Ha! But not so fast. Cooling capacity wise we could not turn them on today, and electric consumption would be critical. Only one rack holding the 12 GB servers servers is currently slated to be turned into a cluster while one of the 24 GB servers racks is slated to become a virtualization farm for an ITS pilot project.

The rest are for now spare parts. And 5% of the servers have something broken.

But they are pretty cute looking racks. Sort of animated. Thank You Blue Sky (Donation of hardware by Blue Sky Studios)

Migration

Mental Notes pour mois …

  • no need to block ssh access to swallowtail or petal tail
  • bring job count up, delete accounts files
  • make sure dhcpd is stopped on swallowtail
  • remove compute-01
  • back up 'good files'
  • release nodes on swallowtail
  • PXEboot
  • add to new cluster
  • custom image
  • scp over lsb.accts files from lsf_archive
  • also the disk and queue log files
  • had to stop|start lsf_daemons to get LSF going
  • reroute the power cords on side of rack
  • free ports 47*48 on both ethernet switches
  • reconfigure LSF queues (&licenses, remove & add)
  • open petaltail for business (jobs can be submitted)
  • keep swallowtail up for a few days (no access)
  • reimage swallowtail as master LSF candidate
  • reposition swallowtail as LSF master, petatail as LSF secondary (and repository manager)
cluster/77.txt · Last modified: 2009/09/09 09:21 by hmeij