User Tools

Site Tools


cluster:89

This is an old revision of the document!



Back

HP HPC

Notes for the cluster design conference with HP.

“do later” means we tackle after the HP on site visit.

S & H

  • Shipping Address: 5th floor data center
  • No 13'6“ truck, 12'6” is ok or box truck
  • Delivery on standard raised dock, no ways to lift rack out of truck if not docked
  • Freight Elevator and pallet jack available

Network

Basically …

  • configure all console port switches with an IP
    • depending on switch IP in 192.168.102.x or 10.10.102.x
    • voltaire console can be stuffed in either
  • head node will be connected to our private network via a two link aggregated ethernet cables in the 10.10.x.y range so current home directories can be mounted somewhere (these dirs will not be available on the back end nodes.
  • x.y.z.255 is broadcast
  • x.y.z.254 is head or log in node
  • x.y.z.0 is gateway
  • x.y.z.<25 is for all switches and console ports
  • x.y.z.25( up to 253) is for all compute nodes

We are planning to ingest our Dell cluster (37 nodes) and our Blue Sky Studios cluster (130 nodes) into this setup, hence the approach.

Netmask is, finally, 255.255.0.0 (excluding public 129.133 subnet).

DM380G7

HP Link (head node)

  • Dual power (one to UPS, one to utility, do later)
  • hostname greentail, another local “tail”, also in reference to HP being 18-24% more efficient in power/cooling
  • eth0, provision, 192.168.102.254/255.255.0.0 (greentail-eth0, should go to better switch ProCurve 2910)
    • do we need a iLo eth? in range 192.168.104.254?
  • eth1, data/private, 10.10.102.254/255.255.0.0 (greentail-eth1, should go to ProCurve 2610)
  • eth2, public, 129.133.1.226/255.255.255.0 (greentail.wesleyan.edu, we provide cable connection)
  • eth3 (over eth2), ipmi, 192.168.103.254/255.255.0.0, (greentail-ipmi, should go to better switch ProCurve 2910, do later)
    • see discussion iLo/IPMI under CMU
  • ib0, ipoib, 10.10.103.254/255.255.0.0 (greentail-ib0)
  • ib1, ipoib, 10.10.104.254/255.255.0.0 (greentail-ib1, configure, might not have cables!, split traffic across ports?)
  • Raid 1 mirrored disks (2x250gb)
  • /home mount point for home directory volume ~ 10tb (contains /home/apps/src)
  • /snapshot mount point for snapshot volume ~ 10tb
  • /sanscratch mount point for sanscratch volume ~ 5 tb
  • logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (should match nodes at 160 gb, leave rest for OS)
  • logical volumes ROOT/VAR/BOOT/TMP: defaults
  • IPoIB configuration
  • SIM configuration
  • CMU configuration
  • SGE configuration

StorageWorks MSA60

HP Link (storage device)

  • Dual power (one to UPS, one to utility, do later)
  • Three volumes to start with:
    • home (raid 6), 10 tb
    • snapshot (raid 6), 10 tb … see todos.
    • sanscratch (raid 1 or 0, no backup), 5 tb
  • SIM

SL2x170z G6

HP Link (compute nodes)

  • node names hp000, increment by 1
  • eth0, provision, 192.168.102.25(increment by 1)/255.255.0.0 (hp000-eth0, should go to better switch ProCurve 2910)
    • do we need an iLo eth? in range 192.168.104.25(increment by 1)
    • CMU wants eth0 on NIC1 and PXEboot
  • eth1, data/private, 10.10.102.25(increment by 1)/255.255.0.0 (hp000-eth1, should go to ProCurve 2610)
  • eth2 (over eth1), ipmi, 192.168.103.25(increment by 1)/255.255.0.0, (hp000-ipmi, should go to better switch ProCurve 2910, do later)
    • see discussion iLo/IPMI under CMU
  • ib0, ipoib, 10.10.103.25(increment by 1)/255.255.0.0 (hp000-ib0)
  • ib1, ipoib, 10.10.104.25(increment by 1)/255.255.0.0 (hp000-ib1, configure, might not have cables!)
  • /home mount point for home directory volume ~ 10tb (contains /home/apps/src)
  • /snapshot mount point for snapshot volume ~ 10tb
  • /sanscratch mount point for sanscratch volume ~ 5 tb
  • logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (60 gb left for OS)
  • logical volumes ROOT/VAR/BOOT/TMP: defaults
  • SIM

Misc

  • IPoIB
    • configuration, fine tune
    • monitor
  • Systems Insight Manager (SIM)
  • HP Link (Linux Install and Configure Guide, and User Guide)
    • Do we need a windows box (virtual) to run the Central Management Server on?
    • SIM + Cluster Monitor (MSCS)?
    • install, configure
    • requires an oracle install? no, hpsmdb is installed with automatic installation (postgresql)
    • linux deployment utilities, and management agents installation
    • configure managed systems, automatic discovery
    • configure automatic event handling
  • Cluster Management Utility (CMU up to 4,096 nodes)
  • HP Link (Getting Started - Hardware Preparation, Setup and Install – Installation Guide v4.2, Users Guides)
    • HP iLo probably removes the need for IPMI, consult External Link, do the blades have a management card?
    • well maybe not, IPMI (External Link) can be scripted to power on/off, not sure about iLo (all web based)
    • is head node the Management server? possibly, needs access to provision and public networks
    • we may need a iLo eth? in range … 192.198.104.x? Consult the Hardware Preparation Guide.
    • CMU wants eth0 on NIC1 and PXEboot
    • install CMU management node
    • install X and CMU GUI client node
    • start CMU, start client, scan for nodes, build golden image
    • install monitoring client when building golden image node via CMU GUI
    • clone nodes, deploy management agent on nodes
    • not sure we can implement CMU HA
  • Sun Grid Engine (SGE)
    • install, configure
    • there will only be one queue (hp12)

Other

  • KVM utility
    • functionality
  • Placement
    • where in data center (do later), based on environmental works

ToDo

All do later. After HP cluster is up.

  • Lava. Install from source and evaluate.


Back

cluster/89.1282167589.txt.gz · Last modified: 2010/08/18 17:39 by hmeij