User Tools

Site Tools


cluster:89

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:89 [2010/08/14 01:46]
hmeij
cluster:89 [2010/08/17 18:51]
hmeij
Line 14: Line 14:
   * Delivery on standard raised dock, no ways to lift rack out of truck if not docked   * Delivery on standard raised dock, no ways to lift rack out of truck if not docked
   * Freight Elevator and pallet jack available   * Freight Elevator and pallet jack available
 +
 +===== Network =====
 +
 +Basically ...
 +
 +  * x.y.z.255 is broadcast
 +  * x.y.z.254 is head or log in node
 +  * x.y.z.0 is gateway
 +  * x.y.z.<25 is for all switches and console ports
 +  * x.y.z.25( up to 253) is for all compute nodes
 +
 +We are planning to ingest our Dell cluster (37 nodes) and our Blue Sky Studios cluster (130 nodes) into this setup, hence the approach.
 +
 +Netmask is, finally, 255.255.0.0 (excluding public 129.133 subnet).
  
 ===== DM380G7 ===== ===== DM380G7 =====
Line 20: Line 34:
   * Dual power (one to UPS, one to utility, do later)   * Dual power (one to UPS, one to utility, do later)
  
-  * hostname [[http://www.ct.gov/dep/cwp/view.asp?A=2723&Q=325780|greentail]], another local "tail", also in reference to HP burning 18-24% more efficient in power/cooling+  * hostname [[http://www.ct.gov/dep/cwp/view.asp?A=2723&Q=325780|greentail]], another local "tail", also in reference to HP being 18-24% more efficient in power/cooling
   * eth0, provision, 192.168.102.254/255.255.0.0 (greentail-eth0, should go to better switch ProCurve 2910)   * eth0, provision, 192.168.102.254/255.255.0.0 (greentail-eth0, should go to better switch ProCurve 2910)
 +    * do we need a iLo eth? in range 192.168.104.254?
   * eth1, data/private, 10.10.102.254/255.255.0.0 (greentail-eth1, should go to ProCurve 2610)   * eth1, data/private, 10.10.102.254/255.255.0.0 (greentail-eth1, should go to ProCurve 2610)
   * eth2, public, 129.133.1.226/255.255.255.0 (greentail.wesleyan.edu)   * eth2, public, 129.133.1.226/255.255.255.0 (greentail.wesleyan.edu)
-  * eth3, ipmi, 10.10.103.254/255.255.0.0,  (greentail-ipmi, do later) +  * eth3 (over eth2), ipmi, 192.168.103.254/255.255.0.0,  (greentail-ipmi, should go to better switch ProCurve 2910, do later) 
-  * ib0, ipoib, 10.10.104.254/255.255.0.0 (greentail-ib0) +    * see discussion iLo/IPMI under CMU 
-  * ib1, ipoib, 10.10.105.254/255.255.0.0 (greentail-ib1, configure, might not have cables!, split traffic across ports?)+  * ib0, ipoib, 10.10.103.254/255.255.0.0 (greentail-ib0) 
 +  * ib1, ipoib, 10.10.104.254/255.255.0.0 (greentail-ib1, configure, might not have cables!, split traffic across ports?)
  
   * Raid 1 mirrored disks (2x250gb)   * Raid 1 mirrored disks (2x250gb)
Line 34: Line 50:
   * logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (should match nodes at 160 gb, leave rest for OS)   * logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (should match nodes at 160 gb, leave rest for OS)
   * logical volumes ROOT/VAR/BOOT/TMP: defaults   * logical volumes ROOT/VAR/BOOT/TMP: defaults
 +
 +  * IPoIB configuration
 +  * SIM configuration
 +  * CMU configuration
 +  * SGE configuration
  
 =====  StorageWorks MSA60  ===== =====  StorageWorks MSA60  =====
Line 45: Line 66:
     * sanscratch (raid 1, no backup), 5 tb     * sanscratch (raid 1, no backup), 5 tb
  
-  * Systems Insight Manager (SIM+  * SIM
-    * install, configure, monitor +
-    * event actions+
  
- 
-===== Network ===== 
- 
-Basically ... 
- 
-  * x.y.z.255 is broadcast 
-  * x.y.z.254 is head or log in node 
-  * x.y.z.0 is gateway 
-  * x.y.z.<25 is for all switches and console ports 
-  * x.y.z.25(and up, but less than 250) is for all compute nodes 
- 
-We are planning to ingest our Dell cluster (37 nodes) and our Blue Sky Studios cluster (130 nodes) into this setup, hence the approach. 
- 
-Netmask is, finally, 255.255.0.0 (banning public 129.133 subnet). 
  
 ===== SL2x170z G6 ===== ===== SL2x170z G6 =====
Line 69: Line 74:
     * node names hp000, increment by 1     * node names hp000, increment by 1
     * eth0, provision, 192.168.102.25(increment by 1)/255.255.0.0 (hp000-eth0, should go to better switch ProCurve 2910)     * eth0, provision, 192.168.102.25(increment by 1)/255.255.0.0 (hp000-eth0, should go to better switch ProCurve 2910)
 +      * do we need an iLo eth? in range 192.168.104.25(increment by 1)
 +      * CMU wants eth0 on NIC1 and PXEboot
     * eth1, data/private, 10.10.102.25(increment by 1)/255.255.0.0 (hp000-eth1, should go to ProCurve 2610)     * eth1, data/private, 10.10.102.25(increment by 1)/255.255.0.0 (hp000-eth1, should go to ProCurve 2610)
-    * eth2, ipmi, 10.10.103.25(increment by 1)/255.255.0.0, (hp000-ipmi, do later) +    * eth2 (over eth1), ipmi, 192.168.103.25(increment by 1)/255.255.0.0, (hp000-ipmi, should go to better switch ProCurve 2910, do later) 
-    * ib0, ipoib, 10.10.104.25(increment by 1)/255.255.0.0 (hp000-ib0) +      * see discussion iLo/IPMI under CMU 
-    * ib1, ipoib, 10.10.105.25(increment by 1)/255.255.0.0 (hp000-ib1, configure, might not have cables!)+    * ib0, ipoib, 10.10.103.25(increment by 1)/255.255.0.0 (hp000-ib0) 
 +    * ib1, ipoib, 10.10.104.25(increment by 1)/255.255.0.0 (hp000-ib1, configure, might not have cables!)
  
     * /home mount point for home directory volume ~ 10tb     * /home mount point for home directory volume ~ 10tb
Line 80: Line 88:
     * logical volumes ROOT/VAR/BOOT/TMP: defaults     * logical volumes ROOT/VAR/BOOT/TMP: defaults
  
 +  * SIM
  
 ===== Misc ===== ===== Misc =====
Line 87: Line 96:
     * monitor     * monitor
  
-  * Cluster Management Utility (CMU)+  * Systems Insight Manager (SIM) [[http://h18013.www1.hp.com/products/servers/management/hpsim/index.html?jumpid=go/hpsim|HP Link]] (Linux Install and Configure Guide, and User Guide) 
 +    * Do we need a windows box (virtual) to run the Central Management Server on? 
 +    * SIM + Cluster Monitor (MSCS)? 
 +    * install, configure 
 +    * requires an oracle install? no, hpsmdb is installed with automatic installation (postgresql) 
 +    * linux deployment utilities, and management agents installation 
 +    * configure managed systems, automatic discovery 
 +    * configure automatic event handling 
 + 
 +  * Cluster Management Utility (CMU)[[http://h20338.www2.hp.com/HPC/cache/412128-0-0-0-121.html|HP Link]] (Getting Started - Hardware Preparation, Setup and Install, Users Guides) 
 +  * iLo/IPMI 
 +    * HP iLo probably removes the need for IPMI, consult [[http://en.wikipedia.org/wiki/HP_Integrated_Lights-Out|External Link]], do the blades have a management card? 
 +    * well maybe not, IPMI ([[http://en.wikipedia.org/wiki/Ipmi|External Link]]) can be scripted to power on/off, not sure about iLo (all web based)  
 +    * is head node the Management server? possibly, needs access to provision and public networks 
 +    * we may need a iLo eth? in range ... 192.198.104.x? Consult the Hardware Preparation Guide. 
 +    * CMU wnants eth0 on NIC1 and PXEboot
     * install, configure, monitor     * install, configure, monitor
     * golden image capture, deploy (there will initially only be one image)     * golden image capture, deploy (there will initially only be one image)
Line 94: Line 118:
     * install, configure     * install, configure
     * there will only be one queue (hp12)     * there will only be one queue (hp12)
 +
 +===== Other =====
  
   * KVM utility   * KVM utility
cluster/89.txt · Last modified: 2010/11/22 19:05 by hmeij