User Tools

Site Tools


cluster:89

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:89 [2010/08/14 01:33]
hmeij
cluster:89 [2010/08/17 18:46]
hmeij
Line 14: Line 14:
   * Delivery on standard raised dock, no ways to lift rack out of truck if not docked   * Delivery on standard raised dock, no ways to lift rack out of truck if not docked
   * Freight Elevator and pallet jack available   * Freight Elevator and pallet jack available
 +
 +===== Network =====
 +
 +Basically ...
 +
 +  * x.y.z.255 is broadcast
 +  * x.y.z.254 is head or log in node
 +  * x.y.z.0 is gateway
 +  * x.y.z.<25 is for all switches and console ports
 +  * x.y.z.25( up to 253) is for all compute nodes
 +
 +We are planning to ingest our Dell cluster (37 nodes) and our Blue Sky Studios cluster (130 nodes) into this setup, hence the approach.
 +
 +Netmask is, finally, 255.255.0.0 (excluding public 129.133 subnet).
  
 ===== DM380G7 ===== ===== DM380G7 =====
Line 20: Line 34:
   * Dual power (one to UPS, one to utility, do later)   * Dual power (one to UPS, one to utility, do later)
  
-  * hostname [[http://www.ct.gov/dep/cwp/view.asp?A=2723&Q=325780|greentail]], another local "tail", also in reference to HP burning 18-24% more efficient in power/cooling+  * hostname [[http://www.ct.gov/dep/cwp/view.asp?A=2723&Q=325780|greentail]], another local "tail", also in reference to HP being 18-24% more efficient in power/cooling
   * eth0, provision, 192.168.102.254/255.255.0.0 (greentail-eth0, should go to better switch ProCurve 2910)   * eth0, provision, 192.168.102.254/255.255.0.0 (greentail-eth0, should go to better switch ProCurve 2910)
 +    * do we need a iLo eth? in range 192.168.104.254?
   * eth1, data/private, 10.10.102.254/255.255.0.0 (greentail-eth1, should go to ProCurve 2610)   * eth1, data/private, 10.10.102.254/255.255.0.0 (greentail-eth1, should go to ProCurve 2610)
   * eth2, public, 129.133.1.226/255.255.255.0 (greentail.wesleyan.edu)   * eth2, public, 129.133.1.226/255.255.255.0 (greentail.wesleyan.edu)
-  * eth3, ipmi, 10.10.103.254/255.255.0.0,  (greentail-ipmi, do later) +  * eth3 (over eth2), ipmi, 192.168.103.254/255.255.0.0,  (greentail-ipmi, should go to better switch ProCurve 2910, do later) 
-  * ib0, ipoib, 10.10.104.254/255.255.0.0 (greentail-ib0) +    * see discussion iLo/IPMI under CMU 
-  * ib1, ipoib, 10.10.105.254/255.255.0.0 (greentail-ib1, configure, might not have cables!, split traffic across ports?)+  * ib0, ipoib, 10.10.103.254/255.255.0.0 (greentail-ib0) 
 +  * ib1, ipoib, 10.10.104.254/255.255.0.0 (greentail-ib1, configure, might not have cables!, split traffic across ports?)
  
   * Raid 1 mirrored disks (2x250gb)   * Raid 1 mirrored disks (2x250gb)
Line 34: Line 50:
   * logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (should match nodes at 160 gb, leave rest for OS)   * logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (should match nodes at 160 gb, leave rest for OS)
   * logical volumes ROOT/VAR/BOOT/TMP: defaults   * logical volumes ROOT/VAR/BOOT/TMP: defaults
 +
 +  * IPoIB configuration
 +  * SIM configuration
 +  * CMU configuration
 +  * SGE configuration
  
 =====  StorageWorks MSA60  ===== =====  StorageWorks MSA60  =====
Line 45: Line 66:
     * sanscratch (raid 1, no backup), 5 tb     * sanscratch (raid 1, no backup), 5 tb
  
-  * Systems Insight Manager (SIM) +  * SIM 
-    * install, configure, monitor +
-    * event actions+
  
 ===== SL2x170z G6 ===== ===== SL2x170z G6 =====
Line 53: Line 73:
  
     * node names hp000, increment by 1     * node names hp000, increment by 1
-    * eth0, provision, 192.168.102.10(increment by 1)/255.255.0.0 (hp000-eth0, should go to better switch ProCurve 2910) +    * eth0, provision, 192.168.102.25(increment by 1)/255.255.0.0 (hp000-eth0, should go to better switch ProCurve 2910
-    * eth1, data/private, 10.10.102.10(increment by 1)/255.255.0.0 (hp000-eth1, should go to ProCurve 2610) +      * do we need an iLo eth? in range 192.168.104.25(increment by 1
-    * eth2, ipmi, 10.10.103.10(increment by 1)/255.255.0.0, (hp000-ipmi, do later) +    * eth1, data/private, 10.10.102.25(increment by 1)/255.255.0.0 (hp000-eth1, should go to ProCurve 2610) 
-    * ib0, ipoib, 10.10.104.10(increment by 1)/255.255.0.0 (hp000-ib0) +    * eth2 (over eth1), ipmi, 192.168.103.25(increment by 1)/255.255.0.0, (hp000-ipmi, should go to better switch ProCurve 2910, do later) 
-    * ib1, ipoib, 10.10.105.10(increment by 1)/255.255.0.0 (hp000-ib1, configure, might not have cables!)+      * see discussion iLo/IPMI under CMU 
 +    * ib0, ipoib, 10.10.103.25(increment by 1)/255.255.0.0 (hp000-ib0) 
 +    * ib1, ipoib, 10.10.104.25(increment by 1)/255.255.0.0 (hp000-ib1, configure, might not have cables!)
  
     * /home mount point for home directory volume ~ 10tb     * /home mount point for home directory volume ~ 10tb
Line 65: Line 87:
     * logical volumes ROOT/VAR/BOOT/TMP: defaults     * logical volumes ROOT/VAR/BOOT/TMP: defaults
  
 +  * SIM
  
 ===== Misc ===== ===== Misc =====
Line 72: Line 95:
     * monitor     * monitor
  
-  * Cluster Management Utility (CMU)+  * Systems Insight Manager (SIM) [[http://h18013.www1.hp.com/products/servers/management/hpsim/index.html?jumpid=go/hpsim|HP Link]] (Linux Install and Configure Guide, and User Guide) 
 +    * Do we need a windows box (virtual) to run the Central Management Server on? 
 +    * SIM + Cluster Monitor (MSCS)? 
 +    * install, configure 
 +    * requires an oracle install? no, hpsmdb is installed with automatic installation (postgresql) 
 +    * linux deployment utilities, and management agents installation 
 +    * configure managed systems, automatic discovery 
 +    * configure automatic event handling 
 + 
 +  * Cluster Management Utility (CMU)[[http://h20338.www2.hp.com/HPC/cache/412128-0-0-0-121.html|HP Link]] (Getting Started - Hardware Preparation, Setup and Install, Users Guides) 
 +  * iLo/IPMI 
 +    * HP iLo probably removes the need for IPMI, consult [[http://en.wikipedia.org/wiki/HP_Integrated_Lights-Out|External Link]], do the blades have a management card? 
 +    * well maybe not, IPMI ([[http://en.wikipedia.org/wiki/Ipmi|External Link]]) can be scripted to power on/off, not sure about iLo (all web based)  
 +    * is head node the Management server? possibly, needs access to provision and public networks 
 +    * we may need a iLo eth? in range ... 192.198.104.x
     * install, configure, monitor     * install, configure, monitor
     * golden image capture, deploy (there will initially only be one image)     * golden image capture, deploy (there will initially only be one image)
Line 79: Line 116:
     * install, configure     * install, configure
     * there will only be one queue (hp12)     * there will only be one queue (hp12)
 +
 +===== Other =====
  
   * KVM utility   * KVM utility
cluster/89.txt · Last modified: 2010/11/22 19:05 by hmeij