Differences

This shows you the differences between two versions of the page.

--- cluster:89 [2010/08/14 01:46]
hmeij
+++ cluster:89 [2010/08/17 18:51]
hmeij
@@ Line 14: / Line 14: @@
   * Delivery on standard raised dock, no ways to lift rack out of truck if not docked
   * Freight Elevator and pallet jack available
+===== Network =====
+Basically ...
+  * x.y.z.255 is broadcast
+  * x.y.z.254 is head or log in node
+  * x.y.z.0 is gateway
+  * x.y.z.<25 is for all switches and console ports
+  * x.y.z.25( up to 253) is for all compute nodes
+We are planning to ingest our Dell cluster (37 nodes) and our Blue Sky Studios cluster (130 nodes) into this setup, hence the approach.
+Netmask is, finally, 255.255.0.0 (excluding public 129.133 subnet).
 ===== DM380G7 =====
@@ Line 20: / Line 34: @@
   * Dual power (one to UPS, one to utility, do later)
-  * hostname [[http://www.ct.gov/dep/cwp/view.asp?A=2723&Q=325780|greentail]], another local "tail", also in reference to HP burning 18-24% more efficient in power/cooling
+  * hostname [[http://www.ct.gov/dep/cwp/view.asp?A=2723&Q=325780|greentail]], another local "tail", also in reference to HP being 18-24% more efficient in power/cooling
   * eth0, provision, 192.168.102.254/255.255.0.0 (greentail-eth0, should go to better switch ProCurve 2910)
+    * do we need a iLo eth? in range 192.168.104.254?
   * eth1, data/private, 10.10.102.254/255.255.0.0 (greentail-eth1, should go to ProCurve 2610)
   * eth2, public, 129.133.1.226/255.255.255.0 (greentail.wesleyan.edu)
-  * eth3, ipmi, 10.10.103.254/255.255.0.0,  (greentail-ipmi, do later)
+  * eth3 (over eth2), ipmi, 192.168.103.254/255.255.0.0,  (greentail-ipmi, should go to better switch ProCurve 2910, do later)
-  * ib0, ipoib, 10.10.104.254/255.255.0.0 (greentail-ib0)
+    * see discussion iLo/IPMI under CMU
-  * ib1, ipoib, 10.10.105.254/255.255.0.0 (greentail-ib1, configure, might not have cables!, split traffic across ports?)
+  * ib0, ipoib, 10.10.103.254/255.255.0.0 (greentail-ib0)
+  * ib1, ipoib, 10.10.104.254/255.255.0.0 (greentail-ib1, configure, might not have cables!, split traffic across ports?)
   * Raid 1 mirrored disks (2x250gb)
@@ Line 34: / Line 50: @@
   * logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (should match nodes at 160 gb, leave rest for OS)
   * logical volumes ROOT/VAR/BOOT/TMP: defaults
+  * IPoIB configuration
+  * SIM configuration
+  * CMU configuration
+  * SGE configuration
 =====  StorageWorks MSA60  =====
@@ Line 45: / Line 66: @@
     * sanscratch (raid 1, no backup), 5 tb
-  * Systems Insight Manager (SIM)
+  * SIM
-    * install, configure, monitor
-    * event actions
-===== Network =====
-Basically ...
-  * x.y.z.255 is broadcast
-  * x.y.z.254 is head or log in node
-  * x.y.z.0 is gateway
-  * x.y.z.<25 is for all switches and console ports
-  * x.y.z.25(and up, but less than 250) is for all compute nodes
-We are planning to ingest our Dell cluster (37 nodes) and our Blue Sky Studios cluster (130 nodes) into this setup, hence the approach.
-Netmask is, finally, 255.255.0.0 (banning public 129.133 subnet).
 ===== SL2x170z G6 =====
@@ Line 69: / Line 74: @@
     * node names hp000, increment by 1
     * eth0, provision, 192.168.102.25(increment by 1)/255.255.0.0 (hp000-eth0, should go to better switch ProCurve 2910)
+      * do we need an iLo eth? in range 192.168.104.25(increment by 1)
+      * CMU wants eth0 on NIC1 and PXEboot
     * eth1, data/private, 10.10.102.25(increment by 1)/255.255.0.0 (hp000-eth1, should go to ProCurve 2610)
-    * eth2, ipmi, 10.10.103.25(increment by 1)/255.255.0.0, (hp000-ipmi, do later)
+    * eth2 (over eth1), ipmi, 192.168.103.25(increment by 1)/255.255.0.0, (hp000-ipmi, should go to better switch ProCurve 2910, do later)
-    * ib0, ipoib, 10.10.104.25(increment by 1)/255.255.0.0 (hp000-ib0)
+      * see discussion iLo/IPMI under CMU
-    * ib1, ipoib, 10.10.105.25(increment by 1)/255.255.0.0 (hp000-ib1, configure, might not have cables!)
+    * ib0, ipoib, 10.10.103.25(increment by 1)/255.255.0.0 (hp000-ib0)
+    * ib1, ipoib, 10.10.104.25(increment by 1)/255.255.0.0 (hp000-ib1, configure, might not have cables!)
     * /home mount point for home directory volume ~ 10tb
@@ Line 80: / Line 88: @@
     * logical volumes ROOT/VAR/BOOT/TMP: defaults
+  * SIM
 ===== Misc =====
@@ Line 87: / Line 96: @@
     * monitor
-  * Cluster Management Utility (CMU)
+  * Systems Insight Manager (SIM) [[http://h18013.www1.hp.com/products/servers/management/hpsim/index.html?jumpid=go/hpsim|HP Link]] (Linux Install and Configure Guide, and User Guide)
+    * Do we need a windows box (virtual) to run the Central Management Server on?
+    * SIM + Cluster Monitor (MSCS)?
+    * install, configure
+    * requires an oracle install? no, hpsmdb is installed with automatic installation (postgresql)
+    * linux deployment utilities, and management agents installation
+    * configure managed systems, automatic discovery
+    * configure automatic event handling
+  * Cluster Management Utility (CMU)[[http://h20338.www2.hp.com/HPC/cache/412128-0-0-0-121.html|HP Link]] (Getting Started - Hardware Preparation, Setup and Install, Users Guides)
+  * iLo/IPMI
+    * HP iLo probably removes the need for IPMI, consult [[http://en.wikipedia.org/wiki/HP_Integrated_Lights-Out|External Link]], do the blades have a management card?
+    * well maybe not, IPMI ([[http://en.wikipedia.org/wiki/Ipmi|External Link]]) can be scripted to power on/off, not sure about iLo (all web based)
+    * is head node the Management server? possibly, needs access to provision and public networks
+    * we may need a iLo eth? in range ... 192.198.104.x? Consult the Hardware Preparation Guide.
+    * CMU wnants eth0 on NIC1 and PXEboot
     * install, configure, monitor
     * golden image capture, deploy (there will initially only be one image)
@@ Line 94: / Line 118: @@
     * install, configure
     * there will only be one queue (hp12)
+===== Other =====
   * KVM utility

DokuWiki

User Tools

Site Tools

Differences

Page Tools