User Tools

Site Tools


cluster:89

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:89 [2010/08/17 20:10]
hmeij
cluster:89 [2010/08/18 21:39]
hmeij
Line 22: Line 22:
     * depending on switch IP in 192.168.102.x or 10.10.102.x     * depending on switch IP in 192.168.102.x or 10.10.102.x
     * voltaire console can be stuffed in either     * voltaire console can be stuffed in either
 +
 +  * head node will be connected to our private network via a two link aggregated ethernet cables in the 10.10.x.y range so current home directories can be mounted somewhere (these dirs will not be available on the back end nodes.
  
   * x.y.z.255 is broadcast   * x.y.z.255 is broadcast
Line 42: Line 44:
     * do we need a iLo eth? in range 192.168.104.254?     * do we need a iLo eth? in range 192.168.104.254?
   * eth1, data/private, 10.10.102.254/255.255.0.0 (greentail-eth1, should go to ProCurve 2610)   * eth1, data/private, 10.10.102.254/255.255.0.0 (greentail-eth1, should go to ProCurve 2610)
-  * eth2, public, 129.133.1.226/255.255.255.0 (greentail.wesleyan.edu)+  * eth2, public, 129.133.1.226/255.255.255.0 (greentail.wesleyan.edu, we provide cable connection)
   * eth3 (over eth2), ipmi, 192.168.103.254/255.255.0.0,  (greentail-ipmi, should go to better switch ProCurve 2910, do later)   * eth3 (over eth2), ipmi, 192.168.103.254/255.255.0.0,  (greentail-ipmi, should go to better switch ProCurve 2910, do later)
     * see discussion iLo/IPMI under CMU     * see discussion iLo/IPMI under CMU
Line 49: Line 51:
  
   * Raid 1 mirrored disks (2x250gb)   * Raid 1 mirrored disks (2x250gb)
-  * /home mount point for home directory volume ~ 10tb +  * /home mount point for home directory volume ~ 10tb (contains /home/apps/src) 
-  * /home/apps mount point for software volume ~ 1tb (contains /home/apps/src) +  * /snapshot mount point for snapshot volume ~ 10tb  
-  * /home/sanscratch mount point for sanscratch volume ~ 5 tb+  * /sanscratch mount point for sanscratch volume ~ 5 tb
   * logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (should match nodes at 160 gb, leave rest for OS)   * logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (should match nodes at 160 gb, leave rest for OS)
   * logical volumes ROOT/VAR/BOOT/TMP: defaults   * logical volumes ROOT/VAR/BOOT/TMP: defaults
Line 66: Line 68:
  
   * Three volumes to start with:    * Three volumes to start with: 
-    * home (raid 6, design a backup path, do later), 10 tb +    * home (raid 6), 10 tb 
-    * apps (raid 6, design a backup path, do later), 1tb +    * snapshot (raid 6), 10 tb ... see todos. 
-    * sanscratch (raid 1, no backup), 5 tb+    * sanscratch (raid 1 or 0, no backup), 5 tb
  
   * SIM   * SIM
Line 86: Line 88:
     * ib1, ipoib, 10.10.104.25(increment by 1)/255.255.0.0 (hp000-ib1, configure, might not have cables!)     * ib1, ipoib, 10.10.104.25(increment by 1)/255.255.0.0 (hp000-ib1, configure, might not have cables!)
  
-    * /home mount point for home directory volume ~ 10tb +    * /home mount point for home directory volume ~ 10tb (contains /home/apps/src) 
-    * /home/apps mount point for software volume ~ 1tb (contains /home/apps/src) +    * /snapshot mount point for snapshot volume ~ 10tb  
-    * /home/sanscratch mount point for sanscratch volume ~ 5 tb+    * /sanscratch mount point for sanscratch volume ~ 5 tb
     * logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (60 gb left for OS)     * logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (60 gb left for OS)
     * logical volumes ROOT/VAR/BOOT/TMP: defaults     * logical volumes ROOT/VAR/BOOT/TMP: defaults
Line 100: Line 102:
     * monitor     * monitor
  
-  * Systems Insight Manager (SIM) [[http://h18013.www1.hp.com/products/servers/management/hpsim/index.html?jumpid=go/hpsim|HP Link]] (Linux Install and Configure Guide, and User Guide)+  * Systems Insight Manager (SIM)  
 +  * [[http://h18013.www1.hp.com/products/servers/management/hpsim/index.html?jumpid=go/hpsim|HP Link]] (Linux Install and Configure Guide, and User Guide)
     * Do we need a windows box (virtual) to run the Central Management Server on?     * Do we need a windows box (virtual) to run the Central Management Server on?
     * SIM + Cluster Monitor (MSCS)?     * SIM + Cluster Monitor (MSCS)?
Line 109: Line 112:
     * configure automatic event handling     * configure automatic event handling
  
-  * Cluster Management Utility (CMU)[[http://h20338.www2.hp.com/HPC/cache/412128-0-0-0-121.html|HP Link]] (Getting Started - Hardware Preparation, Setup and Install -- Installation Guide v4.2, Users Guides) +  * Cluster Management Utility (CMU up to 4,096 nodes) 
-  * iLo/IPMI+  * [[http://h20338.www2.hp.com/HPC/cache/412128-0-0-0-121.html|HP Link]] (Getting Started - Hardware Preparation, Setup and Install -- Installation Guide v4.2, Users Guides)
     * HP iLo probably removes the need for IPMI, consult [[http://en.wikipedia.org/wiki/HP_Integrated_Lights-Out|External Link]], do the blades have a management card?     * HP iLo probably removes the need for IPMI, consult [[http://en.wikipedia.org/wiki/HP_Integrated_Lights-Out|External Link]], do the blades have a management card?
     * well maybe not, IPMI ([[http://en.wikipedia.org/wiki/Ipmi|External Link]]) can be scripted to power on/off, not sure about iLo (all web based)      * well maybe not, IPMI ([[http://en.wikipedia.org/wiki/Ipmi|External Link]]) can be scripted to power on/off, not sure about iLo (all web based) 
Line 119: Line 122:
     * install X and CMU GUI client node     * install X and CMU GUI client node
     * start CMU, start client, scan for nodes, build golden image     * start CMU, start client, scan for nodes, build golden image
-    * +    * install monitoring client when building golden image node via CMU GUI 
 +    * clone nodes, deploy management agent on nodes 
 +    * not sure we can implement CMU HA
  
   * Sun Grid Engine (SGE)   * Sun Grid Engine (SGE)
Line 133: Line 138:
     * where in data center (do later), based on environmental works     * where in data center (do later), based on environmental works
  
 +===== ToDo =====
 +
 +All do later. After HP cluster is up.
 +
 +  * Backups.  /snapshot volume
 +  * Use trickery with linux and rsync to provide snapshots? [[http://forum.synology.com/enu/viewtopic.php?f=9&t=11471|External Link]] and another [[http://www.mikerubel.org/computers/rsync_snapshots/|External Link]]
 +    * Exclude very large files?
 +    * petaltail:/root/snapshot.sh or rotate_backups.sh as examples
 +    * or better [[http://www.rsnapshot.org/|http://www.rsnapshot.org/]]
 +
 +  * Lava.  Install from source and evaluate.
  
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
cluster/89.txt · Last modified: 2010/11/22 19:05 by hmeij