User Tools

Site Tools


cluster:89

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:89 [2010/08/17 18:50]
hmeij
cluster:89 [2010/08/18 19:23]
hmeij
Line 18: Line 18:
  
 Basically ... Basically ...
 +
 +  * configure all console port switches with an IP
 +    * depending on switch IP in 192.168.102.x or 10.10.102.x
 +    * voltaire console can be stuffed in either
 +
 +  * head node will be connected to our private network via a two link aggregated ethernet cables in the 10.10.x.y range so current home directories can be mounted somewhere (these dirs will not be available on the back end nodes.
  
   * x.y.z.255 is broadcast   * x.y.z.255 is broadcast
Line 38: Line 44:
     * do we need a iLo eth? in range 192.168.104.254?     * do we need a iLo eth? in range 192.168.104.254?
   * eth1, data/private, 10.10.102.254/255.255.0.0 (greentail-eth1, should go to ProCurve 2610)   * eth1, data/private, 10.10.102.254/255.255.0.0 (greentail-eth1, should go to ProCurve 2610)
-  * eth2, public, 129.133.1.226/255.255.255.0 (greentail.wesleyan.edu)+  * eth2, public, 129.133.1.226/255.255.255.0 (greentail.wesleyan.edu, we provide cable connection)
   * eth3 (over eth2), ipmi, 192.168.103.254/255.255.0.0,  (greentail-ipmi, should go to better switch ProCurve 2910, do later)   * eth3 (over eth2), ipmi, 192.168.103.254/255.255.0.0,  (greentail-ipmi, should go to better switch ProCurve 2910, do later)
     * see discussion iLo/IPMI under CMU     * see discussion iLo/IPMI under CMU
Line 45: Line 51:
  
   * Raid 1 mirrored disks (2x250gb)   * Raid 1 mirrored disks (2x250gb)
-  * /home mount point for home directory volume ~ 10tb +  * /home mount point for home directory volume ~ 10tb (contains /home/apps/src) 
-  * /home/apps mount point for software volume ~ 1tb (contains /home/apps/src) +  * /snapshot mount point for snapshot volume ~ 10tb  
-  * /home/sanscratch mount point for sanscratch volume ~ 5 tb+  * /sanscratch mount point for sanscratch volume ~ 5 tb
   * logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (should match nodes at 160 gb, leave rest for OS)   * logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (should match nodes at 160 gb, leave rest for OS)
   * logical volumes ROOT/VAR/BOOT/TMP: defaults   * logical volumes ROOT/VAR/BOOT/TMP: defaults
Line 62: Line 68:
  
   * Three volumes to start with:    * Three volumes to start with: 
-    * home (raid 6, design a backup path, do later), 10 tb +    * home (raid 6), 10 tb 
-    * apps (raid 6, design a backup path, do later), 1tb +    * snapshot (raid 6), 10 tb ... see todos. 
-    * sanscratch (raid 1, no backup), 5 tb+    * sanscratch (raid 1 or 0, no backup), 5 tb
  
   * SIM   * SIM
Line 75: Line 81:
     * eth0, provision, 192.168.102.25(increment by 1)/255.255.0.0 (hp000-eth0, should go to better switch ProCurve 2910)     * eth0, provision, 192.168.102.25(increment by 1)/255.255.0.0 (hp000-eth0, should go to better switch ProCurve 2910)
       * do we need an iLo eth? in range 192.168.104.25(increment by 1)       * do we need an iLo eth? in range 192.168.104.25(increment by 1)
-      * CMU wnants eth0 on NIC1 and PXEboot+      * CMU wants eth0 on NIC1 and PXEboot
     * eth1, data/private, 10.10.102.25(increment by 1)/255.255.0.0 (hp000-eth1, should go to ProCurve 2610)     * eth1, data/private, 10.10.102.25(increment by 1)/255.255.0.0 (hp000-eth1, should go to ProCurve 2610)
     * eth2 (over eth1), ipmi, 192.168.103.25(increment by 1)/255.255.0.0, (hp000-ipmi, should go to better switch ProCurve 2910, do later)     * eth2 (over eth1), ipmi, 192.168.103.25(increment by 1)/255.255.0.0, (hp000-ipmi, should go to better switch ProCurve 2910, do later)
Line 82: Line 88:
     * ib1, ipoib, 10.10.104.25(increment by 1)/255.255.0.0 (hp000-ib1, configure, might not have cables!)     * ib1, ipoib, 10.10.104.25(increment by 1)/255.255.0.0 (hp000-ib1, configure, might not have cables!)
  
-    * /home mount point for home directory volume ~ 10tb +    * /home mount point for home directory volume ~ 10tb (contains /home/apps/src) 
-    * /home/apps mount point for software volume ~ 1tb (contains /home/apps/src) +    * /snapshot mount point for snapshot volume ~ 10tb  
-    * /home/sanscratch mount point for sanscratch volume ~ 5 tb+    * /sanscratch mount point for sanscratch volume ~ 5 tb
     * logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (60 gb left for OS)     * logical volume LOCALSCRATCH: mount at /localscratch ~ 100 gb (60 gb left for OS)
     * logical volumes ROOT/VAR/BOOT/TMP: defaults     * logical volumes ROOT/VAR/BOOT/TMP: defaults
Line 105: Line 111:
     * configure automatic event handling     * configure automatic event handling
  
-  * Cluster Management Utility (CMU)[[http://h20338.www2.hp.com/HPC/cache/412128-0-0-0-121.html|HP Link]] (Getting Started - Hardware Preparation, Setup and Install, Users Guides)+  * Cluster Management Utility (CMU)[[http://h20338.www2.hp.com/HPC/cache/412128-0-0-0-121.html|HP Link]] (Getting Started - Hardware Preparation, Setup and Install -- Installation Guide v4.2, Users Guides)
   * iLo/IPMI   * iLo/IPMI
     * HP iLo probably removes the need for IPMI, consult [[http://en.wikipedia.org/wiki/HP_Integrated_Lights-Out|External Link]], do the blades have a management card?     * HP iLo probably removes the need for IPMI, consult [[http://en.wikipedia.org/wiki/HP_Integrated_Lights-Out|External Link]], do the blades have a management card?
Line 111: Line 117:
     * is head node the Management server? possibly, needs access to provision and public networks     * is head node the Management server? possibly, needs access to provision and public networks
     * we may need a iLo eth? in range ... 192.198.104.x? Consult the Hardware Preparation Guide.     * we may need a iLo eth? in range ... 192.198.104.x? Consult the Hardware Preparation Guide.
-    * CMU wnants eth0 on NIC1 and PXEboot +    * CMU wants eth0 on NIC1 and PXEboot 
-    * install, configure, monitor +    * install CMU management node 
-    * golden image capture, deploy (there will initially only be one image)+    * install X and CMU GUI client node 
 +    * start CMU, start client, scan for nodes, build golden image 
 +    * clone nodes, deploy management agent on nodes 
 +    * install monitoring
  
   * Sun Grid Engine (SGE)   * Sun Grid Engine (SGE)
Line 127: Line 136:
     * where in data center (do later), based on environmental works     * where in data center (do later), based on environmental works
  
 +===== ToDo =====
 +
 +All do later. After HP cluster is up.
 +
 +  * Backups.  Use trickery with linux and rsync to provide snapshots? [[http://forum.synology.com/enu/viewtopic.php?f=9&t=11471|External Link]] and another [[http://www.mikerubel.org/computers/rsync_snapshots/|External Link]]
 +    * Exclude very large files?
 +    * petaltail:/root/snapshot.sh for example
 +    * or better [[http://www.rsnapshot.org/|http://www.rsnapshot.org/]]
 +
 +  * Lava.  Install from source and evaluate.
  
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
cluster/89.txt · Last modified: 2010/11/22 19:05 by hmeij