User Tools

Site Tools


cluster:89

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:89 [2010/08/19 15:15]
hmeij
cluster:89 [2010/08/31 20:25]
hmeij
Line 34: Line 34:
  
 Netmask is, finally, 255.255.0.0 (excluding public 129.133 subnet). Netmask is, finally, 255.255.0.0 (excluding public 129.133 subnet).
 +
 +===== Infiniband =====
 +
 +[[http://h20000.www2.hp.com/bizsupport/TechSupport/Home.jsp?lang=en&cc=vn&prodTypeId=12883&prodSeriesId=3758753&lang=en&cc=vn|HP Link]] 
 +
 +  * Voltaire 4036
 +  * 519571-B21
 +  * Voltaire InfiniBand 4X QDR 36-Port Managed Switch
 +
 +
 +Configuration, fine tuning, identify bottlenecks, monitor, administer.  Investigate [[http://www.voltaire.com/Products/Unified_Fabric_Manager|Voltaire UFM]]?
  
 ===== DM380G7 ===== ===== DM380G7 =====
-[[http://h10010.www1.hp.com/wwpc/us/en/sm/WF31a/15351-15351-3328412-241644-241475-4091412.html|HP Link]] (head node)+[[http://h10010.www1.hp.com/wwpc/us/en/sm/WF31a/15351-15351-3328412-241644-241475-4091412.html|HP Link]] (head node)\\ 
 +[[http://vimeo.com/9938744|External Link]] video about hardware
  
   * Dual power (one to UPS, one to utility, do later)   * Dual power (one to UPS, one to utility, do later)
Line 127: Line 139:
     * clone nodes, deploy management agent on nodes     * clone nodes, deploy management agent on nodes
       * PXEboot and wake-on-lan must be done manually in BIOS       * PXEboot and wake-on-lan must be done manually in BIOS
 +      * pre_reconf.sh (/localscratch partition? and reconf.sh (NIC2 definition)
     * not sure we can implement CMU HA     * not sure we can implement CMU HA
 +    * collectl/colplot seems nice
  
   * Sun Grid Engine (SGE)   * Sun Grid Engine (SGE)
Line 152: Line 166:
  
   * Lava.  Install from source and evaluate.   * Lava.  Install from source and evaluate.
 +
 +  * Location
 +    * remove 2 BSS racks (to pace.edu?), rack #3 & 4
 +    * add an L6-30 if needed (have 3? check)
 +    * fill remaining 2 BSS racks with 24gb good servers, turn off
  
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
cluster/89.txt · Last modified: 2010/11/22 19:05 by hmeij