This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:89 [2010/08/12 15:53] hmeij |
cluster:89 [2010/08/17 15:20] hmeij |
||
---|---|---|---|
Line 5: | Line 5: | ||
Notes for the cluster design conference with HP. | Notes for the cluster design conference with HP. | ||
+ | |||
+ | "do later" means we tackle after the HP on site visit. | ||
===== S & H ===== | ===== S & H ===== | ||
Line 10: | Line 12: | ||
* Shipping Address: 5th floor data center | * Shipping Address: 5th floor data center | ||
* No 13' | * No 13' | ||
- | * Delivery on standard raised dock, no ways to lift out of truck | + | * Delivery on standard raised dock, no ways to lift rack out of truck if not docked |
* Freight Elevator and pallet jack available | * Freight Elevator and pallet jack available | ||
- | ===== Head Node ===== | + | ===== Network |
- | * Dual power (one to UPS, one to utility) | + | Basically ... |
- | * hostname greentail | + | |
- | * eth0, provision, 192.168.102.254/ | + | * x.y.z.254 is head or log in node |
- | * eth1, data/ | + | * x.y.z.0 is gateway |
+ | * x.y.z.< | ||
+ | * x.y.z.25( up to 253) is for all compute nodes | ||
+ | |||
+ | We are planning to ingest our Dell cluster (37 nodes) and our Blue Sky Studios cluster (130 nodes) into this setup, hence the approach. | ||
+ | |||
+ | Netmask is, finally, 255.255.0.0 (excluding public 129.133 subnet). | ||
+ | |||
+ | ===== DM380G7 ===== | ||
+ | [[http:// | ||
+ | |||
+ | * Dual power (one to UPS, one to utility, do later) | ||
+ | |||
+ | | ||
+ | * eth0, provision, 192.168.102.254/ | ||
+ | * eth1, data/ | ||
* eth2, public, 129.133.1.226/ | * eth2, public, 129.133.1.226/ | ||
- | * eth3, ipmi, do later?, (greentail-ipmi) | + | * eth3, ipmi, 192.168.103.254/ |
* ib0, ipoib, 10.10.103.254/ | * ib0, ipoib, 10.10.103.254/ | ||
- | * ib1, split ipoib traffic? (might not have cables), 10.10.104.254/ | + | * ib1, ipoib, 10.10.104.254/ |
* Raid 1 mirrored disks (2x250gb) | * Raid 1 mirrored disks (2x250gb) | ||
* /home mount point for home directory volume ~ 10tb | * /home mount point for home directory volume ~ 10tb | ||
- | * /home/apps mount point for software volume ~ 1tb | + | * /home/apps mount point for software volume ~ 1tb (contains / |
* / | * / | ||
* logical volume LOCALSCRATCH: | * logical volume LOCALSCRATCH: | ||
- | * logical | + | * logical |
- | * logical volumes ROOT/ | + | |
+ | ===== StorageWorks MSA60 ===== | ||
+ | [[http:// | ||
+ | |||
+ | * Dual power (one to UPS, one to utility, do later) | ||
+ | |||
+ | * Three volumes to start with: | ||
+ | * home (raid 6, design a backup path, do later), 10 tb | ||
+ | * apps (raid 6, design a backup path, do later), 1tb | ||
+ | * sanscratch (raid 1, no backup), 5 tb | ||
+ | |||
+ | * Systems Insight Manager (SIM) [[http:// | ||
+ | * Do we need a windows box (virtual) to run the Manager on? | ||
+ | * install, configure | ||
+ | * requires an oracle install? no, hpsmdb is installed with automatic installation (postgresql) | ||
+ | * linux deployment utilities, and management agents installation | ||
+ | * configure managed systems, automatic discovery | ||
+ | * configure automatic event handling | ||
+ | |||
+ | |||
+ | ===== SL2x170z G6 ===== | ||
+ | [[http:// | ||
+ | |||
+ | * node names hp000, increment by 1 | ||
+ | * eth0, provision, 192.168.102.25(increment by 1)/ | ||
+ | * eth1, data/ | ||
+ | * eth2, ipmi, 192.168.103.25(increment by 1)/ | ||
+ | * ib0, ipoib, 10.10.103.25(increment by 1)/ | ||
+ | * ib1, ipoib, 10.10.104.25(increment by 1)/ | ||
+ | |||
+ | * /home mount point for home directory volume ~ 10tb | ||
+ | * /home/apps mount point for software volume ~ 1tb (contains | ||
+ | * / | ||
+ | * logical volume LOCALSCRATCH: | ||
+ | * logical volumes ROOT/ | ||
+ | |||
+ | |||
+ | ===== Misc ===== | ||
+ | |||
+ | * IPoIB | ||
+ | * configuration, | ||
+ | * monitor | ||
+ | |||
+ | * Cluster Management Utility (CMU) | ||
+ | * install, configure, monitor | ||
+ | * golden image capture, deploy (there will initially only be one image) | ||
- | ===== Disk Array ===== | + | * Sun Grid Engine (SGE) |
+ | * install, configure | ||
+ | * there will only be one queue (hp12) | ||
- | * Dual power (one to UPS, one to utility) | + | * KVM utility |
+ | * functionality | ||
- | ===== Compute Nodes ===== | + | * Placement |
+ | * where in data center (do later), based on environmental works | ||
\\ | \\ | ||
**[[cluster: | **[[cluster: |