This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cluster:88 [2010/08/11 17:32] hmeij |
cluster:88 [2010/08/17 19:56] (current) hmeij |
||
---|---|---|---|
Line 79: | Line 79: | ||
* name: kusu101prov, | * name: kusu101prov, | ||
* eth1: 10.10.101.254/ | * eth1: 10.10.101.254/ | ||
- | * name: kusupriv, type: other | + | * name: kusu101priv, type: other |
* 4 - gateway & dns: gateway 192.168.101.0 (is not used but required field), dns server 192.168.101.254 (installer node) | * 4 - gateway & dns: gateway 192.168.101.0 (is not used but required field), dns server 192.168.101.254 (installer node) | ||
* 5 - host: FQDN kusu101, PCD kusu101 (basically we will not provide internet accessible names) | * 5 - host: FQDN kusu101, PCD kusu101 (basically we will not provide internet accessible names) | ||
Line 177: | Line 177: | ||
Now reboot the entire cluster and observe changes to be permanent. Sidebar: for Pace, you can now on the installer node assign eth1 a pace.edu IP, and have the necessary changes made to the ProCurve switch, so your users can log into the installer/ | Now reboot the entire cluster and observe changes to be permanent. Sidebar: for Pace, you can now on the installer node assign eth1 a pace.edu IP, and have the necessary changes made to the ProCurve switch, so your users can log into the installer/ | ||
+ | Actually had a better idea: create another node group template from your _BSS template and remove eth1, naming convention login#N and set starting IP to something like 192.168.101.10 ... call this node group _BSS_login or so. Start addhost, add new host to this node group. | ||
===== Step 5 ===== | ===== Step 5 ===== | ||
Line 266: | Line 267: | ||
More fun. Parallel jobs can be submitted over ethernet interconnects but will not achieve the performance of Infiniband interconnects ofcourse. | More fun. Parallel jobs can be submitted over ethernet interconnects but will not achieve the performance of Infiniband interconnects ofcourse. | ||
- | * yum install libibverbs; pdsh yum install libibverbs -q -y | + | * yum install libibverbs |
+ | * pdsh yum install libibverbs -q -y | ||
* yum install gcc-c++ | * yum install gcc-c++ | ||
Line 272: | Line 274: | ||
* download tarball, stage in / | * download tarball, stage in / | ||
- | * cd /opt; tar zxvf / | + | * cd /opt; tar zxvf / |
+ | * pdsh "cd /opt; tar zxvf / | ||
* examples in / | * examples in / | ||
* export PATH=/ | * export PATH=/ | ||
Line 280: | Line 283: | ||
Ok, so now we need write a script to submit a parallel job. A parallel job is submitted with command ' | Ok, so now we need write a script to submit a parallel job. A parallel job is submitted with command ' | ||
+ | |||
+ | * irun | ||
< | < | ||
#!/bin/bash | #!/bin/bash | ||
+ | |||
+ | rm -f err out | ||
#BSUB -e err | #BSUB -e err | ||
Line 296: | Line 303: | ||
which mpirun | which mpirun | ||
- | mpirun / | + | / |
- | mpirun / | + | / |
</ | </ | ||
+ | * 'bsub < irun' (submits) | ||
+ | * ' | ||
+ | |||
+ | ===== Step 7 ===== | ||
+ | |||
+ | Tools. As you add nodes, monitoring tools are added to Ganglia and Cacti. | ||
+ | But first we must fix firefox. | ||
+ | * ' | ||
+ | * ' | ||
+ | * http:// | ||
+ | * http:// | ||
+ | * http:// | ||
+ | * http:// | ||
+ | * http:// | ||
\\ | \\ | ||
**[[cluster: | **[[cluster: |