cluster:52
no way to compare when less than two revisions
Differences
This shows you the differences between two versions of the page.
| — | cluster:52 [2007/11/20 15:18] (current) – created - external edit 127.0.0.1 | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | \\ | ||
| + | **[[cluster: | ||
| + | ====== Upgrading to LSF ====== | ||
| + | |||
| + | Why? Here is my summation of some items i wish to take advantage of: **[[cluster: | ||
| + | |||
| + | We're running Platform/ | ||
| + | |||
| + | ===== First Stumble ===== | ||
| + | |||
| + | What version to upgrade to? Well i thought that would be easy, the latest stable version which is LSF v7.0.1. | ||
| + | |||
| + | Our OCS version is 4.1.1 and the only " | ||
| + | |||
| + | In order to install a v7 " | ||
| + | |||
| + | Another option is to perform a manual install of v7 from source. | ||
| + | |||
| + | |||
| + | |||
| + | ===== Next Step ===== | ||
| + | |||
| + | * process a License Change request via [[http:// | ||
| + | * obtain a new license file | ||
| + | * download the LSF/HPC v6.2 roll at [[http:// | ||
| + | * plan the upgrade steps | ||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | ===== Lots of Next Steps ===== | ||
| + | |||
| + | #0 Shut off NAT box | ||
| + | |||
| + | Reset root password, shut off box. | ||
| + | |||
| + | #1a Close the head node to all ssh traffic (firewall to trusted user VLAN access only). | ||
| + | |||
| + | #1b Inactivate all queues, Backup scratch dirs, Stop all jobs. | ||
| + | |||
| + | * Make friends! | ||
| + | |||
| + | * Take a snapshot of jobs running (like the clumon jobs page or with '' | ||
| + | |||
| + | * Stop all jobs: '' | ||
| + | |||
| + | * Disable any cronjobs. | ||
| + | |||
| + | * reset sknauert' | ||
| + | |||
| + | #1c Backup up all files needed to rebuild the io-node. | ||
| + | |||
| + | The io-node is currently a compute node (but not a member of any queue and admin_closed). It has fiber channel (2 cards) to the Netapp storage device. | ||
| + | |||
| + | #1d Backup up all files needed to rebuild the compute nodes. | ||
| + | |||
| + | This includes two varieties of nodes: a light weight node and a heavy weight node sample. Some of this should be customized with extend-compute.xml (minor chances for now ...). Rebuilding is documented on the [[https:// | ||
| + | |||
| + | #1e Stop the lava system across the cluster. | ||
| + | |||
| + | ''/ | ||
| + | |||
| + | * also on ionode!! | ||
| + | * also on the head node!! | ||
| + | * also on the head node run ''/ | ||
| + | |||
| + | #1f Backup all files in /opt/lava. | ||
| + | |||
| + | => copy the / | ||
| + | |||
| + | LSF/HPC will install in /opt/lsfhpc but make sure you have remote backup copy of /opt/lava ... rsync to / | ||
| + | |||
| + | -> Disable Tivoli agents and start a manual incremental backup. | ||
| + | |||
| + | #1g Unmount all io-node exported file systems, leave nodes running. | ||
| + | |||
| + | We'll force a reboot followed by a re-image later in staggered fashion after we are done with the LSF install. | ||
| + | |||
| + | #1h Good time to clean all orphaned jobs' working dirs in / | ||
| + | |||
| + | -> fix: set this LUN to space reservation enabled (1TB) | ||
| + | |||
| + | #1i Unmount all multipathed LUN filesystems on io-node (/ | ||
| + | |||
| + | ** => <hi #ff0000> AFTER THAT DISCONNECT THE FIBER CABLES </hi> ** | ||
| + | |||
| + | Node re-imaging involves formatting and partitioning. | ||
| + | |||
| + | |||
| + | #2. Remove the lava roll. | ||
| + | |||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | |||
| + | 3. Add the LSFHPC roll. | ||
| + | |||
| + | '' | ||
| + | |||
| + | 4.Prep ENV and license info. | ||
| + | |||
| + | Edit / | ||
| + | Change this section to the appropriate lsf location: | ||
| + | |||
| + | < | ||
| + | # source the job scheduler environment; | ||
| + | if [ -f / | ||
| + | . / | ||
| + | fi | ||
| + | </ | ||
| + | |||
| + | Source that new environment. '' | ||
| + | Next copy the license info to / | ||
| + | |||
| + | 5a. Start the license daemon ... port 1700 is currently free. | ||
| + | |||
| + | '' | ||
| + | |||
| + | 5b. Add this startup command to / | ||
| + | |||
| + | 5c. Check the license daemons: '' | ||
| + | |||
| + | 6. Assign compute nodes to additional resources. | ||
| + | |||
| + | '' | ||
| + | (' | ||
| + | |||
| + | This will add the Infiniband MPI implementation. | ||
| + | |||
| + | <hi # | ||
| + | |||
| + | => Before you do this, redefine the io node as a compute appliance in the cluster database and turn ''/ | ||
| + | |||
| + | ''/ | ||
| + | |||
| + | Once done, mount all NFS file systems on the head node. | ||
| + | |||
| + | => Redefine the io node as a "nas appliance" | ||
| + | |||
| + | ''/ | ||
| + | '' | ||
| + | |||
| + | |||
| + | <hi # | ||
| + | |||
| + | ''/ | ||
| + | |||
| + | <hi #ffff00> Add the memory modules at this time? </hi> | ||
| + | |||
| + | #8. Starting and testing the LSF HPC Cluster. | ||
| + | |||
| + | 7a & 7b should add the nodes to the LSF cluster. | ||
| + | |||
| + | On head node:\\ | ||
| + | '' | ||
| + | ''/ | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | |||
| + | After this is done, and all nodes are back up, walk by the lava configuration files and add information that is missing to the LSF equivalent files. | ||
| + | |||
| + | On head node:\\ | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | '' | ||
| + | |||
| + | #9. Configure Master fail over. | ||
| + | |||
| + | Skip this step. | ||
| + | |||
| + | #10. "Go To 1" | ||
| + | |||
| + | Walk through the items in #1 and enable/ | ||
| + | |||
| + | Kick off Tivoli for an automated backup. | ||
| + | |||
| + | Test some job submissions ... | ||
| + | |||
| + | Document the new MPI job submission procedure ... | ||
| + | |||
| + | Add our eLIM after a while ... | ||
| + | |||
| + | #11. Relocate some home directories. | ||
| + | |||
| + | * " | ||
| + | * relocate lvargarslara, | ||
| + | |||
| + | #12. NAT box. | ||
| + | |||
| + | Reconfigure compute-1-1 for Scott, maybe. | ||
| + | |||
| + | |||
| + | ---- | ||
| + | |||
| + | So how long does this take: | ||
| + | |||
| + | * one morning to install LSF/HPC + rebuild the ionode | ||
| + | * one afternoon to rebuild all other nodes (and deal with unexpected hardware problems) | ||
| + | * one morning to open every node and remove/add memory sticks | ||
| + | |||
| + | --- // | ||
| + | |||
| + | ===== Adding Memory ===== | ||
| + | |||
| + | The depts of **CHEM** and **PHYS** will each contribute $2,400 towards the purchase of additional memory. | ||
| + | |||
| + | The $7,680 is enough to purchase 64 DIMMs adding 128 GB of memory to the cluster. | ||
| + | |||
| + | |" | ||
| + | |||
| + | The 4 heavy weight nodes, with local dedicated fast disks, will not be changed. | ||
| + | |||
| + | So the first suggestion is to remove the 1 GB DIMMs from the 16 gigE enabled nodes (queue '' | ||
| + | |||
| + | That then leaves 16 empty nodes and 64 2GB DIMMs to play with. What to do?\\ | ||
| + | Here are some options. | ||
| + | |||
| + | |||
| + | ^ Scenario A ^^^^ uniform, matches infiniband nodes ^ | ||
| + | | 64< | ||
| + | ^ Scenario B ^^^^ add equal medium and heavy nodes ^ | ||
| + | | 16 | 08 | 2x2 | 64 | " eight 4 GB light weight nodes " | | ||
| + | | 16 | 04 | 4x2 | 32 | " four 8 GB medium weight nodes " | | ||
| + | | 32 | 04 | 8x2 | | ||
| + | ^ Scenario C ^^^^ emphasis on medium nodes ^ | ||
| + | | 08 | 04 | 2x2 | 32 | " four 4 GB light weight nodes " | | ||
| + | | 40 | 10 | 4x2 | 80 | " ten 8 GB medium weight nodes " | | ||
| + | | 16 | 02 | 8x2 | 16 | " two 16 GB heavy weight nodes " | | ||
| + | ^ Scenario D ^^^^ ... ^ | ||
| + | < | ||
| + | < | ||
| + | < | ||
| + | < | ||
| + | |||
| + | * Personally, i was initially leaning towards **A**. | ||
| + | * But now, viewing this table, i like the distribution of cores across light, medium and heavy weight nodes in **B**. | ||
| + | * **C** really depends on if we need 8 GB nodes. Not sure why we would do this vs **A**. | ||
| + | |||
| + | Actually, the perfect argument for **B** was offered by Francis: | ||
| + | |If machines have 8 GB of RAM, 1 job locks up the node. So two jobs lock up 2 nodes, rendering a total of 14 cores unused and unavailable. Suppose instead we have 16GB machines. | ||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | ===== Renaming Queues ===== | ||
| + | |||
| + | In **Scenario A** above nothing really changes but the concept of a "light weight" | ||
| + | |||
| + | In **Scenario B & C**, things change. Now we have light, medium and heavy weight nodes. | ||
| + | |||
| + | | queue_name | = | number of nodes | + | which switch | + | GB mem per node | + | total cores | + | additional info | ; | | ||
| + | |||
| + | Then our queues could be named like so: | ||
| + | |||
| + | | **16i08g128c** | 16 nodes, infiniband enabled, each 8gb mem (medium), comprising 128 cores total | | | ||
| + | | **08e04g064c** | 08 nodes, gigE enabled, each 4 gb mem (light), comprising 64 cores total | | | ||
| + | | **04e08g032c** | 04 nodes, gigE enabled, each 8 gb mem (medium), comprising 32 cores total | | | ||
| + | | **04e16g032c** | 04 nodes, gigE enabled, each 16 gb mem (heavy), comprising 32 cores total | | | ||
| + | | **04e16g032cfd** | 04 nodes, gigE enabled, each 16 gb mem (heavy), comprising 32 cores total | fast local disk access| | ||
| + | |||
| + | Or is this too cumbersome? Maybe.\\ | ||
| + | Perhaps just an abbreviation: | ||
| + | |||
| + | | **imw** | 16 nodes, infiniband enabled, each 8gb mem (medium), comprising 128 cores total | | | ||
| + | | **elw** | 08 nodes, gigE enabled, each 4 gb mem (light), comprising 64 cores total | | | ||
| + | | **emw** | 04 nodes, gigE enabled, each 8 gb mem (medium), comprising 32 cores total | | | ||
| + | | **ehw** | 04 nodes, gigE enabled, each 16 gb mem (heavy), comprising 32 cores total | | | ||
| + | | **ehwfd** | 04 nodes, gigE enabled, each 16 gb mem (heavy), comprising 32 cores total | fast local disk access | | ||
| + | |||
| + | | NEW QUEUES all priority = 50 || | ||
| + | | **imw** | compute-1-1 ... compute-1-16 | | ||
| + | | **elw** | compute-1-17 ... compute-1-24 | | ||
| + | | **emw** | compute-1-25 ... compute-1-27 compute-2-28 | | ||
| + | | **ehw** | compute-2-29 ... compute-2-32 | | ||
| + | | **ehwfd** | nfs-2-1 ... nfs-2-4 | | ||
| + | | ** matlab ** | imw + emw | | ||
| + | |||
| + | delete queues: idle, [i]debug, molscat, gaussian, nat-test | ||
| + | |||
| + | \\ | ||
| + | **[[cluster: | ||
cluster/52.txt · Last modified: by 127.0.0.1
