This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cluster:166 [2018/06/18 18:53] hmeij07 |
cluster:166 [2018/06/27 11:51] (current) hmeij07 [Notes] |
||
---|---|---|---|
Line 4: | Line 4: | ||
==== HPC Users Meeting ==== | ==== HPC Users Meeting ==== | ||
- | * Brief history | + | * Brief history |
* 2006 swallowtail (Dell PE1955, Infiniband, imw, emw) | * 2006 swallowtail (Dell PE1955, Infiniband, imw, emw) | ||
* 2010 greentail (HP gen6 blade servers, hp12) | * 2010 greentail (HP gen6 blade servers, hp12) | ||
Line 28: | Line 28: | ||
* 2017 Benchmarks of some new hardware | * 2017 Benchmarks of some new hardware | ||
- | * Donation led to purchase of four GTX1080ti | + | * Donation led to purchase of a commercial grade GPU server |
* Amber 16. Nucleosome bench runs 4.5x faster than on a K20 | * Amber 16. Nucleosome bench runs 4.5x faster than on a K20 | ||
* Gromacs 5.1.4. Colin' | * Gromacs 5.1.4. Colin' | ||
- | * Lammps 11Aug17. Colloid example runs about 11x faster than a K20 | + | * Lammps 11Aug17. Colloid example runs about 11x faster than on a K20 |
* FSL 5.0.10. BFT bedpostx tests run 16x faster on CPU, a whopping 118x faster on GPU vs CPU. | * FSL 5.0.10. BFT bedpostx tests run 16x faster on CPU, a whopping 118x faster on GPU vs CPU. | ||
- | * Price of 128gb node in 2017 $8, | + | * Price of 128gb node in 2017 was $8, |
- | * 2017 IBM bought Platform Inc (developers of LSF, Openlava is LSF4.2 open source branch) | + | * 2016 IBM bought Platform Inc (developers of LSF, Openlava is LSF4.2 open source branch) |
- | * Promptly | + | * IBM promptly |
* Fall back option to v2.2 (definitely free of infringement, | * Fall back option to v2.2 (definitely free of infringement, | ||
* Move forward option, adopt SLURM (LBL developers, major disruption) | * Move forward option, adopt SLURM (LBL developers, major disruption) | ||
+ | |||
+ | * If we adopt SLURM should we transition to OpenHPC Warewulf/ | ||
+ | * http:// | ||
+ | * new login node and couple compute nodes to start? | ||
* New HPC Advisory Group Member | * New HPC Advisory Group Member | ||
Line 44: | Line 48: | ||
* Tidbits | * Tidbits | ||
* Bought deep U42 rack with AC cooling onboard and two PDUs | * Bought deep U42 rack with AC cooling onboard and two PDUs | ||
- | * Pushed Angstrom rack (bss24) out of our area, ready to recycle that | + | * Pushed Angstrom rack (bss24) out of our area, ready to recycle that (Done. 06/20/2018) |
* Currently we have two U42 racks empty with power | * Currently we have two U42 racks empty with power | ||
* Cooling needs to be provided with any new major purchases (provost, ITS, HPC?) | * Cooling needs to be provided with any new major purchases (provost, ITS, HPC?) | ||
Line 51: | Line 55: | ||
* cottontail (03/ | * cottontail (03/ | ||
* ringtail & n78 (10/2020) | * ringtail & n78 (10/2020) | ||
- | * mw128_nodes (06/2020) | + | * mw128_nodes |
* All Infiniband ports are in use | * All Infiniband ports are in use | ||
+ | ===== Notes ===== | ||
+ | |||
+ | * First make a page comparing CPU vs GPU usage which may influence future purchase [[cluster: | ||
+ | * $100k quote, 3to5 vendors, data points mid-2018 | ||
+ | * One node (or all) should have configured on it: amber, gromacs, laamps, namd, latest version | ||
+ | * Nvidia latest version, optimal configs cpu:gpu ratios | ||
+ | * Amber 1:1 (may be 1:2 in future releases) - amber certified GPU! | ||
+ | * Gromacs 10:1 (could ramp up to claiming all resources per node) | ||
+ | * Namd 13:1 (could ramp up to claiming all resources per node) | ||
+ | * Lammps 2-4:1 | ||
+ | * 128g with enough CPU slots to take over '' | ||
+ | * Anticipated target (also to manage heat exchange) | ||
+ | * 2x10 Xeon CPU (~100gb left) with 2xgtx1080ti GPU (25gb memory required) | ||
+ | * as many as fit budget but but no more than 15 rack wise | ||
\\ | \\ | ||
**[[cluster: | **[[cluster: |