User Tools

Site Tools


cluster:166

This is an old revision of the document!



Back

HPC Users Meeting

  • Brief history of HPC
    • 2006 swallowtail (Dell PE1955, Infiniband, imw, emw)
    • 2010 greentail (HP gen6 blades, hp12)
    • 2013 sharptail (Microway storage, K20s, Infiniband, mw256/mwgpu)
    • 2014 mw256fd (Dell 2006 replacement with Supermicro nodes)
    • 2015 tinymem (Supermicro bare metal, expansion for serial jobs)
    • 2017 mw128 (first faculty startup funds)
    • 2018 6/25 Today's meeting
  • Since 2006
    • Grown from 256 to roughly 1,200 physical CPU cores
    • Processed 3,165,752 jobs (by 18jun2018)
    • Compute capacity over 60 teraflops (DPFP; 38 cpu side, 25 gpu side)
    • Total memory footprint is near 7.5 TB
    • About 500 accounts have been created (incl 22 collaborators, 100 class accounts)
  • Funding / charge scheme: is it working for you?
    • Last 2 years, $15K target realized each year.
  • Status of our cluster development fund
    • $140K come July 1st, 2018
    • Time for some new hardware? Retirement of hp12 nodes?
  • 2017 Benchmarks of some new hardware
    • Donation led to purchase of four GTX1080ti commercial grade GPUs
    • Amber 16. Nucleosome bench runs 4.5x faster than on a K20
    • Gromacs 5.1.4. Colin's multidir bench runs about 2x faster than on a K20
    • Lammps 11Aug17. Colloid example runs about 11x faster than a K20
    • FSL 5.0.10. BFT bedpostx tests run 16x faster on CPU, a whopping 118x faster on GPU vs CPU.
    • Price of 128gb node in 2017 $8,250…price of 256gb node in 2018 $10,500
  • 2017 IBM bought Platform Inc (developers of LSF, Openlava is LSF4.2 open source branch)
    • Promptly accused Openlava of copyright infringement in v3.x (US DMCA law, no proof needed).
    • Fall back option to v2.2 (definitely free of infringement, minor disruption)
    • Move forward option, adopt SLURM (LBL developers, major disruption)
  • New HPC Advisory Group Member
  • Tidbits
    • Bought deep U42 rack with AC cooling onboard and two PDUs
    • Pushed Angstrom rack (bss24) out of our area, ready to recycle that
    • Currently we have two U42 racks empty with power
    • Cooling needs to be provided with any new major purchases (provost, ITS, HPC?)
    • 60 TB raw storage purchased for sharptail (/home2 for users with specific needs)
    • Everything is out of warranty but
      • cottontail (03/2019),
      • ringtail & n78 (10/2020)
      • mw128_nodes (06/2020)
    • All Infiniband ports are in use


Back

cluster/166.1529344121.txt.gz · Last modified: 2018/06/18 13:48 by hmeij07