This is an old revision of the document!
We used to use Zenoss as our health and alerting monitor (Zenoss).
Because of a research project needing quick insight into resource consumations on compute nodes we first quickly installed Ganglia. Not developed anymore but a great tool. You can quickly download centos 8 packages and grab centos 7 packages. For the latter you need to change the yum repo URLs to (and uncomment the mirrorlist URLs)
baseurl=http://vault.centos.org/centos/$releasever/os/$basearch/
The only change I made obvious to the needed ones was specifying that the agent gmond
reports in every 60 seconds (send_metadata interval = 60). I love abstract graphs like this, you know all is humming along in one view. And you can obtain gpu metrics (for centos 7 nodes) finding templates here