cluster:196
This is an old revision of the document!
Netdata
We use Zenoss for monitor and alerting the whole HPC. Page can be found here Zenoss
At PEARC20 conference I became aware of Netdata which seems a good tool for our “tails” (login, storage servers for example). Lots of detailed information.
bash <(curl -Ss https://my-netdata.io/kickstart.sh)
Then open port 19999 in firewall for wesleyan.edu.
Gosh, it does alerting too.
- hpcmon Zenoss server
- cottontail
- cottontail2 Backup scheduler, centos 6 compile env (needs reboot)
- greentail52 NFS server /sanscratch (needs reboot)
- ringtail
- sharptail
- mindstoresrv0
- mindstoresrv1
- petaltail Sandbox, Warewulf centos 6
- swallowtail Sandbox
- whitetail NFS server /lvhomes (old /home), Openhpc Warewulf centos 7
- sharptail2dr Disaster recovery host for hpcstore (active users) /homesdr
cluster/196.1596463689.txt.gz · Last modified: by hmeij07
