User Tools

Site Tools


cluster:196


Back

Netdata

We use Zenoss for monitor and alerting the whole HPC. Page can be found here Zenoss

At PEARC20 conference I became aware of Netdata which seems a good tool for our “tails” (login, storage servers for example). Lots of detailed information.

bash <(curl -Ss https://my-netdata.io/kickstart.sh)

Then open port 19999 in firewall for wesleyan.edu.

Gosh, it does alerting too. The original monitor scripts, unsilenced, exist on cottontail:/usr/lib/netadata/conf.d/health.d-orig

Silence

cd /usr/lib/netdata/conf.d/health.d/

for i in `ls`; do \
perl -pi -e "s/to: sysadmin/to: silent/g" $i; \
perl -pi -e "s/to: webmaster/to: silent/g" $i; \
perl -pi -e "s/to: dba/to: silent/g" $i; \
perl -pi -e "s/to: sitemgr/to: silent/g" $i; \
perl -pi -e "s/to: domainadmin/to: silent/g" $i; \
perl -pi -e "s/to: proxyadmin/to: silent/g" $i; \
done

grep -i to: * | grep -v silent


Back

cluster/196.txt · Last modified: 2020/11/10 14:43 by hmeij07