Warning: Undefined array key "DOKU_PREFS" in /usr/share/dokuwiki/inc/common.php on line 2082
cluster:196 [DokuWiki]

User Tools

Site Tools


cluster:196

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
cluster:196 [2020/07/29 11:55]
hmeij07
cluster:196 [2020/08/18 08:51]
hmeij07 [Netdata]
Line 1: Line 1:
 +
 +\\
 +**[[cluster:0|Back]]**
 +
 +
 ==== Netdata ==== ==== Netdata ====
  
 We use Zenoss for monitor and alerting the whole HPC. Page can be found here [[cluster:183|Zenoss]] We use Zenoss for monitor and alerting the whole HPC. Page can be found here [[cluster:183|Zenoss]]
  
-At PEARC20 conference I became aware of [[https://www.netdata.cloud/|Netdata]] which seems a good tool for our "tails" (login, storage servers for example)+At PEARC20 conference I became aware of [[https://www.netdata.cloud/|Netdata]] which seems a good tool for our "tails" (login, storage servers for example).  Lots of detailed information.
  
 <code> <code>
Line 11: Line 16:
 </code> </code>
  
-  * [[http://hpcmon.wesleyan.edu:19999|hpcmon]]+Then open port 19999 in firewall for wesleyan.edu.  
 + 
 +Gosh, it does alerting too. The original monitor scripts, unsilenced, exist on ''cottontail:/usr/lib/netadata/conf.d/health.d-orig'' 
 + 
 +  * [[http://hpcmon.wesleyan.edu:19999|hpcmon]] Zenoss server 
 +  * [[http://cottontail.wesleyan.edu:19999|cottontail]] Primary log in and scheduler server (needs reboot) 
 +  * [[http://cottontail2.wesleyan.edu:19999|cottontail2]] Backup scheduler, centos 6 compile env 
 +  * [[http://greentail52.wesleyan.edu:19999|greentail52]] NFS server /sanscratch, centos 7 compile env (needs reboot) 
 +  * [[http://ringtail.wesleyan.edu:19999|ringtail]] NFS server /home33 (needs reboot) 
 +  * sharptail 
 +  * [[http://mindstoresrv0.wesleyan.edu:19999|mstore0]] NFS server /mindstore (needs reboot) 
 +  * [[http://mindstoresrv1.wesleyan.edu:19999|mindstoresrv1]] Replication target for mstore0 (needs reboot) 
 +  * [[http://petaltail.wesleyan.edu:19999|petaltail]] Sandbox, Warewulf centos 6 
 +  * [[http://swallowtail.wesleyan.edu:19999|swallowtail]] Sandbox 
 +  * [[http://whitetail.wesleyan.edu:19999|whitetail]] Openhpc Warewulf centos 7 (powered down) 
 +  * [[http://sharptail2dr.wesleyan.edu:19999|sharptail2dr]] Disaster recovery host for hpcstore (active users) /homesdr 
 + 
 + 
 +==== Silence ==== 
 + 
 +<code> 
 + 
 +cd /usr/lib/netdata/conf.d/health.d/ 
 + 
 +for i in `ls`; do \ 
 +perl -pi -e "s/to: sysadmin/to: silent/g" $i; \ 
 +perl -pi -e "s/to: webmaster/to: silent/g" $i; \ 
 +perl -pi -e "s/to: dba/to: silent/g" $i; \ 
 +perl -pi -e "s/to: sitemgr/to: silent/g" $i; \ 
 +perl -pi -e "s/to: domainadmin/to: silent/g" $i; \ 
 +perl -pi -e "s/to: proxyadmin/to: silent/g" $i; \ 
 +done 
 + 
 +grep -i to: * | grep -v silent 
 + 
 +</code> 
 + 
 +\\ 
 +**[[cluster:0|Back]]** 
 + 
cluster/196.txt · Last modified: 2020/11/10 09:43 by hmeij07