Warning: Undefined array key "DOKU_PREFS" in /usr/share/dokuwiki/inc/common.php on line 2082
cluster:196 [DokuWiki]

User Tools

Site Tools


cluster:196

Warning: Undefined array key 25 in /usr/share/dokuwiki/inc/html.php on line 1453

Warning: Undefined array key -1 in /usr/share/dokuwiki/inc/html.php on line 1458

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
cluster:196 [2020/07/28 16:20]
hmeij07 created
cluster:196 [2020/11/10 09:43]
hmeij07
Line 1: Line 1:
 +
 +\\
 +**[[cluster:0|Back]]**
 +
 +
 ==== Netdata ==== ==== Netdata ====
 +
 +We use Zenoss for monitor and alerting the whole HPC. Page can be found here [[cluster:183|Zenoss]]
 +
 +At PEARC20 conference I became aware of [[https://www.netdata.cloud/|Netdata]] which seems a good tool for our "tails" (login, storage servers for example).  Lots of detailed information.
  
 <code> <code>
Line 7: Line 16:
 </code> </code>
  
-  * hpcmon.wesleyan.edu+Then open port 19999 in firewall for wesleyan.edu.  
 + 
 +Gosh, it does alerting too. The original monitor scripts, unsilenced, exist on ''cottontail:/usr/lib/netadata/conf.d/health.d-orig'' 
 + 
 +  [[http://hpcmon.wesleyan.edu:19999|hpcmon]] Zenoss server 
 +  * [[http://cottontail.wesleyan.edu:19999|cottontail]] Primary log in and scheduler server  
 +  * [[http://cottontail2.wesleyan.edu:19999|cottontail2]] Backup scheduler, centos 6 compile env 
 +  * [[http://greentail52.wesleyan.edu:19999|greentail52]] NFS server /sanscratch, centos 7 compile env  
 +  * [[http://ringtail.wesleyan.edu:19999|ringtail]] NFS server /home33  
 +  * sharptail 
 +  * [[http://mindstoresrv0.wesleyan.edu:19999|mstore0]] NFS server /mindstore  
 +  * [[http://mindstoresrv1.wesleyan.edu:19999|mindstoresrv1]] Replication target for mstore0  
 +  * [[http://petaltail.wesleyan.edu:19999|petaltail]] Sandbox, Warewulf centos 6 
 +  * [[http://swallowtail.wesleyan.edu:19999|swallowtail]] Sandbox 
 +  * [[http://whitetail.wesleyan.edu:19999|whitetail]] Openhpc Warewulf centos 7 (powered down) 
 +  * [[http://sharptail2dr.wesleyan.edu:19999|sharptail2dr]] Disaster recovery host for hpcstore (active users) /homesdr 
 + 
 + 
 +==== Silence ==== 
 + 
 +<code> 
 + 
 +cd /usr/lib/netdata/conf.d/health.d/ 
 + 
 +for i in `ls`; do \ 
 +perl -pi -e "s/to: sysadmin/to: silent/g" $i; \ 
 +perl -pi -e "s/to: webmaster/to: silent/g" $i; \ 
 +perl -pi -e "s/to: dba/to: silent/g" $i; \ 
 +perl -pi -e "s/to: sitemgr/to: silent/g" $i; \ 
 +perl -pi -e "s/to: domainadmin/to: silent/g" $i; \ 
 +perl -pi -e "s/to: proxyadmin/to: silent/g" $i; \ 
 +done 
 + 
 +grep -i to: * | grep -v silent 
 + 
 +</code> 
 + 
 +\\ 
 +**[[cluster:0|Back]]** 
 + 
cluster/196.txt · Last modified: 2020/11/10 09:43 by hmeij07