This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cluster:227 [2024/10/15 19:30] hmeij07 [HPC Monitoring] |
cluster:227 [2024/10/23 12:16] (current) hmeij07 |
||
---|---|---|---|
Line 12: | Line 12: | ||
</ | </ | ||
- | The only change I made obvious to the needed ones was specifying that the agent '' | + | The only change I made obvious to the needed ones was specifying that the agent '' |
* https:// | * https:// | ||
- | {{: | + | Here is what it looks like (either select Grid > Wesleyan HPC > Server or after selecting Wesleyan HPC scroll down the page to view all nodes and pick a metric. |
+ | |||
+ | * http:// | ||
+ | |||
+ | {{: | ||
+ | |||
+ | But Ganglia does not provide for alerting so we added **Zabbix**. | ||
+ | |||
+ | We set up agent monitoring using Zabbix Agent (both centos 7 and 8 - centos or rocky) and added the gpu templates from these links. The XML loads as Template on the zabbix_server, | ||
+ | |||
+ | * setup data collection with Zabbix agent, setup monitoring with Zabbix agent | ||
+ | * enable discovery on both with 192.168.102.1-254 | ||
+ | * https:// | ||
+ | * https:// | ||
+ | |||
+ | And that looks like this | ||
+ | |||
+ | * http:// | ||
+ | |||
+ | Log in as guest. Then you can go to " | ||