This shows you the differences between two versions of the page.
— |
cluster:68 [2008/08/19 13:34] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | \\ | ||
+ | **[[cluster: | ||
+ | |||
+ | |||
+ | ===== RTM ===== | ||
+ | |||
+ | This is a collection of interesting graphs generated by the Real Time Monitoring tool Platform is developing. | ||
+ | |||
+ | ** What is RTM ? ** RTM is used to monitor and graph LSF resources (including networks, disks, applications, | ||
+ | |||
+ | [[http:// | ||
+ | |||
+ | [[http:// | ||
+ | |||
+ | ===== Graph Tab ===== | ||
+ | |||
+ | ==== Tree: Cluster Swallowtail-> | ||
+ | |||
+ | **<hi # | ||
+ | |||
+ | ^Viewing Graph 'LSF 62 - GRID Available Memory' | ||
+ | |{{1.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - GRID IO Levels' | ||
+ | |{{2.png | Weekly (30 Minute Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - GRID Load Average' | ||
+ | |{{3.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - GRID CPU Utilization' | ||
+ | |{{4.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - GRID Job Statistics' | ||
+ | |{{5.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - GridJobs Collection Stats' | ||
+ | |{{6.png | Monthly (2 Hour Average)}}| | ||
+ | ^Graphs -> Tree Mode -> LSF 62 - Overall Job Efficiency^ | ||
+ | |{{7.png | Monthly (2 Hour Average)}}| | ||
+ | ^Graphs -> Tree Mode -> LSF 62 - Pending Jobs^ | ||
+ | |{{8.png | Monthly (2 Hour Average)}}| | ||
+ | |||
+ | ==== Tree: Cluster Swallowtail-> | ||
+ | |||
+ | **<hi # | ||
+ | |||
+ | ^Viewing Graph 'LSF 62 - imw_nodes - CPU Capacity' | ||
+ | |{{9.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - elw_nodes - CPU Capacity' | ||
+ | |{{10.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - emw_nodes - CPU Capacity' | ||
+ | |{{11.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - ehw_nodes - CPU Capacity' | ||
+ | |{{12.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - ehwfd_nodes - CPU Capacity' | ||
+ | |{{13.png | Monthly (2 Hour Average)}}| | ||
+ | |||
+ | ==== Tree: Cluster Swallowtail-> | ||
+ | |||
+ | **<hi # | ||
+ | |||
+ | ^Viewing Graph 'LSF 62 - imw_nodes - CPU Utilization' | ||
+ | |{{14.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - elw_nodes - CPU Utilization' | ||
+ | |{{15.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - emw_nodes - CPU Utilization' | ||
+ | |{{16.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - ehw_nodes - CPU Utilization' | ||
+ | |{{17.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - ehwfd_nodes - CPU Utilization' | ||
+ | |{{18.png | Monthly (2 Hour Average)}}| | ||
+ | |||
+ | ==== Tree: Cluster Swallowtail-> | ||
+ | |||
+ | **<hi # | ||
+ | |||
+ | ^Viewing Graph 'LSF 62 - imw_nodes - Slot Utilization' | ||
+ | |{{19.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - elw_nodes - Slot Utilization' | ||
+ | |{{20.png | Monthly (2 Hour Average)}}| | ||
+ | ^Graphs -> Tree Mode -> LSF 62 - emw_nodes - Slot Utilization^ | ||
+ | |{{21.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - ehw_nodes - Slot Utilization' | ||
+ | |{{22.png | Monthly (2 Hour Average)}}| | ||
+ | ^Graphs -> Tree Mode -> LSF 62 - ehwfd_nodes - Slot Utilization^ | ||
+ | |{{23.png | Monthly (2 Hour Average)}}| | ||
+ | |||
+ | |||
+ | ==== Tree: Cluster Swallowtail-> | ||
+ | |||
+ | **<hi # | ||
+ | |||
+ | ^Viewing Graph 'LSF 62 - imw - Job Details' | ||
+ | |{{24.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - elw - Job Details' | ||
+ | |{{25.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - emw - Job Details' | ||
+ | |{{26.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - ehw - Job Details' | ||
+ | |{{27.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - ehwfd - Job Details' | ||
+ | |{{28.png | Monthly (2 Hour Average)}}| | ||
+ | |||
+ | ==== Tree: Cluster Swallowtail-> | ||
+ | |||
+ | **<hi # | ||
+ | |||
+ | ^Viewing Graph 'LSF 62 - imw - Queue Pending Times' | ||
+ | |{{29.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - elw - Queue Pending Times' | ||
+ | |{{30.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'LSF 62 - emw - Queue Pending Times' | ||
+ | |{{31.png | Monthly (2 Hour Average)}}| | ||
+ | ^Graphs -> Tree Mode -> LSF 62 - ehw - Queue Pending Times^ | ||
+ | |{{32.png | Monthly (2 Hour Average)}}| | ||
+ | ^Graphs -> Tree Mode -> LSF 62 - ehwfd - Queue Pending Times^ | ||
+ | |{{33.png | Monthly (2 Hour Average)}}| | ||
+ | |||
+ | ==== Tree: Hosts-> Host: head node ==== | ||
+ | |||
+ | **<hi # | ||
+ | |||
+ | ^Viewing Graph 'head node - Memory Usage' | ||
+ | |{{34.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'head node - Load Average' | ||
+ | |{{35.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph 'head node - Processes' | ||
+ | |{{36.png | Monthly (2 Hour Average)}}| | ||
+ | |||
+ | ==== Tree: Compute Hosts-> Host: " | ||
+ | |||
+ | **<hi # | ||
+ | |||
+ | ... example ... | ||
+ | |||
+ | ^Viewing Graph ' | ||
+ | |{{37.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph ' | ||
+ | |{{38.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph ' | ||
+ | |{{39.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph ' | ||
+ | |{{40.png | Monthly (2 Hour Average)}}| | ||
+ | ^Viewing Graph ' | ||
+ | |{{41.png | Monthly (2 Hour Average)}}| | ||
+ | |||
+ | |||
+ | |||
+ | ===== Grid Tab ===== | ||
+ | |||
+ | These are the examples of the tabular global statistics for currently running jobs. | ||
+ | |||
+ | ==== Queue Level Stats ==== | ||
+ | |{{42.png}}| | ||
+ | |||
+ | ==== User Level Stats ==== | ||
+ | |{{44.png}}| | ||
+ | |||
+ | ==== Cluster Level Stats ==== | ||
+ | |{{43.png}}| | ||
+ | |||
+ | ==== Dashboard ==== | ||
+ | |{{45.png}}| | ||
+ | |||
+ | That's it. | ||
+ | |||
+ | \\ | ||
+ | **[[cluster: |