User Tools

Site Tools


cluster:167

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:167 [2018/06/28 13:56]
hmeij07
cluster:167 [2018/06/28 14:02]
hmeij07
Line 13: Line 13:
   * There is no good/bad metric   * There is no good/bad metric
   * Never collated such data before   * Never collated such data before
-  * The GPU usage is based on detecting gpu reservations (gpu= flag)+  * The GPU jobs are detected based on GPU resource reservations (gpu= flag)
  
  
Line 21: Line 21:
 | Memory | 7,408  |  51:1  |  144 | GB | | Memory | 7,408  |  51:1  |  144 | GB |
 | Teraflops | 38  |  1.5:1  |  25 | double precision, floating point, theoretical | | Teraflops | 38  |  1.5:1  |  25 | double precision, floating point, theoretical |
-| Job Count | 2,834  |  3:1  |  1,045 | scheduled jobs irregardless of exit status | +| Job Count | 2,834  |  3:1  |  1,045 | processed jobs irregardless of exit status | 
-| Avail Hours | 715,200  |  50:1  |  14,400 | total cpu cores, total gpus |+| Avail Hours | 715,200  |  50:1  |  14,400 | total for cpu cores, total for gpus |
 | Job Hours | 221,136  |  77:1  |  2,872 | cumulative hours of consumed usage | | Job Hours | 221,136  |  77:1  |  2,872 | cumulative hours of consumed usage |
-| Job Hours % | 31  |  6:1  |  5 | as a percentage |+| Job Hours % | 31  |  6:1  |  5 | as a percentage of available |
 | Avail Hours2 | 561,600  |  39:1  |  14,400 | total cpu cores - hp12's 256 cores, total gpus | | Avail Hours2 | 561,600  |  39:1  |  14,400 | total cpu cores - hp12's 256 cores, total gpus |
-| Job Hours % | 39  |  8:1  |  5 | more realistic...hp12 rarely used in June18|+| Job Hours2 % | 39  |  8:1  |  5 | more realistic...hp12 rarely used in June18|
  
 The logs showing gpu %util confirm the extremely low GPU usage. When concatenating the four gpu %util values into a string, since 01Jan2017, the string '0000' has occurred 10 million times out of 16 million observations. (GPUs are polled every 10 mins). Sad. The surprising strong GPU job count is due to the Amber group launching lots of small GPU jobs. The logs showing gpu %util confirm the extremely low GPU usage. When concatenating the four gpu %util values into a string, since 01Jan2017, the string '0000' has occurred 10 million times out of 16 million observations. (GPUs are polled every 10 mins). Sad. The surprising strong GPU job count is due to the Amber group launching lots of small GPU jobs.
Line 41: Line 41:
 | Memory | 7,408  |  74:1  |  100 | GB | | Memory | 7,408  |  74:1  |  100 | GB |
 | Teraflops | 38  |  1.7:1  |  23 | double precision, floating point, theoretical | | Teraflops | 38  |  1.7:1  |  23 | double precision, floating point, theoretical |
-| Job Count | 12,798  |  18:1  |  722 | scheduled jobs irregardless of exit status |+| Job Count | 12,798  |  18:1  |  722 | processed jobs irregardless of exit status |
 | Avail Hours | 886,848  |  60:1  |  14,880 | total cpu cores, total gpus | | Avail Hours | 886,848  |  60:1  |  14,880 | total cpu cores, total gpus |
 | Job Hours |  260,997  |  69:1  |  3,805 | cumulative hours of consumed usage | | Job Hours |  260,997  |  69:1  |  3,805 | cumulative hours of consumed usage |
-| Job Hours % | 30  |  1:1  |  26 | as a percentage | +| Job Hours % | 30  |  1:1  |  26 | as a percentage of available 
-| Avail Hours2 | 696,384  |  47:1  |  14,880 | total cpu cores hp12's 256 cores, total gpus | +| Avail Hours2 | 696,384  |  47:1  |  14,880 | total for cpu cores minus hp12's 256 cores, total for gpus | 
-| Job Hours % | 37  |  1.5:1  |  26 | more realistic...hp12 rarely used in June18|+| Job Hours2 % | 37  |  1.5:1  |  26 | more realistic...hp12 rarely used in June18|
  
   * Some noise in this data with the inability to match start and end of job (~15% of records)   * Some noise in this data with the inability to match start and end of job (~15% of records)
cluster/167.txt · Last modified: 2018/08/01 14:11 by hmeij07