This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:125 [2014/02/17 18:09] hmeij [What changes?] |
cluster:125 [2014/02/19 14:14] hmeij |
||
---|---|---|---|
Line 23: | Line 23: | ||
* mw256fd appears | * mw256fd appears | ||
* on both mw256 (n33-n37) and mw256fd (n38-n45) exclusive use is disabled (#BSUB -x) | * on both mw256 (n33-n37) and mw256fd (n38-n45) exclusive use is disabled (#BSUB -x) | ||
- | * the max number of jobs slots per node is 32 on '' | + | * the max number of jobs slots per node is 32 on '' |
+ | |||
+ | Memory: | ||
+ | |||
+ | * Since fewer and fewer nodes are deployed in our cluster with large memory footprints, it becomes important to estimate how much memory you need (add 10-20%) and reserve that via the scheduler so your jobs do not crash. | ||
+ | |||
+ | < | ||
+ | #BSUB -R " | ||
+ | </ | ||
Gaussian: | Gaussian: | ||
Line 56: | Line 64: | ||
* We'll schedule one as soon as '' | * We'll schedule one as soon as '' | ||
+ | |||
+ | ==== What May Chenage? ==== | ||
+ | |||
+ | There is a significant need to run many, many programs that require very little memory (like in the order of 1-5 MB). When such programs run they consume a job slot. When many such programs consume many job slots, like on the large servers in the '' | ||
+ | |||
+ | So we could enable hyperthreading on the nodes of the '' | ||
+ | |||
+ | * if there is no ‘sharing’ required the hyper-threaded node performs the same (that is the operating systems presents 16 cores but only up to 8 jobs are allowed to run, lets say by limiting the JL/H parameter of the queue) | ||
+ | - if there is ‘sharing’ jobs take a 44% speed penalty, however more of them can run, twice as many | ||
+ | |||
+ | So it appears that we could turn hyperthreading on and despite the nodes presenting 16 cores we could limit the number of jobs to 8 until the need arises to run many small jobs and then reset the limit to 16. | ||
+ | |||
\\ | \\ | ||
**[[cluster: | **[[cluster: |