This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:134 [2014/08/14 15:05] hmeij [Slurm] |
cluster:134 [2014/08/14 22:34] hmeij [High Throughput] |
||
---|---|---|---|
Line 34: | Line 34: | ||
echo " | echo " | ||
- | echo DONE | + | date |
</ | </ | ||
Line 53: | Line 53: | ||
^v1 ^v2 ^v3 ^v4 ^v5 ^v6 ^v7 ^v8 ^ | ^v1 ^v2 ^v3 ^v4 ^v5 ^v6 ^v7 ^v8 ^ | ||
|3138|3130|3149|3133|3108|3119|3110|3113 | |3138|3130|3149|3133|3108|3119|3110|3113 | ||
+ | |||
+ | # time to process queues of different sizes (I stage them with the --begin parameter) | ||
+ | # jobs do have to open the output files though, just some crude testing of slurm | ||
+ | # scheduling prowness | ||
+ | |||
+ | ^NrJobs^1, | ||
+ | |mm:ss | 0:33| 6:32| 19:37| | ||
+ | |||
+ | # 20 mins for 25,000 jobs via sbatch | ||
</ | </ | ||
Line 73: | Line 82: | ||
</ | </ | ||
+ | |||
+ | |||
+ | ==== IO error ==== | ||
+ | |||
+ | At around 32K jobs I ran into IO problems. | ||
+ | |||
+ | < | ||
+ | |||
+ | sbatch: error: Batch job submission failed: I/O error writing script/ | ||
+ | |||
+ | </ | ||
+ | |||
+ | Oh, this is OS error from ext3 file system, max files and dirs exceeded. | ||
+ | |||
+ | Switching " | ||
+ | |||
+ | |||
+ | ==== High Throughput ==== | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | * MaxJobCount=100000 | ||
+ | * SlurmctldPort=6820-6825 | ||
+ | * SchedulerParameters=max_job_bf=100, | ||
+ | |||
+ | ^NrJobs^N^hh: | ||
+ | |50, | ||
+ | |||
+ | |||
+ | | ||
+ | |||
+ | Debug Level is 3. Maybe go to 1. | ||
\\ | \\ | ||
**[[cluster: | **[[cluster: |