Differences

This shows you the differences between two versions of the page.

--- cluster:134 [2014/08/20 09:03]
hmeij [High Throughput]
+++ cluster:134 [2014/08/20 13:22]
hmeij
@@ Line 101: / Line 101: @@
 [[https://computing.llnl.gov/linux/slurm/high_throughput.html]]
+Vanilla out of the box with these changes
   * MaxJobCount=120000
@@ Line 130: / Line 132: @@
 Debug Level is 3 above. Falling back to proctrack/pgid while setting debug to level 1. Also setting SchedulerType=sched/builtin (removing the backfill). This is throughput allright, just 8 KVM nodes handling the jobs.
-^NrJobs^N^hh:mm:00^
+^NrJobs^N^hh:mm:ss^
 | 1,000|8|00:00:34|
 |10,000|8|00:05:57|
@@ Line 138: / Line 140: @@
 |100,000|8|00:58:16|
- Now I also added a proplog/epilog script to my submit job script which will created
+ Next I will add a proplog/epilog script to my submit job script which will create
 /localscratch/$SLURM_JOB_ID, echo the date into file foo, then cat foo to standard out and finish with removing the scratch dir. These prolog/epilog actions needs to be done by slurmd but so far it errors for me.  Does slow things down a bit. Same conditions as above.
+<code>
+#!/bin/bash
+/share/apps/lsf/slurm_prolog.pl
+#SBATCH --job-name="NUMBER"
+#SBATCH --output="tmp/outNUMBER"
+#SBATCH --begin=10:00:00
+# unique job scratch dir
+export MYLOCALSCRATCH=/localscratch/$SLURM_JOB_ID
+cd $MYLOCALSCRATCH
+pwd
+echo "$SLURMD_NODENAME JOB_PID=$SLURM_JOB_ID" >> foo
+date  >> foo
+cat foo
+/share/apps/lsf/slurm_epilog.pl
+</code>
+^NrJobs^N^hh:mm:ss^
+| 1,000|8|00:05:00|
+| 5,000|8|00:23:43|
+|10,000|8|00:47:12|
+|25,000|8|00:58:01|

DokuWiki

User Tools

Site Tools

Differences

Page Tools