Both sides previous revision
Previous revision
Next revision
|
Previous revision
Next revision
Both sides next revision
|
cluster:134 [2014/08/20 09:03] hmeij [High Throughput] |
cluster:134 [2014/08/21 16:00] hmeij |
| |
[[https://computing.llnl.gov/linux/slurm/high_throughput.html]] | [[https://computing.llnl.gov/linux/slurm/high_throughput.html]] |
| |
| Vanilla out of the box with these changes |
| |
* MaxJobCount=120000 | * MaxJobCount=120000 |
Debug Level is 3 above. Falling back to proctrack/pgid while setting debug to level 1. Also setting SchedulerType=sched/builtin (removing the backfill). This is throughput allright, just 8 KVM nodes handling the jobs. | Debug Level is 3 above. Falling back to proctrack/pgid while setting debug to level 1. Also setting SchedulerType=sched/builtin (removing the backfill). This is throughput allright, just 8 KVM nodes handling the jobs. |
| |
^NrJobs^N^hh:mm:00^ | ^NrJobs^N^hh:mm:ss^ |
| 1,000|8|00:00:34| | | 1,000|8|00:00:34| |
|10,000|8|00:05:57| | |10,000|8|00:05:57| |
|50,000|8|00:29:55| | |50,000|8|00:29:55| |
|75,000|8|00:44:15| | |75,000|8|00:44:15| |
|50,000|8|00:58:16| | |100,000|8|00:58:16| |
| |
Now I also added a proplog/epilog script to my submit job script which will created | Next I will add a proplog/epilog script to my submit job script which will create |
/localscratch/$SLURM_JOB_ID, echo the date into file foo, then cat foo to standard out and finish with removing the scratch dir. These prolog/epilog actions needs to be done by slurmd but so far it errors for me. Does slow things down a bit. Same conditions as above. | /localscratch/$SLURM_JOB_ID, echo the date into file foo, then cat foo to standard out and finish with removing the scratch dir. These prolog/epilog actions needs to be done by slurmd but so far it errors for me. Does slow things down a bit. Same conditions as above. |
| |
| <code> |
| #!/bin/bash |
| /share/apps/lsf/slurm_prolog.pl |
| |
| #SBATCH --job-name="NUMBER" |
| #SBATCH --output="tmp/outNUMBER" |
| #SBATCH --begin=10:00:00 |
| |
| # unique job scratch dir |
| export MYLOCALSCRATCH=/localscratch/$SLURM_JOB_ID |
| cd $MYLOCALSCRATCH |
| pwd |
| |
| echo "$SLURMD_NODENAME JOB_PID=$SLURM_JOB_ID" >> foo |
| date >> foo |
| cat foo |
| |
| /share/apps/lsf/slurm_epilog.pl |
| </code> |
| |
| |
| ^NrJobs^N^hh:mm:ss^ |
| | 1,000|8|00:05:00| |
| | 5,000|8|00:23:43| |
| |10,000|8|00:47:12| |
| |25,000|8|00:58:01| |
| |
| |
| ==== MPI ==== |
| |
| With sbatch there is no need for a wrapper script, slurm figures it all out. |
| |
| <code> |
| |
| #!/bin/bash |
| #/share/apps/lsf/slurm_prolog.pl |
| |
| #SBATCH --job-name="MPI" |
| #SBATCH --ntasks=8 |
| #SBATCH --begin=now |
| |
| # unique job scratch dir |
| #export MYLOCALSCRATCH=/localscratch/$SLURM_JOB_ID |
| #cd $MYLOCALSCRATCH |
| |
| echo "$SLURMD_NODENAME JOB_PID=$SLURM_JOB_ID" |
| |
| rm -rf err out logfile mdout restrt mdinfo |
| |
| export PATH=/share/apps/openmpi/1.2+intel-9/bin:$PATH |
| export LD_LIBRARY_PATH=/share/apps/openmpi/1.2+intel-9/lib:$LD_LIBRARY_PATH |
| which mpirun |
| |
| mpirun /share/apps/amber/9+openmpi-1.2+intel-9/exe/pmemd -O -i inp/mini.in -p 1g6r.cd.parm -c 1g6r.cd.randions.crd.1 -ref 1g6r.cd.randions.crd.1 |
| |
| #/share/apps/lsf/slurm_epilog.pl |
| |
| </code> |
| |
| |
\\ | \\ |
**[[cluster:0|Back]]** | **[[cluster:0|Back]]** |