\\
**[[cluster:0|Back]]**
====== Automated Submissions ======
Quanli walked into the office with a request: how can one automate the submission of tons of jobs? In his case Gaussian jobs. "Job Arrays" i answered confidently, but that turned out to be a bit of a problem. Still working on that.
However, we took an approach to write a script that generates the job files for you. The basic idea is build a template. Use that template and fill in dynamic data. Then submit those job files in one swoop.
I'm including the simple scripts and steps below (files are in /home/hmeij/batch). You can build the idea out based on your needs.
====== Script Approach ======
First you need to create the input data files. Since each file will be different (but not in my examples) you need to do this manually (or, heck, write another script for that step). So in our case we have 5 input files with the naming convention of ''in.1 - in.5''.
Next we build a template for the job files we want to generate. Below is our sample. The triple uppercase fields are the values we wish to dynamically fill in using our script.
* file ''build.template''
#!/bin/bash
# TEMPLATE
#BSUB -q QQQ
#BSUB -n NNN
#BSUB -J JJJ
#BSUB -o OOO
#BSUB -e EEE
MYSANSCRATCH=/sanscratch/$LSB_JOBID
MYLOCALSCRATCH=/localscratch/$LSB_JOBID
export MYSANSCRATCH MYLOCALSCRATCH
cd $MYSANSCRATCH
export GAUSS_SCRDIR="$MYLOCALSCRATCH"
export g03root="/share/apps/gaussian/g03root"
. $g03root/g03/bsd/g03.profile
cp PPP/in.III ./in
g03 < ./in > ./out
cp ./out PPP/out.III."$LSB_JOBID"
Here is the simple script that will generate our job files. It basically has some default values you may override for queue name and the number of processors you need. The script looks for the number of input data files, and then generates a job file for each. Finally, the script builds a file you can use to submit those job files.
* file ''build.jobfiles''
#!/usr/bin/perl
if ($#ARGV == 1) {
$q = $ARGV[0];
$n = $ARGV[1];
print "Usage: make sure the -n value below matches \%nprocs\n";
print "Using user defined values of q=$q, -n=$n\n";
} else {
print "Usage: ./build.jobfiles queue_name nr_of_procs\n";
print " Using default of q=elw, -n=1\n";
$q = elw;
$n = 1;
}
# load template into memory
open(F,"build.template");
while () {
push @T, $_;
}
close F;
# how many input files
$t = ` ls in.* | wc -l`;
chop($t);
print " Found $t input data files.\n";
# where are we
$p = `pwd`;
chop($p);
# build job files for bsub
foreach $i (1..$t) {
undef $ss;
foreach $j (2..$#T) {
$s = $T[$j];
$s =~ s/QQQ/$q/g;
$s =~ s/NNN/$n/g;
$s =~ s/JJJ/job.$i/g;
$s =~ s/OOO/out.$i/g;
$s =~ s/EEE/err.$i/g;
$s =~ s/III/$i/g;
$s =~ s/PPP/$p/g;
$ss .= $s;
}
open(O,">job.$i");
print O "#!/bin/bash\n$ss";
close O;
}
print " Build $t job files.\n";
# lazy, build a submit script
open(O,">build.submit");
print O "#\!/bin/bash\nfor i in \`seq 1 $t\`\ndo\nbsub \< job.\$i\ndone\n";
close O;
`chmod u+x build.submit`;
print "Done. Verify a job file, then submit like so './build.submit'\n";
Here is how it works. Step by step.
[hmeij@swallowtail batch]$ newgrp gaussian
[hmeij@swallowtail batch]$ ll
total 28
-rwxr--r-- 1 hmeij its 1165 Dec 21 11:01 build.jobfiles
-rw-r--r-- 1 hmeij its 409 Dec 20 16:36 build.template
-rw-r--r-- 1 hmeij its 460 Dec 20 15:58 in.1
-rw-r--r-- 1 hmeij its 460 Dec 20 15:59 in.2
-rw-r--r-- 1 hmeij its 460 Dec 20 15:59 in.3
-rw-r--r-- 1 hmeij its 460 Dec 20 15:59 in.4
-rw-r--r-- 1 hmeij its 460 Dec 20 15:59 in.5
[hmeij@swallowtail batch]$ ./build.jobfiles
Usage: ./build.jobfiles queue_name nr_of_procs
Using default of q=elw, -n=1
Found 5 input data files.
Build 5 job files.
Done. Verify a job file, then submit like so './jobs.submit'
[hmeij@swallowtail batch]$ cat job.3
#!/bin/bash
#BSUB -q elw
#BSUB -n 1
#BSUB -J job.3
#BSUB -o out.3
#BSUB -e err.3
MYSANSCRATCH=/sanscratch/$LSB_JOBID
MYLOCALSCRATCH=/localscratch/$LSB_JOBID
export MYSANSCRATCH MYLOCALSCRATCH
cd $MYSANSCRATCH
export GAUSS_SCRDIR="$MYLOCALSCRATCH"
export g03root="/share/apps/gaussian/g03root"
. $g03root/g03/bsd/g03.profile
cp /home/hmeij/batch/in.3 ./in
g03 < ./in > ./out
cp ./out /home/hmeij/batch/out.3."$LSB_JOBID"
[hmeij@swallowtail batch]$ ./jobs.submit
Job <34529> is submitted to queue .
Job <34530> is submitted to queue .
Job <34531> is submitted to queue .
Job <34532> is submitted to queue .
Job <34533> is submitted to queue .
[hmeij@swallowtail batch]$ bjobs
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
34529 hmeij PEND elw swallowtail - job.1 Dec 21 11:01
34530 hmeij PEND elw swallowtail - job.2 Dec 21 11:01
34531 hmeij PEND elw swallowtail - job.3 Dec 21 11:01
34532 hmeij PEND elw swallowtail - job.4 Dec 21 11:01
34533 hmeij PEND elw swallowtail - job.5 Dec 21 11:01
[hmeij@swallowtail batch]$ ll
total 132
-rwxr--r-- 1 hmeij its 1165 Dec 21 11:01 build.jobfiles
-rw-r--r-- 1 hmeij its 409 Dec 20 16:36 build.template
-rw-r--r-- 1 hmeij gaussian 0 Dec 21 11:02 err.1
-rw-r--r-- 1 hmeij gaussian 0 Dec 21 11:02 err.2
-rw-r--r-- 1 hmeij gaussian 0 Dec 21 11:02 err.3
-rw-r--r-- 1 hmeij gaussian 0 Dec 21 11:02 err.4
-rw-r--r-- 1 hmeij gaussian 0 Dec 21 11:02 err.5
-rw-r--r-- 1 hmeij its 460 Dec 20 15:58 in.1
-rw-r--r-- 1 hmeij its 460 Dec 20 15:59 in.2
-rw-r--r-- 1 hmeij its 460 Dec 20 15:59 in.3
-rw-r--r-- 1 hmeij its 460 Dec 20 15:59 in.4
-rw-r--r-- 1 hmeij its 460 Dec 20 15:59 in.5
-rw-r--r-- 1 hmeij gaussian 426 Dec 21 11:01 job.1
-rw-r--r-- 1 hmeij gaussian 426 Dec 21 11:01 job.2
-rw-r--r-- 1 hmeij gaussian 426 Dec 21 11:01 job.3
-rw-r--r-- 1 hmeij gaussian 426 Dec 21 11:01 job.4
-rw-r--r-- 1 hmeij gaussian 426 Dec 21 11:01 job.5
-rwxr--r-- 1 hmeij gaussian 53 Dec 21 11:01 jobs.submit
-rw-r--r-- 1 hmeij gaussian 1304 Dec 21 11:02 out.1
-rw-r--r-- 1 hmeij gaussian 11389 Dec 21 11:02 out.1.34529
-rw-r--r-- 1 hmeij gaussian 1304 Dec 21 11:02 out.2
-rw-r--r-- 1 hmeij gaussian 11390 Dec 21 11:02 out.2.34530
-rw-r--r-- 1 hmeij gaussian 1304 Dec 21 11:02 out.3
-rw-r--r-- 1 hmeij gaussian 11478 Dec 21 11:02 out.3.34531
-rw-r--r-- 1 hmeij gaussian 1304 Dec 21 11:02 out.4
-rw-r--r-- 1 hmeij gaussian 11477 Dec 21 11:02 out.4.34532
-rw-r--r-- 1 hmeij gaussian 1304 Dec 21 11:02 out.5
-rw-r--r-- 1 hmeij gaussian 11449 Dec 21 11:02 out.5.34533
====== Job Arrays ======
Well this turned out to be easier than expected. The submission process is slightly different though, we will not be using a job file but submit the job on the command line with all arguments necessary.
First you may wish to read
* **[[cluster:50|Simple Job Arrays]]**
or
* **[[http://lsfdocs/lsf6.2_admin/G_jobarrays.html#27813|Manual Pages]]**
When using job arrays, you submit a single job which contains many tasks. Each task is a copy of the original job submission but the input and output structures vary. Also in this case, we will **not** be using a job file with **''#BSUB''** commands anymore.
Here is one way it could work using the Gaussian example mentioned above. First we use the same input data files ''in.1 - in.5''. In addition we create array files. The only content in these array files is the iteration value, so for example ''array.1'' contains ''1'', ''array.2'' contains ''2'', etc. This content is passed from array file as standard input to the program you specify on the command line.
That program file, named ''my_run.sh'' in this example, then reads that information and uses it to set up the current job. We then use that info to build the Gaussian invocation. Seems convoluted? Sure, but think about the case in which you have thousands of jobs to process. This can now be done with a single job submission.
Not clear? Here is how it works. First the contents of our files:
* file ''in.1''
%mem=1GB
%nproc=1
# hf/3-21g geom=connectivity
Title Card Required
0 1
N
H 1 B1
H 1 B2 2 A1
H 1 B3 3 A2 2 D1
B1 1.00000000
B2 1.00000000
B3 1.00000000
A1 109.47120255
A2 109.47125080
D1 -119.99998525
1 2 1.0 3 1.0 4 1.0
2
3
4
* file ''array.1''
1
* program file ''my_run.sh''
#!/bin/bash
read i
echo i:$i
echo '---------------------'
MYSANSCRATCH=/sanscratch/$LSB_JOBID
MYLOCALSCRATCH=/localscratch/$LSB_JOBID
export MYSANSCRATCH MYLOCALSCRATCH
export GAUSS_SCRDIR="$MYLOCALSCRATCH"
export g03root="/share/apps/gaussian/g03root"
. $g03root/g03/bsd/g03.profile
cp ./in.$i $MYSANSCRATCH/in
cd $MYSANSCRATCH
# note that we capture gaussian output as standard out
g03 < ./in
Here is the submission. Step by step. **NOTE THE JOBID THAT GETS ASSIGNED** ... 34554 ... it is the same for all tasks within this job. That makes it easy to manage hundreds or thousands of jobs if you would need to for example stop them all with ''bkill''.
[hmeij@swallowtail arrays]$ newgrp gaussian
[hmeij@swallowtail arrays]$ ll
total 44
-rw-r--r-- 1 hmeij gaussian 2 Dec 21 11:20 array.1
-rw-r--r-- 1 hmeij gaussian 2 Dec 21 11:20 array.2
-rw-r--r-- 1 hmeij gaussian 2 Dec 21 11:20 array.3
-rw-r--r-- 1 hmeij gaussian 2 Dec 21 11:20 array.4
-rw-r--r-- 1 hmeij gaussian 2 Dec 21 11:20 array.5
-rw-r--r-- 1 hmeij gaussian 460 Dec 21 11:08 in.1
-rw-r--r-- 1 hmeij gaussian 460 Dec 21 11:08 in.2
-rw-r--r-- 1 hmeij gaussian 460 Dec 21 11:08 in.3
-rw-r--r-- 1 hmeij gaussian 460 Dec 21 11:08 in.4
-rw-r--r-- 1 hmeij gaussian 460 Dec 21 11:08 in.5
-rwxr--r-- 1 hmeij gaussian 346 Dec 21 11:35 my_run.sh
[hmeij@swallowtail arrays]$ bsub -q elw -n 1 -J "job[1-5]" -i "array.%I" -o "out.%J.%I" ./my_run.sh
Job <34554> is submitted to queue .
[hmeij@swallowtail arrays]$ bjobs
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
34554 hmeij PEND elw swallowtail - job[1] Dec 21 14:01
34554 hmeij PEND elw swallowtail - job[2] Dec 21 14:01
34554 hmeij PEND elw swallowtail - job[3] Dec 21 14:01
34554 hmeij PEND elw swallowtail - job[4] Dec 21 14:01
34554 hmeij PEND elw swallowtail - job[5] Dec 21 14:01
[hmeij@swallowtail arrays]$ ll
total 124
-rw-r--r-- 1 hmeij gaussian 2 Dec 21 11:20 array.1
-rw-r--r-- 1 hmeij gaussian 2 Dec 21 11:20 array.2
-rw-r--r-- 1 hmeij gaussian 2 Dec 21 11:20 array.3
-rw-r--r-- 1 hmeij gaussian 2 Dec 21 11:20 array.4
-rw-r--r-- 1 hmeij gaussian 2 Dec 21 11:20 array.5
-rw-r--r-- 1 hmeij gaussian 460 Dec 21 11:08 in.1
-rw-r--r-- 1 hmeij gaussian 460 Dec 21 11:08 in.2
-rw-r--r-- 1 hmeij gaussian 460 Dec 21 11:08 in.3
-rw-r--r-- 1 hmeij gaussian 460 Dec 21 11:08 in.4
-rw-r--r-- 1 hmeij gaussian 460 Dec 21 11:08 in.5
-rwxr--r-- 1 hmeij gaussian 346 Dec 21 11:35 my_run.sh
-rw-r--r-- 1 hmeij gaussian 12454 Dec 21 14:02 out.34554.1
-rw-r--r-- 1 hmeij gaussian 12455 Dec 21 14:02 out.34554.2
-rw-r--r-- 1 hmeij gaussian 12378 Dec 21 14:02 out.34554.3
-rw-r--r-- 1 hmeij gaussian 12339 Dec 21 14:02 out.34554.4
-rw-r--r-- 1 hmeij gaussian 12340 Dec 21 14:02 out.34554.5
Ofcourse you could pass more information in your array job. For example, you could pass a tilde delimited string of many variables you need to set up your individual tasks. Your program file would then read this long string and parse it apart.
\\
**[[cluster:0|Back]]**