User Tools

Site Tools


cluster:218

Warning: Undefined array key -1 in /usr/share/dokuwiki/inc/html.php on line 1458

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:218 [2022/08/09 15:25]
hmeij07 [Resources]
cluster:218 [2023/10/14 15:24] (current)
hmeij07 [Resources]
Line 101: Line 101:
 #SBATCH -B 2:4:1 # S:C:T=sockets/node:cores/socket:threads/core #SBATCH -B 2:4:1 # S:C:T=sockets/node:cores/socket:threads/core
 #SBATCH --mem=250           # needed to override oversubscribe #SBATCH --mem=250           # needed to override oversubscribe
-#SBATCH --ntasks-per-node=1 # needed to override oversubscribe+#SBATCH --ntasks-per-node=1 # perhaps needed to override oversubscribe
 #SBATCH --cpus-per-task=1   # needed to override oversubscribe #SBATCH --cpus-per-task=1   # needed to override oversubscribe
  
Line 121: Line 121:
  
 </code> </code>
 +
 +** Pending Jobs  **
 +
 +I keep having to inform users that with -n 1 and -cpu 1 your job can still go in pending state because user forgot to reserve memory ... so silly slurm assumes your job needs all the node's memory. Here is my template then
 +
 +<code>
 +
 +FirstName, your jobs are pending because you did not request memory 
 +and if not then slurm assumes you need all memory, silly. 
 +Command "scontrol show job JOBID" will reveal ...
 +
 +JobId=1062052 JobName=3a_avgHbond_CPU
 +   NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:1:1
 +   TRES=cpu=1,mem=191047M,node=1,billing=1    <---------
 +
 +I looked (command "ssh n?? top -u username -b -n 1", look for the VIRT value) 
 +and you need less than 1G per job so with --mem=1024 and n=1 and cpu=1 
 +you should be able to load 48 jobs onto n100. 
 +Consult output of command "sinfo -lN"
 +
 +</code>
 +
  
 ==== MPI ==== ==== MPI ====
Line 509: Line 531:
 #SBATCH --nodelist=n88 #SBATCH --nodelist=n88
  
 +# may or may not be needed, centos7 login env
 +source $HOME/.bashrc  
 +which ifort           # should be the parallel studio 2016 version
  
 # unique job scratch dirs # unique job scratch dirs
cluster/218.1660073156.txt.gz ยท Last modified: 2022/08/09 15:25 by hmeij07