User Tools

Site Tools


cluster:218

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:218 [2022/06/30 17:46]
hmeij07 [Rocky8 Slurm Template]
cluster:218 [2022/07/18 18:11]
hmeij07 [Basic Commands]
Line 47: Line 47:
 # sorta like bhosts -l # sorta like bhosts -l
  scontrol show node n78  scontrol show node n78
 +
 +# sorta like stop/resume
 +scontrol suspend job 1000001
 +scontrol resume job 1000001 
  
 # sorta like bhist -l # sorta like bhist -l
Line 336: Line 340:
  
  
-  * ''/zfshomes/hmeij/slurm/run.rocky''+  * ''/zfshomes/hmeij/slurm/run.rocky'' for tinymem, mw128, amber128, test queues
  
 <code> <code>
Line 379: Line 383:
 cd $MYLOCALSCRATCH cd $MYLOCALSCRATCH
  
-### AMBER20+### AMBER20 works via slurm's imaged nodes, test and amber128  queues
 #source /share/apps/CENTOS8/ohpc/software/amber/20/amber.sh #source /share/apps/CENTOS8/ohpc/software/amber/20/amber.sh
 # OR # # OR #
Line 448: Line 452:
 ==== CentOS7 Slurm Template ==== ==== CentOS7 Slurm Template ====
  
-In this job template I have it setup to run ''pmemd.MPI'' but could also invoke ''pmemd.cuda'' with proper parameter settings. I could also toggle between amber16 or amber20 which on queues ''mwgpu'' and ''exx96'' are local disk CentOS7 software installations. Amber16 will not run on Rocky8 (tried it but forgot error message...we can expect problems like this, hence testing!).+In this job template I have it setup to run ''pmemd.MPI'' but could also invoke ''pmemd.cuda'' with proper parameter settings. On queues ''mwgpu'' and ''exx96'' amber[16,20] are local disk CentOS7 software installations. Amber16 will not run on Rocky8 (tried it but forgot error message...we can expect problems like this, hence testing!).
  
-Note also that we're running mwgpu's K20 cuda version 9.2 on exx96 queue (default cuda version 10.2). Not proper but it works. Hence this script will run on both queues. Oh, now I remember, it is that amber16 was compiled with cuda 9.2 drivers which are supported in cuda 10+_ but not in cuda 11+. So Amber 16, if needed, would need to be compiled in Rocky8 environment(that may work like amber20).+Note also that we're running mwgpu's K20 cuda version 9.2 on exx96 queue (default cuda version 10.2). Not proper but it works. Hence this script will run on both queues. Oh, now I remember, it is that amber16 was compiled with cuda 9.2 drivers which are supported in cuda 10.x but not in cuda 11.x. So Amber 16, if needed, would need to be compiled in Rocky8 environment (and may work like amber20 module).
  
-  * ''/zfshomes/hmeij/slurm/run.centos''+  * ''/zfshomes/hmeij/slurm/run.centos'' for mwgpu, exx96 queues
  
 <code> <code>
Line 479: Line 483:
 # #
 # GPU control # GPU control
-###SBATCH --gres=gpu:tesla_k20m: # n[33-37] 
-###SBATCH --gres=gpu:geforce_rtx_2080_s: # n[79-90] 
 ###SBATCH --cpus-per-gpu=1 ###SBATCH --cpus-per-gpu=1
 ###SBATCH --mem-per-gpu=7168 ###SBATCH --mem-per-gpu=7168
 +###SBATCH --gres=gpu:tesla_k20m: # n[33-37]
 +###SBATCH --gres=gpu:geforce_rtx_2080_s: # n[79-90]
 # #
 # Node control # Node control
Line 507: Line 511:
  
  
-###source /usr/local/amber16/amber.sh + 
-source /usr/local/amber20/amber.sh+###source /usr/local/amber16/amber.sh # works via slurm's mwgpu 
 +source /usr/local/amber20/amber.sh # works via slurm's exx96
 # stage the data # stage the data
 cp -r ~/sharptail/* . cp -r ~/sharptail/* .
Line 567: Line 572:
 July 2022 is for **testing...** lots to learn! July 2022 is for **testing...** lots to learn!
  
-Kudos to Abhilash for working our way through all this.+Kudos to Abhilash and Colin for working our way through all this.
  
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
  
cluster/218.txt · Last modified: 2023/10/14 19:24 by hmeij07