User Tools

Site Tools


cluster:223

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:223 [2023/09/07 20:03]
hmeij07
cluster:223 [2023/09/18 18:35]
hmeij07
Line 116: Line 116:
  
  
-==== Test ==== 
- 
-Script ~hmeij/slurm/run.centos, cuda 11.2, pmemd.cuda of local install of amber20 with 
- 
-  * #SBATCH -N 1 
-  * #SBATCH -n 1 
-  * #SBATCH -B 1:1:1 
-  * #SBATCH --mem-per-gpu=7168 
- 
-For some reason this yields cpus=8 which is different behavior (expected cpu=1). Slurm is overriding the above settings with partition setting of DefCpuPerGPU=8. Slurm has not changed but cuda version has. Odd. 
- 
-<code> 
- 
-# from slurmd.log 
-[2023-09-05T14:51:00.691] Gres Name=gpu Type=tesla_k20m Count=4 
- 
-JOBID   PARTITION         NAME          USER  ST          TIME NODES  CPUS    MIN_MEMORY NODELIST(REASON) 
-1053052 mwgpu             test         hmeij            0:09                                  n33 
- 
-[hmeij@cottontail2 slurm]$ ssh n33 gpu-info 
-id,name,temp.gpu,mem.used,mem.free,util.gpu,util.mem 
-0, Tesla K20m, 36, 95 MiB, 4648 MiB, 100 %, 25 % 
-1, Tesla K20m, 26, 0 MiB, 4743 MiB, 0 %, 0 % 
-2, Tesla K20m, 25, 0 MiB, 4743 MiB, 0 %, 0 % 
-3, Tesla K20m, 26, 0 MiB, 4743 MiB, 0 %, 0 % 
- 
-[hmeij@cottontail2 slurm]$ ssh n33 gpu-process 
-gpu_name, gpu_id, pid, process_name 
-Tesla K20m, 0, 28394, pmemd.cuda 
- 
-</code> 
  
 ==== Testing ==== ==== Testing ====
Line 267: Line 236:
  
 List of command line options supported by this LAMMPS executable: List of command line options supported by this LAMMPS executable:
 +<snip>
 +
 +# hmmm, using -suffix gpu it does not jump on gpus, generic non-gpu libthread error
 +# same version rocky8/cuda-11.6 works, centos7/cuda-10.2 works, all "make" compiles
 +# try "cmake" compile on n33-n36 
 +# libspace tarball download fails on file hash and 
 +# yields a  status: [1;"Unsupported protocol" error for ML-PACE
 +
 +# without ML-SPACE hash fails for opencl-loarder third partty, bad url
 +# https://download.lammps.org/thirdparty/opencl-loader-opencl-loadewer-version...tgz
 +# then extract in _deps/ dir
 +# and added -D GPU_LIBRARY=../lib/gpu/libgpu.a ala QUIP_LIBRARY
 +
  
 </code> </code>
cluster/223.txt ยท Last modified: 2023/09/18 20:56 by hmeij07