This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cluster:208 [2022/06/01 18:28] hmeij07 [gpu testing] |
cluster:208 [2022/11/02 17:28] (current) hmeij07 [gpu testing] |
||
---|---|---|---|
Line 418: | Line 418: | ||
* same as above but all 16 jobs run on gpu 0 | * same as above but all 16 jobs run on gpu 0 | ||
* so the limit to 4 jobs on rtx5000 gpu is a hardware phenomenon? | * so the limit to 4 jobs on rtx5000 gpu is a hardware phenomenon? | ||
- | * | + | * all 16 jobs finished, waal times of 3.11 to 3.60 hours |
+ | |||
+ | ===== gpu testing 2 ===== | ||
+ | |||
+ | Newer 2022 version seems to have reversed the override options for oversubscribe. So here is our testing...back to CR_CPU_Memory and OverSubscribe=No | ||
+ | |||
+ | < | ||
+ | |||
+ | CR_Socket_Memory | ||
+ | PartitionName=test Nodes=n[100-101] | ||
+ | Default=YES MaxTime=INFINITE State=UP | ||
+ | OverSubscribe=No DefCpuPerGPU=12 | ||
+ | |||
+ | MPI jobs with -N 1, -n 8 and -B 2:4:1 | ||
+ | no override options, cpus=48 | ||
+ | --mem=2048, cpus=48 | ||
+ | and --cpus-per-task=1, | ||
+ | and --ntasks-per-node=8, | ||
+ | |||
+ | MPI jobs with -N, -n 8 and -B 1:8:1 | ||
+ | --mem=10240 cpus=48 | ||
+ | and --cpus-per-task=1, | ||
+ | and --ntasks-per-node=8, | ||
+ | |||
+ | GPU jobs with -N 1, -n 1 and -B 1:1:1 | ||
+ | no override options, no cuda export, cpus=48 | ||
+ | --cpus-per-gpu=1, | ||
+ | and --mem-per-gpu=7168, | ||
+ | while other gpu runs in queue but gpus are free???) | ||
+ | |||
+ | GPU jobs with -N 1, -n 1 and -B 1:1:1 | ||
+ | no override options, yes cuda export, cpus=48 | ||
+ | --cpus-per-gpu=1, | ||
+ | and --mem-per-gpu=7168, | ||
+ | while a gpu job runs, gpus are free, then it executes) | ||
+ | |||
+ | ...suddenly the cpus=1 turns into cpus=24 | ||
+ | when submitting, slurm confused becuase of all | ||
+ | the job cancellations? | ||
+ | |||
+ | CR_CPU_Memory test=no, mwgpu=force: | ||
+ | PartitionName=test Nodes=n[100-101] | ||
+ | Default=YES MaxTime=INFINITE State=UP | ||
+ | OverSubscribe=No DefCpuPerGPU=12 | ||
+ | |||
+ | MPI jobs with -N 1, -n 8 and -B 2:4:1 | ||
+ | no override options, cpus=8 (queue fills across nodes, | ||
+ | but only one job per node, test & mwgpu) | ||
+ | --mem=1024, cpus=8 (queue fills first node ..., | ||
+ | but only three jobs per node, test 3x8=24 full 4th job pending & | ||
+ | mwgpu 17th job goes pending on n33, overloaded with -n 8 !!) | ||
+ | (not needed) --cpus-per-task=?, | ||
+ | (not needed) | ||
+ | |||
+ | |||
+ | GPU jobs with -N 1, -n 1 and -B 1:1:1 on test | ||
+ | no override options, no cuda export, cpus=12 (one gpu per node) | ||
+ | --cpus-per-gpu=1, | ||
+ | and --mem-per-gpu=7168, | ||
+ | required else all mem allocated!, max 4 jobs per node, | ||
+ | fills first node first...cuda export not needed) | ||
+ | with cuda export, same node, same gpu, | ||
+ | with " | ||
+ | |||
+ | |||
+ | GPU jobs with -N 1, -n 1 and -B 1:1:1 on mwgpu | ||
+ | --cpus-per-gpu=1, | ||
+ | and --mem-per-gpu=7168, | ||
+ | (same node, same gpu, cuda export set, | ||
+ | with " | ||
+ | potential for overloading!) | ||
+ | |||
+ | </ | ||
+ | |||