User Tools

Site Tools



Slurm links:

Other useful links.

NGC Docker Containers

Trying to understand how to leverage GPU ready applications on the Nvidia NGC web site (Nvidia GPU Cloud). Download docker containers and build your own on premise catalog. Run GPU ready software on compute nodes with docker containers. Can't wrap myself around the problem of how to integrate containers with the our scheduler yet.

# get docker on centos 7
curl -fsSL -o

# systemctl
systemctl enable docker
systemctl start docker

# dockeruser, usermod change
adduser dockeruser
usermod -aG docker dockeruser

# get nvidia-docker
# wget -P /tmp

# rpm -i /tmp/nvidia-docker*.rpm
# make nvidia-docker

# systemctl
systemctl enable nvidia-docker
systemctl start nvidia-docker

systemctl status docker
systemctl status nvidia-docker

# fetch image and run command in container 
# then remove container, image remains

nvidia-docker run --rm nvidia/cuda nvidia-smi

# or 

docker pull nvidia/cuda

Pull down other containers, for example from Nvidia Catalog Register (

NGC Deep Learning Ready Docker Containers:

TensorFlow -
Caffe -
NVIDIA CUDA - (9.2, 10.1, 10.0)
PyTorch -
RapidsAI -

Additional Docker Images:
Portainer Docker Management - portrainer/portainer

# in the catalog you can also find 

docker pull
docker pull
docker pull
docker pull

# not all at the latest versions as you can see
# and amber would have to be custom build on top of nvidia/cuda

Make GPUs available to container and set some settings

# DIGITS example
# if you passed host GPU ID 2,3 the container would still see the GPUs as ID 0,1

NV_GPU=0,1 nvidia-docker run --name digits -d -p 5000:5000 nvidia/digits

# list containers running
nvidia-docker ps

There are some other issues…

  • inside the container the user invoked application runs as root so copying files back and forth is a problem
  • file systems, home directory and scratch spaces need to be mounted inside container
  • GPUs need to be reserved via scheduler on a host then made available to container (see above)

Some notes from

# NGC containers are hosted in a repository called
# A Docker container is the running instance of a Docker image.

# All NGC Container images are based on the CUDA platform layer (

# mount host directory to container location

-v $HOME:/tmp/$USER

# pull images

docker pull
docker images

# detailed information of container


# specifying a user

-u $(id -u):$(id -g)

# allocate GPUs

NV_GPU=0,1 nvidia-docker run ...

# custom build images ...
# looks complex based on Dockerfile config file commands
# see link 


cluster/187.txt · Last modified: 2020/08/17 12:01 by hmeij07