User Tools

Site Tools


cluster:187

This is an old revision of the document!



Back

NGC Docker Containers

Trying to understand how to leverage GPU ready applications on the Nvidia NGC web site (Nvidia GPU Cloud). Download docker containers and buidl your own on premise catalog. Can't wrap myself around the problem of how to integrate containers with the Openlava scheduler.

# Assumes CentOS 7
# Assumes NVIDIA Driver is installed as per requirements ( < 340.29 )
# Install DOCKER
sudo curl -fsSL https://get.docker.com/ | sh
# Start DOCKER
sudo systemctl start docker
# Add dockeruser, usermod change
sudo adduser dockeruser
usermod -aG docker dockeruser
# Install NV-DOCKER
# GET NVIDIA-DOCKER
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker-1.0.1-1.x86_64.rpm
# INSTALL
sudo rpm -i /tmp/nvidia-docker*.rpm
# Start NV-DOCKER Service
systemctl start nvidia-docker

systemctl status docker
systemctl status nvidia-docker

# fetch image and run command in container 
# then remove container, image remains

nvidia-docker run --rm nvidia/cuda nvidia-smi

# or 
docker pull nvidia/cuda

Pull down other containers, for example from Nvidia Catalog Register (nvcr.io)

NGC Deep Learning Ready Docker Containers:
NVIDIA DIGITS - nvcr.io/nvidia/digits
TensorFlow - nvcr.io/nvidia/tensorflow
Caffe - nvcr.io/nvidia/caffe
NVIDIA CUDA - nvcr.io/nvidia/cuda (9.2, 10.1, 10.0)
PyTorch - nvcr.io/nvidia/pytorch
RapidsAI - nvcr.io/nvidia/rapidsai/rapidsai

Additional Docker Images:
Portainer Docker Management - portrainer/portainer

# in the catalog you can also find 
docker pull nvcr.io/hpc/gromacs:2018.2
docker pull nvcr.io/hpc/lammps:24Oct2018
docker pull nvcr.io/hpc/namd:2.13-multinode
docker pull nvcr.io/partners/matlab:r2019b
# not all at the latest versions
# and amber would have to be custom build on top of nvidia/cuda

Make GPUs available to container and set some settings

# DIGITS example
# if you passed GPU ID 2,3 for example, the container would still see the GPUs as ID 0,1
NV_GPU=0,1 nvidia-docker run --name digits -d -p 5000:5000 nvidia/digits

# list containers running
nvidia-docker ps


Back

cluster/187.1576002394.txt.gz · Last modified: 2019/12/10 13:26 by hmeij07