User Tools

Site Tools


cluster:172

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:172 [2018/08/23 14:02]
hmeij07 [Gromacs]
cluster:172 [2018/09/25 13:57]
hmeij07 [Finish]
Line 20: Line 20:
   * unmount NFS mounts while installing nvidia as root   * unmount NFS mounts while installing nvidia as root
   * install other software as regular user    * install other software as regular user 
-  *  
 ==== Nvidia ==== ==== Nvidia ====
  
Line 34: Line 33:
 yum update kernel kernel-tools kernel-tools-libs yum update kernel kernel-tools kernel-tools-libs
 yum install kernel-devel kernel-headers (remove old headers after reboot) yum install kernel-devel kernel-headers (remove old headers after reboot)
-yum install gcc gcc-devel gcc-gfortran gcc-c+++yum install gcc gcc-gfortran gcc-c++  # CHROOT done
  
 # download runfiles from https://developer.nvidia.com/cuda-downloads # download runfiles from https://developer.nvidia.com/cuda-downloads
-sh cuda_name_of_runfile +# files in /usr/local/src 
-sh cuda_name_of_runfile_patch+sh cuda_9.2.148_396.37_linux.run 
  
 Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 396.26? Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 396.26?
Line 55: Line 55:
 (y)es/(n)o/(q)uit: n (y)es/(n)o/(q)uit: n
  
-#/etc/modprobe.d/blacklist-nouveau.confreboot before driver instllation+# /etc/modprobe.d/blacklist-nouveau.conf (new file by nvidia) 
 +reboot before driver installation # CHROOT done
 blacklist nouveau blacklist nouveau
 options nouveau modeset=0 options nouveau modeset=0
Line 61: Line 62:
  
 # nvidia driver # nvidia driver
-./cuda_name_of_runfile -silent -driver+./cuda_name_of_runfile \-\-silent \-\-accept-eula driver
  
 # backup # backup
Line 72: Line 73:
 [root@n37 src]# find /usr/local/cuda-9.2 -name nvidia-xconfig* [root@n37 src]# find /usr/local/cuda-9.2 -name nvidia-xconfig*
 [root@n37 src]# [root@n37 src]#
-[root@n37 src]# scp n78:/etc/X11/xorg.conf /etc/X11/+[root@n37 src]# scp n78:/etc/X11/xorg.conf /etc/X11/  # CHROOT done
  
 # Device files/dev/nvidia* exist with 0666 permissions? # Device files/dev/nvidia* exist with 0666 permissions?
Line 80: Line 81:
 # new kernel initramfs, load # new kernel initramfs, load
 dracut --force dracut --force
 +
 +# for mapd graphics support needs to be enabled
 +nvidia-smi --gom=0
 +# have left persistence and exclusivity at defaults for now
 +
 reboot reboot
  
Line 157: Line 163:
 ** Finish ** ** Finish **
  
-  * yum install freeglut-devel libX11-devel libXi-devel libXmu-devel \ make mesa-libGLU-devel+  * yum install freeglut-devel libX11-devel libXi-devel libXmu-devel \ make mesa-libGLU-devel # CHROOT done
   * check for /usr/lib64/libvdpau_nvidia.so   * check for /usr/lib64/libvdpau_nvidia.so
   * [root@n37 /]# tar -cvf /tmp/n37.chroot.ul.tar usr/local   * [root@n37 /]# tar -cvf /tmp/n37.chroot.ul.tar usr/local
Line 167: Line 173:
  
 <code> <code>
-# As root check requirements+# As root check requirements # CHROOT done
 rpm -qa | grep ^gcc rpm -qa | grep ^gcc
 rpm -qa | grep ^g++ rpm -qa | grep ^g++
Line 187: Line 193:
 rpm -qa | grep ^bison rpm -qa | grep ^bison
  
-# As root install missing+# As root install missing # CHROOT done
 yum install flex bzip2-devel libXdmcp zlib zlib-devel yum install flex bzip2-devel libXdmcp zlib zlib-devel
 yum install tkinter openmpi perl-ExtUtils-MakeMaker patch bison yum install tkinter openmpi perl-ExtUtils-MakeMaker patch bison
Line 274: Line 280:
 As root install As root install
  
-  * yum install cmake+  * cmake, latest version, never understand why so far ahead of distro...
  
 Download and extract source. Using same environment as Amber compilation. Download and extract source. Using same environment as Amber compilation.
 +
 +<code>
 +
 + cd gromacs-2018/
 + mkdir build
 + cd build
 +
 + which mpicc mpicxx
 +/share/apps/CENTOS6/openmpi/1.8.4/bin/mpicc
 +/share/apps/CENTOS6/openmpi/1.8.4/bin/mpicxx
 +
 + CC=mpicc CXX=mpicxx \
 +   /share/apps/CENTOS7/cmake/3.12.1/bin/cmake .. \
 +  -DCMAKE_INSTALL_PREFIX=/usr/local/gromacs-2018 \
 +  -DGMX_BUILD_OWN_FFTW=ON -DGMX_MPI=ON -DGMX_GPU=ON
 + CC=mpicc CXX=mpicxx make
 + CC=mpicc CXX=mpicxx make install
 +
 +</code>
 +
 ==== Lammps ==== ==== Lammps ====
  
 As root install As root install
  
-  * yum install libjpeg libjpeg-devel libjpeg-turbo libjpeg-turbo-devel  +  * yum install libjpeg libjpeg-devel libjpeg-turbo libjpeg-turbo-devel # CHROOT done 
-  * yum install blas blas-devel lapack lapack-devel boost boost-devel+  * yum install blas blas-devel lapack lapack-devel boost boost-devel # CHROOT done
  
 For Lammps-22Aug18 I followed the top installation instructions at this page For Lammps-22Aug18 I followed the top installation instructions at this page
Line 291: Line 317:
  
   * to stay with openmpi-1.8.4 (not mpich3...)   * to stay with openmpi-1.8.4 (not mpich3...)
-  * consulting the ARCH web page I choose -arch=sm_35+  * consulting the ARCH web page I choose -arch=sm_35 (on n37 for K20)
  
-Good thing we're doing this now, future versions of CUDA will not support the K20s anymore. In fact on that web site they are not mentioned, only the K40/K80 gpus. So we'll see what testing reveals.  Please double check results against previous runs. Compile as regular user and stage lmp_mpi in /usr/local/lammps-22Aug10/lmp-mpi-presision1precision2-with-cuda...+Good thing we're doing this now, future versions of CUDA will not support the K20s anymore. In fact on that web site they are not mentioned, only the K40/K80 gpus. So we'll see what testing reveals.  Please double check results against previous runs. Compile as regular user and stage lmp_mpi in /usr/local/lammps-22Aug18/
  
 <code> <code>
Line 299: Line 325:
 [hmeij@n37 src]$ ll /usr/local/lammps-22Aug18/ [hmeij@n37 src]$ ll /usr/local/lammps-22Aug18/
 total 104356 total 104356
--rwxr-xr-x 1 hmeij its 35739800 Aug 23 08:49 lmp_mpi-double-double-with-cuda +-rwxr-xr-x 1 hmeij its 35739800 Aug 23 08:49 lmp_mpi-double-double-with-gpu 
--rwxr-xr-x 1 hmeij its 35555672 Aug 23 09:11 lmp_mpi-single-double-with-cuda +-rwxr-xr-x 1 hmeij its 35555672 Aug 23 09:11 lmp_mpi-single-double-with-gpu 
--rwxr-xr-x 1 hmeij its 35559552 Aug 23 09:53 lmp_mpi-single-single-with-cuda+-rwxr-xr-x 1 hmeij its 35559552 Aug 23 09:53 lmp_mpi-single-single-with-gpu
  
 </code> </code>
  
 ==== mapd ==== ==== mapd ====
 +
 +  * https://www.mapd.com/docs/latest/4_centos7-yum-gpu-ce-recipe.html
 +
 +<code>
 +
 +useradd -U mapd
 +
 +# mapd.repo
 +[mapd-ce-cuda]
 +name=mapd ce - cuda
 +baseurl=https://releases.mapd.com/ce/yum/stable/cuda
 +gpgcheck=1
 +gpgkey=https://releases.mapd.com/GPG-KEY-mapd
 +
 +yum  install \
 +  copy-jdk-configs java-1.8.0-openjdk-headless \
 +  javapackages-tools libxslt \
 +  lksctp-tools python-javapackages \
 +  python-lxml tzdata-java  # CHROOT done
 +
 +yum install mapd   # n37:/usr/local/src
 +
 +# User specific aliases and functions
 +export MAPD_USER=mapd
 +export MAPD_GROUP=mapd
 +export MAPD_STORAGE=/var/lib/mapd
 +export MAPD_PATH=/opt/mapd
 +# The $MAPD_STORAGE directory must be dedicated to MapD
 +
 +</code>
 +
 +==== Finish ====
 +
 +  * Make the final tar file for /usr/local and post with CHROOT # done
 +  * Install all the packages of this page in CHROOT # marked done
 +
 +
 +To do another node, the steps are
 +
 +  * add node in deploy.txtof n37.chroot/
 +  * ./deploy.txt `grep node_name deploy.txt`
 +  * scp in place passwd, shadow, group, hosts, fstab from global archive
 +  * umount -a
 +  * ONBOOT=no, ib0 ??? connectX mlx4_0 IB interface breaks in CentOS 7.3+
 +  * bootlocal=EXIT then reboot then check polkit user … screws up systemd-logind
 +  * hostnamectl set-hostname node_name (logout/login)
 +  * tar in place n37.chroot.ul.tar.gz in / FIRST
 +  * REMOVE /usr/local/cuda-9.2 then uplink eth1 install kernel-devel
 +  * Nvidia install: files in /usr/local/src
 +    * sh cuda_name_of_runfile
 +    * nvidia-modprobe.sh
  
  
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
cluster/172.txt · Last modified: 2020/07/15 17:52 by hmeij07