User Tools

Site Tools


cluster:172

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:172 [2018/08/23 18:33]
hmeij07
cluster:172 [2018/09/26 15:21]
hmeij07 [mapd]
Line 33: Line 33:
 yum update kernel kernel-tools kernel-tools-libs yum update kernel kernel-tools kernel-tools-libs
 yum install kernel-devel kernel-headers (remove old headers after reboot) yum install kernel-devel kernel-headers (remove old headers after reboot)
-yum install gcc gcc-gfortran gcc-c++  (done in CHROOT)+yum install gcc gcc-gfortran gcc-c++  CHROOT done 
 + 
 +# /etc/modprobe.d/blacklist-nouveau.conf (new file by nvidia) 
 +# reboot before driver installation # CHROOT done 
 +blacklist nouveau 
 +options nouveau modeset=0 
 + 
 +# new kernel initramfs, load 
 +dracut --force 
 + 
 +reboot 
  
 # download runfiles from https://developer.nvidia.com/cuda-downloads # download runfiles from https://developer.nvidia.com/cuda-downloads
 # files in /usr/local/src # files in /usr/local/src
-sh cuda_name_of_runfile +sh cuda_9.2.148_396.37_linux.run 
-sh cuda_name_of_runfile_patch+
  
 Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 396.26? Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 396.26?
Line 54: Line 65:
 Install the CUDA 9.2 Samples? Install the CUDA 9.2 Samples?
 (y)es/(n)o/(q)uit: n (y)es/(n)o/(q)uit: n
- 
-# /etc/modprobe.d/blacklist-nouveau.conf 
-# reboot before driver installation - done in CHROOTls -l /et 
-blacklist nouveau 
-options nouveau modeset=0 
-reboot 
  
 # nvidia driver # nvidia driver
 ./cuda_name_of_runfile -silent -driver ./cuda_name_of_runfile -silent -driver
 +
 +# Device files/dev/nvidia* exist with 0666 permissions?
 +# They were not 
 +/usr/local/src/nvidia-modprobe.sh
  
 # backup # backup
 [root@n37 src]# rpm -qf /usr/lib/libGL.so [root@n37 src]# rpm -qf /usr/lib/libGL.so
 file /usr/lib/libGL.so is not owned by any package file /usr/lib/libGL.so is not owned by any package
-cp /usr/lib/libGL.so /usr/lib/libGL.so-nvidia+cp /usr/lib/libGL.so   /usr/lib/libGL.so-nvidia 
 +cp /usr/lib/libGl.so.1 /usr/lib/libGL.so.1-nvidia
  
 [root@n37 src]# ls /etc/X11/xorg.conf [root@n37 src]# ls /etc/X11/xorg.conf
Line 73: Line 83:
 [root@n37 src]# find /usr/local/cuda-9.2 -name nvidia-xconfig* [root@n37 src]# find /usr/local/cuda-9.2 -name nvidia-xconfig*
 [root@n37 src]# [root@n37 src]#
-[root@n37 src]# scp n78:/etc/X11/xorg.conf /etc/X11/  - done in CHROOT+[root@n37 src]# scp n78:/etc/X11/xorg.conf /etc/X11/  CHROOT done
  
-Device files/dev/nvidia* exist with 0666 permissions? +for mapd graphics support needs to be enabled 
-# They were not  +nvidia-smi --gom=0 
-/usr/local/src/nvidia-modprobe.sh+# have left persistence and exclusivity at defaults for now
  
-# new kernel initramfs, load 
-dracut --force 
 reboot reboot
  
Line 158: Line 166:
 ** Finish ** ** Finish **
  
-  * yum install freeglut-devel libX11-devel libXi-devel libXmu-devel \ make mesa-libGLU-devel+  * yum install freeglut-devel libX11-devel libXi-devel libXmu-devel \ make mesa-libGLU-devel # CHROOT done 
 +  * yum install blas blas-devel lapack lapck-devel #CHROOT done
   * check for /usr/lib64/libvdpau_nvidia.so   * check for /usr/lib64/libvdpau_nvidia.so
 +
   * [root@n37 /]# tar -cvf /tmp/n37.chroot.ul.tar usr/local   * [root@n37 /]# tar -cvf /tmp/n37.chroot.ul.tar usr/local
   * [root@n37 /]# scp /tmp/n37.chroot.ul.tar sms_server:/var/chroots/goldimages/   * [root@n37 /]# scp /tmp/n37.chroot.ul.tar sms_server:/var/chroots/goldimages/
Line 168: Line 178:
  
 <code> <code>
-# As root check requirements +# As root check requirements # CHROOT done
-rpm -qa | grep ^gcc +
-rpm -qa | grep ^g+++
 rpm -qa | grep ^flex rpm -qa | grep ^flex
 rpm -qa | grep ^tcsh rpm -qa | grep ^tcsh
Line 188: Line 196:
 rpm -qa | grep ^bison rpm -qa | grep ^bison
  
-# As root install missing +# As root install missing # CHROOT done 
-yum install flex bzip2-devel libXdmcp zlib zlib-devel +# CHROOT done
-yum install tkinter openmpi perl-ExtUtils-MakeMaker patch bison+
  
 </code> </code>
Line 297: Line 304:
  
 </code> </code>
 +
 ==== Lammps ==== ==== Lammps ====
  
 As root install As root install
  
-  * yum install libjpeg libjpeg-devel libjpeg-turbo libjpeg-turbo-devel  +  * yum install libjpeg libjpeg-devel libjpeg-turbo libjpeg-turbo-devel # CHROOT done 
-  * yum install blas blas-devel lapack lapack-devel boost boost-devel+  * yum install blas blas-devel lapack lapack-devel boost boost-devel # CHROOT done
  
 For Lammps-22Aug18 I followed the top installation instructions at this page For Lammps-22Aug18 I followed the top installation instructions at this page
Line 311: Line 319:
  
   * to stay with openmpi-1.8.4 (not mpich3...)   * to stay with openmpi-1.8.4 (not mpich3...)
-  * consulting the ARCH web page I choose -arch=sm_35+  * consulting the ARCH web page I choose -arch=sm_35 (on n37 for K20)
  
-Good thing we're doing this now, future versions of CUDA will not support the K20s anymore. In fact on that web site they are not mentioned, only the K40/K80 gpus. So we'll see what testing reveals.  Please double check results against previous runs. Compile as regular user and stage lmp_mpi in /usr/local/lammps-22Aug10/+Good thing we're doing this now, future versions of CUDA will not support the K20s anymore. In fact on that web site they are not mentioned, only the K40/K80 gpus. So we'll see what testing reveals.  Please double check results against previous runs. Compile as regular user and stage lmp_mpi in /usr/local/lammps-22Aug18/
  
 <code> <code>
Line 319: Line 327:
 [hmeij@n37 src]$ ll /usr/local/lammps-22Aug18/ [hmeij@n37 src]$ ll /usr/local/lammps-22Aug18/
 total 104356 total 104356
--rwxr-xr-x 1 hmeij its 35739800 Aug 23 08:49 lmp_mpi-double-double-with-cuda +-rwxr-xr-x 1 hmeij its 35739800 Aug 23 08:49 lmp_mpi-double-double-with-gpu 
--rwxr-xr-x 1 hmeij its 35555672 Aug 23 09:11 lmp_mpi-single-double-with-cuda +-rwxr-xr-x 1 hmeij its 35555672 Aug 23 09:11 lmp_mpi-single-double-with-gpu 
--rwxr-xr-x 1 hmeij its 35559552 Aug 23 09:53 lmp_mpi-single-single-with-cuda+-rwxr-xr-x 1 hmeij its 35559552 Aug 23 09:53 lmp_mpi-single-single-with-gpu
  
 </code> </code>
Line 344: Line 352:
   javapackages-tools libxslt \   javapackages-tools libxslt \
   lksctp-tools python-javapackages \   lksctp-tools python-javapackages \
-  python-lxml tzdata-java +  python-lxml tzdata-java  nfs-utils 
-  mapd  +  # CHROOT done 
-  # n37:/usr/local/src+ 
 +yum install mapd   # n37:/usr/local/src
  
 # User specific aliases and functions # User specific aliases and functions
Line 359: Line 368:
 ==== Finish ==== ==== Finish ====
  
-  * Make the final tar file for /usr/local and post with CHROOT +  * Make the final tar file for /usr/local and post with CHROOT # done 
-  * Install all the packages of this page in CHROOT+  * Install all the packages of this page in CHROOT # marked done 
 + 
 + 
 +To do another node, the steps are 
 + 
 +  * add node in deploy.txt of n36.chroot/  (centos 7.2) 
 +  * ./deploy.txt `grep node_name deploy.txt` 
 +  * scp in place passwd, shadow, group, hosts, fstab from global archive 
 +  * umount -a 
 +  * ONBOOT=no, ib0 ??? connectX mlx4_0 IB interface breaks in CentOS 7.3+ 
 +  * bootlocal=EXIT then reboot then check polkit user … screws up systemd-logind 
 + 
 +  * hostnamectl set-hostname node_name (logout/login) 
 +  * eth1 on 129.133 
 +  * yum update 
 +  * yum install kernel-headers kernel-devel 
 +  * put n37 tarball in /, unpack 
 +  * remove cuda-9.2 
 + 
 +  * Nvidia install: files in /usr/local/src 
 +    * remove nouveau 
 +    * sh runfile 
 +    * ./runfile -silent -driver 
 +    * install all CHROOT done packages 
 +    * reboot 
 + 
 + 
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
cluster/172.txt · Last modified: 2020/07/15 17:52 by hmeij07