User Tools

Site Tools


cluster:172

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:172 [2018/09/25 13:57]
hmeij07 [Finish]
cluster:172 [2018/09/26 13:16]
hmeij07
Line 34: Line 34:
 yum install kernel-devel kernel-headers (remove old headers after reboot) yum install kernel-devel kernel-headers (remove old headers after reboot)
 yum install gcc gcc-gfortran gcc-c++  # CHROOT done yum install gcc gcc-gfortran gcc-c++  # CHROOT done
 +
 +# /etc/modprobe.d/blacklist-nouveau.conf (new file by nvidia)
 +# reboot before driver installation # CHROOT done
 +blacklist nouveau
 +options nouveau modeset=0
 +
 +# new kernel initramfs, load
 +dracut --force
 +
 +reboot
 +
 +# Device files/dev/nvidia* exist with 0666 permissions?
 +# They were not 
 +/usr/local/src/nvidia-modprobe.sh
  
 # download runfiles from https://developer.nvidia.com/cuda-downloads # download runfiles from https://developer.nvidia.com/cuda-downloads
Line 54: Line 68:
 Install the CUDA 9.2 Samples? Install the CUDA 9.2 Samples?
 (y)es/(n)o/(q)uit: n (y)es/(n)o/(q)uit: n
- 
-# /etc/modprobe.d/blacklist-nouveau.conf (new file by nvidia) 
-# reboot before driver installation # CHROOT done 
-blacklist nouveau 
-options nouveau modeset=0 
-reboot 
  
 # nvidia driver # nvidia driver
-./cuda_name_of_runfile \-\-silent \-\-accept-eula driver+./cuda_name_of_runfile -silent -driver
  
 # backup # backup
Line 74: Line 82:
 [root@n37 src]# [root@n37 src]#
 [root@n37 src]# scp n78:/etc/X11/xorg.conf /etc/X11/  # CHROOT done [root@n37 src]# scp n78:/etc/X11/xorg.conf /etc/X11/  # CHROOT done
- 
-# Device files/dev/nvidia* exist with 0666 permissions? 
-# They were not  
-/usr/local/src/nvidia-modprobe.sh 
- 
-# new kernel initramfs, load 
-dracut --force 
  
 # for mapd graphics support needs to be enabled # for mapd graphics support needs to be enabled
Line 371: Line 372:
 To do another node, the steps are To do another node, the steps are
  
-  * add node in deploy.txtof n37.chroot/+  * add node in deploy.txt of n36.chroot/  (centos 7.2)
   * ./deploy.txt `grep node_name deploy.txt`   * ./deploy.txt `grep node_name deploy.txt`
   * scp in place passwd, shadow, group, hosts, fstab from global archive   * scp in place passwd, shadow, group, hosts, fstab from global archive
Line 377: Line 378:
   * ONBOOT=no, ib0 ??? connectX mlx4_0 IB interface breaks in CentOS 7.3+   * ONBOOT=no, ib0 ??? connectX mlx4_0 IB interface breaks in CentOS 7.3+
   * bootlocal=EXIT then reboot then check polkit user … screws up systemd-logind   * bootlocal=EXIT then reboot then check polkit user … screws up systemd-logind
 +
   * hostnamectl set-hostname node_name (logout/login)   * hostnamectl set-hostname node_name (logout/login)
-  * tar in place n37.chroot.ul.tar.gz in / FIRST +  * eth1 on 129.133 
-  * REMOVE /usr/local/cuda-9.2 then uplink eth1 install kernel-devel, reboot+  * yum update 
 +  * yum install kernel-headers kernel-devel 
 +  * put n37 tarball in /, unpack, remove cuda-9.2 
 +  * reboot 
   * Nvidia install: files in /usr/local/src   * Nvidia install: files in /usr/local/src
-    * sh cuda_name_of_runfile +    * sh runfile 
-    * nvidia-modprobe.sh+    * reboot (nouveau) 
 +    * ./runfile -silent -driver 
 +    * reboot 
  
  
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
cluster/172.txt · Last modified: 2020/07/15 17:52 by hmeij07