User Tools

Site Tools


cluster:172

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:172 [2018/08/22 11:55]
hmeij07 [Nvidia]
cluster:172 [2018/08/22 13:13]
hmeij07
Line 17: Line 17:
   * copy passwd, shadow, group, hosts, fstab from global archive   * copy passwd, shadow, group, hosts, fstab from global archive
   * check polkit user ... screws up systemd-logind   * check polkit user ... screws up systemd-logind
-  * connextX mlx4_0 IB interface breaks in CentOS 7.3++  * connectX mlx4_0 IB interface breaks in CentOS 7.3+ 
 +  * unmount NFS mounts while installing nvidia as root 
 +  * install other software as regular user 
 ==== Nvidia ==== ==== Nvidia ====
 +
 +** Installation **
  
 <code> <code>
Line 29: Line 33:
 yum update kernel kernel-tools kernel-tools-libs yum update kernel kernel-tools kernel-tools-libs
 yum install kernel-devel kernel-headers (remove old headers after reboot) yum install kernel-devel kernel-headers (remove old headers after reboot)
-yum install gcc gcc-devel g++ g++-devel+yum install gcc gcc-devel gcc-gfortran
  
 # download runfiles from https://developer.nvidia.com/cuda-downloads # download runfiles from https://developer.nvidia.com/cuda-downloads
Line 82: Line 86:
   * export PATH=/usr/local/cuda/bin:$PATH   * export PATH=/usr/local/cuda/bin:$PATH
   * export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH   * export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
 +
 +**Verification**
 +
 +<code>
 +
 +[root@n37 cuda-9.2]# /usr/local/cuda/extras/demo_suite/deviceQuery 
 +/usr/local/cuda/extras/demo_suite/deviceQuery Starting...          
 +
 + CUDA Device Query (Runtime API) version (CUDART static linking)
 +
 +Detected 4 CUDA Capable device(s)
 +
 +Device 0: "Tesla K20m"
 +  CUDA Driver Version / Runtime Version          9.2 / 9.2
 +  CUDA Capability Major/Minor version number:    3.5 
 +...
 +> Peer access from Tesla K20m (GPU0) -> Tesla K20m (GPU1) : Yes
 +> Peer access from Tesla K20m (GPU0) -> Tesla K20m (GPU2) : No
 +> Peer access from Tesla K20m (GPU0) -> Tesla K20m (GPU3) : No
 +> Peer access from Tesla K20m (GPU1) -> Tesla K20m (GPU0) : Yes
 +> Peer access from Tesla K20m (GPU1) -> Tesla K20m (GPU2) : No
 +> Peer access from Tesla K20m (GPU1) -> Tesla K20m (GPU3) : No
 +> Peer access from Tesla K20m (GPU2) -> Tesla K20m (GPU0) : No
 +> Peer access from Tesla K20m (GPU2) -> Tesla K20m (GPU1) : No
 +> Peer access from Tesla K20m (GPU2) -> Tesla K20m (GPU3) : Yes
 +> Peer access from Tesla K20m (GPU3) -> Tesla K20m (GPU0) : No
 +> Peer access from Tesla K20m (GPU3) -> Tesla K20m (GPU1) : No
 +> Peer access from Tesla K20m (GPU3) -> Tesla K20m (GPU2) : Yes
 +
 +deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.2, 
 +  CUDA Runtime Version = 9.2, NumDevs = 4, 
 +  Device0 = Tesla K20m, Device1 = Tesla K20m, 
 +  Device2 = Tesla K20m, Device3 = Tesla K20m
 +Result = PASS
 +
 +</code>
 +
 +** BandWithTest **
 +
 +<code>
 +
 +[root@n37 cuda-9.2]# /usr/local/cuda/extras/demo_suite/bandwidthTest
 +[CUDA Bandwidth Test] - Starting...
 +Running on...
 +
 + Device 0: Tesla K20m
 + Quick Mode
 +
 + Host to Device Bandwidth, 1 Device(s)
 + PINNED Memory Transfers
 +   Transfer Size (Bytes)        Bandwidth(MB/s)
 +   33554432                     6181.3
 +
 + Device to Host Bandwidth, 1 Device(s)
 + PINNED Memory Transfers
 +   Transfer Size (Bytes)        Bandwidth(MB/s)
 +   33554432                     6530.0
 +
 + Device to Device Bandwidth, 1 Device(s)
 + PINNED Memory Transfers
 +   Transfer Size (Bytes)        Bandwidth(MB/s)
 +   33554432                     137200.1
 +
 +Result = PASS
 +
 +</code>
 +
 +** Finish **
 +
 +  * yum install freeglut-devel libX11-devel libXi-devel libXmu-devel \ make mesa-libGLU-devel
 +  * check for /usr/lib64/libvdpau_nvidia.so
 +  * [root@n37 /]# tar -cvf /tmp/n37.chroot.ul.tar usr/local
 +  * [root@n37 /]# scp /tmp/n37.chroot.ul.tar sms_server:/var/chroots/goldimages/
 +
 +
 +
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
cluster/172.txt · Last modified: 2020/07/15 17:52 by hmeij07