User Tools

Site Tools


cluster:225

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:225 [2024/05/06 18:25]
hmeij07 [Testing]
cluster:225 [2024/05/21 14:06] (current)
hmeij07
Line 63: Line 63:
 <code> <code>
  
-# n78 first ... (no problem, tests success)+# n78 first ... can reimage cuda-11.6 from n101 (no problem, tests success)
 # make sure /usr/src/kernels/$(uname -r) exists else  # make sure /usr/src/kernels/$(uname -r) exists else 
 # scp into place from n100 (centos8, possibly caused by warewulf...) # scp into place from n100 (centos8, possibly caused by warewulf...)
 +# however old nvidia packages still in OS (driver 510 toolkit 11.6)...
 +#  rpm -qa | grep ^nvidia | wc -l results in 16 packages...
 +# what happens on dfn check-update ???
 +
 +# n[100-101] skipping for now
 +# this is a package install and there is no nvidia_uninstall (runfile)
 +# an upgrade would require internet 'dnf check-update; dnf update')
 +# switching between rpm install and runfile is NOT recommended
 +# and 'dnf erase nvidia*' may leave a hung system behind
  
 # n79 next (no problem) # n79 next (no problem)
Line 81: Line 90:
 # need to research it is somewhat related to cuda install # need to research it is somewhat related to cuda install
 # n80 (same error upon reboot after driver install) # n80 (same error upon reboot after driver install)
 +# n81 (same error upon reboot after driver install)
 +# n90 (same error upon reboot after toolkit install, not driver. weird)
 +# n88 (failed toolkit install, ran /usr/bin/ndia-uninstall, reboot
 +#      re-installed driver, reboot, re-installed tookit, reboot, 
 +#      no error occurs! )
 +# n87 (ran nvidia-uninstall first, driver then toolkit, errors shows up)
 +# n86 & n85 same as n87
 +# n84 (no error shows up)
 +# n82 (as n87 but error shows up after driver before toolkit install)
 +# n83 (error shows up fater toolkit install, not driver install reboot)
  
 sh ./NVIDIA-Linux-x86_64-550.67.run sh ./NVIDIA-Linux-x86_64-550.67.run
cluster/225.1715019927.txt.gz ยท Last modified: 2024/05/06 18:25 by hmeij07