DKMS fails to recompile NVIDIA drivers on kernel update

I am using Fedora 28, Gnome3. Lately after every kernel update (and kernel updates seem to happen on a weekly basis at least) dkms fails to auto-compile the NVIDIA driver. So I have to switch back to previous kernel, do systemctl set-default, reboot, re-install driver, systemctl set-default, reboot and then everything is back. Today when the I've updated my system, I've notice the error on the driver mentioning something related to driver 396.24. That's strange, because I am using the latest NVIDIA driver 410.66. But when I look into /var/lib/dkms/nvidia, I see the following:

[xxx@yyy ~]$ ll /var/lib/dkms/nvidia/
total 16
drwxr-xr-x. 4 root root 4096 Oct 24 19:19 .
drwxr-xr-x. 3 root root 4096 May 31 00:02 ..
drwxr-xr-x. 6 root root 4096 Sep 23 18:01 396.24
drwxr-xr-x. 3 root root 4096 Oct 24 19:19 410.66
lrwxrwxrwx. 1 root root   37 May 31 00:03 kernel-4.16.12-300.fc28.x86_64-x86_64 -> 396.24/4.16.12-300.fc28.x86_64/x86_64
lrwxrwxrwx. 1 root root   37 Jun 12 23:04 kernel-4.16.14-300.fc28.x86_64-x86_64 -> 396.24/4.16.14-300.fc28.x86_64/x86_64
lrwxrwxrwx. 1 root root   37 Jun 21 01:27 kernel-4.16.16-300.fc28.x86_64-x86_64 -> 396.24/4.16.16-300.fc28.x86_64/x86_64
lrwxrwxrwx. 1 root root   37 Oct 24 19:19 kernel-4.18.16-200.fc28.x86_64-x86_64 -> 410.66/4.18.16-200.fc28.x86_64/x86_64

So, it looks like only the latest kernel is pointing to the newest driver, which is strange since it's been a few kernel updates since I am using 410.

Any idea what might be wrong?

I haven't found dkms that reliable. It has happened a few times that the PC ended up black and you have to boot with kernel parameters "nomodeset" and/or "3" and remove the dkms links manually and recompile.

johanh gravatar imagejohanh ( 2018-10-25 01:11:32 -0500 )edit

But which link is correct and which one not? Would it actually be a good idea to boot into each kernel and compile driver 410.66, therefore removing 396.24 entirely?

bioshark gravatar imagebioshark ( 2018-10-25 19:00:46 -0500 )edit

Your shell should show if the symlink has a target or if it is broken. I would maybe remove everything and recompile with dkms. If you want to do it for older kernels you should be able to specify the kernel version for dkms, so that you don't actually have to boot all kernels. Check the dkms documentation.

johanh gravatar imagejohanh ( 2018-10-26 00:58:40 -0500 )edit

I'd recommend that you switch to RPMfusion's nvidia driver. They use akmod to rebuild the modules, which has always worked for me. I know that this doesn't answer the question, but could be better long-term.

Benjamin Doron gravatar imageBenjamin Doron ( 2019-03-21 19:22:20 -0500 )edit

I've manage to fix my dkms and hopefully now all the kernel updates will compile my Nvidia driver.

Here is how:

$dkms status
Error! Could not locate dkms.conf file.
File: /var/lib/dkms/nvidia/396.24/source/dkms.conf does not exist.
$sudo mv /var/lib/dkms/nvidia/ ~/stuff/old-dkms-folder ########just a back-up
$sudo dkms install nvidia/410.66
###... after install finished successfully
$dkms status
nvidia, 410.66, 4.19.5-300.fc29.x86_64, x86_64: installed (original_module exists)

Hope it helps somebody else as well. I've read somewhere that the problem might have happened because dkms was updated in the same time as the kernel.

And has it really worked since the last updates? Because for me, it hasn't... For the last two kernel updates, namely from 4.19.15 to 4.20.4 and from 4.20.4 to 4.20.5, I had to remove nvidia dkms, uninstall the driver and reinstall it bakc again so my system would boot correctly.

I'm looking for a solution to make it work without having to uninstall/reinstall the nvidia driver every time I update the kernel.

loboedu gravatar imageloboedu ( 2019-02-04 19:35:07 -0500 )edit

Except one time, it has worked every time. I just removed the folders that were generating conflict and did the "dkms install". However on one of the updates, probably 4.20 (don't remember exactly which kernel update) I had to install the latest NVIDIA driver (from via the classical CTRL+F2 on boot.

bioshark gravatar imagebioshark ( 2019-02-05 17:05:06 -0500 )edit

Ow well... mine's not working like that. I've just updated the kernel version and again I had to remove the nvidia dkms, uninstall the driver and then reinstall it all over again.

By the way, I'm installing it via the If-not-true-then-falsepatched installer, not via this classical way you mentioned.

loboedu gravatar imageloboedu ( 2019-02-06 22:05:10 -0500 )edit

