Bug 1208939

Summary: Kernel 6.2.1 update no longer loads unsigned kernel modules
Product: [openSUSE] openSUSE Tumbleweed Reporter: Episteme PROMENEUR <epistemepromeneur>
Component: X11 3rd Party DriverAssignee: Stefan Dirsch <sndirsch>
Status: RESOLVED DUPLICATE QA Contact: Stefan Dirsch <sndirsch>
Severity: Critical    
Priority: P2 - High CC: brunopitrus, epistemepromeneur, jlee, werwolf131313
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: nvidia installed packages

Description Episteme PROMENEUR 2023-03-04 17:18:48 UTC
I use fah app (scientific computing).
It needs opencl and cuda.
I use my nvidia card(GT 1030) only for computing.

Today 2023/03/04 I get a current updated with packagekit.

After restarting PC, fah service does not find opencl and cuda.

I deleted all the nvidia packages, then reinstalled them by installing nvidia-drivers-minimal-G06 package.

I restarted the PC. No success. Fah does not find any opencl or cuda.

Also :
— there is no more "GPU current" temp sensor.
— the gkrellm plugin "nvidia" using SMI does not supply any temp and frequency.

kde kcm "data center" does not find opencl.
Comment 1 Episteme PROMENEUR 2023-03-04 17:23:32 UTC
Created attachment 865282 [details]
nvidia installed packages
Comment 2 Stefan Dirsch 2023-03-04 17:52:18 UTC
I guess this is the change in our kernel for 6.2.1 to no longer load unsigned kernel modules. :-( 

In order to verify this either boot an older kernel or disable Secureboot in your Firmware/BIOS.

But I see you're trying to use the openGPU driver from nvidia now. AFAIK GT1030 is still Pascal architecture, which is not supported by this driver. You would need to select nvidia-driver-G06-kmp-default instead and remove again nvidia-open-driver-G06-signed-kmp-default and kernel-firmware-nvidia-gsp-G06.
Comment 3 Episteme PROMENEUR 2023-03-04 18:16:21 UTC
I disabled "secure boot"

same result : no opencl or cuda
Comment 4 Episteme PROMENEUR 2023-03-04 18:19:35 UTC
>>But I see you're trying to use the openGPU driver from nvidia now. AFAIK >>GT1030 is still Pascal architecture, which is not supported by this driver. >>You would need to select nvidia-driver-G06-kmp-default instead and remove >>again nvidia-open-driver-G06-signed-kmp-default and 
>>kernel-firmware-nvidia-gsp-G06.

I did not install these packages. I just selected nvidia-drivers-minimal-G06 package.
Comment 5 Stefan Dirsch 2023-03-04 18:30:54 UTC
(In reply to Episteme PROMENEUR from comment #4)
> >>But I see you're trying to use the openGPU driver from nvidia now. AFAIK >>GT1030 is still Pascal architecture, which is not supported by this driver. >>You would need to select nvidia-driver-G06-kmp-default instead and remove >>again nvidia-open-driver-G06-signed-kmp-default and 
> >>kernel-firmware-nvidia-gsp-G06.
> 
> I did not install these packages. I just selected nvidia-drivers-minimal-G06
> package.

... which then selected and installed these.

%package -n nvidia-drivers-minimal-G06
Summary:        Meta package for compute only installations
Group:          System/X11/Utilities
Requires:       nvidia-compute-utils-G06
Requires:       nvidia-compute-G06
Requires:       (nvidia-driver-G06-kmp = %{version} or nvidia-open-driver-G06-kmp = %{version} or nvidia-open-driver-G06-signed-kmp = %{version})

%description -n nvidia-drivers-minimal-G06
This is just a Meta package for compute only installations.

Please try what I told you.
Comment 6 Episteme PROMENEUR 2023-03-04 18:34:24 UTC
Now :
- secureboot is disabled
- the required packages are installed
- the non compliant packages are no more installed.

success : fah detects opencl and cuda.
Comment 7 Stefan Dirsch 2023-03-05 11:13:20 UTC
Ok. Hopefully this is fixed now with the following changes.

-------------------------------------------------------------------
Sat Mar  4 19:33:22 UTC 2023 - Stefan Dirsch <sndirsch@suse.com>

- sign kernel modules also on TW (boo#1208939) 

-------------------------------------------------------------------
Fri Feb 17 04:58:32 UTC 2023 - Stefan Dirsch <sndirsch@suse.com>

- moved cuda-drivers provides to nvidia-compute-utils-G06 

Updated packages should be available soon.
Comment 8 Episteme PROMENEUR 2023-03-05 11:34:32 UTC
What does it means ?

Can we enable secure boot ?

Can we use nvidia-drivers-minimal-G06 package with a GT 1030 ?

A secondary question : 
when proceeding a headless installation we don't get anymore the adapter sensors and the app “NVIDIA settings”. It would be a good thing to get them again, perhaps with a separate package.
Comment 9 Stefan Dirsch 2023-03-05 12:08:27 UTC
(In reply to Episteme PROMENEUR from comment #8)
> What does it means ?
> 
> Can we enable secure boot ?

Yes.

> Can we use nvidia-drivers-minimal-G06 package with a GT 1030 ?

Since it selects the open driver by default at least on your system (not sure why; maybe you can taboo it via zypper lock) the answer is no. What you can try is

  zypper in --no-recommends nvidia-compute-utils-G06 nvidia-driver-G06-kmp-default

> A secondary question : 
> when proceeding a headless installation we don't get anymore the adapter
> sensors 

Not sure what you mean with that. Special app? Maybe 

> and the app “NVIDIA settings”. It would be a good thing to get them
> again, perhaps with a separate package.

nvidia-settings is now in nvidia-utils-G06 package due to its dependancies
Comment 10 Episteme PROMENEUR 2023-03-05 13:15:30 UTC
"headless" installation of the driver <=> minimal installation for computing
Comment 11 Stefan Dirsch 2023-03-05 13:28:45 UTC
Sorry, I don't understand what you mean with "adapter sensors".
Comment 12 Episteme PROMENEUR 2023-03-05 13:31:34 UTC
The sensor of the nvidia card for delivering temp for example.

to get temp we need to install the video driver
Comment 13 Stefan Dirsch 2023-03-05 13:35:48 UTC
(In reply to Episteme PROMENEUR from comment #12)
> The sensor of the nvidia card for delivering temp for example.
> 
> to get temp we need to install the video driver

Which video driver? You mean the X driver? This sounds weird. So which file is missing here?
Comment 14 Episteme PROMENEUR 2023-03-05 13:42:33 UTC
I don't know which part to install to get sensors.

I know only if I proceed a minimal installation for computing then i get no more any sensor to display with gkrellm.
Comment 15 Stefan Dirsch 2023-03-05 13:43:34 UTC
I can see the temperature in the output of 'nvidia-smi' which is part of nvidia-compute-utils-G06. So together with nvidia-driver-G06-kmp-default I don't see what's missing here ...
Comment 16 Episteme PROMENEUR 2023-03-05 13:46:05 UTC
We can get temp, fans etc. without using SMI.

Gkrellm detects not any sensors with a minimal installation.
Comment 17 Stefan Dirsch 2023-03-05 13:48:37 UTC
gkrellm is a desktop app. So of course it needs some video driver. But I had the impression you had some different gfx card (internal Intel probably) for your desktop and used the nvidia card just for CUDA/OpenCL. No idea what gkrellm does. If it checks the temperature of the gfx card currently in use for the desktop or also additional nvidia gfx cards.
Comment 18 Stefan Dirsch 2023-03-05 13:51:20 UTC
(In reply to Episteme PROMENEUR from comment #16)
> We can get temp, fans etc. without using SMI.
> 
> Gkrellm detects not any sensors with a minimal installation.

I don't force anybody to make use of the minimal installation.
Comment 19 Episteme PROMENEUR 2023-03-05 13:55:58 UTC
If I made a full installation of the NVIDIA card along the Intel iGPU and i don't use the NVIDIA card for the monitor, just for computing, there is no sensor problem. Gkrellm detects the nvidia sensors for displaying temp, fan, etc. that's why I say sensor part installation is missing with a minimal installation.
Comment 20 Episteme PROMENEUR 2023-03-05 14:00:37 UTC
I just say that for computing, it is useful to know GPU core temp, fan speed, etc.
Comment 21 Stefan Dirsch 2023-03-05 14:13:49 UTC
Then I guess it's some library in nvidia-gl-G06 or nvidia-video-G06, which is needed by gkrellm. You could try to figure out which one ...
Comment 22 Episteme PROMENEUR 2023-03-05 14:25:57 UTC
I installed nvidia-gl-G06.

No sensors.

Then certainly the sensor part is in nvidia-video-G06.
Comment 23 Stefan Dirsch 2023-03-05 14:33:12 UTC
Hmm. Interesting. That would be one of these:

rpm -qpl nvidia-video-G06-525.89.02-7.1.x86_64.rpm 
/usr/lib64/gbm
/usr/lib64/gbm/nvidia-drm_gbm.so
/usr/lib64/libnvcuvid.so
/usr/lib64/libnvcuvid.so.1
/usr/lib64/libnvcuvid.so.525.89.02
/usr/lib64/libnvidia-allocator.so.1
/usr/lib64/libnvidia-allocator.so.525.89.02
/usr/lib64/libnvidia-encode.so.1
/usr/lib64/libnvidia-encode.so.525.89.02
/usr/lib64/libnvidia-opticalflow.so.1
/usr/lib64/libnvidia-opticalflow.so.525.89.02
/usr/lib64/libvdpau_nvidia.so
/usr/lib64/vdpau
/usr/lib64/vdpau/libvdpau_nvidia.so.1
/usr/lib64/vdpau/libvdpau_nvidia.so.525.89.02
Comment 24 Stefan Dirsch 2023-03-05 14:51:07 UTC
Ok. ChatGPT just told me that gkrellm uses nVidia's XNVCtrl Xserver extension tp display the temperature. I'm not sure how this extension gets added to the Xserver. Either by loading nVidia's glx extension or the X driver itself. But both are in nvidia-gl-G06.

/usr/lib64/xorg/modules/extensions/libglxserver_nvidia.so.525.89.02
/usr/lib64/xorg/modules/drivers/nvidia_drv.so

You can check that with xdpyinfo. One of the extension will be called 'XNVCtrl' or similar. So of course after installing nvidia-gl-G06 you also need to restart the Xserver.
Comment 25 Episteme PROMENEUR 2023-03-05 14:57:00 UTC
I don't understand how to use xdpyinfo.
Comment 26 Stefan Dirsch 2023-03-05 15:06:06 UTC
Just run it in a terminal on the desktop.

I was wrong. The extension, which are added by NVIDIA are

    NV-CONTROL
    NV-GLX

So I think NV-CONTROL is what is being used by gkrellm (using the XNVCtrl library).
Comment 27 Stefan Dirsch 2023-03-05 15:09:53 UTC
BTW, I only found a gkrellm plugin, which we don't ship AFAIK.

https://github.com/carcass82/gkrellm-nvidia

But this now switched from XNVCtrl library is to NVML ...

This begs the question where your gkrellm is coming from ...
Comment 28 Episteme PROMENEUR 2023-03-05 15:12:08 UTC
With Yast software manager, I found a “libXNVCtrl0” 510.47.03 package.
There is not any 525.89 version.

Perhaps a solution ?
Comment 29 Episteme PROMENEUR 2023-03-05 15:13:08 UTC
My gkrellm is coming from tumbleweed main repo OSS.
Comment 30 Stefan Dirsch 2023-03-05 15:19:48 UTC
Ok. Just figured out that gkrellm uses 'nvidia-settings' tool internally to display the temperature. This is in package nvidia-utils-G06. But of course you still need also nvidia-gl-G06 since 'nvidia-settings' also used NV-CONTROL Xserver extension. But with that this is no longer a minimal installation. ;-)
Comment 31 Episteme PROMENEUR 2023-03-05 15:22:41 UTC
NVML is used by SMI. Then it is installed with a minimal installation just for computing.

If we can't separate the sensor part of the Nvidia driver, perhaps it is possible to create fake sensors. 
These sensors would deliver temp, fan speed, frequency etc. These fake sensors would use SMI to deliver the values. 

The "nvidia" plugin of gkrellm does not create fake sensors, but just display in Gkrellm the output of SMI. This plugin can inspire a developper to cretae fake sensors.
Comment 32 Episteme PROMENEUR 2023-03-05 15:31:41 UTC
there is 

/usr/lib64/libnvidia-ml.so
/usr/lib64/libnvidia-ml.so.1
/usr/lib64/libnvidia-ml.so.525.89.02

in nvidia-compute-G06
Comment 33 Stefan Dirsch 2023-03-05 15:34:47 UTC
Well, so why not use SMI from the beginning for this purpose? As I told you 'nvidia-smi' can tell you the temperature without the need for NV-CONTROL/libXNVCtrl0 interface.
Comment 34 Stefan Dirsch 2023-03-05 15:38:48 UTC
(In reply to Episteme PROMENEUR from comment #32)
> there is 
> 
> /usr/lib64/libnvidia-ml.so
> /usr/lib64/libnvidia-ml.so.1
> /usr/lib64/libnvidia-ml.so.525.89.02
> 
> in nvidia-compute-G06

Yes, nvidia-smi needs it.
Comment 35 Stefan Dirsch 2023-03-05 15:41:22 UTC
I think this gkrellm Plugin making using of NVML (I guess it's just using libnvida-ml, but I didn't check this) is the right approach here.
Comment 36 Episteme PROMENEUR 2023-03-05 15:42:15 UTC
Because SMI is not a monitoring tool.

A monitoring tool display permanently, in an easy way on my desktop, various values as Gkrellm does.

Gkrellm "nvidia" plugin runs well. But it is a workaround. It works only with Gkrellm.
Comment 37 Stefan Dirsch 2023-03-05 15:45:41 UTC
Obviously it does ...

https://github.com/carcass82/gkrellm-nvidia/blob/master/Makefile
[...]
LDLIBS  += -lX11 -lnvidia-ml
[...]
Comment 38 Episteme PROMENEUR 2023-03-05 15:53:16 UTC
What "Obviously it does" ?
Comment 39 Stefan Dirsch 2023-03-05 15:54:52 UTC
(In reply to Episteme PROMENEUR from comment #36)
> Because SMI is not a monitoring tool.
> 
> A monitoring tool display permanently, in an easy way on my desktop, various
> values as Gkrellm does.
> 
> Gkrellm "nvidia" plugin runs well. But it is a workaround. It works only
> with Gkrellm.

Feel free to improve the situation. I currently don't see how changing the packaging would help here. Minimal installation should not have any deps to X libs or even Xserver. Period.
Comment 40 Stefan Dirsch 2023-03-05 15:55:53 UTC
That was an additional comment to comment#35
Comment 41 Stefan Dirsch 2023-03-06 13:05:14 UTC
*** Bug 1208966 has been marked as a duplicate of this bug. ***
Comment 42 Bruno Pitrus 2023-03-06 15:37:25 UTC
I built and installed the updated nvidia-driver-G06-kmp-default and it did not work for me. I noticed you've added some stuff using mokutil. I do not use shim for booting, instead have dracut configured to produce signed .efi images containing kernel + initramfs (the `uefi_secureboot_cert` option)

Is there anything i can do (most likely add something to `kernel_cmdline`) to make this work in such a setup?

(Disabling signature enforcement in kernel would be fine for me, as the initramfs is signed while the system drive is under FDE)
Comment 43 Stefan Dirsch 2023-03-06 16:12:11 UTC
(In reply to dziobian from comment #42)
> I built and installed the updated nvidia-driver-G06-kmp-default and it did
> not work for me. 

Repositories haven't been updated with my changes yet.

> I noticed you've added some stuff using mokutil. 

Yes, but today I noticed that my changes simple don't help. Looks like things need to be done differently with TW kernels than with kernels on openSUSE Leap. :-(

> I do not
> use shim for booting, instead have dracut configured to produce signed .efi
> images containing kernel + initramfs (the `uefi_secureboot_cert` option)
> 
> Is there anything i can do (most likely add something to `kernel_cmdline`)
> to make this work in such a setup?
> 
> (Disabling signature enforcement in kernel would be fine for me, as the
> initramfs is signed while the system drive is under FDE)

Honestly I'm not an expert on this area. So I'm afraid I can't help you here. :-(
Comment 44 Joey Lee 2023-03-07 14:02:02 UTC
(In reply to dziobian from comment #42)
> I built and installed the updated nvidia-driver-G06-kmp-default and it did
> not work for me. I noticed you've added some stuff using mokutil. I do not
> use shim for booting, instead have dracut configured to produce signed .efi
> images containing kernel + initramfs (the `uefi_secureboot_cert` option)
> 
> Is there anything i can do (most likely add something to `kernel_cmdline`)
> to make this work in such a setup?
> 
> (Disabling signature enforcement in kernel would be fine for me, as the
> initramfs is signed while the system drive is under FDE)

Without shim which means without MOK, the only way is rebuild kernel and put the public key of nvidia-driver-G06-kmp-default to kernel. 

Or disable secure boot.
Comment 45 Joey Lee 2023-03-07 14:16:54 UTC
(In reply to Joey Lee from comment #44)
> (In reply to dziobian from comment #42)
> > I built and installed the updated nvidia-driver-G06-kmp-default and it did
> > not work for me. I noticed you've added some stuff using mokutil. I do not
> > use shim for booting, instead have dracut configured to produce signed .efi
> > images containing kernel + initramfs (the `uefi_secureboot_cert` option)
> > 
> > Is there anything i can do (most likely add something to `kernel_cmdline`)
> > to make this work in such a setup?
> > 
> > (Disabling signature enforcement in kernel would be fine for me, as the
> > initramfs is signed while the system drive is under FDE)
> 
> Without shim which means without MOK, the only way is rebuild kernel and put
> the public key of nvidia-driver-G06-kmp-default to kernel. 
> 
> Or disable secure boot.

Another approach, maybe we can try to enable CONFIG_SYSTEM_EXTRA_CERTIFICATE in openSUSE Tumbleweed kernel. As I remember, it will reserve a space for user to enroll public key to the space, then user needs to re-sign kernel by him self. So user doesn't need to re-compiler kernel to embedded public key.
Comment 46 Stefan Dirsch 2023-03-07 20:12:37 UTC
Closing as duplicate.

*** This bug has been marked as a duplicate of bug 1209006 ***