Bugzilla – Bug 1201392
Kernel 5.18.9-2.1 with enabled simpledrm breaks nvidia driver (use kernel boot option 'nosimplefb=1' as workaround)
Last modified: 2022-08-20 09:01:13 UTC
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:101.0) Gecko/20100101 Firefox/101.0 Build Identifier: After performing an update to Tumbleweed 20220710 and rebooting the system, the KDE login screen does not appear and the graphical display is left black. Reproducible: Always Steps to Reproduce: 1. tumbleweed update 2. reboot 3. After reboot, the graphical login screen is left blank.
Please attach the journal for the relevant boot as well as /var/log/Xorg.0.log.
Created attachment 860114 [details] Boot log
Created attachment 860115 [details] Xorg log
Logs supplied as requested.
Looks like something with the NVIDIA driver, reassigning. The only relevant error messages I can find: [ 7.044] randr: failed to create shared pixmap [ 7.044] failed to add fb -22 [ 7.044] (EE) modeset(G0): failed to set mode: Invalid argument
Hmm. modeset driver gets loaded, but I don't think it's really getting active. nvidia is already loaded and is the higher prioritized driver. But you can try with options nvidia-drm modeset=0 in /etc/modprobe.d/50-nvidia-default.conf to prevent modeset driver loaded. I'm afraid it's a regression in the nvidia driver and you got the new nvidia driver together with the TW update.
I seem to have been hit by this, too. (GeForce GT 730, using the proprietary G05 driver from the nvidia repo.) Some more information: * The plymouth animation during boot is garbled. * Blindly typing my password after boot will let me log in, but the desktop behaves strangely: if I move the mouse to the bottom of the screen, the screen will scroll up! * Pressing ctrl-alt-f2 will not switch to a text console. This is the same behavior I got a few months ago, when fbdev had been replaced with simpledrm, a change that was reverted, because of this problem. Now, using "dmesg | grep drm", I see that simpledrm has been enabled again. Switching to nouveau for the time being, which works fine.
(In reply to Stefan Dirsch from comment #6) > Hmm. modeset driver gets loaded, but I don't think it's really getting > active. nvidia is already loaded and is the higher prioritized driver. But > you can try with > > options nvidia-drm modeset=0 > > in /etc/modprobe.d/50-nvidia-default.conf to prevent modeset driver loaded. > > I'm afraid it's a regression in the nvidia driver and you got the new nvidia > driver together with the TW update. Tried this, did not change anything.
(In reply to Kriton Kyrimis from comment #7) > I seem to have been hit by this, too. (GeForce GT 730, using the proprietary > G05 driver from the nvidia repo.) > > Some more information: > > * The plymouth animation during boot is garbled. > > * Blindly typing my password after boot will let me log in, but the desktop > behaves strangely: if I move the mouse to the bottom of the screen, the > screen will scroll up! > > * Pressing ctrl-alt-f2 will not switch to a text console. > > This is the same behavior I got a few months ago, when fbdev had been > replaced with simpledrm, a change that was reverted, because of this > problem. Now, using "dmesg | grep drm", I see that simpledrm has been > enabled again. > > Switching to nouveau for the time being, which works fine. I can partially concur with the above: * Blindly typing my password allowed me to log in. * Pressing ctrl-alt-f2 allowed me to switch to a text console, however pressing ctrl-alt-f7 afterwards displayed a blank screen again. * Oh yes, rollback to a previous working system screwed up the codecs for my speakers, so they are not working any more. This is probably a kernel headache.
(In reply to Stefan Dirsch from comment #6) > Hmm. modeset driver gets loaded, but I don't think it's really getting > active. nvidia is already loaded and is the higher prioritized driver. But > you can try with > > options nvidia-drm modeset=0 > > in /etc/modprobe.d/50-nvidia-default.conf to prevent modeset driver loaded. I mean replace the line options nvidia-drm modeset=1 then run 'mkintird' afterwards and reboot the machine. Does this help?
Hmm. Please try with kernel boot option 'nosimplefb'. Does this issue also occur with older kernels?
(In reply to Stefan Dirsch from comment #10) > (In reply to Stefan Dirsch from comment #6) > > Hmm. modeset driver gets loaded, but I don't think it's really getting > > active. nvidia is already loaded and is the higher prioritized driver. But > > you can try with > > > > options nvidia-drm modeset=0 > > > > in /etc/modprobe.d/50-nvidia-default.conf to prevent modeset driver loaded. > > I mean replace the line > > options nvidia-drm modeset=1 > > then run 'mkintird' afterwards and reboot the machine. Does this help? That is what I did.
(In reply to Stefan Dirsch from comment #11) > Hmm. Please try with kernel boot option 'nosimplefb'. Does this issue also > occur with older kernels? Please try this option on the kernel's command line.
Forgot to mention, this problem does not occur with the previous kernel (5.18.9-1); in fact I am roll back to this kernel to allow me to continue to use the system.
(In reply to Thomas Zimmermann from comment #13) > (In reply to Stefan Dirsch from comment #11) > > Hmm. Please try with kernel boot option 'nosimplefb'. Does this issue also > > occur with older kernels? > > Please try this option on the kernel's command line. I tried this, and also tried adding it to the boot options in GRUB (using YaST). As previously stated, this had no effect.
Same for me: * Adding nosimplefb in the boot options in the grub menu fixed the garbled plymouth animation, but didn't do anything else. * Setting options.nvidia-drm modeset=0 in 50-nvidia-default.conf didn't change anything. (I found the file in /usr/lib/modprobe.d, not /etc/modprobe.d, and ran "dracut -f" instead of mkinitrd.) * I can also confirm that the problem did not exist in the previous kernel (5.18.9-1.1).
Ok. I think we have an issue with our kernel update here. :-(
(In reply to Kriton Kyrimis from comment #16) > Same for me: > > * Adding nosimplefb in the boot options in the grub menu fixed the garbled > plymouth animation, but didn't do anything else. Ok. Thanks. > * Setting options.nvidia-drm modeset=0 in 50-nvidia-default.conf didn't > change anything. (I found the file in /usr/lib/modprobe.d, not > /etc/modprobe.d, and ran "dracut -f" instead of mkinitrd.) You're right. config files moved to a different location in TW. > * I can also confirm that the problem did not exist in the previous kernel > (5.18.9-1.1). Thanks. That's what I assumed.
That's from a different machine. Jul 12 11:21:05 fortress.home.nordisch.org /usr/libexec/gdm/gdm-x-session[2316]: (II) modeset(G0): Damage tracking initialized Jul 12 11:21:05 fortress.home.nordisch.org /usr/libexec/gdm/gdm-x-session[2316]: randr: failed to create shared pixmap Jul 12 11:21:05 fortress.home.nordisch.org /usr/libexec/gdm/gdm-x-session[2316]: (EE) modeset(G0): failed to set mode: No space left on device Jul 12 11:21:05 fortress.home.nordisch.org kernel: simple-framebuffer simple-framebuffer.0: swiotlb buffer is full (sz: 344064 bytes), total 32768 (slots), used 326 (slots)
(In reply to Stefan Dirsch from comment #17) > Ok. I think we have an issue with our kernel update here. :-( Given that installation of the new kernel resulted in sound card codecs disappearing on the previous kernel (so a rollback to the previous kernel using snapper did not restore the sound card to operation), I think that you might be right there!
(In reply to Stefan Dirsch from comment #17) > Ok. I think we have an issue with our kernel update here. :-( Isn't the issue that simpledev has been re-enabled in the latest kernel? I've unistalled the previous kernel, so I can't check. (I understand that "dmesg | grep simpledrm" should not display anything, if simpledrm is disabled.)
Hi, thanks for testing. (In reply to Alan Hughes from comment #15) > (In reply to Thomas Zimmermann from comment #13) > > (In reply to Stefan Dirsch from comment #11) > > > Hmm. Please try with kernel boot option 'nosimplefb'. Does this issue also > > > occur with older kernels? > > > > Please try this option on the kernel's command line. > > I tried this, and also tried adding it to the boot options in GRUB (using > YaST). As previously stated, this had no effect. (In reply to Kriton Kyrimis from comment #16) > Same for me: > > * Adding nosimplefb in the boot options in the grub menu fixed the garbled > plymouth animation, but didn't do anything else. After booting with 'nosimplefb' could you provide a dmesg output from that boot? I'd like to see if the option actually did anything. It's supposed to be a fallback for incompatible systems.
(In reply to Thomas Zimmermann from comment #22) > Hi, > > thanks for testing. > > (In reply to Alan Hughes from comment #15) > > (In reply to Thomas Zimmermann from comment #13) > > > (In reply to Stefan Dirsch from comment #11) > > > > Hmm. Please try with kernel boot option 'nosimplefb'. Does this issue also > > > > occur with older kernels? > > > > > > Please try this option on the kernel's command line. > > > > I tried this, and also tried adding it to the boot options in GRUB (using > > YaST). As previously stated, this had no effect. > > (In reply to Kriton Kyrimis from comment #16) > > Same for me: > > > > * Adding nosimplefb in the boot options in the grub menu fixed the garbled > > plymouth animation, but didn't do anything else. > > After booting with 'nosimplefb' could you provide a dmesg output from that > boot? I'd like to see if the option actually did anything. It's supposed to > be a fallback for incompatible systems. You can get the full log with sudo journalctl -b
Created attachment 860125 [details] journalctl -b output with nosimplefb boot option
> You can get the full log with > > sudo journalctl -b See attachment. This time the plymouth animation was garbled again. Looking at the log, I see: Malformed early option 'nosimplefb' so the option probably didn't do anything.
Regarding the new title of this issue, the problem is with kernel 5.18.9-2.1, not 5.18.9-1.1. 5.18.9-1.1 works OK, as I mentioned.
(In reply to Kriton Kyrimis from comment #16) > Same for me: > > * Adding nosimplefb in the boot options in the grub menu fixed the garbled > plymouth animation, but didn't do anything else. It might need an argument. What about nosimplefb=1 ?
(In reply to Kriton Kyrimis from comment #26) > Regarding the new title of this issue, the problem is with kernel > 5.18.9-2.1, not 5.18.9-1.1. > > 5.18.9-1.1 works OK, as I mentioned. Yes, thanks for the log. It's most certainly a collision between the newly enabled simpledrm driver and the nvidia driver. The former should have been disabled automatically. Kernel options parsing is black magic. :/
Created attachment 860126 [details] dmesg log Just to add to the mix, here is a dmesg log from booting my system. There is probably no additional information in it, but it may be useful to have.
Created attachment 860127 [details] journalctl -b output with nosimplefb=1 boot option
(In reply to Thomas Zimmermann from comment #27) > It might need an argument. What about nosimplefb=1 ? That's it!!! (In reply to Thomas Zimmermann from comment #28) > Kernel options parsing is black magic. :/ True. I tried nosimplefb, simplefb=0, and even video=simplefb:off, before finding that nosimplefb works! I have attached a new log, when booting with the nosimplefb=1 option. I'm leaving the older one for reference, as it shows the default situation. I think I'll add "nosimplefb=1" to /etc/default/grub.conf until further notice.
(In reply to Kriton Kyrimis from comment #31) > nosimplefb works! nosimplefb=1 !!!
(In reply to Kriton Kyrimis from comment #31) > I think I'll add "nosimplefb=1" to /etc/default/grub.conf until further > notice. Hmmm.. no. This way I can't get a Linux console wit ctl-alt-f2. Oh, well. Back to nouveau.
(In reply to Kriton Kyrimis from comment #33) > (In reply to Kriton Kyrimis from comment #31) > > I think I'll add "nosimplefb=1" to /etc/default/grub.conf until further > > notice. > > Hmmm.. no. This way I can't get a Linux console wit ctl-alt-f2. Oh, well. > Back to nouveau. Interesting, but ctrl-alt-f1, ctrl-alt-f2, ... all work for me.
(In reply to Kriton Kyrimis from comment #32) > (In reply to Kriton Kyrimis from comment #31) > > > nosimplefb works! > > nosimplefb=1 !!! Works for too!
Hi (In reply to Kriton Kyrimis from comment #33) > (In reply to Kriton Kyrimis from comment #31) > > I think I'll add "nosimplefb=1" to /etc/default/grub.conf until further > > notice. > > Hmmm.. no. This way I can't get a Linux console wit ctl-alt-f2. Oh, well. > Back to nouveau. Thanks for the log. Nvidia requires efifb for displaying the console, but that driver isn't mentioned in your log, which is surprising. I'll try to reproduce the bug tomorrow
I have the same issue. Also I would like to add a log from /var/log/lightdm/x-0.log. MESA-LOADER: failed to open simpledrm: /usr/lib64/dri/simpledrm_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib64/dri, suffix _dri) failed to load driver: simpledrm MESA-LOADER: failed to open zink: /usr/lib64/dri/zink_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib64/dri, suffix _dri) failed to load driver: zink randr: failed to create shared pixmap (II) Server terminated successfully (0). Closing log file. This is running 5.18.9-2-default. When I revert back to 5.18.9-1-default, it boots and those logs don't show up.
I think you can ignore these error message. When simplefb /simpledrm is active Mesa tries to find a native driver. If this doesn't exist a driver, which wraps to Vulkan (zink). If this fails as well, because there is no Vukan driver it falls back to software rasterizer (swrast). But the simplefb/simpledrm driver should be disabled by default on systems with nvidia driver. This is the issue here. For now use the workaround now even mentioned in the title of the ticket.
*** Bug 1201396 has been marked as a duplicate of this bug. ***
No issues with graphical login here, but tty stopped working. $ uname -a Linux 5.18.9-2-default #1 SMP PREEMPT_DYNAMIC Wed Jul 6 05:57:32 UTC 2022 (a7c5f9c) x86_64 x86_64 x86_64 GNU/Linux $ dmesg | egrep -i '(modeset|efifb|simpledrm)' [ 0.193310] pci 0000:01:00.0: BAR 1: assigned to efifb [ 0.337295] [drm] Initialized simpledrm 1.0.0 20200625 for simple-framebuffer.0 on minor 0 [ 0.347103] simple-framebuffer simple-framebuffer.0: [drm] fb0: simpledrmdrmfb frame buffer device [ 3.916820] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 515.57 Wed Jun 22 22:31:08 UTC 2022
(In reply to Alan Hughes from comment #34) > Interesting, but ctrl-alt-f1, ctrl-alt-f2, ... all work for me. (In reply to Thomas Zimmermann from comment #36) > Nvidia requires efifb for displaying the console, but > that driver isn't mentioned in your log, which is surprising. I don't know about the machine where that log was taken, but mine is a BIOS machine, so I don't have efifb. That's probably why I don't get a Linux console with nosimplefb=1.
(In reply to Kriton Kyrimis from comment #41) > (In reply to Alan Hughes from comment #34) > > Interesting, but ctrl-alt-f1, ctrl-alt-f2, ... all work for me. > > (In reply to Thomas Zimmermann from comment #36) > > Nvidia requires efifb for displaying the console, but > > that driver isn't mentioned in your log, which is surprising. > > I don't know about the machine where that log was taken, but mine is a BIOS > machine, so I don't have efifb. That's probably why I don't get a Linux > console with nosimplefb=1. Oh, indeed. And the respective driver, vesafb, is in the journal's output. I'm surprised that it doesn't work then. With nosimplefb, you're booting with the same drivers as before. I'm working on a fix. Let's see if this leads to anything.
(In reply to Thomas Zimmermann from comment #42) > And the respective driver, vesafb, is in the journal's output. > I'm surprised that it doesn't work then. With nosimplefb, you're booting > with the same drivers as before. It turns out that that particular problem was between keyboard and chair: the function lock on my keyboard was disabled, and the F2 key was sending "Undo" instead of "F2"! After enabling function lock by hitting the "F Lock" key, Ctl-Alt-Fn works fine. HOWEVER: after returning to KDE from a Linux console, the text under the icons on the desktop is garbled. Killing and restarting plasmashell restores the text. This does not happen under nouveau.
(In reply to Kriton Kyrimis from comment #43) > (In reply to Thomas Zimmermann from comment #42) > > And the respective driver, vesafb, is in the journal's output. > > I'm surprised that it doesn't work then. With nosimplefb, you're booting > > with the same drivers as before. > > It turns out that that particular problem was between keyboard and chair: > the function lock on my keyboard was disabled, and the F2 key was sending > "Undo" instead of "F2"! After enabling function lock by hitting the "F Lock" > key, Ctl-Alt-Fn works fine. Ah, ok. nosimplefb works then. > > HOWEVER: after returning to KDE from a Linux console, the text under the > icons on the desktop is garbled. Killing and restarting plasmashell restores > the text. This does not happen under nouveau. This sounds like a problem with nvidia.ko itself. Did this behavior change with the recent updates?
(In reply to Thomas Zimmermann from comment #44) > This sounds like a problem with nvidia.ko itself. Did this behavior change > with the recent updates? I don't really know. I don't often switch to a Linux console, and there are all sorts of possibly related things that have changed since the last time I've needed to do that: kernel, KDE framework, KDE plasma, nvidia driver, and who knows what else. I can't revert to an older system, so I can't check, but I certainly hadn't noticed this behavior before.
Stefan has added change to the nvidia RPMs so that nosimplefb is set automatically. This should resolve any related problems.
*** Bug 1201485 has been marked as a duplicate of this bug. ***
(In reply to Thomas Zimmermann from comment #46) > Stefan has added change to the nvidia RPMs so that nosimplefb is set > automatically. This should resolve any related problems. Yes, this will get fixed with the next driver package update. Changelog entry: ------------------------------------------------------------------- Fri Jul 15 10:14:41 UTC 2022 - Stefan Dirsch <sndirsch@suse.com> - add "nosimplefb=1" kernel boot option as workaround for TW to disable simpledrm during install and remove it again during uninstall (boo#1201392)
Closing as fixed.
Upstream issue for this https://github.com/NVIDIA/open-gpu-kernel-modules/issues/228
*** Bug 1201453 has been marked as a duplicate of this bug. ***
After the latest update, the problem persists. I didn't add the nosimplefb=1 option before the update just to see what would happen. I'm on kernel version 5.18.11, Nvidia driver version 470.129.06, TW release 20220718. This is the log I get from: /var/log/lightdm/x-0.log (==) Log file: "/var/log/Xorg.0.log", Time: Wed Jul 20 12:41:01 2022 (==) Using config directory: "/etc/X11/xorg.conf.d" (==) Using system config directory "/usr/share/X11/xorg.conf.d" MESA-LOADER: failed to open simpledrm: /usr/lib64/dri/simpledrm_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib64/dri, suffix _dri) failed to load driver: simpledrm MESA-LOADER: failed to open zink: /usr/lib64/dri/zink_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib64/dri, suffix _dri) failed to load driver: zink randr: failed to create shared pixmap (II) Server terminated successfully (0). Closing log file. I've added the nosimplefb=1 to grub now using yast, because now I've got no usable kernel, since the last one got deleted after the latest update. Now with the option it runs fine.
(In reply to Sotir Danailov from comment #52) > After the latest update, the problem persists. I didn't add the nosimplefb=1 > option before the update just to see what would happen. It will be fixed in the next NVIDIA Driver update, not TW.
Oh right, I didn't notice that there's no release for the nvidia package yet sorry. Added unnecessary noise.
*** Bug 1201764 has been marked as a duplicate of this bug. ***
After the latest update to the Nvidia package, I can confirm that it now adds nosimplefb=1 properly to the bootloader.
(In reply to Sotir Danailov from comment #56) > After the latest update to the Nvidia package, I can confirm that it now > adds nosimplefb=1 properly to the bootloader. Thanks for confirmation! ;-)
(In reply to Sotir Danailov from comment #56) > After the latest update to the Nvidia package, I can confirm that it now > adds nosimplefb=1 properly to the bootloader. No, it doesn't for me, that's what I get in my `/etc/default/grub` (note the `'`s) > GRUB_CMDLINE_LINUX_DEFAULT="'splash=silent resume=/dev/disk/by-id/nvme-Samsung_SSD_980_1TB_S649NX0RB25368R-part3 quiet mitigations=auto' nosimplefb=1"
(In reply to Andrei Dziahel from comment #58) > (In reply to Sotir Danailov from comment #56) > > After the latest update to the Nvidia package, I can confirm that it now > > adds nosimplefb=1 properly to the bootloader. > > No, it doesn't for me, that's what I get in my `/etc/default/grub` (note the > `'`s) > > > GRUB_CMDLINE_LINUX_DEFAULT="'splash=silent resume=/dev/disk/by-id/nvme-Samsung_SSD_980_1TB_S649NX0RB25368R-part3 quiet mitigations=auto' nosimplefb=1" Indeed you have these extra `'` in there. I suggest to remove these and try again via pbl --del-option nosimplefb=1 --config pbl --add-option nosimplefb=1 --config AFAIK these should not be there. I assume you've added them manually at some point.