Bug 1141041

Summary: Nvidia quadro + nouveau :after snapshot 20190708 second terminal doesn't more display
Product: [openSUSE] openSUSE Tumbleweed Reporter: Philippe Condé <conde.philippe>
Component: X.OrgAssignee: E-mail List <xorg-maintainer-bugs>
Status: RESOLVED UPSTREAM QA Contact: E-mail List <xorg-maintainer-bugs>
Severity: Normal    
Priority: P5 - None CC: conde.philippe, mrmazda
Version: Current   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE Factory   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=111110
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: xorg.0.log
list of packages installed via "zypper dup" on july,10
diff dmesg output with quadro4000 and quadro K4200

Description Philippe Condé 2019-07-10 16:10:49 UTC
Created attachment 810023 [details]
xorg.0.log

I have a tumbleweed system with a nvidia quadro K4200 on which 2 screens are connected 
- one on the DVI port
- the second on the display port 1 (DP1)
the two screens use resolution 1920*1080 and are defined as a  virtual screen of 3480*1080.
I use nouveau as driver; DE is KDE

This worked form years without problem. After installation of snapshot 20190708 I rebooted and the second screen doesn't more display anything.
I found that my config in 
- /etc/X11/xorg.conf.d/50-monitor.conf
- /etc/X11/xorg.conf.d/50-screen.conf
- /etc/X11/xorg.conf.d/50-device.conf
has been reset to default . my config were saved as *rpmsave. I copied back the rpmsave to the original and rebooted but this doesn't solve the problem.
Symptoms:
The second screen receive a signal and wake up but after some seconds it displays "displayport: no signal"
this occur:
1. during the boot when the resolution change form default to 1920*1080
2. when starting graphic display
3. on each login
4. when I start system setting

I 'll  link the Xorg.0.log to this bugreport

I tested with an old kernel (5.1.15-1) but the problem is now also present there
Regards
Philippe Condé
Comment 1 Felix Miata 2019-07-11 05:09:26 UTC
Philippe, my Quadro is slightly older than yours, but works fine with 20190708 automagically with dual displays whether in KDE3 or Plasma. Neither /etc/X11/xorg.conf nor /etc/X11/xorg.conf.d/50-[device,monitor,screen].conf exist:
# xrandr | egrep 'onnect|creen|\*' | grep -v disconn | sort -r
Screen 0: minimum 320 x 200, current 4480 x 1440, maximum 16384 x 16384
DP-2 connected 1920x1200+2560+0 (normal left inverted right x axis y axis) 519mm x 324mm
DP-1 connected primary 2560x1440+0+0 (normal left inverted right x axis y axis) 598mm x 336mm
   2560x1440     59.95*+  74.92
   1920x1200     59.95*+
p5bse:~ # inxi -GxxS
System:    Host: p5bse Kernel: 5.1.16-1-default x86_64 bits: 64 compiler: gcc v: 9.1.1 Desktop: KDE 3.5.10 tk: Qt 3.3.8c
           wm: kwin dm: N/A Distro: openSUSE Tumbleweed 20190708
Graphics:  Device-1: NVIDIA GF119 [NVS 310] vendor: Hewlett-Packard driver: nouveau v: kernel bus ID: 01:00.0
           chip ID: 10de:107d
           Display: x11 server: X.Org 1.20.5 driver: modesetting unloaded: fbdev,vesa alternate: nouveau,nv,nvidia
           resolution: 2560x1440~60Hz, 1920x1200~60Hz
           OpenGL: renderer: llvmpipe (LLVM 8.0 128 bits) v: 3.3 Mesa 19.1.1 compat-v: 3.1 direct render: Yes
# xrandr | egrep 'onnect|creen|\*' | grep -v disconn | sort -r
Screen 0: minimum 320 x 200, current 4480 x 1440, maximum 16384 x 16384
DP-2 connected 1920x1200+2560+0 (normal left inverted right x axis y axis) 519mm x 324mm
DP-1 connected primary 2560x1440+0+0 (normal left inverted right x axis y axis) 598mm x 336mm
   2560x1440     59.95*+  74.92  
   1920x1200     59.95*+
p5bse:/boot # inxi -GxxS
System:    Host: p5bse Kernel: 5.1.16-1-default x86_64 bits: 64 compiler: gcc v: 9.1.1 Desktop: KDE Plasma 5.16.2 
           tk: Qt 5.13.0 wm: kwin_x11 dm: KDM Distro: openSUSE Tumbleweed 20190708 
Graphics:  Device-1: NVIDIA GF119 [NVS 310] vendor: Hewlett-Packard driver: nouveau v: kernel bus ID: 01:00.0 
           chip ID: 10de:107d 
           Display: x11 server: X.Org 1.20.5 driver: modesetting unloaded: fbdev,vesa alternate: nouveau,nv,nvidia 
           compositor: kwin_x11 resolution: 2560x1440~60Hz, 1920x1200~60Hz 
           OpenGL: renderer: llvmpipe (LLVM 8.0 128 bits) v: 3.3 Mesa 19.1.1 compat-v: 3.1 direct render: Yes

Maybe you would have better luck if you removed or installed xf86-video-nouveau. I don't have it installed.

On KDE3 installation I use kdebase3-kdm. On Plasma installation I use KDM.
Comment 2 Philippe Condé 2019-07-11 07:02:37 UTC
Hello Felix,

xf86-video-nouveau was installed. I removed it and rebooted but X didn't start. When I tried to start it manually,  it gives a fatal error and as suggestion says that /usr/bin/Xorg has setuid not set????

So I reinstalled xf86-video-nouveau. rebooted and received the graphic login but my second screen is still not displaying anything but wake up on some actions (starting Yast, systemsettings, etc..).
I have seen that there are errors in journalctl:
Jul 11 08:30:04 hpprol2 kernel: nouveau 0000:0a:00.0: disp: outp 03:0006:0f42: training failed

Running journalctl -f I see that this error is coming each time that the second screen wake up. Also if I jump to a console via Alt-Ctrl -FX this message is displayed on the console.(the second screen display nothing)

I have shutdown my system and disconnected if from power for 30 seconds and then rebooted but this doesn't change anything.

Seems a problem with nouveau but if I reboot in a old kernel (where the 2d screen worked) it doesn't more work. Maybe it is not nouveau but something else calling nouveau with bad data?

I use sddm but I don't see this as a problem because this problem is already present during boot

You are right that the /etc/X11/xorg.conf.d/50-* files seems not more needed. I created it 4 years ago when starting with the nvidia K4200. I think that these data cannot given this problem..

Regards
Philippe
Comment 3 Stefan Dirsch 2019-07-11 10:05:17 UTC
Most likely this is a regression in nouveau kernel driver. Not much we can do here. You may want to try the Leap 15.1 kernel. I suggest to use our G05 NVIDIA proprietary driver

Standard disclaimer for usage of nouveau driver ...

---
Nouveau is an experimental driver under constant heavy development.
This means that we cannot follow it closely, as we are not part of its team with reverse engineered knowledge of NVIDIA cards.


In case you wish to stick with nouveau and to help us improve its support in openSUSE, you can try our latest kernel, Mesa, and xf86-video-nouveau packages:

  http://kernel.opensuse.org/
  https://build.opensuse.org/package/show/X11:XOrg/Mesa
  https://build.opensuse.org/package/show/X11:XOrg/xf86-video-nouveau

Testing the latest versions is a prerequisite in order to inform nouveau's upstream developers of any bugs you find:

  https://nouveau.freedesktop.org/wiki/Bugs/

Once you are aware of an upstream fix for your issue, please reopen the bug and let us know.
We will be happy to include it in your openSUSE distribution if it's technically feasible.


Alternatively, you can install NVIDIA's proprietary driver instead:

  https://en.opensuse.org/SDB:NVIDIA
---
Comment 4 Philippe Condé 2019-07-11 13:55:41 UTC
Hello Stephan;

I'm using the XEN kernel because I have a VM with Samba.So the Nvidia proprietary driver is a no go.
I tested also the last tumbleweed kernel (without XEN) but the problem is the same.
I have installed the  last kernel 5.2.0-2.1 and I'll install mesa and Xf86 that you filled in.

many thanks
Philippe
Comment 5 Philippe Condé 2019-08-16 09:10:41 UTC
Hello,

I have a problem with the Nvidia quadro K4200 identified as GK104GL
tthe screen connected on the DVI port is activated and display is OK
The screen connected on port DP-1 and DP-2 are activated but nothing is displayed.

I replaced the Quadro K4200 by an old Quadro 4000 identified as GF100GL.

With this card the screens DP1 ore Dp-2 is activated and display.
I find only one remark about this in the modification in kernel-firmware
"
lun. 10 juin 2019 14:00:00 CEST
Martin Pluskal <mpluskal@suse.com>
- Update to version 20190607:
  * linux-firmware: update firmware for mhdp8546
  * linux-firmware: rsi: update firmware images for Redpine 9113 chipset
  * imx: sdma: update firmware to v3.5/v4.5
  * nvidia: update GP10[2467] SEC2 RTOS with the one already used on GP108
"

Can you have a new look at this problem? 

Many thanks in advance
Philippe
Comment 6 Philippe Condé 2019-08-16 14:57:12 UTC
I did the next:

Reinstall kernel-firmware-20190514-1.3.noarch.rpm and kernel-default-5.1.10-1.1.x86_64.rpm + reboot

Problem is still present with quadro K4200 after reboot

I installed then all libdrm-2.4.98, Mesa*19.1.0-222 + some dependencies and rebooted.
==> kernel 5.2* cannot boot kernel panic
==> kernel 5.1.10 boot but problem with nvidia K4200 is still present.

After this I did a zypper dup  to reinstall the last snapshot.
Seems that the firmware is not the only involved but I'm unable to determine which other packages(s) are involved.

I 'll attach the list of packages installed on july,10 2019 when the problem occurred fro first time.

Regards
Philippe
Comment 7 Philippe Condé 2019-08-16 14:58:42 UTC
Created attachment 814285 [details]
list of packages installed via  "zypper dup" on july,10
Comment 8 Stefan Dirsch 2019-08-20 12:06:09 UTC
With no picture shown at all, this sounds like a kernel regression to me.
Comment 9 Stefan Dirsch 2019-08-20 12:08:52 UTC
Oh no. Looking again at the bugreport this never worked for you. I again suggest to use NVIDIA's proprietary driver. See my previous comment#3.
Comment 10 Philippe Condé 2019-08-21 06:36:30 UTC
hello,

My Nvidia k4200 worked with two screens until snapshot 20190708.

I replaced it  with an old Quadro 4000 and with this old card the two screens work correctly.

I suppose that a the Quadro K4200 is not more correctly recognized  maybe because new Nvidia cards were introduced in nouveau and in other part (Mesa? drm?).

For me this is a regression.

On my server I have also two VM with Xen: The proprietary NVidia driver cannot work with this configuration.I'm locked to nouveau.

My server is a HP proliant ML350 which only works with Nvidia Quadro. I have chosen the K4200 because it gives better results when working with full screen video

Regards
Philippe
Comment 11 Philippe Condé 2019-08-21 06:39:55 UTC
I created a bug report upstream. See
https://bugs.freedesktop.org/show_bug.cgi?id=111110

Their last comments was

"Perhaps I don't quite understand the issue then, but display modesets are
controlled by the kernel. The error you're seeing is with link training. So it
seems likely that a kernel update is responsible.

Could also be a userspace update which makes use of the kernel differently,
triggering a pre-existing bug. libdrm_nouveau/mesa and such are not at all
involved in this though -- that's just for 2d/3d accelerated rendering"

Regards
Philippe
Comment 12 Stefan Dirsch 2019-08-21 08:59:27 UTC
Ok. I missed that you were using Xen and in the end got nouveau driver running with Quadro K4200. Indeed Quadro 4000 is an older nVIDIA card.

I agree with the upstream comment. Likely a kernel regression. I cannot offer you a resolution here. Hope you still have installed a kernel, which still worked for you. Otherwise I'm afraid you either need to use the older Quadro 4000 or switch to fbdev driver, which would be rather slow. For sure slower than using your Quadro 4000. Sigh.
Comment 13 Stefan Dirsch 2020-02-27 10:56:56 UTC
Any improvements with Leap 15.1, Leap 15.2-Beta or current Tumbleweed?
Comment 14 Philippe Condé 2020-02-27 13:20:45 UTC
(In reply to Stefan Dirsch from comment #13)
> Any improvements with Leap 15.1, Leap 15.2-Beta or current Tumbleweed?

I test the card K4200 with each new kernel (currently at Linux hpprol2 5.5.5-1-default ): there are no change the error is still present

I think still that the problem is in nouveau; If I compare the dmesg output with quadro 4000 and quadro k4200 I see the error short after the start of nouveau. I'll attach a pdf with the diff of dmesg 

Regards
Philippe
Comment 15 Philippe Condé 2020-02-27 13:25:12 UTC
Created attachment 831428 [details]
diff dmesg output with quadro4000 and quadro K4200

the error is marked in yellow. At this boot step, the DP1 monitor is activated but display nothing.

Regards
Philippe
Comment 16 Stefan Dirsch 2020-02-27 14:39:16 UTC
Ok. Thanks! I'm afraid this won't be addressed any longer in nouveau driver. :-( Anyway, let's close it as upstream.

https://bugs.freedesktop.org/show_bug.cgi?id=111110
Comment 17 Felix Miata 2020-02-27 16:51:36 UTC
(In reply to Philippe Condé from comment #2)
> xf86-video-nouveau was installed. I removed it and rebooted but X didn't
> start. When I tried to start it manually,  it gives a fatal error and as
> suggestion says that /usr/bin/Xorg has setuid not set????

I suggest to try again. Side-by-side is the default multi-display configuration. When you try, make sure to not have any xorg.con* files that would prevent use of the modesetting DDX. IOW, remove them all, then try. Also, try with a displaymanager running instead of using startx. This should avoid setuid trouble.