|
Bugzilla – Full Text Bug Listing |
| Summary: | intel [Skylake] GNOME shell 43.3.1 / X11 dumps core - Mesa 23.0.0 regression | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Tumbleweed | Reporter: | Martin Wilck <martin.wilck> |
| Component: | X.Org | Assignee: | Gfx Bugs <gfx-bugs> |
| Status: | RESOLVED FIXED | QA Contact: | Gfx Bugs <gfx-bugs> |
| Severity: | Normal | ||
| Priority: | P3 - Medium | CC: | martin.wilck, mkoutny, vliaskovitis |
| Version: | Current | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| URL: | https://gitlab.freedesktop.org/mesa/mesa/-/issues/8542 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
Changed component to X.org. (In reply to Martin Wilck from comment #0) > Note that neither gnome-shell nor mutter or gogl has been updated in the > transaction that lead to the issue. The only package in the list of updated > packages that looks suspicious in the context shown below is Mesa, which was > updated from 22.3.5-343.1 to 23.0.0-345.1, and the kernel, which has been > updated from 6.1.12-1 to 6.2.1-1. It happens with 6.1.12-1, too, so the kernel is not the culprit. Ok. So try to downgrade Mesa. Possible package list you need to downgrade: Mesa Mesa-KHR-devel Mesa-devel Mesa-dri Mesa-dri-devel Mesa-dri-nouveau Mesa-dri-vc4 Mesa-gallium Mesa-libEGL-devel Mesa-libEGL1 Mesa-libGL-devel Mesa-libGL1 Mesa-libGLESv1_CM-devel Mesa-libGLESv2-devel Mesa-libGLESv3-devel Mesa-libOpenCL Mesa-libRusticlOpenCL Mesa-libd3d Mesa-libd3d-devel Mesa-libglapi-devel Mesa-libglapi0 Mesa-libva Mesa-vulkan-device-select Mesa-vulkan-overlay libOSMesa-devel libOSMesa8 libgbm-devel libgbm1 libvdpau_nouveau libvdpau_r300 libvdpau_r600 libvdpau_radeonsi libvdpau_virtio_gpu libvulkan_broadcom libvulkan_freedreno libvulkan_intel libvulkan_lvp libvulkan_radeon libxatracker-devel libxatracker2 And you need to restart Xserver after this. I just downgraded the following packages (sorry I started before I read your comment): Mesa-libglapi0|22.3.5-343.1 Mesa-KHR-devel|22.3.5-343.1 Mesa-libEGL1|22.3.5-343.1 Mesa-libGL1|22.3.5-343.1 Mesa-gallium|22.3.5-343.1 Mesa|22.3.5-343.1 Mesa-dri|22.3.5-343.1 Mesa-libEGL-devel|22.3.5-343.1 Mesa-libGLESv2-devel|22.3.5-343.1 Mesa-libGLESv1_CM-devel|22.3.5-343.1 Mesa-dri-devel|22.3.5-343.1 Mesa-libGL-devel|22.3.5-343.1 libOSMesa8|22.3.5-343.1 libOSMesa-devel|22.3.5-343.1 Mesa-libglapi-devel|22.3.5-343.1 Mesa-gallium-32bit|22.3.5-343.1 Mesa-dri-32bit|22.3.5-343.1 Mesa-32bit|22.3.5-343.1 Mesa-libGL1-32bit|22.3.5-343.1 Mesa-devel|22.3.5-343.1 Mesa-libglapi0-32bit|22.3.5-343.1 Mesa-vulkan-device-select|22.3.5-343.1 Mesa-libva|22.3.5-343.1 Mesa-libEGL1-32bit|22.3.5-343.1 GNOME seems to work now. I noticed that GDM wouldn't offer me "GNOME/Xorg" any more. But when I start the "GNOME" session, it seems to start GNOME/Xorg. At least X is running. So the original problem isn't observed. However typing in this browser window feels sluggish. It seems that I'm not getting any acceleration any more. I'll downgrade the other packages you recommended and see how it goes. All packages listed in comment 3 downgraded. I'm offered a GNOME/Xorg session now again, and it doesn't crash. > However typing in this browser window feels sluggish
this was a different problem, related to bluetooth and my BT keyboard. Forget it.
Thanks. So apparently it's a Mesa issue. Which graphic is this? glxinfo -B inxi -aG would be useful here. $ glxinfo -B
name of display: :0
display: :0 screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
Vendor: Intel (0x8086)
Device: Mesa Intel(R) HD Graphics 520 (SKL GT2) (0x1916)
Version: 22.3.5
Accelerated: yes
Video memory: 7713MB
Unified memory: yes
Preferred profile: core (0x1)
Max core profile version: 4.6
Max compat profile version: 4.6
Max GLES1 profile version: 1.1
Max GLES[23] profile version: 3.2
OpenGL vendor string: Intel
OpenGL renderer string: Mesa Intel(R) HD Graphics 520 (SKL GT2)
OpenGL core profile version string: 4.6 (Core Profile) Mesa 22.3.5
OpenGL core profile shading language version string: 4.60
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL version string: 4.6 (Compatibility Profile) Mesa 22.3.5
OpenGL shading language version string: 4.60
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile
OpenGL ES profile version string: OpenGL ES 3.2 Mesa 22.3.5
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
# inxi -aG
Graphics:
Device-1: Intel Skylake GT2 [HD Graphics 520] vendor: Dell Latitude E7470 driver: i915 v: kernel
arch: Gen-9 process: Intel 14n built: 2015-16 ports: active: DP-2,DP-4,eDP-1 empty: DP-1, DP-3,
HDMI-A-1, HDMI-A-2 bus-ID: 00:02.0 chip-ID: 8086:1916 class-ID: 0300
Device-2: Sunplus Innovation Integrated_Webcam_HD type: USB driver: uvcvideo bus-ID: 1-2:2
chip-ID: 1bcf:28b8 class-ID: 0e02
Display: server: X.org v: 1.21.1.7 with: Xwayland v: 22.1.8 compositor: gnome-shell v: 43.3
driver: X: loaded: intel unloaded: fbdev,modesetting,vesa dri: i965 gpu: i915 tty: 135x30
Monitor-1: DP-2 model: Dell P2414H serial: KKMMW62572JB built: 2016 res: 1920x1080 dpi: 93
gamma: 1.2 size: 527x297mm (20.75x11.69") diag: 605mm (23.8") ratio: 16:9 modes: max: 1920x1080
min: 720x400
Monitor-2: DP-4 model: Fujitsu Siemens P24W-7 LED serial: YV8S000307 built: 2014
res: 1920x1200 dpi: 94 gamma: 1.2 size: 518x324mm (20.39x12.76") diag: 611mm (24.1")
ratio: 16:10 modes: max: 1920x1200 min: 720x400
Monitor-3: eDP-1 model: LG Display 0x0490 built: 2014 res: 1920x1080 dpi: 158 gamma: 1.2
size: 309x174mm (12.17x6.85") diag: 355mm (14") ratio: 16:9 modes: 1920x1080
API: OpenGL Message: GL data unavailable in console for root.
Graphics:
Device-1: Intel Skylake GT2 [HD Graphics 520] vendor: Dell Latitude E7470
driver: i915 v: kernel arch: Gen-9 process: Intel 14n built: 2015-16 ports:
active: DP-2,DP-4,eDP-1 empty: DP-1, DP-3, HDMI-A-1, HDMI-A-2
bus-ID: 00:02.0 chip-ID: 8086:1916 class-ID: 0300
Device-2: Sunplus Innovation Integrated_Webcam_HD type: USB
driver: uvcvideo bus-ID: 1-2:2 chip-ID: 1bcf:28b8 class-ID: 0e02
Display: x11 server: X.Org v: 21.1.7 with: Xwayland v: 22.1.8
compositor: gnome-shell v: 43.3 driver: X: loaded: intel
unloaded: fbdev,modesetting,vesa dri: i965 gpu: i915 display-ID: :0
screens: 1
Screen-1: 0 s-res: 3648x1920 s-dpi: 96 s-size: 965x508mm (37.99x20.00")
s-diag: 1091mm (42.93")
Monitor-1: DP-2 mapped: DP1-1 pos: primary,top-center model: Dell P2414H
serial: KKMMW62572JB built: 2016 res: 1080x1920 hz: 60 dpi: 91 gamma: 1.2
size: 300x530mm (11.81x20.87") diag: 605mm (23.8") ratio: 16:9 modes:
max: 1920x1080 min: 720x400
Monitor-2: DP-4 mapped: DP1-3 pos: top-left model: Fujitsu Siemens P24W-7
LED serial: YV8S000307 built: 2014 res: 1200x1920 hz: 60 dpi: 95
gamma: 1.2 size: 320x520mm (12.6x20.47") diag: 611mm (24.1") ratio: 16:10
modes: max: 1920x1200 min: 720x400
Monitor-3: eDP-1 mapped: eDP1 pos: bottom-r model: LG Display 0x0490
built: 2014 res: 1368x768 dpi: 112 gamma: 1.2 size: 310x170mm (12.2x6.69")
diag: 355mm (14") ratio: 16:9 modes: 1920x1080
API: OpenGL v: 4.6 Mesa 22.3.5 renderer: Mesa Intel HD Graphics 520 (SKL
GT2) direct render: Yes
Seeing that the crash in mutter/gnome-shell is in code related to handling GLX_INTEL_swap_event, I wonder if reverting this Mesa 23.0 commit makes a difference: " From 19c57ea3bf6d77cf6f07f2a56e781f55b0e6013b Mon Sep 17 00:00:00 2001 From: Adam Jackson <ajax@redhat.com> Date: Tue, 13 Dec 2022 12:26:58 -0500 Subject: [PATCH] glx: Remove pointless GLX_INTEL_swap_event paranoia It's not our job to filter this out, it's the server's job to not send events that haven't been selected for. We'll still throw the event away if we don't have any client-side state for it though." Debug package with this patch reverted: https://build.opensuse.org/package/binaries/home:vliaskovitis:branches:X11:XOrg/Mesa/openSUSE_Tumbleweed However as the upstream commit log implies, this is only for debug purposes: If the revert does fix things, it likely means the proper solution is to handle the event differently in either gnome's mutter/gnome-shell or xorg server, not in Mesa itself. So if the revert fixes things, it would just help us find the correct component to focus on. Side note: there seems to be a package dependency issue here. Spin-off bug 1209086 created. My first attempt to update from Vasilis' repo resulted in the following package mix from Vaslis and Factory: Factory: Mesa-libglapi0-23.0.0-345.1 Mesa-KHR-devel-23.0.0-345.1 Mesa-libEGL1-23.0.0-345.1 Mesa-gallium-23.0.0-345.1 Mesa-23.0.0-345.1 Mesa-dri-23.0.0-345.1 Mesa-libEGL-devel-23.0.0-345.1 Mesa-libGLESv2-devel-23.0.0-345.1 Mesa-libGLESv1_CM-devel-23.0.0-345.1 Mesa-dri-devel-23.0.0-345.1 Mesa-libGL-devel-23.0.0-345.1 libOSMesa8-23.0.0-345.1 libOSMesa-devel-23.0.0-345.1 Mesa-libglapi-devel-23.0.0-345.1 Mesa-gallium-32bit-23.0.0-345.1 Mesa-dri-32bit-23.0.0-345.1 Mesa-32bit-23.0.0-345.1 Mesa-libGL1-32bit-23.0.0-345.1 Mesa-devel-23.0.0-345.1 Vasilis: Mesa-libGL1-23.0.0-1453.1 Mesa-vulkan-device-select-23.0.0-1453.1 Mesa-libglapi0-32bit-23.0.0-1453.1 Mesa-libEGL1-32bit-23.0.0-1453.1 libvulkan_intel-23.0.0-1453.1 libgbm-devel-23.0.0-1453.1 libvdpau_virtio_gpu-23.0.0-1453.1 libvdpau_r600-23.0.0-1453.1 libvdpau_r300-23.0.0-1453.1 libvdpau_nouveau-23.0.0-1453.1 Mesa-libva-23.0.0-1453.1 Anyway, the issue is gone. As the commit mentioned in comment 11 affects Mesa-libGL1 (AFAICT), Vasilis' hypothesis is confirmed. Thanks! (In reply to Martin Wilck from comment #18) > Upstream: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8542 Thanks a lot! Watching now ... @Martin While the issue is addressed upstream. Is this a fatal issue, I mean does this break the GNOME desktop completely, so shouldn't I reverse apply this patch for now ASAP? It breaks the desktop for me, because I have to use GNOME/X11 in order to use barrier, and GNOME/X11 crashes on every startup (not only once, but many times in a row - that's another bug actually, the number of desktop restarts by systemd should be limited somehow). If there aren't a lot of other people affected, I can just set a lock on Mesa and keep the package that works for me. That should work for a limited amount of time. But the issue doesn't seem to have got much upstream traction so far... Thanks. I can't imagine you being the only one affected by this. Probably you're just the first one reporting it. I'll do the following for the time being. ------------------------------------------------------------------- Tue Mar 14 11:53:20 UTC 2023 - Stefan Dirsch <sndirsch@suse.com> - U_glx-Remove-pointless-GLX_INTEL_swap_event-paranoia.patch * reverse apply this patch to fix a regression caused by this commit, which resulted in gnome-shell constantly crashing, which is making a GNOME/X11 session impossible (boo#1209005) We'll see what upstream thinks about this ... https://build.opensuse.org/request/show/1071497 Lowering severity due to regression commit reverted now. *** Bug 1209203 has been marked as a duplicate of this bug. *** Now also reverted upstream. I'll close this one once I update to a Mesa version, which supersedes our "revert"-patch. Patch is now reverted in Mesa 23.1.3. Will be in TW soon. Closing ... . This is an autogenerated message for OBS integration: This bug (1209005) was mentioned in https://build.opensuse.org/request/show/1094791 Factory / Mesa |
Since today's TW update, gnome shell (X11) dumps core repeatedly, roughly every 7s. I've collected >200 core dumps in a few minutes. I needed to kill the systemd user instance because systemd would try to restart gnome-shell forever. GNOME/wayland seems to work, but I can't use it because I need the barrier KVM switch software which doesn't work with wayland. Note that neither gnome-shell nor mutter or gogl has been updated in the transaction that lead to the issue. The only package in the list of updated packages that looks suspicious in the context shown below is Mesa, which was updated from 22.3.5-343.1 to 23.0.0-345.1, and the kernel, which has been updated from 6.1.12-1 to 6.2.1-1. Sample core: (it's too large to attach to bugzilla but I can provide it on demand): PID: 30814 (gnome-shell) UID: 17326 (mwilck) GID: 50 (suse) Signal: 11 (SEGV) Timestamp: Tue 2023-03-07 10:05:12 CET (1h 50min ago) Command Line: /usr/bin/gnome-shell Executable: /usr/bin/gnome-shell Control Group: /user.slice/user-17326.slice/user@17326.service/session.slice/org.gnome.Shell@x11.service Unit: user@17326.service User Unit: org.gnome.Shell@x11.service Slice: user-17326.slice Owner UID: 17326 (mwilck) Boot ID: 725dacfb20e34ebaa2c5bb3384db7c25 Machine ID: a0385656b74c9241b77c1bb6577a603b Hostname: apollon.suse.de Storage: /var/lib/systemd/coredump/core.gnome-shell.17326.725dacfb20e34ebaa2c5bb3384db7c25.30814.1678179912000000.zst (present) Size on Disk: 18.4M Message: Process 30814 (gnome-shell) of user 17326 dumped core. (gdb) bt #0 0x00007fd104389e3d in cogl_onscreen_glx_notify_swap_buffers (swap_event=0x7ffc351d7f00, onscreen=0x55655988d120 [CoglOnscreenGlx]) at ../cogl/cogl/winsys/cogl-onscreen-glx.c:991 #1 notify_swap_buffers (context=<optimized out>, swap_event=0x7ffc351d7f00) at ../cogl/cogl/winsys/cogl-winsys-glx.c:184 #2 glx_event_filter_cb (xevent=0x7ffc351d7f00, data=<optimized out>) at ../cogl/cogl/winsys/cogl-winsys-glx.c:224 #3 0x00007fd104388f18 in _cogl_renderer_handle_native_event (renderer=<optimized out>, event=0x7ffc351d7f00) at ../cogl/cogl/cogl-renderer.c:636 #4 cogl_xlib_renderer_handle_event (renderer=<optimized out>, event=0x7ffc351d7f00) at ../cogl/cogl/cogl-xlib-renderer.c:579 #5 0x00007fd1048de110 in cogl_xlib_filter (xevent=<optimized out>, event=<optimized out>, data=<optimized out>) at ../src/backends/x11/meta-clutter-backend-x11.c:94 #6 0x00007fd1048e9d93 in meta_clutter_backend_x11_process_event_filters (clutter_backend_x11=0x5565596b0010 [MetaClutterBackendX11], event=0x55655dd7a2e0, native=0x7ffc351d7f00) at ../src/backends/x11/meta-clutter-backend-x11.c:329 #7 meta_clutter_backend_x11_translate_event (clutter_backend=0x5565596b0010 [MetaClutterBackendX11], native=0x7ffc351d7f00, event=0x55655dd7a2e0) at ../src/backends/x11/meta-clutter-backend-x11.c:363 #8 0x00007fd10498c090 in meta_x11_handle_event.isra.0 (backend=backend@entry=0x5565595f31d0 [MetaBackendX11Cm], xevent=xevent@entry=0x7ffc351d7f00) at ../src/backends/x11/meta-event-x11.c:82 #9 0x00007fd1048e576d in handle_host_xevent (event=0x7ffc351d7f00, backend=0x5565595f31d0 [MetaBackendX11Cm]) at ../src/backends/x11/meta-backend-x11.c:421 #10 x_event_source_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at ../src/backends/x11/meta-backend-x11.c:475 #11 0x00007fd1056a0a90 in g_main_dispatch (context=0x5565595de9f0) at ../glib/gmain.c:3454 #12 g_main_context_dispatch (context=context@entry=0x5565595de9f0) at ../glib/gmain.c:4172 #13 0x00007fd1056a0e48 in g_main_context_iterate (context=0x5565595de9f0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4248 #14 0x00007fd1056a110f in g_main_loop_run (loop=0x55655b770f00) at ../glib/gmain.c:4448 #15 0x00007fd1048c28c5 in meta_context_run_main_loop (context=<optimized out>, error=error@entry=0x7ffc351d8160) at ../src/core/meta-context.c:465 #16 0x000055655892d904 in main (argc=<optimized out>, argv=<optimized out>) at ../src/main.c:582 Crashes here: 976 cogl_onscreen_glx_notify_swap_buffers (CoglOnscreen *onscreen, 977 GLXBufferSwapComplete *swap_event) 978 { 979 CoglOnscreenGlx *onscreen_glx = COGL_ONSCREEN_GLX (onscreen); 980 CoglFramebuffer *framebuffer = COGL_FRAMEBUFFER (onscreen); 981 CoglContext *context = cogl_framebuffer_get_context (framebuffer); 982 gboolean ust_is_monotonic; 983 CoglFrameInfo *info; 984 985 /* We only want to notify that the swap is complete when the 986 application calls cogl_context_dispatch so instead of immediately 987 notifying we'll set a flag to remember to notify later */ 988 set_sync_pending (onscreen); 989 990 info = cogl_onscreen_peek_head_frame_info (onscreen); 991 info->flags |= COGL_FRAME_INFO_FLAG_VSYNC; // <==== 992 because info is NULL: > 0x00007fd104389e31 <+417>: mov %rbp,%rdi > 0x00007fd104389e34 <+420>: call 0x7fd104352e80 <cogl_onscreen_peek_head_frame_info@plt> > 0x00007fd104389e39 <+425>: mov 0x30(%r13),%rsi > => 0x00007fd104389e3d <+429>: orl $0x8,0x70(%rax) (gdb) info reg rax 0x0 0 rbx 0x7ffc351d7f00 140721199611648 rcx 0x5565596b0fa0 93893780246432 rdx 0x5565596b0fa0 93893780246432 rsi 0x20000c 2097164 rdi 0x55655988d040 93893782196288 rbp 0x55655988d120 0x55655988d120 rsp 0x7ffc351d7c60 0x7ffc351d7c60 r8 0x28 40 r9 0x50 80 r10 0x0 0 r11 0x1 1 r12 0xfffffffffffffff0 -16 r13 0x55655988d120 93893782196512 r14 0x55655982f130 93893781811504 r15 0x5565595f6800 93893779482624 rip 0x7fd104389e3d 0x7fd104389e3d <glx_event_filter_cb+429> eflags 0x10246 [ PF ZF IF RF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 The caller of notify_swap_buffers is handling a GLX_BufferSwapComplete event: 201 glx_event_filter_cb (XEvent *xevent, void *data) ... 217 #ifdef GLX_INTEL_swap_event 218 glx_renderer = context->display->renderer->winsys; 219 220 if (xevent->type == (glx_renderer->glx_event_base + GLX_BufferSwapComplete)) 221 { 222 GLXBufferSwapComplete *swap_event = (GLXBufferSwapComplete *) xevent; 223 224 notify_swap_buffers (context, swap_event); (gdb) 225 226 /* remove SwapComplete events from the queue */ 227 return COGL_FILTER_REMOVE; 228 } 229 #endif /* GLX_INTEL_swap_event */ (gdb) p *swap_event $12 = { type = 96, serial = 5387, send_event = 0, display = 0x5565595f6800, drawable = 2097163, event_type = 33153, ust = 1796404299, msc = 107467, sbc = 1 }