Bugzilla – Bug 1077885
GPU hang (Intel Mobile 4 Series Integrated Graphics Controller)
Last modified: 2018-03-26 14:13:43 UTC
This is similar to "Bug 1050256 - GPU hang", but different GPU. The symptoms are the same, but being different GPU I was told to create new report. I have this issue after upgrading my laptop to 42.3 from 42.2, using the offline or DVD upgrade method. CPU: Model: 6.23.10 "Pentium(R) Dual-Core CPU T4300 @ 2.10GHz" Video: Model: "Intel Mobile 4 Series Chipset Integrated Graphics Controller" Vendor: pci 0x8086 "Intel Corporation" Device: pci 0x2a42 "Mobile 4 Series Chipset Integrated Graphics Controller" SubVendor: pci 0x103c "Hewlett-Packard Company" SubDevice: pci 0x3069 Revision: 0x07 Driver: "i915" Driver Modules: "i915" (hwinfo output will be attached) Crash log: <3.6> 2018-01-27 12:47:05 minas-tirith systemd 1 - - Started Postfix Mail Transport Agent. <0.6> 2018-01-27 12:47:17 minas-tirith kernel - - - [ 1128.808879] [drm] GPU HANG: ecode 4:0:0xfdefffff, in X [2154], reason: Hang on render ring, action: reset <0.6> 2018-01-27 12:47:17 minas-tirith kernel - - - [ 1128.808883] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. <0.6> 2018-01-27 12:47:17 minas-tirith kernel - - - [ 1128.808884] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel <0.6> 2018-01-27 12:47:17 minas-tirith kernel - - - [ 1128.808884] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. <0.6> 2018-01-27 12:47:17 minas-tirith kernel - - - [ 1128.808885] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. <0.6> 2018-01-27 12:47:17 minas-tirith kernel - - - [ 1128.808885] [drm] GPU crash dump saved to /sys/class/drm/card0/error <0.5> 2018-01-27 12:47:17 minas-tirith kernel - - - [ 1128.808914] drm/i915: Resetting chip after gpu hang <0.5> 2018-01-27 12:47:26 minas-tirith kernel - - - [ 1137.820965] drm/i915: Resetting chip after gpu hang <0.5> 2018-01-27 12:47:36 minas-tirith kernel - - - [ 1147.820140] drm/i915: Resetting chip after gpu hang I commented this on the openSUSE mail list, and Dave Plater suggested nomodeset. This works, but the video mode changes to something like 800*600, which is pretty bad. He also suggested to reopen this Bugzilla. At that moment I had kernel 4.4.104-39, and drm-kmp-default 4.9.33_k4.4.79_4-5.2. I updated to his version, drm-kmp-default-4.9.33_k4.4.104_39-7.24.x86_64.rpm; this is more stable, but in the end the X environment froze: mouse moves, but no response. I could ctrl-alt-f1. I see in the log several entries like this (different PID), don't know if related: <3.6> 2018-01-27 19:58:34 minas-tirith console-kit-daemon 3128 - - (process:10750): GLib-CRITICAL **: g_slice_set_config: assertion 'sys_page_size == 0' failed I hibernated the machine and went back home. Restored (not restarted) and I see this in the log: <3.6> 2018-01-27 21:16:36 minas-tirith systemd-sleep 10886 - - System resumed. <3.6> 2018-01-27 21:16:36 minas-tirith systemd-sleep 10886 - - INFO: running /usr/lib/systemd/system-sleep/grub2.sleep for hibernate <3.6> 2018-01-27 21:16:36 minas-tirith systemd-sleep 10886 - - INFO: Running grub-once-restore .. <3.6> 2018-01-27 21:16:36 minas-tirith systemd-sleep 10886 - - 2018-01-27 21:16:36+01:00 - Thawing the system now... <3.4> 2018-01-27 21:16:36 minas-tirith systemd-sh - - - Thawing the system now... <3.6> 2018-01-27 21:16:37 minas-tirith systemd 1 - - Stopped Deferred execution scheduler. <3.6> 2018-01-27 21:16:37 minas-tirith systemd 1 - - Started Deferred execution scheduler. <3.6> 2018-01-27 21:16:37 minas-tirith laptop-mode - - - Laptop mode <3.6> 2018-01-27 21:16:37 minas-tirith laptop-mode - - - enabled, not active [unchanged] <3.6> 2018-01-27 21:16:37 minas-tirith systemd-sleep 10886 - - INFO: Done. <3.6> 2018-01-27 21:16:37 minas-tirith laptop-mode - - - Laptop mode <3.6> 2018-01-27 21:16:37 minas-tirith laptop-mode - - - enabled, not active [unchanged] <3.6> 2018-01-27 21:16:37 minas-tirith systemd-sleep 10886 - - tput: No value for $TERM and no -T specified <0.6> 2018-01-27 21:16:48 minas-tirith kernel - - - [13685.816731] [drm] GPU HANG: ecode 4:0:0xfdeffdfb, in X [2171], reason: Hang on render ring, action: reset <0.6> 2018-01-27 21:16:48 minas-tirith kernel - - - [13685.816736] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. <0.6> 2018-01-27 21:16:48 minas-tirith kernel - - - [13685.816736] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel <0.6> 2018-01-27 21:16:48 minas-tirith kernel - - - [13685.816737] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. <0.6> 2018-01-27 21:16:48 minas-tirith kernel - - - [13685.816737] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. <0.6> 2018-01-27 21:16:48 minas-tirith kernel - - - [13685.816738] [drm] GPU crash dump saved to /sys/class/drm/card0/error <0.5> 2018-01-27 21:16:48 minas-tirith kernel - - - [13685.816792] drm/i915: Resetting chip after gpu hang <0.5> 2018-01-27 21:17:00 minas-tirith kernel - - - [13697.816112] drm/i915: Resetting chip after gpu hang I will attach gpu.2.log, and messages log since machine upgrade, and hwinfo --cpu and --gfxcard My desktop is XFCE and I have 4 GiB of RAM.
Created attachment 757825 [details] CER: Messages log
Created attachment 757826 [details] CER: gpu log
Created attachment 757827 [details] CER: hwinfo output
On suggestion from Felix Miata I add inxi output: minas-tirith:/home/cer/Bugzilla/Bug_1050256 - GPU hang # inxi -c0 -G Graphics: Card: Intel Mobile 4 Series Integrated Graphics Controller Display Server: X.org 1.18.3 drivers: intel (unloaded: modesetting,fbdev,vesa) tty size: 150x51 Advanced Data: N/A for root minas-tirith:/home/cer/Bugzilla/Bug_1050256 - GPU hang #
On suggestion from Stefan Dirsch I have uninstalled drm-kmp-default, I will see what happens.
I see a needinfo from me, but I don't see the question. :-? Clearing.
Well, question is in your own comment #5. ;-)
Ah, ok :-) So far, no crashes (I left the machine running all night while I slept, and the display artefacts have disappeared. I will now hibernate and restore the machine, this usually causes some stress. [...] Restored fine, it seems. I can try rebooting with reduced memory. [...] Ok, did so, booted with 1G, opened thunderbird and firefox, machine was swapping about another gig, alt-tabbed, switched workspaces, and no artifacts, no crashes. So this machine should run without drm-kmp-default always? Or a patch is needed?
(In reply to Carlos Robinson from comment #8) > So this machine should run without drm-kmp-default always? Or a patch is > needed? Yes, that's probably best. In addition you can try KOTD to see if the issue has been fixed upstream meanwhile.
Well, I'll see if I can. Means also installing corresponding drm-kmp- too, I guess. I also have to try installing Leap 15.0 in a test partition and report. Thanks.
(In reply to Carlos Robinson from comment #10) > Well, I'll see if I can. Means also installing corresponding drm-kmp- too, I > guess. Oh no. *Un*installing, please! > I also have to try installing Leap 15.0 in a test partition and report. That's also useful. Thanks!
I don't understand. The crash doesn't happen unless I install drm-kmp, there will be no way to know when the kernel solves the issue.
? drm-kmp means DRM drivers from Kernel 4.9. I would like to know whether newer Kernels 4.14/4.15 refix the issue. We know DRM of Kernel 4.4 still worked.
I have exactly the same problem after update to 43.2 Device Name: "Onboard IGD" Model: "Intel Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller" Vendor: pci 0x8086 "Intel Corporation" Device: pci 0x0412 "Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller" SubVendor: pci 0x1462 "Micro-Star International Co., Ltd. [MSI]" SubDevice: pci 0x7817 Revision: 0x06 Driver: "i915" Driver Modules: "drm" CPU: Intel(R) Core(TM) i5-4460 CPU @ 3.20GHz uname -r 4.4.104-39-default The only way to overcome this is to zypper addlock drm-kmp-default :(
Since the old Intel chips don't work with 4.9.x kernel any better than 4.4.x, let's apply the limited supplements to drm-kmp on Leap 42.3 as we do for SLE12-SP3. It won't let it uninstalled automatically, but it can help a bit -- you can remove the zypper lock, at least.
SUSE-SU-2018:0509-1: An update that solves one vulnerability and has 8 fixes is now available. Category: security (moderate) Bug References: 1041744,1046821,1047277,1047729,1048155,1050256,1055493,1066175,1077885 CVE References: CVE-2017-10810 Sources used: SUSE Linux Enterprise Workstation Extension 12-SP3 (src): drm-4.9.33-4.11.1 SUSE Linux Enterprise Desktop 12-SP3 (src): drm-4.9.33-4.11.1
Tested with openSUSE-Leap-15.0-DVD-x86_64-Build153.1-Media.iso Minas-Anor:~ # rpm -q drm-kmp-default package drm-kmp-default is not installed and the machine seems to work perfectly. On Leap 42.3, however, the package was automatically reinstalled by YaST online update. I noticed the artifacts and found the package installed. I had to taboo it.
This is an autogenerated message for OBS integration: This bug (1077885) was mentioned in https://build.opensuse.org/request/show/588685 42.3 / drm
This is an autogenerated message for OBS integration: This bug (1077885) was mentioned in https://build.opensuse.org/request/show/589148 42.3 / drm
openSUSE-RU-2018:0782-1: An update that has 6 recommended fixes can now be installed. Category: recommended (moderate) Bug References: 1041744,1047277,1047729,1055493,1066175,1077885 CVE References: Sources used: openSUSE Leap 42.3 (src): drm-4.9.33-10.2
The updated drm-kmp-default package for Leap 42.3 no longer will be (re-)installed automatically for older Intel GPUs. Hardwarre Supplements in the package have been adjusted. So let's close this as fixed.