|
Bugzilla – Full Text Bug Listing |
| Summary: | BUG: unable to handle kernel NULL pointer dereference at 0000000000000060 in intel_fb_obj_invalidate+0x1c/0xf0 [i915] | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Tumbleweed | Reporter: | Forgotten User l03xIL5qZl <forgotten_l03xIL5qZl> |
| Component: | Kernel | Assignee: | E-mail List <kernel-maintainers> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Major | ||
| Priority: | P5 - None | CC: | forgotten_l03xIL5qZl, forgotten_NXEif20qEv, jslaby, tiwai |
| Version: | Current | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
First (truncated) output of computer lock
Second (truncated) output of computer lock |
||
|
Description
Forgotten User l03xIL5qZl
2016-01-20 20:24:18 UTC
Hmm, there is no change in fs/proc/consoles.c itself, so if any, it must be in the layer below that, e.g. console_lock() deadlocks, etc. Could you try to get the whole tasks with alt-sysrq-t? Even if I had the issue four times in 3 days, I've never encountered the same symptoms since the bug entry creation. However, if I face this issue one more time, I doubt I'll be able to dump that information: screen is blank. And as far as I can remember, MagicSysReq only print information on console, not on remote SSH sessions. It'll be logged in journal, if it's alive. Created attachment 663626 [details]
First (truncated) output of computer lock
First time I had a lock down after this bug report.
Unfortunately, the ring buffer is not large enough, and this was not visible using journalctl -k
Does journald redirect kernel ring buffer?
Created attachment 663627 [details]
Second (truncated) output of computer lock
Cool. Could you install and probe a kernel with lockdep enabled: https://build.opensuse.org/project/monitor/home:jirislaby:stable-lockdep ? Nevermind, the true reason is this: WARNING: CPU: 0 PID: 1151 at ../include/linux/kref.h:46 drm_framebuffer_reference+0x64/0x70 [drm]() Modules linked in: ... [last unloaded: vboxdrv] CPU: 0 PID: 1151 Comm: X Tainted: G W O 4.4.0-1-default #1 Hardware name: Hewlett-Packard HP EliteBook 8470p/179B, BIOS 68ICF Ver. F.40 01/31/2013 ffffffffa01ee7e1 ffff88032ce7baa8 ffffffff8137f639 0000000000000000 ffff88032ce7bae0 ffffffff8107d132 ffff880036d3af40 ffff8800a19781c0 ffff8800a19781c0 ffff88030aee4400 ffff88032ef10000 ffff88032ce7baf0 Call Trace: [<ffffffff8101a095>] try_stack_unwind+0x175/0x190 [<ffffffff81018fe9>] dump_trace+0x69/0x3a0 [<ffffffff8101a0fb>] show_trace_log_lvl+0x4b/0x60 [<ffffffff8101942c>] show_stack_log_lvl+0x10c/0x180 [<ffffffff8101a195>] show_stack+0x25/0x50 [<ffffffff8137f639>] dump_stack+0x4b/0x72 [<ffffffff8107d132>] warn_slowpath_common+0x82/0xc0 [<ffffffff8107d22a>] warn_slowpath_null+0x1a/0x20 [<ffffffffa01c8294>] drm_framebuffer_reference+0x64/0x70 [drm] [<ffffffffa01da1bd>] drm_atomic_set_fb_for_plane+0x2d/0x90 [drm] [<ffffffffa026db6c>] __drm_atomic_helper_set_config+0xbc/0x3a0 [drm_kms_helper] [<ffffffffa026fa9c>] drm_fb_helper_pan_display+0x18c/0x230 [drm_kms_helper] [<ffffffffa037eb0a>] intel_fbdev_pan_display+0x1a/0x60 [i915] [<ffffffff813f6a6f>] fb_pan_display+0xcf/0x160 [<ffffffff813f10b0>] bit_update_start+0x20/0x50 [<ffffffff813ee233>] fbcon_switch+0x3b3/0x600 [<ffffffff81481668>] redraw_screen+0x178/0x260 [<ffffffff81477f3f>] complete_change_console+0x3f/0xe0 [<ffffffff814786ce>] vt_ioctl+0x6ee/0x12c0 [<ffffffff8146bd61>] tty_ioctl+0x361/0xc30 [<ffffffff8120e328>] do_vfs_ioctl+0x288/0x470 [<ffffffff8120e589>] SyS_ioctl+0x79/0x90 BUG: unable to handle kernel NULL pointer dereference at 0000000000000060 IP: [<ffffffffa0375cfc>] intel_fb_obj_invalidate+0x1c/0xf0 [i915] PGD 32cb59067 PUD 32131c067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: ... CPU: 0 PID: 1151 Comm: X Tainted: G W O 4.4.0-1-default #1 Hardware name: Hewlett-Packard HP EliteBook 8470p/179B, BIOS 68ICF Ver. F.40 01/31/2013 task: ffff88032c71e240 ti: ffff88032ce78000 task.ti: ffff88032ce78000 RIP: 0010:[<ffffffffa0375cfc>] [<ffffffffa0375cfc>] intel_fb_obj_invalidate+0x1c/0xf0 [i915] RSP: 0018:ffff88032ce7bac8 EFLAGS: 00010246 RAX: ffff88032c71e240 RBX: ffff8802fb63be00 RCX: ffff88012ae9dfc0 RDX: ffff880036d3af40 RSI: 0000000000000000 RDI: ffff8802fb63be00 RBP: ffff88032ce7baf0 R08: ffffea000c3eb7df R09: 0000000000000008 R10: ffff8800aae9db80 R11: ffff8800aae9d780 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000200001 R15: 0000000000000080 FS: 00007f1dfefbca00(0000) GS:ffff88033ea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000060 CR3: 000000032d524000 CR4: 00000000001406f0 Stack: ffff88032f62e600 ffff88032fb35800 0000000000000000 0000000000200001 0000000000000080 ffff88032ce7bb10 ffffffffa037ebf3 0000000000000000 ffff88032ce7bc68 ffff88032ce7bc48 ffffffff813f6ed8 ffff88032fb35860 Call Trace: [<ffffffffa037ebf3>] intel_fbdev_set_par+0x43/0x60 [i915] [<ffffffff813f6ed8>] fb_set_var+0x238/0x460 [<ffffffff813ed749>] fbcon_blank+0x2e9/0x330 [<ffffffff81482ac3>] do_unblank_screen+0xc3/0x190 [<ffffffff81477f59>] complete_change_console+0x59/0xe0 [<ffffffff814786ce>] vt_ioctl+0x6ee/0x12c0 [<ffffffff8146bd61>] tty_ioctl+0x361/0xc30 [<ffffffff8120e328>] do_vfs_ioctl+0x288/0x470 [<ffffffff8120e589>] SyS_ioctl+0x79/0x90 Code: 41 5f 5d c3 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 41 89 f5 53 4c 8b 67 08 48 89 fb <41> 8b 44 24 60 4d 8b 74 24 28 83 f8 01 74 58 8b b3 5c 01 00 00 RIP [<ffffffffa0375cfc>] intel_fb_obj_invalidate+0x1c/0xf0 [i915] RSP <ffff88032ce7bac8> CR2: 0000000000000060 I saw that in the log, but I was hoping that VirtualBox was not involved. Hmmm, I think I will then try to avoid loading it at startup, and only run it when required… Which should be this: https://bugs.freedesktop.org/show_bug.cgi?id=93822 (In reply to Adrien Clerc from comment #8) > I saw that in the log, but I was hoping that VirtualBox was not involved. > Hmmm, I think I will then try to avoid loading it at startup, and only run > it when required… It's likely has nothing to do with vbox. I pushed: 0c82312f3f15538f4e6ceda2a82caee8fbac4501 51f1385b90c1ad30896bd62b1ff97aa4edb1a163 ca40ba855c9e3f19f2715fd8a1ced5128359d3d9 to the stable branch. But it remains to decide which kernel versions need that. (In reply to Jiri Slaby from comment #11) > I pushed: > 0c82312f3f15538f4e6ceda2a82caee8fbac4501 > 51f1385b90c1ad30896bd62b1ff97aa4edb1a163 > ca40ba855c9e3f19f2715fd8a1ced5128359d3d9 > to the stable branch. Thanks! > But it remains to decide which kernel versions need that. I remember of this bug on my machine, and this appears to be a symptom is seen since 4.4 kernel. Maybe the problem has been already there but the new code path triggers it. Let's see whether we have a similar report on Leap 4.1.x kernel. Many thanks for the quick identification. So now, I'll wait eagerly for the fix to be pushed :) Linux kernel 4.5 hits Tumbleweed some days ago. It seems my problem is gone now. I'll close this bug in a few days if no new comment appears. Few days have passed. Closing as resolved. openSUSE-SU-2016:1008-1: An update that solves 15 vulnerabilities and has 26 fixes is now available. Category: security (important) Bug References: 814440,884701,949936,951440,951542,951626,951638,953527,954018,954404,954405,954876,958439,958463,958504,959709,960561,960563,960710,961263,961500,961509,962257,962866,962977,963746,963765,963767,963931,965125,966137,966179,966259,966437,966684,966693,968018,969356,969582,970845,971125 CVE References: CVE-2015-1339,CVE-2015-7799,CVE-2015-7872,CVE-2015-7884,CVE-2015-8104,CVE-2015-8709,CVE-2015-8767,CVE-2015-8785,CVE-2015-8787,CVE-2015-8812,CVE-2016-0723,CVE-2016-2069,CVE-2016-2184,CVE-2016-2383,CVE-2016-2384 Sources used: openSUSE Leap 42.1 (src): kernel-debug-4.1.20-11.1, kernel-default-4.1.20-11.1, kernel-docs-4.1.20-11.3, kernel-ec2-4.1.20-11.1, kernel-obs-build-4.1.20-11.2, kernel-obs-qa-4.1.20-11.1, kernel-obs-qa-xen-4.1.20-11.1, kernel-pae-4.1.20-11.1, kernel-pv-4.1.20-11.1, kernel-source-4.1.20-11.1, kernel-syms-4.1.20-11.1, kernel-vanilla-4.1.20-11.1, kernel-xen-4.1.20-11.1 On the latest Leap(4.1.21-14-xen) running Xen kernel we get deadlocks when switching console. Locked processes (and the whole GUI): /sbin/agetty /sbin/showconsole Kernel BUG dump: Jun 08 18:20:10 linux-6956 kernel: BUG: unable to handle kernel NULL pointer dereference at (null) Jun 08 18:20:10 linux-6956 kernel: IP: [< (null)>] (null) Jun 08 18:20:10 linux-6956 kernel: PGD f4160067 PUD f4195067 PMD 0 Jun 08 18:20:10 linux-6956 kernel: Oops: 0010 [#1] SMP Jun 08 18:20:10 linux-6956 kernel: Modules linked in: bnep bluetooth rfkill fuse ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bridge stp llc iscsi_ibft iscsi_boot_sysfs ipmi_ssif xfs libcrc32c blktap blktap2 pciback 8250_fintek coretemp crct10dif_pclmul crc32_pclmul iTCO_wdt dm_mod iTCO_vendor_support usbbk battery aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper pcspkr lpc_ich mfd_core cryptd i2c_i801 joydev ipmi_si 8250 ipmi_msghandler serial_core xen_scsibk tpm_crb tpm_tis video tpm acpi_pad fan thermal igb ptp pps_core ie31200_edac mei_me button mei edac_core shpchp processor thermal_sys hwmon blkbk blkback_pagemap domctl netbk xenbus_be gntdev evtchn btrfs xor hid_generic usbhid raid6_pq raid1 md_mod ast syscopyarea sysfillrect crc32c_intel sysimgblt i2c_algo_bit Jun 08 18:20:10 linux-6956 kernel: drm_kms_helper ttm xhci_pci ehci_pci ehci_hcd xhci_hcd drm i2c_core usbcore usb_common sg Jun 08 18:20:10 linux-6956 kernel: CPU: 2 PID: 1693 Comm: Xorg Not tainted 4.1.21-14-xen #1 Jun 08 18:20:10 linux-6956 kernel: Hardware name: Silicon Mechanics Rackform iServ R135.v5.1/X10SLM+-LN4F, BIOS 3.0 04/24/2015 Jun 08 18:20:10 linux-6956 kernel: task: ffff8801d6d6e650 ti: ffff8801d39b8000 task.ti: ffff8801d39b8000 Jun 08 18:20:10 linux-6956 kernel: RIP: e030:[<0000000000000000>] [< (null)>] (null) Jun 08 18:20:10 linux-6956 kernel: RSP: e02b:ffff8801d39bb640 EFLAGS: 00010206 Jun 08 18:20:10 linux-6956 kernel: RAX: 4000000001000000 RBX: ffff8801e9593940 RCX: 0000000000000002 Jun 08 18:20:10 linux-6956 kernel: RDX: 4000000001000000 RSI: 0000000000000000 RDI: ffff8801e9593940 Jun 08 18:20:10 linux-6956 kernel: RBP: ffff8801e58d1000 R08: 00000000000f6fff R09: 000000000000003c Jun 08 18:20:10 linux-6956 kernel: R10: 0000000000007ff0 R11: ffffffffa008e1b9 R12: ffff8801e9593940 Jun 08 18:20:10 linux-6956 kernel: R13: 0000000000000000 R14: 00000000000007e9 R15: 0000000000000000 Jun 08 18:20:10 linux-6956 kernel: FS: 00007f05657148c0(0000) GS:ffff8801e4680000(0000) knlGS:ffff8801e4680000 Jun 08 18:20:10 linux-6956 kernel: CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 Jun 08 18:20:10 linux-6956 kernel: CR2: 0000000000000000 CR3: 00000000fb230000 CR4: 0000000000042660 Jun 08 18:20:10 linux-6956 kernel: Stack: Jun 08 18:20:10 linux-6956 kernel: ffffffff8013c1ed 0000000000000001 ffffffff8004eeb9 00000000f6005000 Jun 08 18:20:10 linux-6956 kernel: ffffc90002800000 00000000f67ee000 ffff8801e9593940 ffff8801e58d1000 Jun 08 18:20:10 linux-6956 kernel: 0000000000000000 4000000001000000 00000000000007e9 0000000000000000 Jun 08 18:20:10 linux-6956 kernel: Call Trace: Jun 08 18:20:10 linux-6956 kernel: Inexact backtrace: Jun 08 18:20:10 linux-6956 kernel: [<ffffffff8013c1ed>] ? free_pages_prepare+0x1dd/0x2d0 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff8004eeb9>] ? iomem_map_sanity_check+0x89/0xd0 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff8013d176>] ? free_hot_cold_page+0x26/0x1a0 Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa0091bd2>] ? ttm_put_pages+0x152/0x1c0 [ttm] Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa0089384>] ? ttm_mem_global_free_zone+0x24/0x80 [ttm] Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa0091c99>] ? ttm_pool_unpopulate+0x59/0x70 [ttm] Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa008a1ed>] ? ttm_tt_destroy+0x5d/0x70 [ttm] Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa008e8c1>] ? ttm_bo_move_memcpy+0x361/0x630 [ttm] Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa008a265>] ? ttm_tt_init+0x65/0xb0 [ttm] Jun 08 18:20:10 linux-6956 kernel: [<ffffffff801718ad>] ? free_vmap_area_noflush+0x2d/0x60 Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa008bf86>] ? ttm_bo_handle_move_mem+0x256/0x5b0 [ttm] Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa008ca11>] ? ttm_bo_mem_space+0x181/0x350 [ttm] Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa008d0c8>] ? ttm_bo_validate+0x1e8/0x200 [ttm] Jun 08 18:20:10 linux-6956 kernel: [<ffffffff800aa3a3>] ? hrtimer_try_to_cancel+0x43/0xf0 Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa020e4ca>] ? ast_bo_pin+0x7a/0xa0 [ast] Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa020c077>] ? ast_crtc_do_set_base.isra.14.constprop.24+0xe7/0x390 [ast] Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa00eea17>] ? _object_find+0x67/0xb0 [drm] Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa00b1896>] ? drm_crtc_helper_set_config+0x7d6/0xae0 [drm_kms_helper] Jun 08 18:20:10 linux-6956 kernel: [<ffffffff8001377e>] ? __switch_to+0x22e/0x940 Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa00f06d8>] ? drm_mode_set_config_internal+0x68/0x100 [drm] Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa00bc280>] ? restore_fbdev_mode+0xc0/0xe0 [drm_kms_helper] Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa00be130>] ? drm_fb_helper_restore_fbdev_mode_unlocked+0x20/0x60 [drm_kms_helper] Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa00be192>] ? drm_fb_helper_set_par+0x22/0x50 [drm_kms_helper] Jun 08 18:20:10 linux-6956 kernel: [<ffffffff80386d2e>] ? fb_set_var+0x15e/0x3b0 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff8037e10b>] ? fbcon_blank+0x1cb/0x2b0 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff803fac05>] ? do_unblank_screen+0xa5/0x1c0 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff803f0e03>] ? complete_change_console+0x53/0xe0 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff803f1ddc>] ? vt_ioctl+0xf4c/0x10f0 Jun 08 18:20:10 linux-6956 kernel: [<ffffffffa00e59ba>] ? drm_ioctl+0x17a/0x590 [drm] Jun 08 18:20:10 linux-6956 kernel: [<ffffffff803e4647>] ? tty_ioctl+0x207/0xd30 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff801644b5>] ? handle_mm_fault+0xdc5/0x1920 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff801a41df>] ? do_vfs_ioctl+0x2ff/0x520 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff80054327>] ? recalc_sigpending+0x17/0x50 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff80054cfd>] ? __set_task_blocked+0x2d/0x70 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff800e10cc>] ? __audit_syscall_entry+0xac/0xf0 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff80022bfb>] ? syscall_trace_enter_phase1+0xfb/0x160 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff801a4481>] ? SyS_ioctl+0x81/0xa0 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff805f205d>] ? system_call_fastpath+0x16/0x76 Jun 08 18:20:10 linux-6956 kernel: [<ffffffff805f2010>] ? __entry_text_start+0x8/0x8 Jun 08 18:20:10 linux-6956 kernel: Code: Bad RIP value. Jun 08 18:20:10 linux-6956 kernel: RIP [< (null)>] (null) Jun 08 18:20:10 linux-6956 kernel: RSP <ffff8801d39bb640> Jun 08 18:20:10 linux-6956 kernel: CR2: 0000000000000000 Jun 08 18:20:10 linux-6956 kernel: ---[ end trace 80cd0c2700e7d36c ]--- ------------ linux-6956:~ # xl info host : linux-6956 release : 4.1.21-14-xen version : #1 SMP Sun Apr 17 07:27:45 UTC 2016 (fc187c1) machine : x86_64 nr_cpus : 8 max_cpu_id : 7 nr_nodes : 1 cores_per_socket : 4 threads_per_core : 2 cpu_mhz : 3400 It does not seem to happen on non-xen kernel. Does this warrant reopening the issue, or a new issue? (In reply to Vanja Bucic from comment #18) > It does not seem to happen on non-xen kernel. > > Does this warrant reopening the issue, or a new issue? A different hardware (ast), a different code path, a different symptom. An absolutely different problem. Please open another bug report with more hardware details. |