Bug 1201681 - kernel oops with 5.3.18-150300.59.81.1
Summary: kernel oops with 5.3.18-150300.59.81.1
Status: RESOLVED FIXED
Alias: None
Product: openSUSE Distribution
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Leap 15.3
Hardware: x86-64 Linux
: P5 - None : Major with 5 votes (vote)
Target Milestone: ---
Assignee: Borislav Petkov
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-07-20 08:35 UTC by Robert Simai
Modified: 2022-08-01 11:31 UTC (History)
4 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Simai 2022-07-20 08:35:32 UTC
Installed the most recent 

kernel-default-optional-5.3.18-150300.59.81.1.x86_64
kernel-default-extra-5.3.18-150300.59.81.1.x86_64
kernel-default-5.3.18-150300.59.81.1.x86_64

today and rebooted, which was followed by a kernel oops during the graphical login. It's reproducible and even if the system may partially work, it reports soft lockups for one CPU until it stucks completely.

The system works nicely with (and up to) 5.3.18-150300.59.76-default, I reverted to this version.

Here's the oops, please advise if you need anything else:

Jul 20 10:05:12 localhost kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Jul 20 10:05:12 localhost kernel: #PF: supervisor instruction fetch in kernel mode
Jul 20 10:05:12 localhost kernel: #PF: error_code(0x0010) - not-present page
Jul 20 10:05:12 localhost kernel: PGD 0 P4D 0 
Jul 20 10:05:12 localhost kernel: Oops: 0010 [#1] SMP PTI
Jul 20 10:05:12 localhost kernel: CPU: 3 PID: 2448 Comm: plasmashell Tainted: G           OE  X  N 5.3.18-150300.59.81-default #1 SLE15-SP3
Jul 20 10:05:12 localhost kernel: Hardware name: Dell Inc. Precision Tower 3620/0MWYPT, BIOS 2.3.5 08/06/2017
Jul 20 10:05:12 localhost kernel: RIP: 0010:0x0
Jul 20 10:05:12 localhost kernel: Code: Bad RIP value.
Jul 20 10:05:12 localhost kernel: RSP: 0018:fffffe00000aeee0 EFLAGS: 00010046
Jul 20 10:05:12 localhost kernel: RAX: 0000000000000001 RBX: ffffa09cd5ac0000 RCX: 0000000000000048
Jul 20 10:05:12 localhost kernel: RDX: 0000000000000000 RSI: ffffffff8b2018af RDI: 0000000000000000
Jul 20 10:05:12 localhost kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Jul 20 10:05:12 localhost kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Jul 20 10:05:12 localhost kernel: R13: 0000000000000000 R14: 0000000fa8388005 R15: 0000000000000001
Jul 20 10:05:12 localhost kernel: FS:  00007fd560875640(0000) GS:ffffa09cd5ac0000(0000) knlGS:0000000000000000
Jul 20 10:05:12 localhost kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 20 10:05:12 localhost kernel: CR2: ffffffffffffffd6 CR3: 0000000fa8388005 CR4: 00000000003706e0
Jul 20 10:05:12 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jul 20 10:05:12 localhost kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jul 20 10:05:12 localhost kernel: Call Trace:
Jul 20 10:05:12 localhost kernel:  <NMI>
Jul 20 10:05:12 localhost kernel:  ? end_repeat_nmi+0x7/0x6d
Jul 20 10:05:12 localhost kernel:  ? lookup_fast+0x7a/0x2b0
Jul 20 10:05:12 localhost kernel:  ? lookup_fast+0x7a/0x2b0
Jul 20 10:05:12 localhost kernel:  ? lookup_fast+0x7a/0x2b0
Jul 20 10:05:12 localhost kernel:  </NMI>
Jul 20 10:05:12 localhost kernel:  ? walk_component+0x48/0x300
Jul 20 10:05:12 localhost kernel:  ? link_path_walk.part.33+0x68/0x510
Jul 20 10:05:12 localhost kernel:  ? path_lookupat+0x6e/0x210
Jul 20 10:05:12 localhost kernel:  ? filename_lookup+0xb6/0x190
Jul 20 10:05:12 localhost kernel:  ? filename_lookup+0xf2/0x190
Jul 20 10:05:12 localhost kernel:  ? vfs_statx+0x73/0xe0
Jul 20 10:05:12 localhost kernel:  ? vfs_statx+0x73/0xe0
Jul 20 10:05:12 localhost kernel:  ? __do_sys_newstat+0x39/0x70
Jul 20 10:05:12 localhost kernel:  ? do_syscall_64+0x5b/0x1e0
Jul 20 10:05:12 localhost kernel:  ? entry_SYSCALL_64_after_hwframe+0x67/0xcc
Jul 20 10:05:12 localhost kernel: Modules linked in: xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv>
Jul 20 10:05:12 localhost kernel:  snd_hda_core snd_hwdep nls_iso8859_1 snd_soc_core nls_cp437 vfat fat dell_wmi aesni_intel snd_compress dell_smbios crypto_simd snd_pcm_dmaengine cryptd dcdbas(X) glue_helper snd_pcm sparse_keymap wmi_bmof >
Jul 20 10:05:12 localhost kernel: Supported: No, Unsupported modules are loaded
Jul 20 10:05:12 localhost kernel: CR2: 0000000000000000
Jul 20 10:05:12 localhost kernel: ---[ end trace 2224b8a28ae2b3e6 ]---
Jul 20 10:05:12 localhost kernel: RIP: 0010:0x0
Jul 20 10:05:12 localhost kernel: Code: Bad RIP value.
Jul 20 10:05:12 localhost kernel: RSP: 0018:fffffe00000aeee0 EFLAGS: 00010046
Jul 20 10:05:12 localhost kernel: RAX: 0000000000000001 RBX: ffffa09cd5ac0000 RCX: 0000000000000048
Jul 20 10:05:12 localhost kernel: RDX: 0000000000000000 RSI: ffffffff8b2018af RDI: 0000000000000000
Jul 20 10:05:12 localhost kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Jul 20 10:05:12 localhost kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Jul 20 10:05:12 localhost kernel: R13: 0000000000000000 R14: 0000000fa8388005 R15: 0000000000000001
Jul 20 10:05:12 localhost kernel: FS:  00007fd560875640(0000) GS:ffffa09cd5ac0000(0000) knlGS:0000000000000000
Jul 20 10:05:12 localhost kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 20 10:05:12 localhost kernel: CR2: ffffffffffffffd6 CR3: 0000000fa8388005 CR4: 00000000003706e0
Jul 20 10:05:12 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jul 20 10:05:12 localhost kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Comment 1 Takashi Iwai 2022-07-20 09:03:24 UTC
See bug 1201664, bug 1201644 and bug 1201672.
Comment 2 Takashi Iwai 2022-07-20 09:04:41 UTC
spectre_v2=off boot option as a temporary workaround
Comment 3 Borislav Petkov 2022-07-20 17:25:08 UTC
Ok, new kernel here:

https://download.opensuse.org/repositories/home:/bpetkov:/15sp3/pool/

Pls test.

Thx.
Comment 4 Felix Miata 2022-07-21 04:49:26 UTC
(In reply to Takashi Iwai from comment #1)
> See bug 1201664, bug 1201644 and bug 1201672.

No can do: "You are not authorized to access bug #1201672."
Comment 5 Robert Simai 2022-07-21 13:15:24 UTC
@Borislav, grabbed 

kernel-default-5.3.18-150300.1.1.g7226005.x86_64.rpm
kernel-default-extra-5.3.18-150300.1.1.g7226005.x86_64.rpm
kernel-default-optional-5.3.18-150300.1.1.g7226005.x86_64.rpm

from your repo and booted

localhost:~ # uname -a
Linux localhost 5.3.18-150300.1.g7226005-default #1 SMP Wed Jul 20 14:39:51 UTC 2022 (7226005) x86_64 x86_64 x86_64 GNU/Linux

and finally can't see the issue anymore. No oops, no lockups so far.

Is there any reason why you've chosen a lower package version? Is that only for testing?
Comment 6 Takashi Iwai 2022-07-21 13:35:06 UTC
The package version (actually the build counter) is reset for each new OBS project.  It's a long-standing issue of OBS kernel packaging (for non-official builds).
Comment 7 Borislav Petkov 2022-07-21 14:29:34 UTC
(In reply to Robert Simai from comment #5)
> localhost:~ # uname -a
> Linux localhost 5.3.18-150300.1.g7226005-default #1 SMP Wed Jul 20 14:39:51
> UTC 2022 (7226005) x86_64 x86_64 x86_64 GNU/Linux
> 
> and finally can't see the issue anymore. No oops, no lockups so far.

Cool, thanks for testing. That looks good.

Lemme get this submitted properly now...
Comment 8 Borislav Petkov 2022-08-01 11:31:40 UTC
Ok, all retbleed fallout in the SP3 kernel should be fixed and all fixes are in the KOTD kernels, soon in the official submission.

Closing.