Bug 968259

Summary: kernel crash observed in OBS when builing gcc5-testsuite
Product: [openSUSE] openSUSE Tumbleweed Reporter: Dominique Leuenberger <dimstar>
Component: KernelAssignee: E-mail List <kernel-maintainers>
Status: RESOLVED DUPLICATE QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: mhocko, tiwai
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Dominique Leuenberger 2016-02-25 14:59:38 UTC
while building gcc5-testsuite in OBS, we regularly see this kernel crash


[ 5017s] Running /home/abuild/rpmbuild/BUILD/gcc-5.3.1-r231346/gcc/testsuite/gcc.c-torture/compile/compile.exp ...
[ 5017s] [ 5007.423905] BUG: unable to handle kernel NULL pointer dereference at           (null)
[ 5017s] [ 5007.424436] IP: [<          (null)>]           (null)
[ 5017s] [ 5007.424436] PGD 0 
[ 5017s] [ 5007.424436] Oops: 0010 [#1] PREEMPT SMP 
[ 5017s] [ 5007.424436] Modules linked in: ata_generic ata_piix nls_iso8859_1 nls_cp437 vfat fat virtio_rng virtio_blk virtio_pci virtio_ring virtio nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack btrfs xor raid6_pq reiserfs squashfs fuse dm_snapshot dm_bufio dm_mod binfmt_misc loop sg
[ 5017s] [ 5007.424436] CPU: 0 PID: 30354 Comm: cc1 Not tainted 4.4.0-3-default #1
[ 5017s] [ 5007.424436] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20151112_172657-sheep25 04/01/2014
[ 5017s] [ 5007.424436] task: ffff88007fb00240 ti: ffff880235384000 task.ti: ffff880235384000
[ 5017s] [ 5007.424436] RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
[ 5017s] [ 5007.424436] RSP: 0018:ffff880235387d00  EFLAGS: 00010047
[ 5017s] [ 5007.424436] RAX: ffff88017ed299a0 RBX: ffff88023ffec6c0 RCX: 0000000000000000
[ 5017s] [ 5007.424436] RDX: ffff88017fcbc000 RSI: ffff88023ffec6c0 RDI: 0000000000000003
[ 5017s] [ 5007.424436] RBP: ffff880235387d58 R08: 000000000001a288 R09: 0000000000000282
[ 5017s] [ 5007.424436] R10: 0000000000000000 R11: ffffffffffffffe2 R12: 000000000000000e
[ 5017s] [ 5007.424436] R13: ffffea0000cd0e40 R14: ffff8800bb2529a8 R15: 0000000000000001
[ 5017s] [ 5007.424436] FS:  00007f08a2c5a840(0000) GS:ffff88023fc00000(0000) knlGS:00000000f3970b40
[ 5017s] [ 5007.424436] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5017s] [ 5007.424436] CR2: 0000000000000000 CR3: 0000000001e0b000 CR4: 00000000000406f0
[ 5017s] [ 5007.424436] Stack:
[ 5017s] [ 5007.424436]  ffff88017ed299a0 ffff880200000003 0000000000000216 ffff8800bb253000
[ 5017s] [ 5007.424436]  ffffea0002513860 ffffea0005501b60 ffff8800bb253000 ffffea0008d24140
[ 5017s] [ 5007.424436]  00000000000001fe ffff8800bb252010 ffff8800bb253000 ffff880235387d90
[ 5017s] [ 5007.424436] Call Trace:
[ 5017s] [ 5007.424436] Inexact backtrace:
[ 5017s] [ 5007.424436] 
[ 5017s] [ 5007.424436]  [<ffffffff811c716d>] free_pages_and_swap_cache+0x7d/0x90
[ 5017s] [ 5007.424436]  [<ffffffff811b06a6>] tlb_flush_mmu_free+0x36/0x60
[ 5017s] [ 5007.424436]  [<ffffffff811b184c>] tlb_finish_mmu+0x1c/0x50
[ 5017s] [ 5007.424436]  [<ffffffff811bc677>] exit_mmap+0xc7/0x150
[ 5017s] [ 5007.424436]  [<ffffffff8107958d>] mmput+0x4d/0x120
[ 5017s] [ 5007.424436]  [<ffffffff8107f0c7>] do_exit+0x267/0xac0
[ 5017s] [ 5007.424436]  [<ffffffff8112358b>] ? __audit_syscall_entry+0xab/0xf0
[ 5017s] [ 5007.424436]  [<ffffffff8100319b>] ? do_audit_syscall_entry+0x4b/0x70
[ 5017s] [ 5007.424436]  [<ffffffff8107f9a3>] do_group_exit+0x43/0xb0
[ 5017s] [ 5007.424436]  [<ffffffff8107fa24>] SyS_exit_group+0x14/0x20
[ 5017s] [ 5007.424436]  [<ffffffff816a9476>] entry_SYSCALL_64_fastpath+0x16/0x75
[ 5017s] [ 5007.424436] Code:  Bad RIP value.
[ 5017s] [ 5007.424436] RIP  [<          (null)>]           (null)
[ 5017s] [ 5007.424436]  RSP <ffff880235387d00>
[ 5017s] [ 5007.424436] CR2: 0000000000000000
[ 5017s] [ 5007.424436] ---[ end trace 0a90586faf4ca841 ]---
[ 5017s] [ 5007.424436] Fixing recursive fault but reboot is needed!
[ 5017s] [ 5007.424436] BUG: scheduling while atomic: cc1/30354/0x00000002
[ 5017s] [ 5007.424436] Modules linked in: ata_generic ata_piix nls_iso8859_1 nls_cp437 vfat fat virtio_rng virtio_blk virtio_pci virtio_ring virtio nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack btrfs xor raid6_pq reiserfs squashfs fuse dm_snapshot dm_bufio dm_mod binfmt_misc loop sg
[ 5017s] [ 5007.424436] CPU: 0 PID: 30354 Comm: cc1 Tainted: G      D         4.4.0-3-default #1
[ 5017s] [ 5007.424436] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20151112_172657-sheep25 04/01/2014
[ 5017s] [ 5007.424436]  0000000000000000 ffff8802353879f0 ffffffff8137ea39 ffff88023fc16e80
[ 5017s] [ 5007.424436]  ffff880235387a00 ffffffff8109f7db ffff880235387a50 ffffffff816a4d47
[ 5017s] [ 5007.424436]  ffffffff81183ba0 0000000000000000 ffff88007fb00240 ffff880235388000
[ 5017s] [ 5007.424436] Call Trace:
[ 5017s] [ 5007.424436]  [<ffffffff8101a095>] try_stack_unwind+0x175/0x190
[ 5017s] [ 5007.424436]  [<ffffffff81018fe9>] dump_trace+0x69/0x3a0
[ 5017s] [ 5007.424436]  [<ffffffff8101a0fb>] show_trace_log_lvl+0x4b/0x60
[ 5017s] [ 5007.424436]  [<ffffffff8101942c>] show_stack_log_lvl+0x10c/0x180
[ 5017s] [ 5007.424436]  [<ffffffff8101a195>] show_stack+0x25/0x50
[ 5017s] [ 5007.424436]  [<ffffffff8137ea39>] dump_stack+0x4b/0x72
[ 5017s] [ 5007.424436]  [<ffffffff8109f7db>] __schedule_bug+0x4b/0x60
[ 5017s] [ 5007.424436]  [<ffffffff816a4d47>] thread_return+0x3d6/0x73f
[ 5017s] [ 5007.424436]  [<ffffffff816a50ec>] schedule+0x3c/0x90
[ 5017s] [ 5007.424436]  [<ffffffff8107f723>] do_exit+0x8c3/0xac0
[ 5017s] [ 5007.424436]  [<ffffffff81019b01>] oops_end+0xa1/0xd0
[ 5017s] [ 5007.424436]  [<ffffffff81065e3b>] no_context+0x10b/0x360
[ 5017s] [ 5007.424436]  [<ffffffff81066110>] __bad_area_nosemaphore+0x80/0x1f0
[ 5017s] [ 5007.424436]  [<ffffffff81066293>] bad_area_nosemaphore+0x13/0x20
[ 5017s] [ 5007.424436]  [<ffffffff81066559>] __do_page_fault+0xb9/0x410
[ 5017s] [ 5007.424436]  [<ffffffff81066917>] trace_do_page_fault+0x37/0xc0
[ 5017s] [ 5007.424436]  [<ffffffff8105f1d9>] do_async_page_fault+0x19/0x70
[ 5017s] [ 5007.424436]  [<ffffffff816ab748>] async_page_fault+0x28/0x30
[ 5017s] [ 5007.424436] DWARF2 unwinder stuck at async_page_fault+0x28/0x30
[ 5017s] [ 5007.424436] 
[ 5017s] [ 5007.424436] Leftover inexact backtrace:
[ 5017s] [ 5007.424436] 
[ 5017s] [ 5007.424436]  [<ffffffff811c716d>] free_pages_and_swap_cache+0x7d/0x90
[ 5017s] [ 5007.424436]  [<ffffffff811b06a6>] tlb_flush_mmu_free+0x36/0x60
[ 5017s] [ 5007.424436]  [<ffffffff811b184c>] tlb_finish_mmu+0x1c/0x50
[ 5017s] [ 5007.424436]  [<ffffffff811bc677>] exit_mmap+0xc7/0x150
[ 5017s] [ 5007.424436]  [<ffffffff8107958d>] mmput+0x4d/0x120
[ 5017s] [ 5007.424436]  [<ffffffff8107f0c7>] do_exit+0x267/0xac0
[ 5017s] [ 5007.424436]  [<ffffffff8112358b>] ? __audit_syscall_entry+0xab/0xf0
[ 5017s] [ 5007.424436]  [<ffffffff8100319b>] ? do_audit_syscall_entry+0x4b/0x70
[ 5017s] [ 5007.424436]  [<ffffffff8107f9a3>] do_group_exit+0x43/0xb0
[ 5017s] [ 5007.424436]  [<ffffffff8107fa24>] SyS_exit_group+0x14/0x20
[ 5017s] [ 5007.424436]  [<ffffffff816a9476>] entry_SYSCALL_64_fastpath+0x16/0x75
[10458s] qemu: terminating on signal 15 from pid 1231
Comment 1 Michal Hocko 2016-02-25 15:29:55 UTC
Could you try to capture the kernel dump?
Comment 2 Dominique Leuenberger 2016-02-25 15:33:10 UTC
(In reply to Michal Hocko from comment #1)
> Could you try to capture the kernel dump?

I can try to simulate it in a qemu-kvm setup equal to the one used by OBS. No guarantees though :)
Comment 3 Takashi Iwai 2016-02-25 16:06:15 UTC
Already reported in bug 967412.  Let's track there.

*** This bug has been marked as a duplicate of bug 967412 ***