Bug 813889

Summary: BUG: scheduling while atomic:
Product: [openSUSE] openSUSE 12.3 Reporter: Dion Kant <g.w.kant>
Component: KernelAssignee: Neil Brown <nfbrown>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Critical    
Priority: P0 - Crit Sit CC: nfbrown
Version: Final   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE 12.3   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Dion Kant 2013-04-06 22:27:28 UTC
User-Agent:       Mozilla/5.0 (X11; Linux i686 on x86_64; rv:20.0) Gecko/20100101 Firefox/20.0


I have seen this on two different machines. It starts shortly after booting. This is with a Xen kernel (3.7.10-1.1-xen)

2013-04-06T19:45:55.217991+02:00 dom0-gcan systemd[1]: Started Load Kernel Modules.
2013-04-06T19:45:55.217994+02:00 dom0-gcan systemd[1]: Starting Apply Kernel Variables...
2013-04-06T19:45:55.217996+02:00 dom0-gcan kernel: [    5.564002] Call Trace:
2013-04-06T19:45:55.218000+02:00 dom0-gcan systemd[1]: Mounted FUSE Control File System.
2013-04-06T19:45:55.218003+02:00 dom0-gcan kernel: [    5.564007]  [<ffffffff81004818>] dump_trace+0x88/0x300
2013-04-06T19:45:55.218004+02:00 dom0-gcan kernel: [    5.564011]  [<ffffffff8158b033>] dump_stack+0x69/0x6f
2013-04-06T19:45:55.218004+02:00 dom0-gcan systemd[1]: Mounted Configuration File System.
2013-04-06T19:45:55.218008+02:00 dom0-gcan kernel: [    5.564015]  [<ffffffff8158cf34>] __schedule_bug+0x48/0x54
2013-04-06T19:45:55.218010+02:00 dom0-gcan kernel: [    5.564019]  [<ffffffff81596814>] thread_return+0x450/0x45c
2013-04-06T19:45:55.218010+02:00 dom0-gcan kernel: [    5.564023]  [<ffffffff8100be70>] cpu_idle+0xd0/0xe0
2013-04-06T19:45:55.218011+02:00 dom0-gcan kernel: [    5.564027]  [<ffffffff81ac8bc8>] start_kernel+0x3b8/0x3c3
2013-04-06T19:45:55.218012+02:00 dom0-gcan kernel: [    5.564030]  [<ffffffff81ac8436>] x86_64_start_kernel+0x105/0x114
2013-04-06T19:45:55.218013+02:00 dom0-gcan kernel: [    5.565075] BUG: scheduling while atomic: swapper/0/0/0x00000002

However on a different machine with a desktop kernel I have seen the "BUG: scheduling while atomic: ext4lazyinit

I have got the feeling that this may be related to systemd. Since 12.3 I cannot disable systemd anymore. This is what I did before (12.2 and downwards) when I run into instabilities related to systemd. However, I may be wrong.


Reproducible: Sometimes

Steps to Reproduce:
1. Reboot the machine
2. Check syslog for the BUG
3.
Actual Results:  
I have one type of SuperMicro hardware (multiple similar machines) on which the installer dies early in the beginning of the final installation step: formatting disks and installing rpms.

The SuperMicro machine running the Xen kernel survives until now, but with bursts of a lot of logging.

Depending on the machine

Expected Results:  
A stable system.
Comment 1 Dion Kant 2013-04-09 19:14:58 UTC
I isolated the problem a bit further and I now suspect this is related to linux software raid. This could be a duplicate of https://bugzilla.novell.com/show_bug.cgi?id=812316

I'll report more details in that bug report

*** This bug has been marked as a duplicate of bug 812316 ***
Comment 2 Neil Brown 2013-04-09 23:32:04 UTC
Based on the info pasted into bug 812316, I think this is fixed by upstream kernel commit c8dc9c654794a765ca61baed07f84ed8aaa7ca

I'll arrange to get it into a future update.
Comment 3 Dion Kant 2013-04-11 16:15:03 UTC
I applied both ee0b0244030434cdda26777bfb98962447e080cd and 	c8dc9c654794a765ca61baed07f84ed8aaa7ca8c on 3.7.10-1.1 and with this, it looks stable during some first testing.
Comment 4 Neil Brown 2013-04-29 05:10:26 UTC
Thanks for the confirmation.
I've added those patches to our 12.3 kernel tree so they'll appear in future maintenance releases.
Comment 5 Swamp Workflow Management 2013-06-10 10:16:21 UTC
openSUSE-SU-2013:0951-1: An update that solves two vulnerabilities and has 6 fixes is now available.

Category: security (critical)
Bug References: 803931,813889,815745,818327,818497,819519,819789,820048
CVE References: CVE-2013-0290,CVE-2013-2094
Sources used:
openSUSE 12.3 (src):    kernel-docs-3.7.10-1.11.1, kernel-source-3.7.10-1.11.1, kernel-syms-3.7.10-1.11.1
Comment 6 Dion Kant 2013-08-25 13:15:10 UTC
Neil,

I think I run into the same issue as is described by commit 5026d7a9b2f3eb1f9bda66c18ac6bc3036ec9020 from H. Peter Anvin:

md/raid1,5,10: Disable WRITE SAME until a recovery strategy is in place

I think this is related to this one and therefore this Bug is not resolved.
Is this issue taken care of in a (future) maintenance update?

With the current 12.3 kernel I am getting panic on a system with a SAS contoller with SATA drives.
Comment 7 Dion Kant 2013-08-25 13:57:39 UTC
*** Bug 836572 has been marked as a duplicate of this bug. ***
Comment 8 Neil Brown 2013-08-25 23:27:55 UTC
Thanks for reporting that.
I've added that patch to our 12.3 kernel tree so it will be in the next update (and in the kotd in a day or so).
Comment 9 Swamp Workflow Management 2013-12-30 20:09:17 UTC
openSUSE-SU-2013:1971-1: An update that solves 34 vulnerabilities and has 19 fixes is now available.

Category: security (moderate)
Bug References: 799516,801341,802347,804198,807153,807188,807471,808827,809906,810144,810473,811882,812116,813733,813889,814211,814336,814510,815256,815320,816668,816708,817651,818053,818561,821612,821735,822575,822579,823267,823342,823517,823633,823797,824171,824295,826102,826350,826374,827749,827750,828119,828191,828714,829539,831058,831956,832615,833321,833585,834647,837258,838346
CVE References: CVE-2013-0914,CVE-2013-1059,CVE-2013-1819,CVE-2013-1929,CVE-2013-1979,CVE-2013-2141,CVE-2013-2148,CVE-2013-2164,CVE-2013-2206,CVE-2013-2232,CVE-2013-2234,CVE-2013-2237,CVE-2013-2546,CVE-2013-2547,CVE-2013-2548,CVE-2013-2634,CVE-2013-2635,CVE-2013-2851,CVE-2013-2852,CVE-2013-3222,CVE-2013-3223,CVE-2013-3224,CVE-2013-3226,CVE-2013-3227,CVE-2013-3228,CVE-2013-3229,CVE-2013-3230,CVE-2013-3231,CVE-2013-3232,CVE-2013-3233,CVE-2013-3234,CVE-2013-3235,CVE-2013-3301,CVE-2013-4162
Sources used:
openSUSE 12.3 (src):    kernel-docs-3.7.10-1.24.1, kernel-source-3.7.10-1.24.1, kernel-syms-3.7.10-1.24.1