|
Bugzilla – Full Text Bug Listing |
| Summary: | Kernel crash - BUG: scheduling while atomic | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 11.2 | Reporter: | Daniele Tombolini <kailed> |
| Component: | Kernel | Assignee: | Jeff Mahoney <jeffm> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Critical | ||
| Priority: | P2 - High | CC: | angie, erwinl, forgotten_-yQj4fdAjs, forgotten_N1m2whZ-xl, harbrink, jeffm, Joachim.Reichelt, jose.lpa, lavrinenko_alex, lbickley, lchiquitto, lsteeger, meissner, petr.m, revealed, sebastien.rohaut, valerio.bontempi |
| Version: | Final | ||
| Target Milestone: | --- | ||
| Hardware: | 32bit | ||
| OS: | openSUSE 11.2 | ||
| Whiteboard: | maint:released:11.2:30542 | ||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
dmesg output
dmesg with kotd (2.6.31.11-0.0.0.15.4478caa) |
||
|
Description
Daniele Tombolini
2010-01-04 19:52:47 UTC
Hello! i got this exact same issue. 11.2 and: 2.6.31.8-0.1-desktop i386 GNU/Linux Thanks. Greetings, R Same problem here with openSUSE 11.2 and the new desktop kernel 2.6.31.8-0.1-desktop, except that there is no crash. The system is running but I don't know what the impact is of this error. Ah sorry mine is not crashing too, but i am receiving the same stacktrace with same [<xy>] .. letters and numbers in each lines. Ok, on my notebook hard lookup when xorg starts and the power button is the only way. Laptop does not crash but I did not test for too long, just few minutes.. Enough to be critical. same for me, but no crash/hangup. simply the messages on boot. Same for me, but no crash/hangup, higher CPU load and high swap activity. Well, there are two bugs. 1) the well known "atomic" -> it seems not so serious 2) #568307 - wireless issue (rt2860 driver) --> crash Hi there, I am having the same trouble .. but since my search seemed so stupid and got no results I created an extra bug report for this. Sorry for that. See bug #570316. During an automated installation the system will freeze after the main package installation, but run fine after manually shutting it down. Bye and thanks, Angie. *** Bug 570316 has been marked as a duplicate of this bug. *** The scheduling while atomic issue comes from my patch to override ACPI tables from the initramfs. I've disabled them in the repo until I can come up with a workaround. *** Bug 568244 has been marked as a duplicate of this bug. *** Same here, same kernel, but x86_64. Yep, it will occur on any machine with ACPI and preemption enabled, which means both the i386 and x86_64 desktop kernels. I've updated the 11.2 branch with an updated patch set for this. A kernel containing at least the following entry will be needed for testing: ------------------------------------------------------------------- Thu Jan 14 19:50:46 CET 2010 - jeffm@suse.de - patches.suse/add-initramfs-file_read_write: Build fix http://ftp.suse.com/pub/projects/kernel/kotd/openSUSE-11.2/ *** Bug 568638 has been marked as a duplicate of this bug. *** *** Bug 568801 has been marked as a duplicate of this bug. *** Hello there, sorry i have to say that i can not thest the KOTD, beacuse it would require kernel-source and other rpm's to install on my system. I'm getting failed dependencies. Greetings, kernel-source is in the src/ dir, but you don't need it anyway. All you need is kernel-$flavor.rpm unless you're doing kernel debugging and development. Sorry i would, if i could. Hopefully one of the others here can test some? Hi, Tried 2.6.31.11 from KOTD, x86รจ64, desktop. Here are the result : [ 0.118766] BUG: scheduling while atomic: swapper/0/0x10000002 [ 0.118773] Modules linked in: [ 0.118776] Pid: 0, comm: swapper Not tainted 2.6.31.11-0.0.0.15.4478caa-desktop #1 [ 0.118778] Call Trace: [ 0.118794] [<ffffffff81011a19>] try_stack_unwind+0x189/0x1b0 [ 0.118798] [<ffffffff8101025d>] dump_trace+0xad/0x3a0 [ 0.118802] [<ffffffff81011524>] show_trace_log_lvl+0x64/0x90 [ 0.118805] [<ffffffff81011573>] show_trace+0x23/0x40 [ 0.118810] [<ffffffff81552cc2>] dump_stack+0x81/0x9e [ 0.118815] [<ffffffff81056f32>] __schedule_bug+0x92/0xa0 [ 0.118819] [<ffffffff81553bff>] thread_return+0x2a7/0x3c8 [ 0.118823] [<ffffffff81060dc8>] __cond_resched+0x38/0x80 [ 0.118826] [<ffffffff81553ebd>] _cond_resched+0x4d/0x60 [ 0.118831] [<ffffffff8130575e>] acpi_ps_complete_op+0x2c2/0x2eb [ 0.118835] [<ffffffff81305c89>] acpi_ps_parse_loop+0x371/0x3d0 [ 0.118838] [<ffffffff81304976>] acpi_ps_parse_aml+0x119/0x404 [ 0.118842] [<ffffffff81303528>] acpi_ns_one_complete_parse+0x144/0x175 [ 0.118845] [<ffffffff813035b3>] acpi_ns_parse_table+0x5a/0xb3 [ 0.118849] [<ffffffff812ff0f3>] acpi_ns_load_table+0x87/0x138 [ 0.118852] [<ffffffff813087fe>] acpi_tb_load_namespace+0x80/0x163 [ 0.118856] [<ffffffff813088fe>] acpi_load_tables+0x1d/0x5c [ 0.118861] [<ffffffff81a071e1>] acpi_early_init+0x85/0x12e [ 0.118866] [<ffffffff819d363e>] start_kernel+0x3c4/0x3e6 [ 0.118870] [<ffffffff819d268d>] x86_64_start_reservations+0x134/0x14f [ 0.118873] [<ffffffff819d2803>] x86_64_start_kernel+0x15b/0x17e Sorry... But I don't have any crashes (even in 2.6.31.8). Created attachment 336995 [details]
dmesg with kotd (2.6.31.11-0.0.0.15.4478caa)
Jeff, kernel 2.6.31.11-0.0.0.15.4478caa is definitely an improvement. With the latest official update I was getting at least a dozen "scheduling while atomic" call traces during boot. With KOTD I get only one.
Strange. I wonder why I wasn't running into that. Fortunately this part should be easier to work around. I guess I just need to load the table into memory and not actually do anything with it instead of loading the table into the ACPI stack. Hi there, Which of the kernel packages shall I install now? My oiginal packages had been: kernel-desktop-2.6.31.5-0.1.1.i586.rpm preload-kmp-desktop-1.1_2.6.31.5_0.1-6.8.1.i586.rpm So I downloaded kernel-desktop.rpm and kernel-desktop-base.rpm, installed them, rebooted and got a kernel OOPS with complete machine freezing shortly after the booting process started. f5f7d-desktop #1 [0.172216]Call Trace: [0.172281] [<c020883a>] try_stack_unwind+0x17a/0x1a0 [0.172350] [<c020746c>] dump_trace+0x6c/0x130 [0.172417] [<c02083e8>] show_trace_log_lvl+0x58/0x80 [0.172485] [<c0208436>] show_trace+0x26/0x40 [0.172553] [<c06936e3>] dump_stack+0x79/0x91 [0.172619] [<c0693756>] panic+0x5b/0x145 [0.172686] [<c0255605>] do_exit+0x2d5/0x350 [0.172983] [<c0697fb7>] oops_end+0xb7/0x110 [0.173054] [<c022fba4>] no_context+x0d4/0xf0 [0.173121] [<c022fcc5>] __bad_area_nosemaphore+0x105/0x1d0 [0.173189] [<c022fdaf>] bad_area_nosemaphore+0x1f/0x40 [0.173256] [<c0699bbb>] do_page_fault+0x39b/0x440 [0.173324] [<c069722b>] error_code+0x73/0x78 [0.173391] [<c02017ac>] initramfs_file_write+0xec/0x190 [0.173458] [<c0201899>] initramfs_write+0x49/0x90 [0.173525] [<c0981e73>] do_copy+0x32/0xd7 [0.173591] [<c0980f6b>] flush_buffer+0x81/0xb8 [0.173657] [<c09a774d>] gunzip+0x374/0x428 [0.173723] [<c098147b>] unpack_to_rootfs+0x29f/0x3aa [0.173790] [<c0981e13>] populate_rootfs+0x59/0x87 [0.173857] [<c097fcd3>] start_kernel+0x38d/0x3ae [0.173984] [<c097f007>] i386_start_kernel+0x87/0x9f Please note: This trace has been entered manually since I had no chance of copy+paste. Bye, Angie. Ugh. Ok. It looks like it will probably be a better idea to test kernels out of my build service project instead of committing to the repo. This worked fine for me. For now, just back out to the last update kernel. I've got the same problem ("atomic" bug without crash) with kernel 2.6.31.8-0.1-desktop. Kernel 2.6.31.8-0.1-default is OK.
Ok, thanks for the feedback. We really don't need any more "me too" posts, though. Please only post new comments if you have new information. The cause of the messages is well known. It affects -desktop because -desktop has preemption enabled. It affects versions that have the generic apci table override code it in because, well, that's the code that's causing the messages. Here's the thing. Outside of the oops in comment #22, which is a real crash, this problem is _not_ crashing systems. It's warning during boot. It looks like a real oops, but it's not a real oops. Here's why: The code paths are calling cond_resched() which was added to a number of places to improve latency. Part of it checks to ensure that the code is not running with a spinlock held, in interrupt context, or some other conditions. It issues the message everyone's been reporting when it detects that. Generally, it's a good thing. For the code that we're using here, though, init hasn't even been started yet so it _can't_ actually schedule. The warnings are cosmetic even if they're really ugly. (In reply to comment #22) > Hi there, > > Which of the kernel packages shall I install now? My oiginal packages had been: Maybe it's a good idea for you to go back to the last known usable configuration? I tried the kernel too, and got freeze. I toasted kernel 2.6.31.8-0.1-desktop to a cd and used rpm -e (kotdkernel) and rpm -Uhv 2.6.31.8-0.1-desktop to get back to a working configuration. Freeze in my case is probably caused due to a difficult kernel choice in general which caused incompatibilities. Greetings, R Good Morning, well, my nice little machine has an additional kernel installed from the kernel:head repository. I chose the pae kernel here, allthough I am fairly sure I won't need it with my 1.5 GB of memory. kernel-pae-2.6.32-41.1.i586 Nevertheless, the other machines I install are not supposed to use that repository as well, so I just try to make sure that the other machines do _not_ install the desktop kernel. One machine I installed chose the default kernel, and it did not crash yet. I am installing via autoyast and have a script that will check for a specific kernel version (2.6.31.8) and will just try to downgrade the kernel as soon as it is detected. I will now test the default kernel with my own machine and see whether the problems are gone with the normal kernels (default / pae / xen?). I will come back with real feedback, promised! Bye for now, Angie. Hi there, I installed the following packages: kernel-xen + kernel-xen-base: booted fine, no problems kernel-pae + kernel-pae-baes: booted fine, no problems kernel-default + kernel-default-base: booted fine, no problems No errors had been reported, despite that one (but for each version): Could not load /lib/modules/kernel-<version>/systemtap/preloadtrace.ko I guess that's intended? The machine I installed yesterday had "kernel-desktop-2.6.31.5-0.1.1.i586" installed fine and was running fine. Today a user logged in, started applications (KDE4 + Kontact) and the machine got an oops ... so I had to reboot the system. Having learned from the problem I started top on that machine since it was quite slow, even for a nearly empty KDE4 session .. I could see the preload_trace process using 99 percent of the CPU. That machine only has 512MB RAM and started swapping right after Kontact had been started - which _seemed_ to have caused the kernel oops together with the frozen system. Shortly after booting the machine I used my chance and installed kernel-default, which solved the problem completely - the machine is usable, even with the 2.6.31.8 kernel. I would like to know how opensuse determines which kernel flavor is needed, since I had a big desktop of nearly the same age, 1GB RAM and it installed the default kernel. But on the smaller desktop it installed the desktop flavor ... For me the solution will be to get rid of the installed desktop kernel and replace it by the default kernel. But that is the tricky part during an automated installation. Bye and thanks for your help, Angie. The BUG: scheduling while atomic: swapper/0/0x10000002 is fixed in 2.6.31.11-0.0.0.17.0f2b876-desktop. No more messages. Daniele: I believe Jeff is waiting to hear from you. See comment 13. Uh!, sorry... For me, fixed in kernel-desktop-2.6.31.11-bnc540589.0 Resetting needinfo. Hi, kernel-desktop-2.6.31.12 from ktod, fixed for me. Thank you. *** Bug 574910 has been marked as a duplicate of this bug. *** Hi Again, is there any chance that a patch will be put into the update repositories or do I need to fetch the kotd and install it? At the moment one of the machines that has an older kernel installed "2.6.31.5-0.1-default" gets kernel oopses quite often, whereas a machine with nearly the same setup seems to be quite stable. I would like to test a newer kernel, but would love to have it from the update repository. The kernel 2.6.31.11 had already been okay for my personal machine .. Bye and many thanks, Angie. *** Bug 575615 has been marked as a duplicate of this bug. *** http://download.opensuse.org/update/11.2-test/ has the next update kernel (2.6.31.12) checked in, feel free to test if it fixes this issue. Hi there, the kernel seems to be stable now, had no trouble with the swapper yet. I am quite happy now. Thanks a bundle, Angie. I've been running 2.6.31.12-0.1-desktop for several days now -- no troubles. Update released for: kernel-debug, kernel-debug-base, kernel-debug-base-debuginfo, kernel-debug-debuginfo, kernel-debug-debugsource, kernel-debug-devel, kernel-debug-devel-debuginfo, kernel-default, kernel-default-base, kernel-default-base-debuginfo, kernel-default-debuginfo, kernel-default-debugsource, kernel-default-devel, kernel-default-devel-debuginfo, kernel-desktop, kernel-desktop-base, kernel-desktop-base-debuginfo, kernel-desktop-debuginfo, kernel-desktop-debugsource, kernel-desktop-devel, kernel-desktop-devel-debuginfo, kernel-pae, kernel-pae-base, kernel-pae-base-debuginfo, kernel-pae-debuginfo, kernel-pae-debugsource, kernel-pae-devel, kernel-pae-devel-debuginfo, kernel-source, kernel-source-vanilla, kernel-syms, kernel-trace, kernel-trace-base, kernel-trace-base-debuginfo, kernel-trace-debuginfo, kernel-trace-debugsource, kernel-trace-devel, kernel-trace-devel-debuginfo, kernel-vanilla, kernel-vanilla-base, kernel-vanilla-base-debuginfo, kernel-vanilla-debuginfo, kernel-vanilla-debugsource, kernel-vanilla-devel, kernel-vanilla-devel-debuginfo, kernel-xen, kernel-xen-base, kernel-xen-base-debuginfo, kernel-xen-debuginfo, kernel-xen-debugsource, kernel-xen-devel, kernel-xen-devel-debuginfo, preload-kmp-default, preload-kmp-desktop Products: openSUSE 11.2 (debug, i586, x86_64) Got the same in 11.3M1: [ 0.016206] Pid: 0, comm: swapper Not tainted 2.6.32-3-desktop #1 [ 0.016269] Call Trace: [ 0.016340] [<ffffffff81006219>] dump_trace+0x79/0x340 [ 0.016406] [<ffffffff814b7393>] dump_stack+0x69/0x6f [ 0.016471] [<ffffffff814b81e1>] thread_return+0x367/0x386 [ 0.016537] [<ffffffff81045e45>] __cond_resched+0x25/0x40 [ 0.016601] [<ffffffff814b832d>] _cond_resched+0x2d/0x40 [ 0.016666] [<ffffffff810d3c1e>] generic_perform_write+0x14e/0x200 [ 0.016732] [<ffffffff810d3d3e>] generic_file_buffered_write+0x6e/0xd0 [ 0.016798] [<ffffffff810d4397>] __generic_file_aio_write+0x247/0x460 [ 0.016863] [<ffffffff810d461d>] generic_file_aio_write+0x6d/0xe0 [ 0.016929] [<ffffffff8111ae62>] do_sync_write+0xe2/0x120 [ 0.017002] [<ffffffff8111b138>] vfs_write+0xb8/0x1a0 [ 0.017066] [<ffffffff8111bbda>] sys_write+0x5a/0x110 [ 0.017131] [<ffffffff81b3530a>] do_copy+0x84/0xb0 [ 0.017194] [<ffffffff81b34d4c>] flush_buffer+0x7d/0xa4 [ 0.017259] [<ffffffff81b58b8d>] gunzip+0x411/0x4bf [ 0.017323] [<ffffffff81b35176>] unpack_to_rootfs+0x2d3/0x3e3 [ 0.017388] [<ffffffff81b35b26>] populate_rootfs+0x5b/0x10a [ 0.017452] [<ffffffff81b33dea>] start_kernel+0x302/0x318 [ 0.017516] [<ffffffff81b333f3>] x86_64_start_kernel+0xe5/0xe9 [ 0.025030] BUG: scheduling while atomic: swapper/0/0x10000002 [ 0.025094] Modules linked in: [ 0.025182] Pid: 0, comm: swapper Not tainted 2.6.32-3-desktop #1 11.3M2 will contain a 2.6.33-rc based kernel that doesn't have the DSDT in initramfs patches. Closing this one as fixed. |