Bugzilla – Bug 660464
complete system freeze regression
Last modified: 2018-07-03 20:35:09 UTC
openQA testing has shown complete system freeze early in booting or sometimes after install in 32-bit installs. http://openqa.opensuse.org/results/openSUSE-NET-i586-Build0963 http://openqa.opensuse.org/results/openSUSE-NET-i586-Build0964 http://openqa.opensuse.org/results/openSUSE-NET-i586-Build0964-lxde How To Reproduce: 1. qemu-kvm -m 1000 -cdrom factory/iso/openSUSE-NET-i586-Build0964-Media.iso 2. (maybe optional) on the boot prompt add nohz=off 3. optionally use F3 to select text mode to see console messages 4. press return to boot Actual Results: Boot will often stop after printing " >>> openSUSE installation program v3.5.7... <<< Starting udev..." Expected Results: should work like yesterdays version Reproducible: Sometimes - sometimes x86_64 bit versions also showed this problem. - also happens in VirtualBox - from the test log's statuser values can be seen that it is busy-looping
Now I have seen a kernel-panic on http://openqa.opensuse.org/opensuse/permanent/bug/bug660464-2.jpg So maybe it is actually a kernel-problem, that only started to be randomly triggered by something else later?
http://www.linuxquestions.org/questions/slackware-14/current-randomly-timed-kernel-oops-on-bootup-of-two-test-boxen-852843/ discusses the very same bug. It appears to be a bug in the kernel's SCSI passthrough, triggered by udev-165 using an additional SCSI command. Tests with today's openSUSE-GNOME-LiveCD-i686-Build0988-Media.iso on KVM had it failing in 15 of 20 tries. nohz=off is not required for that.
Can you re-capture the oops but boot with panic_on_oops=1 so we can see the primary oops?
Created attachment 407345 [details] serial console log with Oops+backtrace used console=ttyS0 instead
I had a similar panic on my laptop (Amilo Pro 2010) with 2.6.37-rc7, but that went away when using 2.6.37 from Kernel:/HEAD so there might already be a fix.
Created attachment 407787 [details] serial console log with Oops+backtrace from 2.6.37-default log has one successful boot and one oops after reset, so on KVM, bug might still be there with final 2.6.37
According to http://marc.info/?l=kernel-janitors&m=129378990812615&w=1 Mike can reproduce it too
Tejun has a working patch: http://marc.info/?l=linux-hotplug&m=129536338129945&w=2
(In reply to comment #7) > According to http://marc.info/?l=kernel-janitors&m=129378990812615&w=1 Mike can > reproduce it too The crashes I could reproduce were cured by,.. patches.fixes/sched-cgroup-use-exit-hook-to-avoid-use-after-free-crash ..which is the patch in this thread, with another hunk to prevent the exit hook from messing with a failed fork child on it's way to the grave, and thereby making autogroup diddle freed memory.
ok, so the other bug is fixed by #8 - if someone could push it to master asap I would be grateful
(In reply to comment #7) > According to http://marc.info/?l=kernel-janitors&m=129378990812615&w=1 Mike can > reproduce it too No, according to that thread, Mike could produce /an/ Oops. Not /this/ Oops. I've applied the patch from comment #8 to the repo and have forced an update to Kernel:HEAD for testing. Please try a kernel with the following changelog entry and report back. ata: Fix panics with ata_id (bnc#660464).
No more i586 crashes on openQA in over 20 testruns since this went into Factory. Can not yet tell about i686 LiveCDs, since none were built so far. But looks good.
Thanks. I'll close as FIXED. Please re-open if the LiveCDs fail.
Bug has not been seen again. Not even on LiveCDs.
Created attachment 412674 [details] Default kernel log After update from 11.3 to 11.4-M6 (x86_64) my system (laptop hp-compaq 6720s) totally freezes on boot. It's happens almost always (~9 times of 10, roughly). In console I saw only "Creating device nodes with udev", it's all. This problem I saw in 11.3 with newer kernels (2.6.36, 2.6.37)
Created attachment 412675 [details] "Failsave" parameters (apm=off noresume edd=off powersaved=off nohz=off highres=off processor.max_cstate=1 x11failsafe vga=0x317)
Created attachment 412676 [details] Default kernel log + nomodeset - flood in logs by udev
Created attachment 412677 [details] System log after successeful boot (udev's flood again)
Bug is here (see above).
11.4 RC1 - bug is still here
sorry, this is a different bug. So please track it as a different number. Your problem is hardware specific - or it wouldn't go away with kernel parameters.