Bugzilla – Bug 202079
Kernel panic during install on K6 and K6-2 processors (VIA Apollo chipsets) 10.2 Alpha 3
Last modified: 2006-10-10 18:45:31 UTC
While booting the installation CD on a K6/200 with a VIA VT82C595/97 [Apollo VP2/97] chipset, and a K6-2/500 with a VIA VT82C598 [Apollo MVP3] chipset, the kernel panics shortly after freeing unused memory. This happens even when booting using the "safe settings" entry in the boot menu. The K6/200 screen goes crazy, but I was able to copy down the screen contents of the K6-2/500 which I am including as an attachment. I am also attaching the output of lspci -vv for both computers from the very well behaved SUSE-10.1 installs now running on them. (The same install CD works very nicely on Pentium 2, Pentium 4, and AMD64-X2 systems, to save time answering the obvious question :-)
Created attachment 97228 [details] Panic Info copied from screen This is the information I copied from the screen of the K6-2/500 after the kernel panic
Created attachment 97229 [details] Output of lspci -vv on k6-2/500 This is the output of lspci -vv on the K6-2/500
Created attachment 97230 [details] Output of lspci -vv on k6/200 This is output of lspci -vv from the K6/200
Next alpha should solve this. If not, please reopen.
I have the same problem with AMD Sempron 2400+ 1.66 GHz CPU. Testing on: openSUSE 10.2 Alpha 4 - so basically this was NOT fixed. The kernel panics when I try to install openSUSE Linux - in both normal and failsafe modes. look at the attached screenshot.
Created attachment 98189 [details] openSUSE-10.2-kernel-crash (tested on alpha4, x86)
*** Bug 204252 has been marked as a duplicate of this bug. ***
10.2 Alpha 4 - Same behaviour on Pentium-1 133MHz laptop (Toshiba Tecra 510CDT) I hope comment #4 refers to Alpha 5 - Will be very pleased to try this.
*** Bug 203466 has been marked as a duplicate of this bug. ***
OK, but since this bug is blocker, and I am not sure it is fixed for Alpha 5, I would like to know more about it. What part of the kernel is buggy? Who works on it? Is this bug connected to the recent transition to SMP kernels ?
The same happens for me on an _installed_ system (Thinkpad T42p) that I upgraded via yast2 system upgrade to Factory as of yesterday. It did not happen on the same machine with Factory as of Sep 11. It seems it is not the kernel alone causing the problem because downgrading the kernel to the one of Sep 11 does not workaround the problem. The problem does happen with both default and xen kernel. If you already have some idea what may cause this problem please post it here because this might help me to develop a temporary workaround other than downgrading the whole system and thus no longer being able to provide further information.
BTW the crash does happen just before/when ide-core is normally loaded from initrd. There is no message about ide-core seen thus it seems either init from initrd is not run at all or it does not work correctly. ide-core is included in initrd.
I think it have something to do to SMP-based kernels (that are now default ones)... - perhaps there is some small bug in the chipset/BIOS that tell to Linux: "I am multiprocessor-capable system" and when Linux kernel uses the opcodes for standard multiprocessor systems, the system dies, and the kernel panics. I can't prove that, but it is the feeling... I can't prove, because I can't install the Linux system and then degrade the kernel to uniprocessor one, but I can do the other way around, will try to use older distro and install SMP kernel on one of the "buggy" systems.
Also, please change to topic, because this bug happens not just on AMD K6 systems, but also on some AMD Sempron and Pentium I systems. But I have no idea of the correct name as of yet.
btw: my buggy system uses AMIBIOS.
(In reply to comment #14) > Also, please change to topic, because this bug happens not just on AMD K6 > systems, but also on some AMD Sempron and Pentium I systems. > > But I have no idea of the correct name as of yet. > There are several bugs talkinging about something similar. Please somebody with knowledge look at them and decide which one is a copy of the other: 202605, 204252 and 204647. The issue is that because some are under VMware and Parallels the mauintainers just close those tickets, leaving the issue not closed, but open. This WILL result in a LOT of problems when people close tickets just because the word Parallels or VMware is used. If maintainers don't look closer, this will become a second zen disaster.
After seeing this problem on my AMD K6-2 system using 10.2 alpha 4, I tried to go back to alpha 3 (which used to run, though I had other issues [bug #203233]). Now the alpha 3 install crashes with essentially the same symptoms as described above for alpha 4 - on hardware unchanged from last month when it didn't crash. The same machine still runs 10.1 final perfectly happily (as much as that means for 10.1 ;-) ). BTW - tried burning a new CD1 (after md5 sum check) to no avail. Any suggestions please as to how alpha 4 has managed this!? Can anyone else replicate the problem?
Richard: I guess it was just good luck that it worked for you for the first time with an older version. All: On my system the following happens when adding notsc to command prompt: System installs acpi_p clocksource instead of tsc (obviously). System does _not_ crash any longer (used to do without notsc at this point). System does initialize Synaptics TouchPad and IBM TrackPoint. System stalls infinitely at this point. It does _not_ crash, just stall. I could still reboot by pressing Ctrl+Alt+Delete. Could others that suffer from this bug as well please also try notsc on their kernel command line just to verify whether it is likely that we are talking about the same bug.
I just verified that notsc does _not_ fix the problem with my K6-2/500 which has an Award BIOS, copyright 2/12/99 (No usable APM or ACPI so I have disabled it) I suspect that the illegal instruction is actually in "init", and not the kernel, and comment 11 strengthens my suspicions.
At least my case is fixed by the patch I submitted in bug #206368. Since I cannot be sure that all cases are caused by this problem I opened this as a new bug. If it turns out that all cases are caused by this, make this bug a duplicate of bug #206368 since this contains the fix.
Mark: notsc did fix the problem for me neither, it did just hide it.
Further inspection revealed that my problem and thus the patch in bug #206368 is unrelated to this bug although the results seem pretty equal. Thus just ignore all my comments in this bug, they will just confuse you.
Robert - tried alpha 4 with "notsc" on K6/2-500, acpi_p is installed but system crashes as before. Also tried in conjunction with "acpi=off", then clock source is "pit", still crashes. This is the same as using "safe settings" of course. Also tried on a Pentium-I system as per comment #8, exactly the same behaviour. For completeness I repeated the tests on both machines with alpha 3, both show the same behaviour as for alpha 4. I still don't know how I managed to get the install under way with alpha 3, though thinking about it I seem to remember it failing on the first try (probably the symtoms we are discussing) but I just hit reset and repeated - somehow it worked. Very frustrating!
Richard: Yes, most likely some race condition. You can forget all that tsc stuff now because as said in comment #21 it just did hide my problem and as said in comment #22 my problem was definitely a different one.
K6/2-550 on MVP4 also kernel panics on alpha4.
hi guys ! So how the stuff is progressing ? Due to my belief that there is a bug in SMP kernel (default in 10.2) I recommend: on all problematic systems, running SUSE 10.0, and 10.1 try to install SMP kernels to see if this is the problem.
Just installed 2.6.16.13-4-smp on my K6-2/500 which is very happily running 10.1, and it works perfectly: markgray@soyo:~> uname -a Linux soyo 2.6.16.13-4-smp #1 SMP Wed May 3 04:53:23 UTC 2006 i586 i586 i386 GNU/Linux [end quote] Fedora core 6 Test 3 install and rescue both also refuse to boot on this machine, but fail with different symptoms, so it could be a bug introduced by a later kernel version. Fedora Core 6 Test 3 uses kernel-2.6.17-1.2630.fc6.i586.rpm and openSUSE 10.2 Alpaha 4 uses 2.6.18_rc6_git3 apparently.
If you add "nosmp" to the kernel boot line, does the oops go away?
No -- an identical kernel panic still occurs. It might be informative if the other K6, K6-2, K6-3 and Athlon panic sufferers were to identify the chipset and hardware on their computers. While the bug might be caused by the kernel's requirement for instructions or features these processors lack, a bug in the Via chipset handling or the various elderly Bios'es used is also possible. The kernel and installation system have been depending more and more on ACPI functions doing what they are supposed to do, it seems to me :-) (Will be away until after the weekend -- back online Saturday sometime.)
Can people who have this problem, after testing the "nosmp" boot issue, also try the kernel-vanilla package and see if the issue occurs there too?
1) "nosmp" - didn't help. Are there other kernel options that might help, besides those already used in "Install--Safe Settings" in 10.2 ? "Install--Safe Settings" produces *same* errors as normal install. 2) I can't install "kernel-vanilla" package on SUSE 10.2, because I can't install 10.2 itself. any ideas?
Actually the same kernel problem accures also on Virtual PC 5.1, and maybe even on Microsoft Virtual PC 2004. (this is freeware now, so if you have Windows, please try openSUSE 10.2 on it).
If you install the kernel-vanilla package on the 10.1 release, does it cause the same problem? We really need to track down if this is a SuSE specific issue, or a kernel.org issue as soon as possible. That is what the kernel-vanilla package is for.
Bad news for SUSE : standard new vanilla-kernel from kernel.org - latest stable - 2.6.18 WORKS (!!) on my problematic machine ! configured with smp functionality on single-CPU machine, i586. So the problem is just with SUSE kernel.
No, that's not a bad thing for us, that means we have a better chance to fix it properly :) Let me go build a different kernel that might help out with some testing, and I'll post the location of where it is in a few hours when it finishes building...
Ok, can anyone try out the kernel at: ftp://ftp.suse.com/pub/people/gregkh/202079 and let me know if that works better or not? Rumor also has it that for some people the kernel-of-the-day has fixed their problems. Any truth to that matter or not?
*** Bug 204647 has been marked as a duplicate of this bug. ***
*** Bug 202605 has been marked as a duplicate of this bug. ***
After taking the kernel image from the i586 RPM at ftp://ftp.suse.com/pub/people/gregkh/202079/ and using it to replace the non-working kernel on the Alpha4 CD1 ISO, I was able to boot under Parallels without having a kernel panic (as noted under bug #202605). As a description of what I've done, I have been taking the kernel and initrd from the factory boot/i386/loader, and rebuilding the CD1 ISO using makeSUSEdvd, using the factory kernel/initrd as a replacement for those supplied on the alpha4 ISO. To date, each kernel/initrd combination taken from factory have all failed with a kernel panic no matter what kernel options were passed. In my matest test, I extracted the kernel image from the RPM and used that as the replacement kernel. Since the kernel RPM doesn't include a matching initrd, and I didn't try to build one, there would be a version mismatch between the kernel image (2.6.18-6) and the initrd (2.6.18-4). Even though this is the case, the system does boot beyond the point at which it was throwing up the kernel panic, although the installation system complains that it can't find the installation CD and so activates the manual setup system in text mode. After this, I am able to pick the installation language as English and the English (UK) keymap. At this point, because no modules were loaded, and no can be loaded, the installation can procede no further. As a summary, it appears that this kernel build has solved the kernel panic under Parallels.
David, Wonderful, thank you so much for testing this (and yes, what you did with the initrd was fine, sorry I had forgotton about that.) The test kernel that I provided was identical to our normal kernel, with the exception that all of the Xen modified code was removed. This proves that the problem is in the Xen set of patches added to our kernel, and now the Xen team can work on tracking this down and fixing it. Reassigning to the Xen team for them to figure out before the Beta release.
Do you wanna tell me that default kernel besides being able to run on x86 is also able to run as Xen host *and* guest ? Or just Xen host functionality included? I think in SUSE 10.0 it was 3 separate kernels... -Not yet tested the new kernel. Just woke up.
Ohh yes ! The new test-kernel runs excellent on my old problematic system !
I hope that Alpha 5 will include a fixed kernel. Is this progressing?
Alexey, the Xen team is currently working on it now.
Oops, Alexey, sorry, but no, I do not think the Alpha5 kernel will be fixed for this, as that kernel version just was checked in a few hours ago, without a fix for this. Sorry, but the proper people now know about this. Hopefully by the next release it can be fixed.
*** Bug 206153 has been marked as a duplicate of this bug. ***
*** Bug 206513 has been marked as a duplicate of this bug. ***
*** Bug 206768 has been marked as a duplicate of this bug. ***
*** Bug 204749 has been marked as a duplicate of this bug. ***
*** Bug 204147 has been marked as a duplicate of this bug. ***
This is either a binutils bug a fix for which was meanwhile imported from mainline (incorrect use of multi-byte NOPs), or incorrect assembler options. In either case, yesterday's kernel of the day does not exhibit this problem anymore.
Well, the alpha5 doesn't makes any errors ! very good ! People, please test on your AMD K6 boxes...
Not possible here. My two installation targets don't support DVD. No one should have to download that much just to test if an installer kernel panic is fixed. My recent attempts from factory ftp got nowhere, not even able to find a HD to install to.
Just failed to install openSUSE-10.2-Alpha5-DVD-i386.iso under vmware 5.5.2. I remembered that I have kill timidity few days ago because I had some problems with a busy /dev/rtc which was in use by timidity. So I started "timidity -iA -B2,8 -Os" and now I get a vmware warning but Alpha5 is now booting! Very strange!?
Install of 10.2 alpha 5 seems to have been successful on my K6-2/500 box that exhibited these problems (see bug #204252). Like Felix I had the problem that the machine had no DVD, so I loop-mounted the DVD ISO to extract the contents on another machine, booted the target with the mini CD and accessed the DVD data via NFS. Worked beautifully! Slightly OT - why can't I NFS export the loop-mounted ISO directly? (I had to copy the files out to a real directory structure before it was visible) Is there a way to do it without copying? Next I will try on the old P133 laptop referred to in my comemnt #8 above...
Installation on a Parallels VM (#202605) works. Performing an upgrade from alpha2 (last alpha to install on Parallels) required editting of /etc/fstab and /boot/grub/device.map as the drives are no longer identified as IDE, but as SCSI.