|
Bugzilla – Full Text Bug Listing |
| Summary: | System crashes after launching Yast under Xen kernel | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 10.3 | Reporter: | Henry Laurent <laurent.henry> |
| Component: | Xen | Assignee: | Jan Beulich <jbeulich> |
| Status: | RESOLVED NORESPONSE | QA Contact: | Jiri Srain <jsrain> |
| Severity: | Major | ||
| Priority: | P5 - None | CC: | carnold, laurent.henry, marcus |
| Version: | Final | ||
| Target Milestone: | --- | ||
| Hardware: | i386 | ||
| OS: | openSUSE 10.3 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
save_y2logs output
xend log dmesg with the working kernel dmesg with the kernel producing the kernel xen boot.msg file boot.msg with default kernel |
||
|
Description
Henry Laurent
2007-12-04 13:51:47 UTC
Created attachment 185816 [details]
save_y2logs output
Originally, which YaST module did you start that led to the crash? (Was this really just the menu in ncurses?) What version of YaST do you have installed now (best provide rpm -qa)? -The crash occurs just when typing "yast" on the root prompt i'm mixed with the feeling it could be a kernel problem, badly accessing memory. I can't imagine how just yast could freeze the whole system. - rpm -qa|grep yast yast2-storage-lib-2.15.27-4 yast2-xml-2.15.0-55 yast2-control-center-qt-2.15.4-12 yast2-ncurses-2.15.27-16 yast2-2.15.58-12 yast2-country-2.15.20-7 yast2-sound-2.15.11-18 yast2-firewall-2.15.8-8 yast2-runlevel-2.15.3-19 yast2-x11-2.15.11-22 yast2-fingerprint-reader-2.15.2-27 yast2-kerberos-client-2.15.7-32 yast2-ldap-client-2.15.12-37 yast2-users-2.15.38-7 yast2-inetd-2.15.1-41 autoyast2-installation-2.15.17-17 autoyast2-2.15.17-17 yast2-restore-2.15.4-22 yast2-online-update-frontend-2.15.24-0.1 yast2-repair-2.15.8-0.1 yast2-backup-2.15.5-0.1 yast2-schema-2.15.0-123 yast2-trans-stats-2.15.0-32 yast2-transfer-2.14.0-107 yast2-hardware-detection-2.15.8-36 yast2-perl-bindings-2.15.3-29 yast2-qt-2.15.16-19 yast2-control-center-2.15.4-12 yast2-mouse-2.15.1-81 yast2-printer-2.15.6-4 yast2-vm-2.16.1-48 yast2-bluetooth-2.15.4-17 yast2-irda-2.15.1-94 yast2-pam-2.14.0-128 yast2-scanner-2.15.5-42 yast2-sysconfig-2.15.3-58 yast2-network-2.15.81-2 yast2-ntp-client-2.15.12-7 yast2-tv-2.15.7-23 yast2-installation-2.15.54-4 yast2-samba-client-2.15.11-33 yast2-packager-2.15.81-4 yast2-update-2.15.23-21 yast2-iscsi-client-2.15.2-39 yast2-metapackage-handler-0.7.1-9 yast2-registration-2.15.3-15 yast2-iscsi-client-2.15.2-39 yast2-metapackage-handler-0.7.1-9 yast2-registration-2.15.3-15 yast2-sudo-2.15.3-86 yast2-bootloader-2.15.29-2 yast2-add-on-2.15.17-4 yast2-online-update-2.15.24-0.1 yast2-core-2.15.13-0.1 yast2-profile-manager-2.15.1-0.1 yast2-theme-openSUSE-2.15.14-4 yast2-slp-2.15.0-31 yast2-pkg-bindings-2.15.51-4 yast2-ldap-2.15.1-83 yast2-apparmor-2.1-26 yast2-nfs-client-2.15.0-25 yast2-support-2.15.3-14 yast2-nis-client-2.15.3-21 yast2-security-2.15.1-23 yast2-mail-2.15.23-2 yast2-samba-server-2.15.7-57 yast2-storage-2.15.27-4 yast2-tune-2.15.7-20 yast2-trans-fr-2.15.16-2.1 "yast" just starts the menu; not even hardware probing should be involved there. Definitely there is something going really wrong since when i'm with any xen kernel it crashes while i reboot with the default 32 bits one, all is fine. Memory management problem with these kernels ? Please attach the kernel logs (dmesg, etc) and xend.log so that we may better understand what is happening on your system. *** Bug 346178 has been marked as a duplicate of this bug. *** Created attachment 189700 [details]
xend log
Created attachment 189701 [details]
dmesg with the working kernel
Created attachment 189705 [details]
dmesg with the kernel producing the kernel
i am noticing something really weird under the xen kernels concerning ntp going crazy, not sure there is something to do with the actual trouble but it occurs only with the wen kernels on this hardware too in var log messages it could be seen as: Jan 8 10:36:28 xen1 kernel: clocksource/1: Time went backwards: delta=-4696602488 shadow=1100000067915 offset=4136832 Jan 8 10:36:28 xen1 kernel: clocksource/1: Time went backwards: delta=-4696519541 shadow=1100000067915 offset=4219705 Jan 8 10:36:28 xen1 kernel: clocksource/1: Time went backwards: delta=-4704805144 shadow=1100000067915 offset=4228629 Jan 8 10:36:28 xen1 kernel: clocksource/1: Time went backwards: delta=-4704781345 shadow=1100000067915 offset=4252501 Jan 8 10:36:28 xen1 kernel: clocksource/1: Time went backwards: delta=-4706074107 shadow=1100000067915 offset=4498243 Jan 8 10:36:28 xen1 kernel: clocksource/1: Time went backwards: delta=-4706189123 shadow=1100000067915 offset=4528109 Jan 8 10:36:28 xen1 kernel: clocksource/1: Time went backwards: delta=-4706248675 shadow=1100000067915 offset=4552807 Jan 8 10:36:28 xen1 kernel: clocksource/1: Time went backwards: delta=-4706455918 shadow=1100000067915 offset=4578997 Jan 8 10:36:28 xen1 kernel: clocksource/1: Time went backwards: delta=-4706791654 shadow=1100000067915 offset=4661999 Jan 8 10:36:55 xen1 kernel: printk: 5 messages suppressed. Jan 8 10:36:55 xen1 kernel: Timer ISR/1: Time went backwards: delta=-26763989925 delta_cpu=4716010075 shadow=1104000078321 off=712004995 processed=1131476065650 cpu_processed=1099996065650 Jan 8 10:36:55 xen1 kernel: 0: 1131472065650 Jan 8 10:36:55 xen1 kernel: 1: 1099996065650 Jan 8 10:36:55 xen1 kernel: clocksource/1: Time went backwards: delta=-26765288604 shadow=1104000078321 offset=712224591 Jan 8 10:36:55 xen1 kernel: clocksource/1: Time went backwards: delta=-26771352533 shadow=1104000078321 offset=712269391 Jan 8 10:36:55 xen1 kernel: clocksource/1: Time went backwards: delta=-26771343265 shadow=1104000078321 offset=712278173 Jan 8 10:36:55 xen1 kernel: clocksource/1: Time went backwards: delta=-26771317593 shadow=1104000078321 offset=712303921 About comment #11, this sounds like bug 279062 found and fixed in sles10sp1. The same fix has been taken for 10.3 but is not yet available in the maintenance channel for download. I am getting a 'permission deny ' to this bug. In fact it will be difficult to fix it to see if there is a link with what i am experiencing since launching yast crashes the system. The messages in #11 should be gone with 2.6.22.16-0.1 - please try that kernel. update from 2.6.22.13-0.3 to 2.6.22.17-0.1 done. i am still "segfault-ing" while launching yast only with xen kernels. Can you make a statement regarding the 'time went backwards' messages with the new kernel? As to the seg-faulting - without you providing more detail on them (e.g. messages printed generated by the kernel or Xen, if any) and with the understanding that you are not using the PAE kernel flavor (for which a possibly similar problem was found) I'm afraid there's not much else we can do. Oh, perhaps your list of loaded modules might also provide some hint. About ntp issue, the date i've seen time given with ntp is correct now and don't find any buggy message about this anymore. About segfault, the problem is exactly the same i've mentionned on my first posts: under xen and xen-pae kernel (the same happens for both), while login as root, anytime just when typing the yast command my system instantly freezes and i just can execute a manual poweroff, all i can see on the screen is the following message: #yast sbin/yast: line 386: 4075 Erreur de segmentation $ybindir/y2base menu ncurses $NCTHREADS (it's in french, 'erreur de segmentation' meaning segfault). I dont find any relevant log about the crash and i am open to any manipulation needed. output of lsmod (Linux xen1 2.6.22.17-0.1-xen #1 SMP 2008/02/10 20:01:04 UTC i686 i686 i386 GNU/Linux) Module Size Used by af_packet 29064 0 bridge 53528 1 netbk 78420 0 [permanent] netloop 10752 0 blkbk 25504 0 [permanent] blktap 118696 2 [permanent] xenbus_be 8064 3 netbk,blkbk,blktap iptable_filter 6912 0 ip_tables 16324 1 iptable_filter ip6_tables 17476 0 x_tables 18308 2 ip_tables,ip6_tables microcode 8072 0 firmware_class 13568 1 microcode edd 12996 0 apparmor 40736 0 ext3 131848 1 jbd 68276 1 ext3 mbcache 12292 1 ext3 loop 21892 0 dm_mod 56880 0 ide_cd 40324 0 cdrom 37148 1 ide_cd pata_serverworks 13824 0 ata_generic 11524 0 libata 139472 2 pata_serverworks,ata_generic thermal 18440 0 processor 27808 1 thermal button 12304 0 parport_pc 40764 0 serverworks 11400 0 [permanent] generic 8836 0 [permanent] 8250_pnp 13568 0 shpchp 34836 0 e100 38924 0 i2c_piix4 12300 0 8250 31384 1 8250_pnp mii 9344 1 e100 pci_hotplug 33216 1 shpchp ide_core 123972 3 ide_cd,serverworks,generic i2c_core 27520 1 i2c_piix4 parport 37960 1 parport_pc serial_core 24704 1 8250 serio_raw 10756 0 rtc_cmos 12448 0 rtc_core 23304 1 rtc_cmos sworks_agp 13984 0 agpgart 37428 1 sworks_agp rtc_lib 7040 1 rtc_core sg 36908 0 reiserfs 232500 1 sd_mod 30976 4 usbhid 41556 0 hid 29184 1 usbhid ff_memless 9352 1 usbhid aic7xxx 157732 3 scsi_transport_spi 26880 1 aic7xxx scsi_mod 140504 5 libata,sg,sd_mod,aic7xxx,scsi_transport_spi ohci_hcd 24068 0 usbcore 124908 3 usbhid,ohci_hcd xenblk 20976 0 xennet 29960 0 The same with the working kernel (2.6.22.17-0.1-default) Module Size Used by iptable_filter 6912 0 ip_tables 16324 1 iptable_filter ip6_tables 17476 0 x_tables 18308 2 ip_tables,ip6_tables microcode 15372 0 firmware_class 13568 1 microcode apparmor 40736 0 ext3 131848 1 jbd 68148 1 ext3 mbcache 12292 1 ext3 loop 21636 0 dm_mod 56880 0 e100 38156 0 parport_pc 40892 0 mii 9344 1 e100 rtc_cmos 12064 0 parport 37832 1 parport_pc button 12560 0 sworks_agp 13344 0 shpchp 35092 0 rtc_core 23048 1 rtc_cmos agpgart 35764 1 sworks_agp rtc_lib 7040 1 rtc_core serio_raw 10756 0 pci_hotplug 33216 1 shpchp i2c_piix4 12556 0 i2c_core 27520 1 i2c_piix4 sr_mod 19492 0 sg 37036 0 cdrom 37020 1 sr_mod usbhid 41300 0 hid 29184 1 usbhid ff_memless 9352 1 usbhid ohci_hcd 23684 0 sd_mod 31104 4 usbcore 124268 3 usbhid,ohci_hcd edd 12996 0 reiserfs 233140 1 fan 9220 0 aic7xxx 157348 3 scsi_transport_spi 27008 1 aic7xxx pata_serverworks 13824 0 libata 139216 1 pata_serverworks scsi_mod 140376 6 sr_mod,sg,sd_mod,aic7xxx,scsi_transport_spi,libata thermal 20872 0 processor 40876 1 thermal I think preventing at least sworks_agp, pata_serverworks, and the two non-Xen modules not loaded in -default at all (ata_generic and generic) from loading might be a reasonable first step. For these last two modules it'd be especially interesting to know why they get loaded in -xen, but not in -default. And please be so kind a re-attach /var/log/boot.msg for -default and -xen with the kernel version you just installed. Hi, I've got exactly the same issue. Have reinstalled 3 times and added additional phsyical ram but segfaults with yast. I got a little further when installing a minimal, no desktop system. Could start yast in Xen but crashes when installing something. With either Gnome or KDE installed it crashes as soon as you type "yast" as root over a ssh connetion. /sbin/yast: line 386: 4247 Segmentation fault $ybindir/y2base menu ncurses $NCTHREADS Server is a 32 bit Dell Power Edge SC430 with 1.5Gb Ram running clean install of OpenSuse 10.3. I'm happy to provide more data, install anything or possibly provide remote access, currently the box is unused. thanks Marcus Yes, getting remote access might help, unless we're able to duplicate this inhouse (which is currently being attempted). Since I wouldn't immediately have the cycles to do debugging this way, I'll get back to that offer once I know what our lab folks say on it. Of course, if you want to do some debugging of this meanwhile - what we would minimally need to get would be register state and backtrace from gdb at the point of the SEGV. The other things to do (independently) would be to - collect Xen *and* kernel messages over serial (to grab a possible kernel crash's printout), or (if that doesn't provide anything) - check whether SysRq still works at the point of the hang, and if so, collect SysRq-p and SysRq-t output (again over serial), or (if that doesn't work) - collect Xen's response to sending 'd' over the serial line (after switching input to Xen). Created attachment 197497 [details]
xen boot.msg file
Created attachment 197498 [details]
boot.msg with default kernel
- I've blacklisted or done rmmod to numerous modules, the one you mentionned and a few more (xenblk and xennet) and still crashing, there are some i can't delete yet (generic always given as "busy"). - I prevent launching of xend, xendomains and avahi processes too, without anymore succes. - It took me about 20 tries with crashes and reboots to do all this, 3 times among all of this yast launched well (i succeed editing runlevels and searching for online updates). I am not able to reproduce what makes it work sometimes, it seems a random behavior. - I've uploaded the 2 boot.msg files, for the actual kernel. - About console redirecting of kernel messages, why not, by i have no idea how to do so. PS: about Marcus message, my own server is a 32 bits 2500 Dell poweredge. >- About console redirecting of kernel messages, why not, by i have no idea how >to do so. This requires collecting messages over serial, and the 'xencons=xvc' (or 'xencons=ttyS') kernel (not Xen) boot option. Also, the other information requested in the last paragraph of comment #23 would also apply to your system; without getting an understanding on the kind of crash/hang I don't think there's much we can do. Closing NOREPSONSE, due to missing information for more than 21 days. Please feel free to reopen and provide the requested information. |