|
Bugzilla – Full Text Bug Listing |
| Summary: | XEN locks the machine during boot | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 11.2 | Reporter: | Birger Kollstrand <birger.kollstrand> |
| Component: | Xen | Assignee: | Jason Douglas <jdouglas> |
| Status: | RESOLVED DUPLICATE | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Critical | ||
| Priority: | P2 - High | CC: | carnold, jbeulich |
| Version: | RC 1 | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | openSUSE 11.2 | ||
| Whiteboard: | maint:released:11.2:29469 | ||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
HW information
XEN logs boot.omsg dmesg.txt xmdmesg.txt Xorg.0.log boot.msg /Var/log/messages from UI crash point Xorg log from crashpoint. xm dmesg The requested boot.msg |
||
|
Description
Birger Kollstrand
2009-10-18 19:29:04 UTC
Created attachment 322997 [details]
HW information
Created attachment 322998 [details]
XEN logs
We will need to capture what is happening at boot time. Please edit the /boot/grub/menu.lst file and add the following to the xen.gz line.
kernel /boot/xen.gz loglvl=all guest_loglvl=all
When the Xen kernel fails to boot, reboot into the native kernel and attach /var/log/boot.omsg
This may not provide enough information in which case you will need to attach a serial cable from this machine to another to capture the output. To do this, your menu.lst entry should look something like this example below,
###Don't change this comment - YaST2 identifier: Original name: xen###
title XEN
root (hd0,1)
kernel /boot/xen.gz loglvl=all guest_loglvl=all console=com1 com1=115200,8n1
module /boot/vmlinuz-xen root=/dev/sda2 vga=0x31a resume=/dev/sda1 splash=silent showopts console=ttyS0, 115200
module /boot/initrd-xen
You can use a utility like minicom on the other machine to capture the output.
Created attachment 323119 [details]
boot.omsg
The boot now behaved slightly different.
I ended up with a garbaled screen as earlyer, but I could do ctrl-alt-F1 and get console. loged in and out and switched back to gfx view to try to restart X.
Now it got stuck again.
This is the menu.lst:
title Xen -- openSUSE 11.2 RC 1 - 2.6.31.3-1
root (hd1,0)
kernel /boot/xen.gz loglvl=all guest_loglvl=all
module /boot/vmlinuz-2.6.31.3-1-xen root=/dev/md0 splash=silent quiet showopts
module /boot/initrd-2.6.31.3-1-xen
(In reply to comment #4) > The boot now behaved slightly different. > I ended up with a garbaled screen as earlyer, but I could do ctrl-alt-F1 and > get console. If you can get to the command line console using ctrl-alt-F1, attach the boot.msg and the Xorg.0.log file and also the output of 'dmesg' and 'xm dmesg'. It sounds like you may be experiencing a graphics driver problem. The posted boot.omsg doesn't provide any helpful information. In addition to what Charles asked for - in case the important part of the information didn't make it to the kernel log, please also check the kernel console (Ctrl-Alt-F10) for any extra maessages, and if there are any that aren't in either of the logs, take a snapshot. (Also, Re #4: Please attach individual small or medium size text only files as plain text rather than gzipped or even tarred up blobs, as that's easier to look at.) Btw., if it's just the GUI that doesn't come up, I wouldn't view this as a blocker. Please clarify. Clarification: It was not only that the GUI did not come up. It locked completely. No SSH access and no switching to terminals. I'll add the requested data as soon as possible, but I am working late this week so it is hard to estimate when I can address it. Created attachment 323335 [details]
dmesg.txt
Created attachment 323337 [details]
xmdmesg.txt
Created attachment 323338 [details]
Xorg.0.log
Created attachment 323339 [details]
boot.msg
As long as I only go to console and do not try to return to the desktop, then it does not seem to lock. So it seems like this is X in connection with Xen as thre is no problem while running the stock kernel. Please categorize as you find sensible. I downgraded it to Critical according to the fact that it does crash the system, but it's probably in limited configurations.(This definately stops me testing in this area which is Xen management with Yast, but have enough of other areas to work on :-) ) Crap, forgot the NEEDINFO tick..... Sorry, but the logs you posted still don't show any (severe) problem. I therefore assume they were taken before the machine crashed/hung, but we need the logs from after the hang/crash occurred; as that information may not get written to disk, you may have no choice other than attaching a serial cable to log Xen and kernel messages that way. Btw., did you try to configure X without using acceleration (in which case drm kernel modules should not get loaded)? That may get you around the problem, if you need a workaround... I'll try the X way first :-) "configure X without using acceleration" How do I do that without any /etc/X11/xorg.conf file? Is not that some of the good new features? I found out how to get the x stuff deon. nice help on IRC :-) Yes, I'm now running ok with that xorg.conf file. Created attachment 323981 [details]
/Var/log/messages from UI crash point
Created attachment 323982 [details]
Xorg log from crashpoint.
I discovered that I do have ssh access to the machine after it crashes. Only keyboard, mouse and monitor is locked.
This is the procedure I did to get the logs:
1. booted with the XEN kernel
2. switched to ctrl-alt-F1
3. started logging via SSH from another machine on /var/log/messages and /var/log/Xorg.0.log
4. switched back with ctrl-alt-F7.
5. monitor, mounse and keyboard locked imediately. I noticed on the attached loging that at the same time the machine did this:
Oct 23 23:00:07 corot kernel: [ 423.179200] [drm] Loading RS780/RS880 CP Microcode
Oct 23 23:00:07 corot kernel: [ 423.179316] [drm] Loading RS780/RS880 PFP Microcode
Oct 23 23:00:07 corot kernel: [ 423.194235] [drm] Resetting GPU
May I ask that you provide the information asked for prior to clearing the needinfo state? There's still no indication of a crash in the log you provided, and there's also still no indication that you obtained the log via serial. Of course, you may indeed ask. Although I can not see what should be output on the serial that is not in the logs obtained via ssh? The kernel is operating after the screen/mouse/keyboard stops working. This will definitely take some time as the motherboard does not have a serial port. I will check to see if there is a pinout that can be used. Oh, I see - the different entries of yours contradict one another regarding how much of the machine doesn't work anymore. With just the human interfaces locked I agree that you're not to expect more output over serial, except for eventual Xen (hypervisor) messages which you didn't also provide. So after the local access to the machine stopped working, we'd need besides the /var/log/messages fragment also the respective "xm dmesg" output, with "loglvl=all guest_loglvl=all" added to the Xen command line in /boot/grub/menu.lst. Additionally, with the machine having 8Gb of RAM, it would be worth trying to add "mem=4G" to the Xen command line to possibly circumvent the problem. Finally, would you mind clarifying whether the kernel messages referred to in #18 also appear when starting the GUI on the native kernel? Created attachment 324781 [details]
xm dmesg
Regarding #18: It is the same with the standard kernel.
Regarding using mem=4G: No difference in behaviour.
"xm dmesg" output added.
I'm planning on upgrading to RC2 tomorrow. I'll post in if there is any change.
With that I think we'll need to have a way to reproduce this internally in order to find ways to analyze/debug the problem. Along with reporting the results of using RC2, could you please also get us a native kernel's boot logs? Same problem with RC2.
But....
How did I not see this before:
top - 18:55:44 up 5 min, 2 users, load average: 0.78, 0.27, 0.12
Tasks: 115 total, 2 running, 113 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 33.1%sy, 0.0%ni, 66.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.1%st
Mem: 7891800k total, 489296k used, 7402504k free, 16084k buffers
Swap: 4192944k total, 0k used, 4192944k free, 110588k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4137 root 20 0 649m 32m 6980 R 100 0.4 0:22.68 Xorg
4447 root 20 0 8532 1156 848 R 0 0.0 0:00.09 top
1 root 20 0 8072 744 624 S 0 0.0 0:00.23 init
2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd
3 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/0
4 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/0
5 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/0
100% on Xorg......
Then I see heaps of these in /var/log/messages after killing Xoeg brutally with kill -9.
Nov 4 18:58:57 linux-889s kernel: [ 547.139514] [drm] wait idle failed status : 0xA0003030 0x00000003
Nov 4 18:58:57 linux-889s kernel: [ 547.280594] [drm] wait idle failed status : 0xA0003030 0x00000003
Nov 4 18:58:57 linux-889s kernel: [ 547.423976] [drm] wait idle failed status : 0xA0003030 0x00000003
Nov 4 18:58:57 linux-889s kernel: [ 547.563065] [drm] wait idle failed status : 0xA0003030 0x00000003
Nov 4 18:58:57 linux-889s kernel: [ 547.706390] [drm] wait idle failed status : 0xA0003030 0x00000003
Nov 4 18:58:58 linux-889s kernel: [ 547.846369] [drm] wait idle failed status : 0xA0003030 0x00000003
And xorg.log:
(WW) RADEONHD(0): DRMCPIdle: DRM CP IDLE returned BUSY!
(WW) RADEONHD(0): DRMCPIdle: DRM CP IDLE returned BUSY!
(WW) RADEONHD(0): DRMCPIdle: DRM CP IDLE returned BUSY!
(WW) RADEONHD(0): DRMCPIdle: DRM CP IDLE returned BUSY!
Can this in anyway help?
Created attachment 325555 [details]
The requested boot.msg
(In reply to comment #22) > Created an attachment (id=324781) [details] > xm dmesg > > Regarding #18: It is the same with the standard kernel. > > Regarding using mem=4G: No difference in behaviour. > > "xm dmesg" output added. "xm dmesg" output provided here does not relate to the "mem=4G" test, and since bug 552492 indicates that this option indeed fixes a similar issue reported there I'd like you to clarify that you indeed passed this option to Xen (and not by mistake to the kernel). My mistake. You were absolutely correct. X now works with the mem=4G. 552492 It might be an idea to probice more thatn the parameter to bug reporters :-) I'm not a XEN expert, just testing the Yast GUI so this was a bit over my head. I'm just putting this as a duplicate of 552492. Please contact me if you need any help in testing a patch. *** This bug has been marked as a duplicate of bug 552492 *** Update released for: kernel-debug, kernel-debug-base, kernel-debug-base-debuginfo, kernel-debug-debuginfo, kernel-debug-debugsource, kernel-debug-devel, kernel-debug-devel-debuginfo, kernel-default, kernel-default-base, kernel-default-base-debuginfo, kernel-default-debuginfo, kernel-default-debugsource, kernel-default-devel, kernel-default-devel-debuginfo, kernel-desktop, kernel-desktop-base, kernel-desktop-base-debuginfo, kernel-desktop-debuginfo, kernel-desktop-debugsource, kernel-desktop-devel, kernel-desktop-devel-debuginfo, kernel-pae, kernel-pae-base, kernel-pae-base-debuginfo, kernel-pae-debuginfo, kernel-pae-debugsource, kernel-pae-devel, kernel-pae-devel-debuginfo, kernel-source, kernel-source-vanilla, kernel-syms, kernel-trace, kernel-trace-base, kernel-trace-base-debuginfo, kernel-trace-debuginfo, kernel-trace-debugsource, kernel-trace-devel, kernel-trace-devel-debuginfo, kernel-vanilla, kernel-vanilla-base, kernel-vanilla-base-debuginfo, kernel-vanilla-debuginfo, kernel-vanilla-debugsource, kernel-vanilla-devel, kernel-vanilla-devel-debuginfo, kernel-xen, kernel-xen-base, kernel-xen-base-debuginfo, kernel-xen-debuginfo, kernel-xen-debugsource, kernel-xen-devel, kernel-xen-devel-debuginfo, preload-kmp-default, preload-kmp-desktop Products: openSUSE 11.2 (debug, i586, x86_64) |