Bug 548010 - XEN locks the machine during boot
Summary: XEN locks the machine during boot
Status: RESOLVED DUPLICATE of bug 552492
Alias: None
Product: openSUSE 11.2
Classification: openSUSE
Component: Xen (show other bugs)
Version: RC 1
Hardware: x86-64 openSUSE 11.2
: P2 - High : Critical (vote)
Target Milestone: ---
Assignee: Jason Douglas
QA Contact: E-mail List
URL:
Whiteboard: maint:released:11.2:29469
Keywords:
Depends on:
Blocks:
 
Reported: 2009-10-18 19:29 UTC by Birger Kollstrand
Modified: 2018-07-03 20:17 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
HW information (127.18 KB, application/x-gzip)
2009-10-18 19:29 UTC, Birger Kollstrand
Details
XEN logs (1.56 KB, application/x-gzip)
2009-10-18 19:30 UTC, Birger Kollstrand
Details
boot.omsg (11.36 KB, application/x-gzip)
2009-10-19 20:36 UTC, Birger Kollstrand
Details
dmesg.txt (43.17 KB, text/plain)
2009-10-20 20:35 UTC, Birger Kollstrand
Details
xmdmesg.txt (10.41 KB, text/plain)
2009-10-20 20:35 UTC, Birger Kollstrand
Details
Xorg.0.log (57.20 KB, text/plain)
2009-10-20 20:37 UTC, Birger Kollstrand
Details
boot.msg (42.73 KB, text/plain)
2009-10-20 20:38 UTC, Birger Kollstrand
Details
/Var/log/messages from UI crash point (1.24 KB, text/plain)
2009-10-23 21:15 UTC, Birger Kollstrand
Details
Xorg log from crashpoint. (10.55 KB, text/plain)
2009-10-23 21:19 UTC, Birger Kollstrand
Details
xm dmesg (10.41 KB, text/plain)
2009-10-29 22:44 UTC, Birger Kollstrand
Details
The requested boot.msg (50.79 KB, text/plain)
2009-11-04 18:19 UTC, Birger Kollstrand
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Birger Kollstrand 2009-10-18 19:29:04 UTC
User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; nb-NO; rv:1.9.1.3) Gecko/20090909 SUSE/3.5.3-3.2 Firefox/3.5.3

Installed XEN via Yast.
- booted in to the XEN kernel 
- the machine locks up during the boot.
- Desktop boot is working

Reproducible: Always

Steps to Reproduce:
1.Installed XEN via Yast
2.Reboot in to the XEN kernel
3.Machine loacks up
Actual Results:  
XEN crashes the machine

Expected Results:  
Boot up whith XEN kernel
Comment 1 Birger Kollstrand 2009-10-18 19:29:53 UTC
Created attachment 322997 [details]
HW information
Comment 2 Birger Kollstrand 2009-10-18 19:30:45 UTC
Created attachment 322998 [details]
XEN logs
Comment 3 Charles Arnold 2009-10-19 16:50:16 UTC
We will need to capture what is happening at boot time. Please edit the /boot/grub/menu.lst file and add the following to the xen.gz line.

kernel /boot/xen.gz loglvl=all guest_loglvl=all

When the Xen kernel fails to boot, reboot into the native kernel and attach /var/log/boot.omsg

This may not provide enough information in which case you will need to attach a serial cable from this machine to another to capture the output.  To do this, your menu.lst entry should look something like this example below,

###Don't change this comment - YaST2 identifier: Original name: xen###
title XEN
    root (hd0,1)
    kernel /boot/xen.gz loglvl=all guest_loglvl=all console=com1 com1=115200,8n1
    module /boot/vmlinuz-xen root=/dev/sda2 vga=0x31a resume=/dev/sda1  splash=silent showopts console=ttyS0, 115200
    module /boot/initrd-xen


You can use a utility like minicom on the other machine to capture the output.
Comment 4 Birger Kollstrand 2009-10-19 20:36:23 UTC
Created attachment 323119 [details]
boot.omsg

The boot now behaved slightly different.
I ended up with a garbaled screen as earlyer, but I could do ctrl-alt-F1 and get console. loged in and out and switched back to gfx view to try to restart X.
Now it got stuck again.

This is the menu.lst:
title Xen -- openSUSE 11.2 RC 1 - 2.6.31.3-1
    root (hd1,0)
    kernel /boot/xen.gz loglvl=all guest_loglvl=all
    module /boot/vmlinuz-2.6.31.3-1-xen root=/dev/md0 splash=silent quiet showopts
    module /boot/initrd-2.6.31.3-1-xen
Comment 5 Charles Arnold 2009-10-19 22:55:24 UTC
(In reply to comment #4)
> The boot now behaved slightly different.
> I ended up with a garbaled screen as earlyer, but I could do ctrl-alt-F1 and
> get console. 

If you can get to the command line console using ctrl-alt-F1, attach the boot.msg and the Xorg.0.log file and also the output of 'dmesg' and 'xm dmesg'.  It sounds like you may be experiencing a graphics driver problem.
Comment 6 Jan Beulich 2009-10-20 07:24:24 UTC
The posted boot.omsg doesn't provide any helpful information. In addition to what Charles asked for - in case the important part of the information didn't make it to the kernel log, please also check the kernel console (Ctrl-Alt-F10) for any extra maessages, and if there are any that aren't in either of the logs, take a snapshot.

(Also, Re #4: Please attach individual small or medium size text only files as plain text rather than gzipped or even tarred up blobs, as that's easier to look at.)

Btw., if it's just the GUI that doesn't come up, I wouldn't view this as a blocker. Please clarify.
Comment 7 Birger Kollstrand 2009-10-20 11:34:00 UTC
Clarification: It was not only that the GUI did not come up. It locked completely. No SSH access and no switching to terminals.
I'll add the requested data as soon as possible, but I am working late this week so it is hard to estimate when I can address it.
Comment 8 Birger Kollstrand 2009-10-20 20:35:15 UTC
Created attachment 323335 [details]
dmesg.txt
Comment 9 Birger Kollstrand 2009-10-20 20:35:55 UTC
Created attachment 323337 [details]
xmdmesg.txt
Comment 10 Birger Kollstrand 2009-10-20 20:37:22 UTC
Created attachment 323338 [details]
Xorg.0.log
Comment 11 Birger Kollstrand 2009-10-20 20:38:52 UTC
Created attachment 323339 [details]
boot.msg
Comment 12 Birger Kollstrand 2009-10-20 20:44:36 UTC
As long as I only go to console and do not try to return to the desktop, then it does not seem to lock.

So it seems like this is X in connection with Xen as thre is no problem while running the stock kernel.

Please categorize as you find sensible. I downgraded it to Critical according to the fact that it does crash the system, but it's probably in limited configurations.(This definately stops me testing in this area which is Xen management with Yast, but have enough of other areas to work on :-) )
Comment 13 Birger Kollstrand 2009-10-20 20:45:23 UTC
Crap, forgot the NEEDINFO tick.....
Comment 14 Jan Beulich 2009-10-21 09:27:03 UTC
Sorry, but the logs you posted still don't show any (severe) problem. I therefore assume they were taken before the machine crashed/hung, but we need the logs from after the hang/crash occurred; as that information may not get written to disk, you may have no choice other than attaching a serial cable to log Xen and kernel messages that way.

Btw., did you try to configure X without using acceleration (in which case drm kernel modules should not get loaded)? That may get you around the problem, if you need a workaround...
Comment 15 Birger Kollstrand 2009-10-22 17:15:58 UTC
I'll try the X way first :-)

"configure X without using acceleration"

How do I do that without any /etc/X11/xorg.conf file? Is not that some of the good new features?
Comment 16 Birger Kollstrand 2009-10-23 19:32:40 UTC
I found out how to get the x stuff deon. nice help on IRC :-)

Yes, I'm now running ok with that xorg.conf file.
Comment 17 Birger Kollstrand 2009-10-23 21:15:23 UTC
Created attachment 323981 [details]
/Var/log/messages from UI crash point
Comment 18 Birger Kollstrand 2009-10-23 21:19:44 UTC
Created attachment 323982 [details]
Xorg log from crashpoint.

I discovered that I do have ssh access to the machine after it crashes. Only keyboard, mouse and monitor is locked.
This is the procedure I did to get the logs:
1. booted with the XEN kernel
2. switched to ctrl-alt-F1
3. started logging via SSH from another machine on /var/log/messages and /var/log/Xorg.0.log
4. switched back with ctrl-alt-F7.
5. monitor, mounse and keyboard locked imediately. I noticed on the attached loging that at the same time the machine did this:
Oct 23 23:00:07 corot kernel: [  423.179200] [drm] Loading RS780/RS880 CP Microcode
Oct 23 23:00:07 corot kernel: [  423.179316] [drm] Loading RS780/RS880 PFP Microcode
Oct 23 23:00:07 corot kernel: [  423.194235] [drm] Resetting GPU
Comment 19 Jan Beulich 2009-10-26 14:28:53 UTC
May I ask that you provide the information asked for prior to clearing the needinfo state? There's still no indication of a crash in the log you provided, and there's also still no indication that you obtained the log via serial.
Comment 20 Birger Kollstrand 2009-10-26 18:46:02 UTC
Of course, you may indeed ask.

Although I can not see what should be output on the serial that is not in the logs obtained via ssh? The kernel is operating after the screen/mouse/keyboard stops working.

This will definitely take some time as the motherboard does not have a serial port. I will check to see if there is a pinout that can be used.
Comment 21 Jan Beulich 2009-10-27 08:38:06 UTC
Oh, I see - the different entries of yours contradict one another regarding how much of the machine doesn't work anymore. With just the human interfaces locked I agree that you're not to expect more output over serial, except for eventual Xen (hypervisor) messages which you didn't also provide. So after the local access to the machine stopped working, we'd need besides the /var/log/messages fragment also the respective "xm dmesg" output, with "loglvl=all guest_loglvl=all" added to the Xen command line in /boot/grub/menu.lst.

Additionally, with the machine having 8Gb of RAM, it would be worth trying to add "mem=4G" to the Xen command line to possibly circumvent the problem.

Finally, would you mind clarifying whether the kernel messages referred to in #18 also appear when starting the GUI on the native kernel?
Comment 22 Birger Kollstrand 2009-10-29 22:44:45 UTC
Created attachment 324781 [details]
xm dmesg

Regarding #18: It is the same with the standard kernel.

Regarding using mem=4G: No difference in behaviour.

"xm dmesg" output added.

I'm planning on upgrading to RC2 tomorrow. I'll post in if there is any change.
Comment 23 Jan Beulich 2009-10-30 12:03:07 UTC
With that I think we'll need to have a way to reproduce this internally in order to find ways to analyze/debug the problem.
Comment 25 Jan Beulich 2009-11-02 16:07:37 UTC
Along with reporting the results of using RC2, could you please also get us a native kernel's boot logs?
Comment 26 Birger Kollstrand 2009-11-04 18:14:21 UTC
Same problem with RC2.

But....
How did I not see this before:
top - 18:55:44 up 5 min,  2 users,  load average: 0.78, 0.27, 0.12
Tasks: 115 total,   2 running, 113 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us, 33.1%sy,  0.0%ni, 66.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.1%st
Mem:   7891800k total,   489296k used,  7402504k free,    16084k buffers
Swap:  4192944k total,        0k used,  4192944k free,   110588k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 4137 root      20   0  649m  32m 6980 R  100  0.4   0:22.68 Xorg
 4447 root      20   0  8532 1156  848 R    0  0.0   0:00.09 top
    1 root      20   0  8072  744  624 S    0  0.0   0:00.23 init
    2 root      15  -5     0    0    0 S    0  0.0   0:00.00 kthreadd
    3 root      RT  -5     0    0    0 S    0  0.0   0:00.00 migration/0
    4 root      15  -5     0    0    0 S    0  0.0   0:00.00 ksoftirqd/0
    5 root      RT  -5     0    0    0 S    0  0.0   0:00.00 watchdog/0

100% on Xorg......
 Then I see heaps of these in /var/log/messages after killing Xoeg brutally with kill -9.
Nov  4 18:58:57 linux-889s kernel: [  547.139514] [drm] wait idle failed status : 0xA0003030 0x00000003
Nov  4 18:58:57 linux-889s kernel: [  547.280594] [drm] wait idle failed status : 0xA0003030 0x00000003
Nov  4 18:58:57 linux-889s kernel: [  547.423976] [drm] wait idle failed status : 0xA0003030 0x00000003
Nov  4 18:58:57 linux-889s kernel: [  547.563065] [drm] wait idle failed status : 0xA0003030 0x00000003
Nov  4 18:58:57 linux-889s kernel: [  547.706390] [drm] wait idle failed status : 0xA0003030 0x00000003
Nov  4 18:58:58 linux-889s kernel: [  547.846369] [drm] wait idle failed status : 0xA0003030 0x00000003

And xorg.log:
(WW) RADEONHD(0): DRMCPIdle: DRM CP IDLE returned BUSY!
(WW) RADEONHD(0): DRMCPIdle: DRM CP IDLE returned BUSY!
(WW) RADEONHD(0): DRMCPIdle: DRM CP IDLE returned BUSY!
(WW) RADEONHD(0): DRMCPIdle: DRM CP IDLE returned BUSY!


Can this in anyway help?
Comment 27 Birger Kollstrand 2009-11-04 18:19:32 UTC
Created attachment 325555 [details]
The requested boot.msg
Comment 28 Jan Beulich 2009-11-05 09:34:21 UTC
(In reply to comment #22)
> Created an attachment (id=324781) [details]
> xm dmesg
> 
> Regarding #18: It is the same with the standard kernel.
> 
> Regarding using mem=4G: No difference in behaviour.
> 
> "xm dmesg" output added.

"xm dmesg" output provided here does not relate to the "mem=4G" test, and since bug 552492 indicates that this option indeed fixes a similar issue reported there I'd like you to clarify that you indeed passed this option to Xen (and not by mistake to the kernel).
Comment 29 Birger Kollstrand 2009-11-05 21:59:55 UTC
My mistake. You were absolutely correct. X now works with the mem=4G.
552492
It might be an idea to probice more thatn the parameter to bug reporters :-)
I'm not a XEN expert, just testing the Yast GUI so this was a bit over my head.

I'm just putting this as a duplicate of 552492.

Please contact me if you need any help in testing a patch.

*** This bug has been marked as a duplicate of bug 552492 ***
Comment 30 Swamp Workflow Management 2010-01-04 10:52:42 UTC
Update released for: kernel-debug, kernel-debug-base, kernel-debug-base-debuginfo, kernel-debug-debuginfo, kernel-debug-debugsource, kernel-debug-devel, kernel-debug-devel-debuginfo, kernel-default, kernel-default-base, kernel-default-base-debuginfo, kernel-default-debuginfo, kernel-default-debugsource, kernel-default-devel, kernel-default-devel-debuginfo, kernel-desktop, kernel-desktop-base, kernel-desktop-base-debuginfo, kernel-desktop-debuginfo, kernel-desktop-debugsource, kernel-desktop-devel, kernel-desktop-devel-debuginfo, kernel-pae, kernel-pae-base, kernel-pae-base-debuginfo, kernel-pae-debuginfo, kernel-pae-debugsource, kernel-pae-devel, kernel-pae-devel-debuginfo, kernel-source, kernel-source-vanilla, kernel-syms, kernel-trace, kernel-trace-base, kernel-trace-base-debuginfo, kernel-trace-debuginfo, kernel-trace-debugsource, kernel-trace-devel, kernel-trace-devel-debuginfo, kernel-vanilla, kernel-vanilla-base, kernel-vanilla-base-debuginfo, kernel-vanilla-debuginfo, kernel-vanilla-debugsource, kernel-vanilla-devel, kernel-vanilla-devel-debuginfo, kernel-xen, kernel-xen-base, kernel-xen-base-debuginfo, kernel-xen-debuginfo, kernel-xen-debugsource, kernel-xen-devel, kernel-xen-devel-debuginfo, preload-kmp-default, preload-kmp-desktop
Products:
openSUSE 11.2 (debug, i586, x86_64)