Bug 576681 - Installation hangs at Loading basic drivers
Summary: Installation hangs at Loading basic drivers
Status: VERIFIED FIXED
: 566419 574463 576626 (view as bug list)
Alias: None
Product: openSUSE 11.3
Classification: openSUSE
Component: Installation (show other bugs)
Version: Milestone 3
Hardware: i586 openSUSE 11.3
: P2 - High : Critical with 1 vote (vote)
Target Milestone: ---
Assignee: Jiri Bohac
QA Contact: Jiri Srain
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-02-03 20:06 UTC by Forgotten User vs5edErKRK
Modified: 2016-04-15 10:45 UTC (History)
8 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
Screenshot of the hang, showing 2 cycles of the repeat (257.18 KB, image/jpeg)
2010-02-05 11:59 UTC, Paul Hands
Details
Configuration of VirtualBox (4.33 KB, application/xml)
2010-03-13 12:11 UTC, Vojtech Zeisek
Details
Log of VirtualBox (52.18 KB, text/plain)
2010-03-13 12:12 UTC, Vojtech Zeisek
Details
Xorg.0.log file from OpenSUSE 11.3 MS3 / running in VirtualBox 3.1.4 (13.72 KB, text/x-log)
2010-03-19 16:07 UTC, Vadim Plessky
Details
dmesg output from OpenSUSE 11.3 booted in VirtualBox 3.1.4 (24.62 KB, text/plain)
2010-03-19 16:15 UTC, Vadim Plessky
Details
logs from tty4 from NET-i586-Build0518 (87.62 KB, image/png)
2010-03-24 14:40 UTC, Michal Seben
Details
logs from serial console from NET-i586-Build0518 (35.02 KB, text/plain)
2010-03-24 14:41 UTC, Michal Seben
Details
Serial console output from NET-i586- Build0515 on VirtualBox 3.1.4 (33.41 KB, text/x-log)
2010-03-24 16:05 UTC, Larry Finger
Details
screenshot (175.47 KB, image/png)
2010-03-27 11:41 UTC, Rastislav Krupansky
Details
debugging patch (2.95 KB, patch)
2010-03-31 22:54 UTC, Jiri Bohac
Details | Diff
Sereial console log from attempting to boot build0518-pcpu-debug-1.iso (146.51 KB, text/plain)
2010-04-01 04:38 UTC, Larry Finger
Details
Serial console log from attempting to boot build0518-pcpu-debug-1.iso on virtualbox-ose (158.08 KB, text/plain)
2010-04-01 07:49 UTC, Michal Seben
Details
Serial console log from successful boot of Build0518-pcpu-debug-1.iso (152.01 KB, text/plain)
2010-04-01 20:01 UTC, Larry Finger
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Forgotten User vs5edErKRK 2010-02-03 20:06:24 UTC
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; sv-SE; rv:1.9.2) Gecko/20100115 Firefox/3.6 (.NET CLR 3.5.30729)

At installation under VirtualBox the processor goes up to 100% CPU and installation halts at Loading basic drivers.

Reproducible: Always

Steps to Reproduce:
1.Boot from DVD
2.Choose installation
3.Press Esc after Kernel is loaded.
Actual Results:  
Installation halts

Expected Results:  
Installation should have continued
Comment 1 Steffen Winterfeldt 2010-02-04 10:50:36 UTC
probably kernel issue
Comment 2 Steffen Winterfeldt 2010-02-04 10:52:40 UTC
*** Bug 576626 has been marked as a duplicate of this bug. ***
Comment 3 Steffen Winterfeldt 2010-02-04 11:03:28 UTC
FWIW, I always get a kernel oops with recent kernels in vbox (3.1.2) unless
I enable ioapic in vbox.
Comment 4 Forgotten User vs5edErKRK 2010-02-04 11:14:30 UTC
Enabling ioapic doesn't help for me. Same problem.
Comment 5 Paul Hands 2010-02-04 12:02:29 UTC
Just for completeness, I can confirm that ioapic (or *any* of the other system settings in VBox) makes no difference to the bug.  Neither do any of the kernel boot time option, such as safe setting.  I tested them all.
Comment 6 Steffen Winterfeldt 2010-02-04 12:12:07 UTC
Can you check on consoles 3 and 4 what the last thing is it does?
Comment 7 Paul Hands 2010-02-04 16:47:02 UTC
(In reply to comment #6)
> Can you check on consoles 3 and 4 what the last thing is it does?

How do I do that in VBox?  If I try the crtl-alt-Fx key combination, it's not trapped by VBox, and is passed to the host, which switched consoles quite happily :-)
Comment 8 Steffen Winterfeldt 2010-02-04 17:08:53 UTC
You're looking for 'host-key'+Fn (Right Ctrl, I think).
Comment 9 Jeff Mahoney 2010-02-04 17:10:24 UTC
Virtualbox is notorious for having buggy hardware emulation. Unless this report can be reproduced on real hardware (or more reputable VM), this is way, way low priority.
Comment 10 Paul Hands 2010-02-04 18:16:46 UTC
(In reply to comment #8)
> You're looking for 'host-key'+Fn (Right Ctrl, I think).
Thanks, I learned something new!

I got a lot more from the alternate consoles.....


Startup...
Loading basic drivers...[   13.477542] ide-cd driver 5.00
[   13.550907] ide-gd driver 1.18
[   13.685910] st: Version 20081215, fixed bufsize 32768, s/g segs 256
[   14.300451] NET: Registered protocol family 17
[   16.582221] NET: Registered protocol family 10
[   77.313486] BUG: soft lockup - CPU#0 stuck for n61s! [insmod:377]
[   77.313797] Modules linked in: ipv6(+) af_packet st ide_gd_mod ide_cd_mod sr_mod cdrom sg ide_pci_generic piix ide_core ata_generic ata_piix ahci parport_pc floppy parport battery ac rtc_cmos rtc_core rtc_lib thermal button processor thermal_sys pcnet32 hwmon libata edd squashfs loop
[   77.313797]
[   77.313797] Pid: 377, comm insmod Not tainted (2.6.32-3-default #1) VirtualBox
[   77.313797] EIP: 0060:[<c08e9ef5>] EFLAGS: 00010286 CPU: 0
[   77.313797] EIP is at pcpu_populate_chunk+0x95/0x410
[   77.313797] EAX: 00000000 EBX: 00000800 ECX: 00000200 EDX: 00000000
[   77.313797] ESI: 00000000 EDI: fec00000 EBP: c0e360a0 ESP: c2731e88
[   77.313797]  DS: 007b ES: 007b GS: 0033 SS: 0060
[   77.313797] CR0: 8005003b CR2: fec00000 CR3: 0271b000 CR4: 000006d0
[   77.313797] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[   77.313797] DR6: 00000000 DR7: 00000000
[   77.313797] Call Trace:
[   77.313797] [<c08ea88b>] pcpu_alloc+0x30b/0x3f0
[   77.313797] [<c0b56f76>] snmp_mib_init+0x16/0x60
[   77.313797] [<f828f9b3>] smnp6_alloc_dev+0x53/0x80 [ipv6]
[   77.313797] [<f829064e>] ipv6_add_dev+0xce/0x2b0 [ipv6]
[   77.313797] [<f7c2b2c6>] addrconf_init+0x3e/0x12e [ipv6]
[   77.313797] [<f7c2b178>] inet6_init+0x178/0x27e [ipv6]
[   77.313797] [<c080112f>] do_one_initcall+0x2f/0x190
[   77.313797] [<c0873fd4>] sys_init_module+0xb4/0x230
[   77.313797] [<c0803075>] syscall_call+0x7/0xb
[   77.313797] [<b780070e>] 0xb780070e


This then repeats ad infinitum....
Please don't put too much faith in the accuracy of the above - I transcribed it by hand!
Comment 11 Steffen Winterfeldt 2010-02-05 10:03:03 UTC
You could have just made a screenshot. :-)
Comment 12 Paul Hands 2010-02-05 11:57:33 UTC
(In reply to comment #11)
> You could have just made a screenshot. :-)

Oops, now I feel like an idiot :-).

I made a screenshot to transcribe.

It's now attached.

Paul
Comment 13 Paul Hands 2010-02-05 11:59:11 UTC
Created attachment 341021 [details]
Screenshot of the hang, showing 2 cycles of the repeat
Comment 14 Jeff Mahoney 2010-02-16 22:23:30 UTC
That was the only iteration of the 2.6.32 kernel for openSUSE 11.3. Can you reproduce with a kernel from http://download.opensuse.org/repositories/Kernel:/HEAD/openSUSE_Factory/ ?
Comment 15 Paul Hands 2010-02-16 22:48:39 UTC
(In reply to comment #14)
> That was the only iteration of the 2.6.32 kernel for openSUSE 11.3. Can you
> reproduce with a kernel from
> http://download.opensuse.org/repositories/Kernel:/HEAD/openSUSE_Factory/ ?


Hi Jeff,

I'm not sure how to apply one of those kernel rpms to the iso image I have, but I'd love to learn.  Any pointers?

Paul
Comment 16 Paul Hands 2010-02-18 16:58:52 UTC
I tried the M2 iso.  Same result.  I guess it's the same kernel?
Comment 17 Forgotten User vs5edErKRK 2010-02-23 15:15:32 UTC
Workaround: Boot with less than 700 MB RAM see https://bugzilla.novell.com/show_bug.cgi?id=574463
Comment 18 Rastislav Krupansky 2010-03-04 20:39:32 UTC
(In reply to comment #17)
> Workaround: Boot with less than 700 MB RAM see
> https://bugzilla.novell.com/show_bug.cgi?id=574463

No, workaround doesn´t still work. Milestone 3 has the same issue :-(
I can confirm it on livecd´s.
Btw, a similar issue is in VMware also (at least in my case)

To Jeff: kernel is 2.6.33-5
Comment 19 Jeff Mahoney 2010-03-04 21:18:57 UTC
(In reply to comment #15)
> I'm not sure how to apply one of those kernel rpms to the iso image I have, but
> I'd love to learn.  Any pointers?

Yes, but it needs an already running system. There's an 'install-initrd' package that allows you to regenerate the initrd to match the kernel package you want to use. Then you can replace the kernel and initrd on the install media with it and re-burn it.

It will probably be easier to not bother with the live cd for this stage of testing. You're not getting far enough into the boot cycle for it to matter.

Instead, download the internet install iso. It's a lot smaller and it's easier to drop in the replacement kernel and initrd and remake the iso than it is to rebuild the entire live cd with a new kernel. Also I only know how to do the first part. :)
Comment 20 Jeff Mahoney 2010-03-04 21:19:55 UTC
(In reply to comment #18)
> (In reply to comment #17)
> > Workaround: Boot with less than 700 MB RAM see
> > https://bugzilla.novell.com/show_bug.cgi?id=574463
> 
> No, workaround doesn´t still work. Milestone 3 has the same issue :-(
> I can confirm it on livecd´s.
> Btw, a similar issue is in VMware also (at least in my case)

If you're able to reproduce with VB and VMware, can you also reproduce with kvm? I ask because it's possible to start KVM with -kernel and -initrd parameters that will make testing loads faster.
Comment 21 Rastislav Krupansky 2010-03-05 04:44:24 UTC
I use VB and VMware on Microsoft Windows.(In reply to comment #20)
> (In reply to comment #18)
> > (In reply to comment #17)
> > > Workaround: Boot with less than 700 MB RAM see
> > > https://bugzilla.novell.com/show_bug.cgi?id=574463
> > 
> > No, workaround doesn´t still work. Milestone 3 has the same issue :-(
> > I can confirm it on livecd´s.
> > Btw, a similar issue is in VMware also (at least in my case)
> 
> If you're able to reproduce with VB and VMware, can you also reproduce with
> kvm? I ask because it's possible to start KVM with -kernel and -initrd
> parameters that will make testing loads faster.

Sorry, i´d try it, but i use VB and VMware on Microsoft Windows and i have no experience with KVM.
Someone else for testing in KVM?
Comment 22 Vojtech Zeisek 2010-03-13 12:11:43 UTC
Created attachment 348312 [details]
Configuration of VirtualBox
Comment 23 Vojtech Zeisek 2010-03-13 12:12:58 UTC
Created attachment 348313 [details]
Log of VirtualBox
Comment 24 Vojtech Zeisek 2010-03-13 12:14:24 UTC
I have same problem. Changes in settings did not help. I try to run openSUSE 11.3 M3 NET install in VirtualBox. I have one processor for it, 8 GB disc, 1400 MB RAM, ISO image of the CD and direct access to computer's CD ROM. When I run it I choose 'Install' and pres Esc to see messages. When it goes to stage 'Loading basic drivers...', it stops there. It is still running (=consuming CPU time), but booting does not continue.
Comment 25 Vadim Plessky 2010-03-16 09:50:02 UTC
I have reported two bugs which may have the same cause to this one.
Both are reported against real machines, not VirtualBox.

* System doesn't start in graphics mode after installation from KDE Live CD.
https://bugzilla.novell.com/show_bug.cgi?id=581636

Model: Notebook ASUS M51Ta
CPU: AMD Turion Ultra ZM84
RAM: 4GB
HDD: 320gb
Video: ATI 3650
(system has built-in ATI 3200 on chipset and discrete graphics ATI3650)

* OpenSUSE 11.3 doesn't install, KDE Live session boots 
https://bugzilla.novell.com/show_bug.cgi?id=582982

Model: HP Tablet PC Tx1100 (notebook with Tablet PC capabilities)
Video: NV17 [GeForce4 420 Go 32M]'
RAM: 512MB
CPU: Pentium M 1000Mhz


Hope those bugs can be also considered to be fixed for OpenSUSE 11.3 MS4

I have all hardware listed in my hands, ready for testing.
Comment 26 Michal Seben 2010-03-19 12:18:10 UTC
*** Bug 574463 has been marked as a duplicate of this bug. ***
Comment 27 Michal Seben 2010-03-19 12:22:50 UTC
two interesting posts from dup bnc#574463 :

Tejas Guruswamy :
>Confirmed that booting with ipv6.disable=1 results in a successful boot.
>
>With this new information, while trawling I found this message on the lkml
>
>http://lkml.org/lkml/2010/1/10/84
>http://bugzilla.kernel.org/show_bug.cgi?id=15042


Larry Finger :
>The only relevance to nmi_watchdog is that vboxdrv turns off that watchdog when
>loading a VM. Perhaps a real machine and other kinds of VM work around a race
>condition through the use of the watchdog, but VB has this disabled. I'm
>grasping at straws here.
>
>The first reference you quote above talks about a race between ipv6.ko
>loading/initializing and the starting of sshd. Does the kernel on the NET
>install iso actually start sshd? If so, perhaps a sleep 10 at the start of the
>sshd script would avoid the problem.
>
>I checked the log while booting with 700 MB VM. In that case, the system does
>not try to load ipv6. There must be some kind of free RAM test that skips the
>load. Without IPV6, the load succeeds.
Comment 28 Michal Seben 2010-03-19 12:26:13 UTC
*** Bug 566419 has been marked as a duplicate of this bug. ***
Comment 29 Paul Hands 2010-03-19 13:30:46 UTC
I added ipv6.disable=1 to the Milestone 3 boot command line with a Virtualbox VM with 1024Mb RAM.  The problem did not appear!   Looks like a good workaround.
Comment 30 Jeff Mahoney 2010-03-19 15:13:53 UTC
Thanks for the research. Assigning..
Comment 31 Vadim Plessky 2010-03-19 16:03:16 UTC
Hello all,

Nice to see so much attention to this bug.

Just launched OpenSUSE 11.3 Milestone 3 in VirtualBox 3.1.4 (Windows Vista host) with ipv6.disable=1 boot parameter in GRUB.

First - startup stalls at "Loading HAL daemon"
And moves nowhere, until I press Ctrl-C.
Than OpenSUSE loads successfuly to login prompt, and I can login.
This is in Terminal 1.

BUT:  Graphics screen is not visible (black)

Good news that Ethernet connectivity is working.
So I was able to get Xlog.0.log and uplod it to FTP server at host.
Comment 32 Vadim Plessky 2010-03-19 16:07:08 UTC
Created attachment 349531 [details]
Xorg.0.log file from OpenSUSE 11.3 MS3 / running in VirtualBox 3.1.4


What is visible from Xorg.0.log file is that mouse driver is not loaded

(II) config/hal: Adding input device VirtualBox Guest Service
(II) LoadModule: "vboxmouse"
(II) Loading /usr/lib/xorg/modules/input/vboxmouse_drv.so
(II) Module vboxmouse: vendor="Sun Microsystems Inc."
	compiled for 0.0.0, module version = 1.0.0
	Module class: X.Org XInput Driver
	ABI class: X.Org XInput driver, version 4.0
(EE) module ABI major version (4) doesn't match the server's version (7)
(II) UnloadModule: "vboxmouse"
(II) Unloading /usr/lib/xorg/modules/input/vboxmouse_drv.so
(EE) Failed to load module "vboxmouse" (module requirement mismatch, 0)
(EE) No input driver matching `vboxmouse'
(EE) config/hal: NewInputDeviceRequest failed (15)

It explains why mouse is not visible on X screen.
But why X itself (KDM4 login?) is not visible?..
Comment 33 Vadim Plessky 2010-03-19 16:15:12 UTC
Created attachment 349534 [details]
dmesg output from OpenSUSE 11.3 booted in VirtualBox 3.1.4


OpenSUSE 11.3 booted in VirtualBox 3.1.4 (Windows Vista host)

ipv6.disable=1 boot parameter added in GRUB before start.

For the records:
dmesg output confirms that IPv6 has been disabled
...
[   88.153872] IPv6: Loaded, but administratively disabled, reboot required to enable
Comment 34 Jiri Bohac 2010-03-19 18:21:25 UTC
(In reply to comment #13)
> Created an attachment (id=341021) [details]
> Screenshot of the hang, showing 2 cycles of the repeat

This looks similar to the symptoms discussed in http://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg37042.html, but that one was believed to be a PPC problem (?!) -> CCing Tejun Heo anyway. 

This bug also reminded me that I was planning to make CONFIG_IPV6=y for all kernel flavours (Bug #561611). Did that just now. If this really were some kind of race between IPv6 initialization and the socket syscall (as suggested by Bug #574463, Comment 9), having IPV6=y will work around this.

Bug #574463, Comment 10:
> I checked the log while booting with 700 MB VM. In that case, the system does
> not try to load ipv6. There must be some kind of free RAM test that skips the
> load. Without IPV6, the load succeeds.

Huh? I really doubt we have any check like that in the installation system. I will have a look.



I will dig into this more next week.
Comment 35 Michal Seben 2010-03-20 14:35:35 UTC
(In reply to comment #32)
> What is visible from Xorg.0.log file is that mouse driver is not loaded
> 
> (II) config/hal: Adding input device VirtualBox Guest Service
> (II) LoadModule: "vboxmouse"
> (II) Loading /usr/lib/xorg/modules/input/vboxmouse_drv.so
> (II) Module vboxmouse: vendor="Sun Microsystems Inc."
>     compiled for 0.0.0, module version = 1.0.0
>     Module class: X.Org XInput Driver
>     ABI class: X.Org XInput driver, version 4.0
> (EE) module ABI major version (4) doesn't match the server's version (7)
> (II) UnloadModule: "vboxmouse"
> (II) Unloading /usr/lib/xorg/modules/input/vboxmouse_drv.so
> (EE) Failed to load module "vboxmouse" (module requirement mismatch, 0)
> (EE) No input driver matching `vboxmouse'
> (EE) config/hal: NewInputDeviceRequest failed (15)
> 
> It explains why mouse is not visible on X screen.
> But why X itself (KDM4 login?) is not visible?..

Hi Vadim !
these bugs with X and vbox mouse driver should be fixed in Milestone 4 see bnc#584085 and bnc#587980
Comment 36 Paul Hands 2010-03-20 17:15:47 UTC
I did a bit more characterisation on these, as I think we may be mixing 2 different issues in the same bug ID.


The original hang issue :

Adding ipv6.disable to the installer boot options works just fine.  I tried this with a couple of different installations, above and below 700MB RAM.  With ipv6 disabled, all installations succeed.  Subsequent reboots do NOT need the ipv6 boot option - they work anyway.  I also tried with ipv6 support enabled and disabled in YaST2 - no difference.  The system boots and runs X quite happily, so it seems that it only the installer that needs ipv6.disable.  Perhaps someone else can confirm?



The newer failing to start X issue :

I see this in Virtualbox 3.1.4 only if I install the VBox Guest Additions.  I think this is a separate issue, and should perhaps be pushed over to a different bug?  

I saw errors in the kernel module loads for the VBox additions - the vboxvfs module fails at modprobe with.....

FATAL : Error inserting vboxvfs (/lib/modules/2.6.33-5-default/misc/vboxvfs.ko): Invalid module format

Running startx after that produces the X hang described by others.
Comment 37 Larry Finger 2010-03-20 22:07:42 UTC
The problem with ipv6 hanging the boot process is only with the kernel on the CD. It is believed that the "secret" is having ipv6 built in, rather than loaded as a module, As you say, it has nothing to do with X issues.
Comment 38 Vadim Plessky 2010-03-21 13:22:26 UTC
I just tried to boot OpenSUSE 11.3 Milestone2 (3?) on live system:
- ASUS M51Ta
- AMD Turion Ultra CPU
- ATI 3650 video

with ipv6.disable=1 option, and without it.
I installed it to hard disk some time before, so it's available for testing - but doesn't work in X.

X doesn't start.
Looking into Xorg.0.log, it seems that X is using radeonhd video driver.
And even correctly detects screen resolution 1440x900.
Final result is the same as with VirtualBox - graphics screen (Terminal 7) is black/blanc.

My original bug report for this problem is:
https://bugzilla.novell.com/show_bug.cgi?id=581636

And it seems, for now, that problem for installation in VirtualBox is different from ASUS M51Ta/ATI 3650.

When I look at Terminal 8 after startup, I see in on-screen log that X was also trying to load fglrx driver, but failed.
Is it possible to use fglrx driver with Kernel 2.6.33/OpenSUSE 11.3 MS3?
If yes - how I can install it from command prompt or Yast2 running in text mode?

Than at least it can be checked if it is radeonhd driver problem or something else.
Comment 39 Larry Finger 2010-03-23 20:59:20 UTC
I just downloaded Build 515 of the NET install iso for i586. It also fails to boot in a VirtualBox VM unless ipv6.disable=1 is added to the boot line.

With ipv6 compiled into the kernel rather that as a module, there is a difference.

In console 4, I see the following lines:

NET: Registered protocol family 17
NET: Registered protocol family 10

Next there is a long pause with no output ~ 30 seconds or so. Then there is a kernel oops in insmod. I do not have a complete dump as it scrolls , but the traceback is as follows:

pcpu_populate_chunk+0x91/0x3e0
pcpu_alloc+0x25a/0x370
snmp_mib_init+0x22/0x70
ipv6_add_dev+0x13d/0x390
addr_conf_init+0x3e/0x12d
ipv6_init+0x178/0x280

The EIP is at memset+0x10/0x20. This output repeats roughly every 66 seconds until the system is stopped.
Comment 40 Tejun Heo 2010-03-23 23:03:14 UTC
Can you please set up a serial or net console and capture the full log?
Comment 41 Vadim Plessky 2010-03-24 06:57:11 UTC
Can Novell test latest builds of 11.3 and ensure that soon-to-be-released Milestone 4 would run and install in VirtualBox VM?
It would be a shame if it again fails to install.
 
Testing it (for basic Install process) in XEN and KVM would be also a plus.
And it even doesn't require additional hardware.

I have 4 computers around where I can test OpenSUSE Live CD, and I can do test install on 2 of those computers.
So far OpenSUSE 11.3 was booting ok, without any problems, only on MSI U90 netbook (Intel Atom, built-in Intel graphics, 1GB RAM)
Comment 42 Stephan Kulow 2010-03-24 08:42:38 UTC
Hi, at http://download.opensuse.org/factory/iso/ you can always find the latest live CDs to test. Feel free to send your findings to opensues-factory@opensuse.org mailing list
Comment 43 Vadim Plessky 2010-03-24 09:40:05 UTC
Hi Stephan,

Nice to hear you after a long time (when I was away from Linux/KDE)

I hope to have some time for testing tomorrow, so would download latest Live CDs and provide feedback.
Comment 44 Michal Seben 2010-03-24 14:39:15 UTC
I test openSUSE-NET-i586-Build0518-Media.iso  (builded 24-Mar-2010 07:44)
I also setup serial console, but it doesn't contain same logs like tty4 so I made also screenshots from tty4. Result is similar to Comment 39 and Comment 10,

-----
I also bootup with ipv6.disable=1 and do some test (maybe could be interesting):

/#lsmod | grep ipv6
ipv6 279574 0
- according to Comment 34, ipv6 module should be build in kernel ?

/#uname -a
Linux 10.0.2.15 2.6.33-6-default #1 SMP 2010-02-25 20:06:12 +0100 i686 i686 i386 GNU/Linux
-looks too old to me *2010-02-25*, cant check rpm -qi or --changelog as rpm database is not during NET install ...

------
check date from changes in buildservice:
# Fri *Mar 19 17:33:27 CET 2010* - jbohac@suse.cz  
#   
# - set CONFIG_IPV6=y for all flavours (bnc#561611)  
------

so I suppose, kernel with fix is still not on NET install CD
Comment 45 Michal Seben 2010-03-24 14:40:39 UTC
Created attachment 350310 [details]
logs from tty4 from NET-i586-Build0518
Comment 46 Michal Seben 2010-03-24 14:41:20 UTC
Created attachment 350311 [details]
logs from serial console from NET-i586-Build0518
Comment 47 Larry Finger 2010-03-24 16:05:19 UTC
Created attachment 350338 [details]
Serial console output from NET-i586- Build0515 on VirtualBox 3.1.4

This log shows a locked CPU on a single-cpu VM.
Comment 48 Larry Finger 2010-03-25 20:38:09 UTC
The problem is still present in M4. Using "ipv6.disable=1" on the Boot Options line works.
Comment 49 Jiri Bohac 2010-03-26 17:38:49 UTC
For some reason, the memset in pcpu_pcpu_populate_chunk() hangs when executing
the rep stos instruction.

The memset is zeroing a chunk of memory starting at fec00000. This might as
well be a virtualbox issue, I can't otherwise explain how the system could hang
in the middle of a single instruction, while still processing timer interrupts.

I have VB 3.1.4, on an x86_64 system, and unfortunately, I was not able to
reproduce the problem here. I am preparing a debug kernel/CD image with lots of
debugging information printed by the percpu allocator to see what is
happenning.
Comment 50 Larry Finger 2010-03-26 19:15:04 UTC
I could not find a routine name pcpu_pcpu_populate_chunk(). I assume you meant pcpu_populate_chunk().

Please let me know where to get that CD image to test.

I will also point the VB folks to this thread and ask them if they know of such problems.
Comment 51 Larry Finger 2010-03-26 19:47:22 UTC
Is your VB the OSE version? I'm running the pre-compiled variety as I need pass-thru for USB devices.
Comment 52 Rastislav Krupansky 2010-03-26 20:46:31 UTC
I have installed the newest VirtualBox 3.1.6 (but on windows system), Milestone 4 and workaround from most annoying bugs "ipv6.disable=1" does not work for me :-(
Comment 53 Larry Finger 2010-03-26 22:26:37 UTC
Obviously, VB on Windows is a different beast. I also have 3.1.6, but on an x86_64 openSUSE 11.2 host. It still boots with the workaround.

When it starts booting, press the ESC key (if you see the splash scree), then press the special key + f4. For me the special key is the right CTRL. Note what you see here when it hangs and/or crashes.
Comment 54 Forgotten User vs5edErKRK 2010-03-27 07:17:07 UTC
ipv6.disable=1 works here on Windows system with VB 3.1.6 Loading basic drivers takes some time...
Comment 55 Rastislav Krupansky 2010-03-27 11:41:45 UTC
Created attachment 350990 [details]
screenshot

(In reply to comment #53)
> Obviously, VB on Windows is a different beast. I also have 3.1.6, but on an
> x86_64 openSUSE 11.2 host. It still boots with the workaround.
> 
> When it starts booting, press the ESC key (if you see the splash scree), then
> press the special key + f4. For me the special key is the right CTRL. Note what
> you see here when it hangs and/or crashes.

on attached screenshot is full output what i can see
Comment 56 Vadim Plessky 2010-03-27 12:42:43 UTC
Tried to load OpenSUSE 11.3 MS4 Live CD in VMware Server (ver.2.0.2, latest available) hosted on Windows Vista.
It is available as free download at 
http://downloads.vmware.com/d/info/datacenter_downloads/vmware_server/2_0

Tested both GNOME and KDE Live CDs.
Same result - system starts loading, than switches to graphics mode (GDM or KDM startup?) and I see black graphics screen.
And no response to mouse or keyboard.
Added ipv6.disable as boot option - no change.

And it seems VMware doesn't allow/pass-through to change to Terminal 1 via Ctrl+Alt+F1. At least it doesn't work here. Any tips how to overcome it would be appreciated.
As an alternative - it should be possible to capture log at TTY3.
Question is how to connect to TTY3 in VM from host system?
VMware Server I have is installed on Windows Vista Host.
I am not familiar with VMware as VirtualBox suits better my needs.
But for case like this it's can be a good testing option.
Comment 57 Larry Finger 2010-03-27 16:53:56 UTC
@Vadim: I tried to obtain VMware to test for what you did. Their kernel module will not build on 2.6.33 or later kernels, and I refuse to debug their code. I might consider fixing it if it only failed with 2.6.34-rcX, but they seem to be lazy!

@Rastislav: I think you are trying to boot the Live CD. The "ipv6.disable=1" trick works for the NET install CD and the DVD. The top line of the picture you posted shows that "ipv6.disable=1" is not a valid identifier. In addition, the NET install CD does not use the Kiwi CD boot system. It boots a normal kernel.
Comment 58 Rastislav Krupansky 2010-03-27 19:05:19 UTC
(In reply to comment #57)

> @Rastislav: I think you are trying to boot the Live CD. The "ipv6.disable=1"
> trick works for the NET install CD and the DVD. The top line of the picture you
> posted shows that "ipv6.disable=1" is not a valid identifier. In addition, the
> NET install CD does not use the Kiwi CD boot system. It boots a normal kernel.

Yes, you are right, i´m trying LiveCD. I thought workaround works for liveCD also.

@Vadim: You can post your experiences with VMware into bug 574857. This bug is filed for VMware and tracked by VMware people.
Comment 59 Vadim Plessky 2010-03-28 19:47:32 UTC
(In reply to comment #35)
> (In reply to comment #32)
> > What is visible from Xorg.0.log file is that mouse driver is not loaded
> > 
> > (II) config/hal: Adding input device VirtualBox Guest Service
> > (II) LoadModule: "vboxmouse"
> > (II) Loading /usr/lib/xorg/modules/input/vboxmouse_drv.so
> > (II) Module vboxmouse: vendor="Sun Microsystems Inc."
> >     compiled for 0.0.0, module version = 1.0.0
> >     Module class: X.Org XInput Driver
> >     ABI class: X.Org XInput driver, version 4.0
> > (EE) module ABI major version (4) doesn't match the server's version (7)
> > (II) UnloadModule: "vboxmouse"
> > (II) Unloading /usr/lib/xorg/modules/input/vboxmouse_drv.so
> > (EE) Failed to load module "vboxmouse" (module requirement mismatch, 0)
> > (EE) No input driver matching `vboxmouse'
> > (EE) config/hal: NewInputDeviceRequest failed (15)
> > 
> > It explains why mouse is not visible on X screen.
> > But why X itself (KDM4 login?) is not visible?..
> 
> Hi Vadim !
> these bugs with X and vbox mouse driver should be fixed in Milestone 4 see
> bnc#584085 and bnc#587980

Hi Michal!

I confirm that bug with X and 'vboxmouse' has been fixed in OpenSUSE 11.3 Milestone4.
Installation of KDE Live CD has been done in VirtualBox 3.1.6 (hosted in Windows Vista).
No additional tuning, like ipv6.disable=1, was necessary.

------------------------
X.Org X Server 1.7.6
Release Date: 2010-03-17
X Protocol Version 11, Revision 0
Build Operating System: openSUSE SUSE LINUX
Current Operating System: Linux linux-yd5w 2.6.33-6-default #1 SMP 2010-02-25 20:06:12 +0100 i686

...

(II) config/hal: Adding input device VirtualBox Guest Service
(II) LoadModule: "vboxmouse"
(II) Loading /usr/lib/xorg/modules/input/vboxmouse_drv.so
(II) Module vboxmouse: vendor="Sun Microsystems Inc."
	compiled for 0.0.0, module version = 1.0.0
	Module class: X.Org XInput Driver
	ABI class: X.Org XInput driver, version 7.0
(**) Load address of symbol "VBOXMOUSE" is 0xb4b45200
(**) VirtualBox Guest Service: always reports core events
(**) VirtualBox Guest Service: Device: "/dev/vboxguest"
(II) XINPUT: Adding extended input device "VirtualBox Guest Service" (type: MOUSE)
(**) VirtualBox Guest Service: (accel) keeping acceleration scheme 1
(**) VirtualBox Guest Service: (accel) acceleration profile 0
(II) VirtualBox Guest Service: On.
(II) VirtualBox Guest Service: Off.
(II) Open ACPI successful (/var/run/acpid.socket)
(II) VirtualBox Guest Service: On.
------------------------

In order to verify it, I had to kill existing installation of 11.3 Milestone3 (start from MS4 Live CD, and make complete new install to HDD)

BUT:
bug 574857 - Can´t start and install live cd in VMware
is still present in Milestone 4.
Comment 60 Jiri Bohac 2010-03-31 22:54:38 UTC
Created attachment 351845 [details]
debugging patch

Larry & Michal:
to see what the pcpu allocator is doing, I made this patch.
I have replaced the kernel on the build0518 iso image with a kernel containing this patch. Please test with the modified iso found at http://labs.suse.cz/jbohac/bug576681/build0518-pcpu-debug-1.iso

I need the full kernel messages, so please set up a serial console:

- in VB, go to "serial ports", select the "Port 1" tab, check "Enable Serial Port", set the mode to "Host pipe", check "Create pipe" and put something like /tmp/vboxconsole in the "port/file path"

- start the virtual machine

- do something like "socat UNIX-CONNECT:/tmp/vboxconsole  - | tee /tmp/log"

- in the VM boot menu, select "Installation" and put "console=ttyS0 ignore_loglevel" in the "Boot options"

Then, please provide the full log from /tmp/log.
Thanks!

Vladimir & Rastislav: I think you are seeing an unrelated bug.
Comment 61 Jiri Bohac 2010-03-31 23:12:00 UTC
(In reply to comment #51)
> Is your VB the OSE version? I'm running the pre-compiled variety as I need
> pass-thru for USB devices.

I was using OSE, yes. But now I tried with the pre-compiled version and still could not reproduce the problem :(
Comment 62 Larry Finger 2010-04-01 04:38:15 UTC
Created attachment 351867 [details]
Sereial console log from attempting to boot build0518-pcpu-debug-1.iso

Captured all the pcpu debug info. None output after CPU hung.
Comment 63 Michal Seben 2010-04-01 07:49:46 UTC
Created attachment 351924 [details]
Serial console log from attempting to boot build0518-pcpu-debug-1.iso on virtualbox-ose

Another console log, looks same like Larry's, just for sure I will attached it
I am using virtualbox-ose 3.1.6, to reproduce this issue i have to disable hw virtualization and set Video Memory to 128 MB (or check https://bugzilla.novell.com/show_bug.cgi?id=574463#c3 and https://bugzilla.novell.com/show_bug.cgi?id=574463#c4)
Comment 64 Larry Finger 2010-04-01 20:01:46 UTC
Created attachment 352105 [details]
Serial console log from successful boot of Build0518-pcpu-debug-1.iso

This log was with the RAM assigned to the VM reduced to 700 MB. The system does boot this way.

A couple of things I noted:

(1) The allocation that fails is the first one that uses pcpu_alloc_pages and pcpu_map_pages.

(2) Likely related to (1), but this is the first call to use memory in the "vmalloc" region. All others use "lowmem".
Comment 65 Tejun Heo 2010-04-02 01:18:54 UTC
[    0.000000] ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
...
[   44.781407] * pcpu debug:going to memset: chunk=e8e71140, cpu=0, off=8832, size=64, addr=fec00000

2.6.34-rcX has a random memory corruption bug which is showing up as
various boot failures.  Yinghai has a patch.

    http://thread.gmane.org/gmane.linux.kernel/963616/focus=964914

-rc3 has the fix which got committed to suse kernel repo a couple of days ago.  It should soon appear on Factory.

Thanks.
Comment 66 Larry Finger 2010-04-02 02:47:28 UTC
Was that bug in 2.6.33? I thought it came in with 2.6.34.

OK, M5 should be OK.
Comment 67 Jiri Bohac 2010-04-06 19:14:51 UTC
I can now reproduce the problem as well. After more debugging I see that the machine is stuck in an endless loop of page faults.

The page fault is triggered by the memset at fec0000 and the page fault is thought to be "spurious" (stale TLB entry) by the page fault handler, so the kernel does nothing, the STOS instruction of memset is restarted and the pagefault triggers again.

The reason code for the page fault is 3, that is a protection fault during a write operation.

Looking at the PMD entry and PTE of the fec00000 page, the page is set to be writeable, so I don't understand why this happens. The i386 specification says that the TLB should be flushed automatically after a PF trap, and that is why the PF handler does nothing if it believes the PF was "spurious".

So, this could either be a VB bug (because it is VB that emulates the paging, traps, etc in the guest), or there is some other reason why a page protection fault can happen besides the permission bits in the PTE/PMD entry.

(In reply to comment #65)
> [    0.000000] ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
> ...
> [   44.781407] * pcpu debug:going to memset: chunk=e8e71140, cpu=0, off=8832,
> size=64, addr=fec00000

Yes, I also thought this was the reason at first, but I think the IOAPIC address refers to a physical address, while the allocated memory that memset faults on is at virtual address fec00000, right? 

> 2.6.34-rcX has a random memory corruption bug which is showing up as
> various boot failures.  Yinghai has a patch.
> 
>     http://thread.gmane.org/gmane.linux.kernel/963616/focus=964914

This looks pretty deterministic, It fails at exactly the same place for more people.


> -rc3 has the fix which got committed to suse kernel repo a couple of days ago. 
> It should soon appear on Factory.

Also, this bug is probably going to stop appearing with the new kernel in Factory, because I recently switched IPv6 to be compiled-in. 
Most likely, this bug is not related to IPv6 at all and it is just a coincidence that the order in which the install CD image loads kernel modules makes IPv6 be the first one to need a new allocation of pcpu data and trigger this bug. With IPv6 compiled in, this order is going to change and the bug will either be triggered by something else or will not show at all.

But even if this bug disappears, I think it is worth finding out what the cause was, before it causes other headaches in a different situation.

More debugging soon, I currently have some more urgents bugs to deal with.
Comment 68 Larry Finger 2010-04-06 20:41:25 UTC
I posted your analysis on the VB developers ML, along with the question "Does VB handle these PFs correctly?" I'll post any answers here.
Comment 69 Tejun Heo 2010-04-06 21:32:15 UTC
(In reply to comment #67)
> Yes, I also thought this was the reason at first, but I think the IOAPIC
> address refers to a physical address, while the allocated memory that memset
> faults on is at virtual address fec00000, right? 

Indeed, but it still gets on my nerves that they share the same address.

> > 2.6.34-rcX has a random memory corruption bug which is showing up as
> > various boot failures.  Yinghai has a patch.
> > 
> >     http://thread.gmane.org/gmane.linux.kernel/963616/focus=964914
> 
> This looks pretty deterministic, It fails at exactly the same place for more
> people.

On the same configuration, this will cause the same failure.  Given that virtual envs end up being the same on multiple configurations, I still think this could be the culprit.  Maybe it's a good idea to test w/ -rc3 with ipv6 compiled as module or -rc2 w/ only Yinghai's fix applied?

Thanks.
Comment 70 Larry Finger 2010-04-06 22:03:16 UTC
I would prefer another test with Yinghai's fix applied. That way we can determine if it is a VB problem, or the random memory corruption bug. As the problem has not shown up with real machines, I suspect a VB problem, but I would like to know.
Comment 71 Michal Seben 2010-04-08 14:02:16 UTC
we got response from  
Sander van Leeuwen on vbox-dev mailing list (to Larry's question):
http://vbox.innotek.de/pipermail/vbox-dev/2010-April/002514.html
------------
I've found and fixed the problem. It's an edge case 
where the guest writes to a non-present guest page
in the same region our hypervisor is located. That caused a #PF storm 
inside the guest, because the hypervisor range conflict
check was performed too late.

Sander

-----------
I could don't find the fix in svn so I asked for revision, but for me this looks fixed :)
Comment 72 Larry Finger 2010-04-08 15:21:14 UTC
Thanks to Jiri's analysis, VB had enough info to find their bug. It is not clear when this fix will propagate through, but I agree that the issue looks fixed. All we need is one last test. I am not in a trusting mood. ;)
Comment 73 Michal Seben 2010-04-09 11:26:46 UTC
I used suggested patch : http://www.virtualbox.org/changeset/28090
and build0518-pcpu-debug-1.iso boots nicely ...
so closing as fixed 

thanks to all !
Comment 74 Paul Hands 2010-04-14 22:08:17 UTC
Just confirming : the bug is gone in M5.  The install on Virtualbox gets past the basic drivers stage and proceeds normally.

Good job, guys.
Comment 75 Larry Finger 2010-04-14 22:51:42 UTC
I confirm that this bug does not appear in M5; however, I think this is an accident of the memory sizes of the new kernel, rather than a fix.

The real problem comes from the VirtualBox code. There is a fix in the svn repository for the OSE version of VB; however, that fix will not be in the version I use until 3.1.8. It is definitely not in the 3.1.6 version that I'm using.
Comment 76 Bernhard Wiedemann 2016-04-15 10:45:02 UTC
This is an autogenerated message for OBS integration:
This bug (576681) was mentioned in
https://build.opensuse.org/request/show/37563 Factory / virtualbox-ose