Bug 757858 - Installer using default kernel begins system probing hangs at "Search for Linux partitions"
Summary: Installer using default kernel begins system probing hangs at "Search for Lin...
Status: RESOLVED DUPLICATE of bug 773058
Alias: None
Product: openSUSE 12.2
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Milestone 3
Hardware: x86-64 openSUSE 12.2
: P2 - High : Critical (vote)
Target Milestone: Final
Assignee: Thomas Renninger
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-18 17:40 UTC by Roman Bysh
Modified: 2012-10-31 08:55 UTC (History)
4 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
inxi -F readout of my setup (1.42 KB, text/plain)
2012-04-18 17:40 UTC, Roman Bysh
Details
ACPIDUMP (189.98 KB, application/octet-stream)
2012-06-26 17:51 UTC, Roman Bysh
Details
Info about kernel provided with 12.2 M1 (350 bytes, text/plain)
2012-06-27 17:39 UTC, Roman Bysh
Details
Photo of screen (38.88 KB, image/jpeg)
2012-08-06 17:06 UTC, Roman Bysh
Details
yast logs while expert partitioner was scanning volumes (1.31 MB, application/x-bzip)
2012-10-17 03:16 UTC, Tom C
Details
yast logs after expert partitioner fully loaded (1.32 MB, application/x-bzip)
2012-10-17 03:17 UTC, Tom C
Details
Latest yast2logs (5.05 MB, application/x-bzip)
2012-10-17 18:01 UTC, Roman Bysh
Details
Today's dmesg (59.59 KB, text/plain)
2012-10-17 18:02 UTC, Roman Bysh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Roman Bysh 2012-04-18 17:40:32 UTC
Created attachment 486758 [details]
inxi -F readout of my setup

User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20100101 Firefox/11.0

Since 12.2 M1 - M3. The installer using the default kernel always hangs at "System Probing" when it tries to "Search for Linux partitions".

The only way around this was to add acpi=off to Boot Options 
or select "NO ACPI" or "Safe Setting".

Reproducible: Always

Steps to Reproduce:
1. Select default kernel 
2. Installer gets to System Probing --> Search for Linux partitions
3. Installer hangs
4. Press reset button to restart installation and select "NO ACPI".
Actual Results:  
The installer using the default kernel always hangs at "System Probing" and "Search for Linux partitions".

I have 12.1 installed on the first 4 partitions.

Expected Results:  
The installer should not hang when System Probing --> Search for Linux partitions.


When I tested 11.3 - 12.1 Milestones with default kernel. They never hang when the "Search for Linux partitions" was performed.
Comment 1 Roman Bysh 2012-04-18 18:02:22 UTC
I tried this installation in VirtualBox and the installer never had any problems.
However, it did not have to deal with an openSUSE 12.1 partitions on the first 4 partitions.

If I had an empty drive than the installation would proceed in a normal fashion.
Comment 2 Kun Kun Zhang 2012-04-19 03:05:22 UTC
Hi,could you please help to have a look this?I am not sure whether it is right to assign it to you.Feel free to reassign it.Thank you.
Comment 3 Roman Bysh 2012-05-31 23:45:12 UTC
Update

The problem still exists in 12.2 Beta 1. Adding "acpi=off" to Boot options or choosing "NO ACPI" from Kernel options are a way of getting around the installer stalling.

Again. I double-checked with the 11.4 and 12.1 DVDs. They DO NOT stall when "Searching for Linux partitions". Thank you.
Comment 4 Roman Bysh 2012-06-01 16:09:13 UTC
Follow Up

I'm using an Asus P5Q mobo w/ 4GB RAM and an Intel Quad-Core Q6600 CPU
Comment 5 Thomas Renninger 2012-06-26 09:51:28 UTC
acpi=off can be nearly everything.
Please try:
pci=noacpi

Also have a look at bug #765378 (best try out things described in comment #8 whether it might be CPU related).
There it is a CPU specific problem, your issue might be totally unrelated, but could also be a duplicate (even the CPU seem to be different).

Is it possible to send a screenshot when booting with vga=normal boot param?

This could take a while, but you could try to increase acpi debug output:
acpi.debug_level=0x20F acpi.debug_layer=0xFFFFFFFF log_buf_len=64M vga=normal
boot parameters.
Only makes sense if you can send (best additionally) a screenshot.

Another interesting info would be (if you could install a the system via failsafe settings already) the output of the command "acpidump >/tmp/acpidump"
Please attach if possible.
Comment 6 Roman Bysh 2012-06-26 16:50:37 UTC
I will try using "pci=noacpi"

However, I do NOT have problems with shutdown. Just that it hangs when searching for Linux partitions when installing.

I've been using this P5Q mobo and Q6600 CPU for the entire 11.x series including all milestones. The same with 12.1 and milestones without it hanging.

What has changed in 12.2 that causes it to hang?

Sorry. I cannot provide a screenshot. I do not have a camera.

In the meantime, I will reinstall using the safe settings and post the acpidump.
Comment 7 Roman Bysh 2012-06-26 17:51:29 UTC
Created attachment 496469 [details]
ACPIDUMP

Please review the ACPIDUMP
Comment 8 Roman Bysh 2012-06-26 17:54:06 UTC
Reinstalled using Safe Settings and uploaded the acpidump as per request.
Comment 9 Thomas Renninger 2012-06-27 15:37:25 UTC
> Sorry. I cannot provide a screenshot. I do not have a camera.
Ok, it's somewhat difficult then without any output of the affected system.

You say, this happens since:
Since 12.2 M1 - M3

Do you still remember a as recent as possible working kernel?
Like that I have a chance to look at relevant patches which came in in this time frame.

I have to read up quite some mails on the acpi list. Hopefully I see something related. If not I may try to remove the latest ACPI 5.0 patches, they are typically rather separated and this may work out without too much work.
Stay tuned, this may take time.

If pci=noacpi works, most/all power management features should work and this may be a sufficient workaround for you for some time.
If pci=noacpi works, please also try pci=nocrs.
Setting needinfo flag for pci=noacpi test.
Comment 10 Roman Bysh 2012-06-27 16:44:36 UTC
The problem only exists during the 'first time installation' of openSUSE on a 500 GB SATA II Western Digital drive. The problem is that I have 12.1 on the first
three partitions on the primary and 12.2 Beta 2 on the last 2 partitions in the extended. 

Go back and look at the kernel provided with 12.1 DVD_x86-64 and the ACPI and compare it with what you have now on 12.2 Beta 2.

Is there a command that I can use to use with the default installation to debug or write to file to my usb drive?

This info would be very helpful to you. Yes?
Comment 11 Roman Bysh 2012-06-27 17:39:57 UTC
Created attachment 496635 [details]
Info about kernel provided with 12.2 M1

I have uploaded info about the kernel provided with 12.2 Milestone 1
Comment 12 Roman Bysh 2012-06-27 18:07:30 UTC
Thomas

Add the end of (In reply to comment #9)
> > Sorry. I cannot provide a screenshot. I do not have a camera.
> Ok, it's somewhat difficult then without any output of the affected system.
> 
> You say, this happens since:
> Since 12.2 M1 - M3
> 
> Do you still remember a as recent as possible working kernel?
> Like that I have a chance to look at relevant patches which came in in this
> time frame.
> 
> I have to read up quite some mails on the acpi list. Hopefully I see something
> related. If not I may try to remove the latest ACPI 5.0 patches, they are
> typically rather separated and this may work out without too much work.
> Stay tuned, this may take time.
> 
> If pci=noacpi works, most/all power management features should work and this
> may be a sufficient workaround for you for some time.

If pci=noacpi works, please also try pci=nocrs.
Setting needinfo flag for pci=noacpi test.


What is the exact command for "set needinfo flag for pci=noacpi test"?
Comment 13 Roman Bysh 2012-06-27 18:17:09 UTC
Can dmesg help?
Comment 14 Thomas Renninger 2012-06-28 07:46:02 UTC
> What is the exact command for "set needinfo flag for pci=noacpi test"?
These are 2 different things:
  -> set needinfo flag for this bug. If you have a look at the state of the bug,
     it's up to you to provide some information. This is for better handling
     if one has to deal with quite a lot bugs.

  -> pci=noacpi
     Sorry, I should have been more clear about that.
     This is a boot parameter. You can add additional boot parameters at the
     point when the SUSE DVD got booted and you choose between:
      - Boot from Harddisk
      - Installation
      - ...
     There is a field where you can add boot parameters, there add:
     pci=noacpi
     and choose the Installation (as you've done before).
     If pci=noacpi helps, you should be able to use the system without any bad
     side effects.
     Switching of ACPI in general (acpi=off, safe settings) will result in
     missing some important Hardware features.
     It looks like Interrupt or resource settings provided by ACPI prevents one
     device (the disk?) to get initialized properly.
     If the kernel is able to set it up without ACPI info (pci=noacpi)
     everything should be just fine.
     It would still be interesting to find out why this happens, but you can at
     least work with the system.
Comment 15 Roman Bysh 2012-06-28 15:11:16 UTC
Follow Up

I tried "pci-noacpi" and "pci-nocrs" and they both hang at 60 percent 
"Searching for Linux partitions".
Comment 16 Roman Bysh 2012-06-28 15:14:07 UTC
Correction

I typed a hyphen instead of an equal sign in comments.

pci=noacpi and pci=nocrs
Comment 17 Roman Bysh 2012-07-13 16:34:21 UTC
I just ran the installation of openSUSE 12.2 RC1 and problem still exists.

My P5Q mobo uses ACPI 2.0.


Note
pci=noacpi and pci=nocrs do not work. 

The installer still hangs at 60 percent "Search for Linux partitions".

Any update on the patches? In the meantime, it looks like this will not be resolved for 12.2.

Is there a way that I can be testing this after openSUSE 12.2 is released?
Comment 18 Andreas Jaeger 2012-07-13 20:28:24 UTC
It looks like you provided the info, let's remove the NEEDINFO flag.
Comment 19 Thomas Renninger 2012-07-16 08:02:38 UTC
One specific ACPI hang has been fixed, but the patch did not show up in RC1 any more:
[Bug 44171] BUG: unable to handle kernel NULL pointer dereference at acpi_ns_check_object_type
https://bugzilla.kernel.org/show_bug.cgi?id=44171

It's hard to say whether it matches your machine without some info from the crash.
You can boot with vga=normal, then you might see the kernel crashing with some backtrace info?

If this function shows up there somewhere: acpi_ns_check_object_type()
it will get fixed with the next release.
Otherwise you could write down the backtrace (last executed functions before the crash happens) if possible? Also what kind of crash (NULL pointer dereference, ...) or whether things just hang,..., would be interesting.
Comment 20 Roman Bysh 2012-07-16 16:58:15 UTC
Okay. I'll give it a try.
Comment 21 Roman Bysh 2012-07-25 15:57:00 UTC
I'm getting access to a camera and will follow up with a screenshot.

I have watched the boot up and it flickers intensely.
Comment 22 Thomas Renninger 2012-08-02 14:22:03 UTC
> I have watched the boot up and it flickers intensely.
Does that mean it's readable/photographable?

Try with "Text mode" if you boot from install DVD.
And/or add vga=normal as boot parameter.

Hm, evtl. nomodeset boot param helps. But this one depends on which graphics card you have.
ATI:
radeon.modeset=0   (ATI graphics card)
nouveau.modeset=0  (NVidia graphics card_)
may help.
Comment 23 Roman Bysh 2012-08-02 15:54:19 UTC
It should be readable. I will follow up.
Comment 24 Roman Bysh 2012-08-06 17:06:56 UTC
Created attachment 501279 [details]
Photo of screen

You may have to scale the photo and invert the colors to read it.

As I said before it's very blurred. I'll see if I can borrow a different camera.
Comment 25 Roman Bysh 2012-08-06 17:41:25 UTC
Try the sharpen filter in Gimp. Or if you have Adobe it could make it more legible.

I don't understand why this problem is happening with the 500 GB Western Digital Blue drive? It's such a common drive.

My P5Q is using ACPI 2.0. Check the WD specs for the drive and Asus for the P5Q.

One of your patches to the kernel for 12.2 Milestone 1 and up broke something.
Comment 26 Tom C 2012-09-10 19:24:10 UTC
This bug now exists in the final. I am unable to install and it hangs in the exact same place as described here.  I have a gigabyte UDR3 motherboard and an intel Q9300 processor. I have two 640GB western digital black HDD's in RAID 1 configuration using Intel Matrix (ICH10R) running windows and used for storage. I have a third drive (same specs) which is not a RAID volume, where openSuse is installed.  It has been happily running 11.4 there since it was released.  Now I am unable to install 12.2 due to this problem. Please fix it soon!
Comment 27 Roman Bysh 2012-09-10 20:22:50 UTC
For now I am typing in the Boot Options:

acpi=off
Comment 28 Thomas Renninger 2012-09-11 12:18:27 UTC
> Please fix it soon!
Tom: I need some data to be able to fix this. This is a very platform specific bug.

> You may have to scale the photo and invert the colors to read it.
I cannot see anything on this screen shot. It looks like some ACPI paths show up there, but I cannot read it. If possible boot with vga=normal parameter and also try to scroll up a bit and photograph the beginning of the error messages (and some non-error messages to be able to get an idea at which stage of booting this happens).
Comment 29 Thomas Renninger 2012-09-11 14:09:17 UTC
If you have installed with acpi=off, you should also try a very latest kernel to see whether the issue has been addressed mainline already:
http://download.opensuse.org/repositories/Kernel:/HEAD/standard/x86_64
Best download the -default flavor and install it via:
rpm -ivh kernel-default-xy.rpm
Then the old kernel is not overridden and you can still switch to the original 12.2 kernel easily.
Comment 30 Roman Bysh 2012-09-11 17:51:38 UTC
As mentioned before the problem happens during the installation? Once I'm up and running, I remove the command from the kernel options and I'm okay.

How am I to resolve this? Create a new DVD?

Can you create a DVD ISO without the patch you applied before the first 12.2 Milestone was created? That is the only way to resolve this problem.

I just put in a new 500 GB Western Digital "Black" drive and the BIOS on the 
P5Q motherboard has been updated to v2209. The problem still exists.

I'm not had any problems with any openSUSE milestone nor final release 
from openSUSE 10.3 to 12.1.

That's a lot of good information for you to work on. One of your patches pre-12.2 created this problem.

The P5Q motherboard and WD 500 GB drive are very Linux friendly. I've had no problems with any distro. And I've tried them all.
Comment 31 Roman Bysh 2012-09-11 17:54:10 UTC
Tom

I thought you were also addressing my problem.
Comment 32 Roman Bysh 2012-09-11 17:58:51 UTC
If I create a short movie of this would it help? The words flicker so fast that a single shot cannot catch this problem.
Comment 33 Roman Bysh 2012-09-11 18:03:45 UTC
Another option is to create a new DVD ISO with the latest kernel. Save it to your home repository for us to download and see if it hangs in the same place. 
Any thoughts?
Comment 34 Thomas Renninger 2012-09-12 13:39:02 UTC
> As mentioned before the problem happens during the installation? Once I'm up
> and running, I remove the command from the kernel options and I'm okay.
Ah ok, you mentioned that in comment #10, that's an important detail I've overseen.

In fact this is really strange...
This does not have to do anything with the disk or partition layout.
ACPI may be involved when the sata/disk driver is loaded, but once the HW is working, ACPI is out of the game.
It looks like the install (userspace) system is trying to load an ACPI driver manually which otherwise would not be loaded or is playing in /sys or /proc which triggers ACPI code to be interpreted by the kernel.
Strange is that this seem to happen in parallel when the disk is scanned for other Linux/OS installations.
Does it always hang at exactly the same point?
If there is a thermal or other HW event happening..., this would explain why ACPI is involved again, but this should not always happen at exactly the same stage.

One test could be to blacklist all kind of acpi drivers for the installation via:
brokenmodules=video,acpi-cpufreq,processor,thermal,button,wmi,battery

You find more here in an installed system (adjust the kernel version):
find /lib/modules/3.0.4-0.0.0.1.a432f18-default/kernel/drivers/acpi/ |grep "\.ko"

Best add every ACPI driver you can find to brokenmodules=.
The guy who is most involved in the installer (linuxrc), is on vacation until today. I try to reach him as soon as he's back. But I fear he also won't have much of a clue, as the problem (ACPI, HW related) and the time it is happening (when all HW should have been initialized already) is really strange.
Comment 35 Roman Bysh 2012-09-12 16:39:31 UTC
Once I add:

brokenmodules=video,acpicpufreq,processor,thermal,button,wmi,battery

Which file should be checked? Can I create a log file and what will it reveal?
Comment 36 Roman Bysh 2012-09-12 17:26:46 UTC
>It looks like the install (userspace) system is trying to load an ACPI driver >manually which otherwise would not be loaded or is playing in /sys or /proc >which triggers ACPI code to be interpreted by the kernel. Strange is that this >seem to happen in parallel when the disk is scanned for other Linux/OS >installations. Does it always hang at exactly the same point?

Yes. It always hangs in the same place


I tried "brokenmodules=video,acpicpufreq,processor,thermal,button,wmi,battery" and it hangs at the same spot.

I'm going to try the rest that were not listed from the ACPI folder.
Comment 37 Roman Bysh 2012-09-12 18:08:18 UTC
Update

I've added everything from the ACPI folder to Boot Options.

brokenmodules=video,acpicpufreq,processor,thermal,button,wmi,battery,ac,acpi_ipmi,acpi_memhotplug,acpi_pad,bgrt,container,ec_sys,fan,pci_slot,sbs,sbshc

It hangs in the same spot.
Comment 38 Thomas Renninger 2012-09-12 18:48:41 UTC
If the machine is connected via lan and the machine does not totally hang (Caps Lock LED, irqs work etc.)

you can try to install with these boot parameters:
usessh=1 sshpassword=mypass

You should then see a message with the host name and that you can log in via ssh and start the installation by typing yast.
Follow the instruction(s)...
With some luck the machine is not totally dead and you can log in via ssh root@... again when it hangs and copy away dmesg (via scp), look around a bit why yast may hang:
top, /var/log/Yast is very important then. There should even be a binary which collects all yast logs and bzips them (y2logs or similar).
Comment 39 Thomas Renninger 2012-09-13 11:44:55 UTC
After talking with the install guys:
If we have luck you may be able to reproduce this in the installed system by:
   - removing the acpi=off parameter which has been "inherited" by explicitly
     adding acpi=off to the install kernel boot parameters
   - Boot and start:
       - hwinfo
          -> does it cause the hang?
       - yast2 disk
          -> this should initialize disk HW similar to what happens at the
             installation stage when the machine hangs
If you can reproduce this in an installed system, it should be much easier to retrieve debug info, some logs or to further try to limit the issue (e.g. if it's hwinfo which causes the hang, each device can be probed separately, e.g. hwinfo --disk, ...).
Comment 40 Roman Bysh 2012-09-13 16:51:25 UTC
Can you please put this in step format?

1. 
2. etc...
3. etc...
Comment 41 Thomas Renninger 2012-09-14 08:08:30 UTC
Sure:
1) Try to reproduce in an installed system with acpi=off removed
================================================================

1.1 Boot the installed system with acpi=off removed from grub2 configs
    Not exactly sure where yast puts the acpi=off param when installed with
    this param with grub2.
    Could be:
    /etc/default/grub
    GRUB_CMDLINE_LINUX_DEFAULT="quiet splash=silent acpi=off"
    then remove acpi=off and run:
    grub2-mkconfig -o /boot/grub2/grub.cfg
    or just remove it from /boot/grub2/grub.cfg to give it a try

1.2 When booted run:
      1.2.1 hwinfo
            -> does it cause the hang?
            If yes, try to find out on which HW probing it hangs (hwinfo disk?)
      1.2.2 yast2 disk
            -> this should initialize disk HW similar to what happens at the
            installation stage when the machine hangs

Try to find out more if it hangs (jump to 3.)
If this does not trigger the issue, continue with 2.:

2. Try to obtain info by doing starting a ssh installation
==========================================================

2.1 Make sure LAN cable is connected and the machine gets an IP via
    dhcp (not sure, but I guess static IP can also be passed later)

2.2 Put in the install DVD and pass these extra boot parameters:
    usessh=1 sshpassword=mypass

2.3 Follow the instructions on the screen after the kernel booted.
    Should be something like: "Log into the system via root@host_xy and start
    yast" (password is mypass or whatever you pass via sshpassword= boot
    param).

2.4 Once you run into the issue again, does the machine still ping and can
    you log in again? If not the kernel is in an inconsistent state and we
    cannot do much again.
    But if you can log in, try to obtain as much info as you can, see 3.

3. Get logs
===========

Important logs are:
  - dmesg
  - If yast is the involved, /var/log/Yast (best by calling
    save_y2logs /tmp/yast2logs.tar.bz2)
  - run top and see if a processes is hanging
  - ...
Comment 42 Roman Bysh 2012-09-14 18:09:04 UTC
(In reply to comment #41)
> Sure:
> 1) Try to reproduce in an installed system with acpi=off removed
> ================================================================
> 
> 1.1 Boot the installed system with acpi=off removed from grub2 configs
>     Not exactly sure where yast puts the acpi=off param when installed with
>     this param with grub2.
>     Could be:
>     /etc/default/grub
>     GRUB_CMDLINE_LINUX_DEFAULT="quiet splash=silent acpi=off"
>     then remove acpi=off and run:
>     grub2-mkconfig -o /boot/grub2/grub.cfg
>     or just remove it from /boot/grub2/grub.cfg to give it a try
> 
> 1.2 When booted run:
>       1.2.1 hwinfo
>             -> does it cause the hang?
>             If yes, try to find out on which HW probing it hangs (hwinfo disk?)
>       1.2.2 yast2 disk
>             -> this should initialize disk HW similar to what happens at the
>             installation stage when the machine hangs
> 
Response

No problem with hwinfo.

What has changed with the kernel in 12.1 compared to 12.2?

What if I were to try out the live cd?


OTOH I might be delayed due to the fact that I've hit my monthly maximum.
Comment 43 Roman Bysh 2012-09-14 18:14:20 UTC
Update

We found an error in Grub2's device map that causes a serious hang.
Open /boot/grub2/device map and remove the line referring to fd0 and /dev/fd0.

However, I don't think Grub2 has anything to do with this issue due to the fact that it comes after the "System Probing".
Comment 44 Roman Bysh 2012-09-14 18:19:05 UTC
I'm wondering why you want me to deviate via the ssd route trying an alternate method for installation when the problem lies within the "System Probing".

It sits for a while at hard disk and then hangs probing for Linux partitions.
Comment 45 Roman Bysh 2012-09-14 18:27:17 UTC
Could there be an issue with an interrupt? Please look at 12.1 versus 12.2.
Besides upgrading drivers and memory issues with the kernel could there be a patch (interrupt?) that is inhibiting the installer to go past the same hang point?

I know there are many other people that are having the same problem. 
I'm encouraging more users to weigh in on this issue.
Comment 46 Roman Bysh 2012-09-14 18:30:50 UTC
I would encourage openSUSE to engage more in testing the installer using real hardware versus testing with KVM. Virtual testing misses a lot.

Any one else using the P5Q mobo in Germany?
Comment 47 Thomas Renninger 2012-09-17 09:33:04 UTC
> Any one else using the P5Q mobo in Germany?
Yep, this is probably a very machine/BIOS specific issue.

> I'm wondering why you want me to deviate via the ssh route trying an alternate
> method for installation when the problem lies within the "System Probing".
This is the only way to be able to retrieve logs/debugging output when the system hangs during installation. It's not really an alternative installation method. It's just that an ssh daemon is started before the installation starts.
If the system does not hang totally (irqs are still delivered and network, etc. still works to some extend, you can then log in, examine why it did hang and copy away logs over the network).

> It sits for a while at hard disk and then hangs probing for Linux partitions.
I cannot see that this has to do with Linux partition probing. It must be something HW related, because of ACPI being involved. Possibly yast tries to configure software raid or similar at this point.

I need more info. A clear picture of the hang, yast logs, dmesg, top, ... when the system hangs or whatever. As said, if the machine's network still responds when the system hangs, please start a ssh installation and try to copy this info over network via scp.
Comment 48 Tom C 2012-09-17 22:44:51 UTC
Hi Roman and Thomas, I wasn't able to install with acpi=none.  When I do this, the kernel loads but then its no longer able to recognize my DVD drive any longer. It prompts me to enter disk 1.  I went through all the options trying to get it to recogize the drive again but no go.   I am using an IDE DVD drive, maybe this won't work without acpi?? 

In any case, I decided to try it again without acpi=none just to make sure it would still read the drive. It did, and it hung at the same spot "searching for linux partions."  Then I got a phone call so I just walked away and left it there.   I was on the phone for about a half hour and when I came back to my surprise, it had gone past it! So for me it wasn't completely hung it just takes a really long time. 

Its in the process of installing now. Once I get the install done, maybe I can try these other steps to collect the data.
Comment 49 Michael Shields 2012-09-28 19:06:44 UTC
I battled with what I believe to be this issue last night. I think that my struggle resulted in one potentially important symptom that I didn't see in my review of the comments for this issue. (If it is, my apologies.) What I noticed is that while it was "hung" in "Search for Linux partitions", it was actually making many repeated I/O requests of my IDE DVD drive. All the requests heated up the IDE DVD drive. Like Tom C, the install eventually (after _many_ minutes) did move beyond the "hang". But unlike Tom C, I was not able to continue past the prompt for the distribution disk; attempting to do so resulted in a "no medium in drive" error (but of course it was in the drive).

The acpi=off boot option did not work around this issue for me. My only "workaround" was to go back to 12.1, which I have done (life must go on). So unfortunately, I won't be able to provide any additional info. :-( But +1, IMHO.
Comment 50 Thomas Renninger 2012-10-01 16:06:28 UTC
What is urgently needed are yast logs when this is happening.
One can only roughly guess which yast installer part is involved and what it currently is doing.
Yast logs everything in detail to /var/log/Yast2 or even better, there is a tiny executable which collects all needed data:
save_y2logs /tmp/yast2logs.tar.bz2
The yast logs can either be copied via scp if an usessh=1 sshpassword=xxx have been past as boot parameters (see description of ssh installation in comment #41, 2.) or maybe one can mount an USB stick (CTRL-ALT F1 or F2... should bring you to a prompt/console), during installation and can copy the yast logs onto it and attach them to this bug.

Michael: What kind of HW (mainboard) is this?
Hm, maybe you can attach dmidecode (from 12.1), there the HW/BIOS version details are listed.
Comment 51 Thomas Renninger 2012-10-06 23:19:26 UTC
I may have found something.
It's about D3hot runtime power management for SATA devices.
It got introduce by mainline commit 9ee4f3933930abf5cc34f8e9 in kernel 3.3
It got disabled by default in later kernels after 12.2 with
    mainline git commit 0c8d32c2
    libata: forbid port runtime pm by default, fixing regression

Unfortunately the changelog does not mention what kind of regressions are fixed.
It is possible to disable this feature at runtime.
Best is to disable runtime power management for all SATA devices for testing whether this is the problem:

By default the value should be "auto":
cat /sys/devices/pci0000:00/*/ata*/power/control

This disables runtime power management for these devices (yes "on" disables it...):
for file in /sys/devices/pci0000:00/*/ata*/power/control;do echo on >$file;done

You can do that while Yast already started, but before you reach the "Search for Linux partitions part" by switching to a console:
Hit CTRL-ALT-F1 switches to the main console.
Hit ALT-<right cursor> to move to the next console until you get a prompt.
Now execute above line:
for file in /sys/devices/pci0000:00/*/ata*/power/control;do echo on >$file;done

But this makes only sense if you have SATA DVD drives.
Michael (Shields) mentioned in comment #49:
> All the requests heated up the IDE DVD drive.
Is this really an IDE or could it be a SATA drive?
Comment 52 Roman Bysh 2012-10-16 18:04:29 UTC
Follow Up

I passed the parameters "usessh=1 sshpassword=mypass" in Boot Options.
Followed by the message:

login using 'ssh -X root@192.xxx.x.xx'
use yast.ssh to start the installation

After typing in ssh -X root@192.xxx.x.xx followed me pressing <enter> key it sits there and does nothing.
Comment 53 Roman Bysh 2012-10-16 19:17:43 UTC
Follow Up

When using the ssh option for installation and I am prompted to use:

     ssh -X root@192.168.0.12

I never see a login prompt such as " # ". If I press ALT+F2, I then see a prompt.
However, if I switch to another console prompt and I try ssh -X root@ I get a message that it is already running.
Comment 54 Tom C 2012-10-17 02:57:31 UTC
(In reply to comment #51)
> 
> By default the value should be "auto":
> cat /sys/devices/pci0000:00/*/ata*/power/control
> 

I did this on my machine, and the values are already set to ON for all. If I am reading your response correctly, this means its already disabled. I am still having the problem. 


> 
> But this makes only sense if you have SATA DVD drives.
> Michael (Shields) mentioned in comment #49:
> > All the requests heated up the IDE DVD drive.
> Is this really an IDE or could it be a SATA drive?
>

In my machine the DVD is definitely an IDE drive. The hard disks are SATA.
Comment 55 Tom C 2012-10-17 03:11:47 UTC
(In reply to comment #50)
> What is urgently needed are yast logs when this is happening.
> Yast logs everything in detail to /var/log/Yast2 or even better, there is a
> tiny executable which collects all needed data:
> save_y2logs /tmp/yast2logs.tar.bz2
>

I didn't redo the install, however the same slowness also occurs in yast when you bring up the 'partitioner' feature. It sits there for a VERY long time displaying the message "Volumes are being detected".  I assume this is the very same problem that happens during install. 

So I used the save_y2logs command above a few times while it was sitting there running and I was waiting for it. Then I waited it out, and ran it one final time after it had brought up the expert partitioner window. Hopefully this is what you need, so I removed NEEDINFO flag on this bug. If this doesn't do it, please put the flag back and let me know if there is something else I can help you collect. 

yast2logs-4.tar.bz was the most recent one I made BEFORE it finished detecting the volumes. yast2logs-final.tar.bz was made after it passed that step and loaded the main window.
Comment 56 Tom C 2012-10-17 03:16:56 UTC
Created attachment 509788 [details]
yast logs while expert partitioner was scanning volumes
Comment 57 Tom C 2012-10-17 03:17:52 UTC
Created attachment 509790 [details]
yast logs after expert partitioner fully loaded
Comment 58 Thomas Renninger 2012-10-17 15:43:18 UTC
> I didn't redo the install, however the same slowness also occurs in yast when
> you bring up the 'partitioner' feature. It sits there for a VERY long time
> displaying the message "Volumes are being detected".  I assume this is the very
> same problem that happens during install.
Perfect!
That makes things *much* easier to test/verify.

Things seem to get stuck in blkid call issued by yast.
After talking with installation and Yast partitioner guys, there may already be a duplicate (bug #757368).

It seem to be a (not existing?) floppy drive that the is tried to get probed.
You can try by not loading the floppy driver.
During installation pass this grub/kernel boot parameter:
BrokenModules=floppy
or easier when you can reproduce this at runtime, make sure the floppy driver is not loaded:
Add this line to /etc/modprobe.d/99-local.conf:
blacklist floppy
and remove the module manually if it already got loaded:
rmmod floppy
and try again to start the yast partitioner.

Unfortunately kernel logs are not included in the yast logs you posted.
But if this happens you should see quite some kernel messages pointing to the floppy driver in dmesg (command) or /var/log/messages similar to what got reported in the bug I mentioned above.

If you can verify that this is about the floppy driver, I am going to poke the guys in the other bug that this is more urgent they think it is...
Comment 59 Thomas Renninger 2012-10-17 15:47:34 UTC
Hm, one guy mentioned:
> Fortunately, the issue appears to be fixed in 12.2 Beta 2.
Still it would be great if you give the "not loading floppy driver" a try to be able to further limit the issue. It should be much easier to test things now.

Please also have a look at /var/log/messages (best attach it) if this still happens.
Comment 60 Roman Bysh 2012-10-17 17:56:41 UTC
Follow Up

I had noticed that my motherboard was getting very dirty and the CPU fan was completely blocked up. Everything was completely cleaned including controller pins.

I then reburned a BIOS upgrade and the problem went away. The installer no longer hangs.
Comment 61 Roman Bysh 2012-10-17 18:01:17 UTC
Created attachment 509901 [details]
Latest yast2logs

Today's yast2 logs.
Comment 62 Roman Bysh 2012-10-17 18:02:28 UTC
Created attachment 509902 [details]
Today's dmesg

Today's dmesg
Comment 63 Roman Bysh 2012-10-17 18:20:29 UTC
I recently had downgraded the BIOS to a Hackintosh BIOS v2.2208. After figuring out how to run the ssh installation, I had noticed that the installer did not hang when "Searching for Linux partitions". 

So, I rebooted and tried installing using the default values and again the installer did not hang. Thinking that the BIOS had something to do with it, I upgraded the BIOS back to the Asus BIOS v.2.209.

An installation was restarted and the installer still did not hang.

I'm satisfied that this has finally been resolved and the ASUS P5Q mobo is  openSUSE-friendly. Works right out of the box.
Comment 64 Roman Bysh 2012-10-17 18:22:38 UTC
Thomas

Could an overheating CPU cause this problem?
Comment 65 Tom C 2012-10-18 01:32:30 UTC
(In reply to comment #58)
>
> make sure the floppy driver is not loaded:
> Add this line to /etc/modprobe.d/99-local.conf:
> blacklist floppy
> and remove the module manually if it already got loaded:
> rmmod floppy
> and try again to start the yast partitioner.
> 

Hi Thomas, I did this, and the problem is gone! The partitioner now goes through the search VERY quickly. For the record, there is no floppy disk drive attached to this system.
Comment 66 Thomas Renninger 2012-10-18 09:44:27 UTC
Wow..., let me summarize this:

Roman:
> Thinking that the BIOS had something to do with it
Yep, definitely. This is something that only happens on specific platforms/BIOSes. That "acpi=off works" fact indicates this already (ACPI is the HW interface/configuration from BIOS to OS), also this only happens on very specific platforms, otherwise we would have been spammed with bugs or better, 12.2 would never have been released with such a bug.

> Could an overheating CPU cause this problem?
No. What happens when the CPU overheats is HW specific, typically the OS is told to restrict CPU frequency. But CPU power would still be enough for (nearly) the same IO (disk, DVD,...) performance.
There are othere techniques, e.g. throttling (or even shutdown, ...). There the CPU is even slower, but it would not get stuck like that and always at the same point.

I very much guess that it had to do with the floppy probing, because of the nearly exact 10 mins timeout seen in the yast logs. There exists a similar report that very much fits this description.
Beside BIOS there also must have been a kernel modification as this did not happen with older kernels, right? If you could not run into the issue anymore with the same BIOS, maybe you run "zypper update" and the kernel got fixed for this special case again via stable kernel updates.

Tom:
So you did not modify the BIOS and the issue is solved by not using the floppy driver?
Can you double check that you have the latest 12.2 kernel running (zypper update). I already went through the floppy driver changes, but could not find what may have caused this regression.
But there are 2 recent patches mainline we do not have in 12.2, one from Jiri Kosina from SUSE:

commit 070ad7e793dc6ff753ee682ef7790b3373b471f6
Author: Jiri Kosina <jkosina@suse.cz>
Date:   Fri May 18 13:50:25 2012 +0200

    floppy: convert to delayed work and single-thread wq

and an on-top fix on that from Linus:

commit dab058fd5ff834cb3b9de1d930ce731a605eb0c6
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Tue Jul 3 15:51:22 2012 -0700

    floppy: cancel any pending fd_timeouts before adding a new one

If Jiri tells me it makes sense to include and provide these with the next maintenance update kernel if they help, I can build a 12.2 kernel with these if the problem still exists in latest 12.2 maintenance kernel.
Comment 67 Tom C 2012-10-18 15:59:22 UTC
>
> So you did not modify the BIOS and the issue is solved by not using the floppy
> driver?
>

Correct, I did NOT modify BIOS. I only disabled the floppy driver. 

>
> Can you double check that you have the latest 12.2 kernel running (zypper
> update). 
>

I will have to check later on when I get home from work to see if there are any kernel updates applied, but I don't think so.
Comment 68 Axel Keller 2012-10-30 21:53:11 UTC
I just installed the version 12.2 on three elder computers.

The installation on the fist one worked fine, on the last two the search for LINUX partitions hang. Even though only one of them two has a floppy drive it was configured in BIOS for both. After clearing all floppy disc entries in BIOS the installations worked fine.

Thank you for this hint.
Comment 69 Thomas Renninger 2012-10-31 08:55:22 UTC
Thanks.
Unfortunately this cannot be changed anymore on the install media.
Would be great if you can spread this info (disable floppy in BIOS or use broken_modules=floppy boot parameter for installation on affected (older?) systems) if you see anyone else out there having this problem.

I could imagine with Jiri's latest changes (and a "timeout fix" on-top from Linus sounds promising) this may not happen anymore with latest kernels. Up to him whether it's worth investigating further and find a possible fix for 12.2.

*** This bug has been marked as a duplicate of bug 773058 ***