Bug 1008444

Summary: Installer hangs at script 90linux-distro
Product: [openSUSE] openSUSE Distribution Reporter: Dieter Jurzitza <dieter.jurzitza>
Component: InstallationAssignee: Michael Chang <mchang>
Status: RESOLVED FIXED QA Contact: Jiri Srain <jsrain>
Severity: Critical    
Priority: P2 - High CC: dieter.jurzitza, igonzalezsosa, mchang, mlin
Version: Leap 42.2   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: Requested information with regard to the system setup
os-prober debug package
os-prober.log as requested ....

Description Dieter Jurzitza 2016-11-04 05:48:44 UTC
When installing openSUSE leap 42.2 (latest issue from opensuse server) the following happens

After about 75% of the installation process the following starts to show in the process table:

/bin/sh /usr/lib/os-probes/mounted/90-linux-distro /dev/sdb6 /var/lib/os-prober/mount ext2

nothing happens any more, no installation progress. After waiting for a long time I killed this process - I had to do over and over again until the installation finished at the very end.

System: 2 harddisks, /dev/sda, /dev/sdb, "traditional" setup with mbr. /dev/sdb6 is a linux partition, ext4 formatted (the home partition of opensuse 13.1). leap 42.2 is installed on /dev/sda

Lucky me the boot setup was not broken due to this bug.

If you need feedback on this, please hurry, I have only limited access to the computer in question during the next weeks.
As this bug breaks installation for regular users I'd consider it critical.


Partition scheme:
Gerät      Boot    Anfang      Ende  Sektoren  Größe Kn Typ
/dev/sda1  *         2048   2056191   2054144  1003M 83 Linux
/dev/sda2         2056192 976773119 974716928 464,8G  f W95 Erw. (LBA)
/dev/sda5         2058240 131604479 129546240  61,8G 83 Linux
/dev/sda6       131606528 954132479 822525952 392,2G 83 Linux
/dev/sda7       954134528 976773119  22638592  10,8G 82 Linux Swap / Solaris

Gerät      Boot     Anfang       Ende   Sektoren  Größe Kn Typ
/dev/sdb1             2048    2056191    2054144  1003M 83 Linux
/dev/sdb2          2056192 1953523711 1951467520 930,5G  f W95 Erw. (LBA)
/dev/sdb5          2058240   65802239   63744000  30,4G 83 Linux
/dev/sdb6         65804288  131604479   65800192  31,4G 83 Linux
/dev/sdb7        131606528 1937053695 1805447168 860,9G 83 Linux
/dev/sdb8       1937055744 1953488895   16433152   7,9G 82 Linux Swap / Solaris
Comment 1 Ludwig Nussel 2016-11-04 17:06:48 UTC
please attach yast logs
https://en.opensuse.org/openSUSE:Bugreport_YaST
Comment 2 Dieter Jurzitza 2016-11-05 14:52:27 UTC
I'll provide this information on monday!
Comment 3 Dieter Jurzitza 2016-11-07 05:44:39 UTC
Created attachment 700837 [details]
Requested information with regard to the system setup

Please find y2logs attached as requested by Mr. Nusselt. Please note, any reference to the real hostname as in use today has been removed and replaced by <MYHOSTNAME><MYDOMAIN>
Comment 4 Dieter Jurzitza 2016-11-07 05:51:29 UTC
By the way, this morning, zypper up hangs on kernel installation:

 2847 pts/0    S+     0:03 zypper up
 2849 pts/0    Z+     0:00 [tar] <defunct>
 2851 pts/0    S+     0:00 /usr/bin/python /usr/lib/zypp/plugins/commit/btrfs-defrag-plugin.py
 2852 pts/0    S+     0:00 /usr/bin/python /usr/lib/zypp/plugins/commit/snapper.py
 2854 ?        Sl     0:00 /usr/sbin/snapperd
 2869 ?        R      0:00 /usr/bin/xterm
 2871 pts/1    Ss     0:00 bash
 2971 pts/0    S+     0:05 rpm --root / --dbpath /var/lib/rpm -i --percent --noglob --force --nodeps -- /var/cache/zypp/packages/repo-oss/sus
 2974 ?        Sl     0:46 /usr/lib64/firefox/plugin-container -greomni /usr/lib64/firefox/omni.ja -appomni /usr/lib64/firefox/browser/omni.j
 3003 ?        S      0:00 [kworker/2:0]
 3009 pts/0    S+     0:00 /bin/sh /var/tmp/rpm-tmp.lk1gif 2
 3108 pts/0    S+     0:00 /bin/sh /usr/lib/os-probes/50mounted-tests /dev/sdb6
 3115 ?        Rs     4:26 grub2-mount /dev/sdb6 /var/lib/os-prober/mount
 3153 pts/0    S+     0:00 /bin/sh /usr/lib/os-probes/mounted/90linux-distro /dev/sdb6 /var/lib/os-prober/mount ext2
 3155 pts/0    S+     0:00 /bin/sh /usr/lib/os-probes/mounted/90linux-distro /dev/sdb6 /var/lib/os-prober/mount ext2
 3156 ?        S      0:00 [kworker/1:1]


Apparently a related problem, probably this tells you something ... the snapper problem mentioned in  the other bug seems related ... thank you for looking into this.
Comment 5 Dieter Jurzitza 2016-11-07 05:58:40 UTC
And, by the way, this morning my boot setup got destroyed - I have to check how I can boot from 13.1 again ... :-(
Comment 6 Ludwig Nussel 2016-11-07 12:57:20 UTC
Looks like there's something wrong with os-prober reading sda6. Maybe you could put some set -x in those scripts to see what they are doing. /usr/lib/os-probes/mounted/90linux-distro doesn't look that complicated. it shouldn't hang.

Reassign to maintainer of os-prober.
Comment 7 Imobach Gonzalez Sosa 2016-11-07 13:15:18 UTC
Reassigning to Michael Change. Please, could you have a look?
Comment 8 Michael Chang 2016-11-08 04:53:30 UTC
Created attachment 701037 [details]
os-prober debug package

Hi Dieter,

Please test this debug package with trace (-x) enabled. In that also removed the wildcard search as we used to stumbled across similar issue that it spent far too many time to finish with mount points by grub2-mount.

Please test if it helps in solving the problem, and also attach the trace output of running command 'os-prober'. Thanks.
Comment 9 Dieter Jurzitza 2016-11-08 06:59:30 UTC
Hi Michael,
could you kindly specify which commands to issue in order to test?
Just a short feedback,
thank you
take care




Dieter Jurzitza
Comment 10 Michael Chang 2016-11-08 07:46:25 UTC
Hi Dieter,

Please issue the commands

  # rpm -Uvh os-prober-1.61-21.1.x86_64.rpm

  # (time os-prober) > /tmp/os-prober.log 2>&1
 
from the command line to gather the output. And please attach the os-prober.log to this bugzilla.

Thanks.
Comment 11 Dieter Jurzitza 2016-11-08 10:29:17 UTC
Created attachment 701074 [details]
os-prober.log as requested ....
Comment 12 Dieter Jurzitza 2016-11-08 10:32:43 UTC
Hi Michael,
apparently you were on the right track - the command ran more or less quickly, even a call to mkinitrd passed through without a break.
Thank you for looking into this,
take care




Dieter Jurzitza
Comment 13 Dieter Jurzitza 2016-11-08 10:34:11 UTC
Hi Michael,
apparently you were on the right track - the command ran more or less quickly, even a call to mkinitrd passed through without a break.
Thank you for looking into this,
take care




Dieter Jurzitza
Comment 14 Ludwig Nussel 2016-11-08 16:53:41 UTC
not a ship stopper but worth an update nevertheless
Comment 15 Michael Chang 2016-11-09 04:45:43 UTC
I think it's about time to remove the "unknown" linux detection, it's not first time we have trouble with. Nevertheless, even it could finally work, the test cost us too much time for basically nothing, really not worth it imho.

There's an parallel issue that /fuse/grub2-mount/grub2-fs-modules turns out to be slow in readdir() or like operations that may worth more investigation where exactly the bottleneck is. But still that can't guarantee to solve the problem. The reasoning we use grub2-mount in favor of kernel mount is that we don't want to leave any kernel module behind after os-prober operation, but with tradeoff with degraded performance running fuse mount (plus grub2-modules fs module is not aimed to be optimized for io performance imho).

So that my conclusion is to remove the time consuming "unknown" linux test for the sake we are using grub2-mount. If any distro supposed to be a target to be probed, we should make it explicit on the list.
Comment 16 Dieter Jurzitza 2016-11-09 05:56:03 UTC
By the way,
today I mounted /dev/sdb6 manually to /mnt on my system, modified the (old) script slightly and tried to execute the command (an excerpt of what used to reside in /usr/lib/os-probes/mounted/90linux-distro):

#!/bin/sh
export dir=/mnt
# if (ls "$dir"/lib*/ld*.so* || ls "$dir"/usr/lib*/ld*.so*) >/dev/null 2>/dev/null; then
if (ls "$dir"/lib*/ld*.so* || ls "$dir"/usr/lib*/ld*.so*) >/dev/null 2>/dev/null; then  
        short="Linux"
        long="unknown Linux distribution"
fi
echo $short

the command ls "$dir"/lib*/ld*.so* fails badly. I can do a "$dir"/lib*, I can do a "$dir"/lib*/ld*, but adding the ".so*", even escaping the "." fails. Not because the directory /mnt/lib was empty / nonexistent, but because the command ls /mnt*/lib*.* does not work at all.


Could it probably be that we are facing another problem here or do I do something wrong? Probably a "hang" due to a wrong usage of "*" and ".*" here?

Thank you for looking into this,
take care




Dieter Jurzitza
Comment 17 Dieter Jurzitza 2016-11-09 07:02:02 UTC
Sorry, please forget about #16, my fault - there are no such files, hence there is nothing to be found.
Sorry for the noise,
take care




Dieter Jurzitza
Comment 18 Michael Chang 2016-11-09 07:18:10 UTC
(In reply to Dieter Jurzitza from comment #16)

> Could it probably be that we are facing another problem here or do I do
> something wrong? Probably a "hang" due to a wrong usage of "*" and ".*" here?

Are you testing with grub2-mount ? I don't think /lib*/ld*.so* syntax is wrong as it's been regular bash wildcard expansion. It could be fuse based grub2-mount and its file-system modules to service the request for listing file system nodes badly.

BTW, why did you have /lib directory in your /home partition. I'm just wondering. :) 

Thanks for information.
Michael
Comment 19 Michael Chang 2016-11-09 07:20:58 UTC
(In reply to Dieter Jurzitza from comment #17)
> Sorry, please forget about #16, my fault - there are no such files, hence
> there is nothing to be found.
> Sorry for the noise,
> take care

OK. Are you saying that "ls "$dir"/lib*/ld*.so*" hangs even without those files?

Thanks,
Michael
Comment 20 Dieter Jurzitza 2016-11-09 08:29:30 UTC
Hello Michael,
1.) I mounted "regularly" using mount /dev/sdb6 /mnt
2.) the command did not hang. There were only no files that would match so there was a corresponding error message.

Nevertheless: thanks again for looking into this, sorry again for the noise,
take care




Dieter Jurzitza
Comment 21 Michael Chang 2016-11-15 09:43:08 UTC
srid#440345 to openSUSE:Factory/os-prober.
Comment 22 Michael Chang 2016-11-16 03:51:01 UTC
srid#124084 to SLE12/Leap maintenance update. Set status to resolved fixed for maintenance QA verification.

Thanks all.
Comment 23 Swamp Workflow Management 2017-05-06 22:09:01 UTC
openSUSE-RU-2017:1189-1: An update that has two recommended fixes can now be installed.

Category: recommended (low)
Bug References: 1008444,997465
CVE References: 
Sources used:
openSUSE Leap 42.2 (src):    os-prober-1.61-20.3.1
openSUSE Leap 42.1 (src):    os-prober-1.61-21.1
Comment 24 Bernhard Wiedemann 2017-05-22 12:01:07 UTC
This is an autogenerated message for OBS integration:
This bug (1008444) was mentioned in
https://build.opensuse.org/request/show/497303 42.3 / os-prober
Comment 25 Swamp Workflow Management 2017-05-25 16:09:25 UTC
SUSE-RU-2017:1410-1: An update that has two recommended fixes can now be installed.

Category: recommended (low)
Bug References: 1008444,997465
CVE References: 
Sources used:
SUSE Linux Enterprise Server for Raspberry Pi 12-SP2 (src):    os-prober-1.61-29.1
SUSE Linux Enterprise Server 12-SP2 (src):    os-prober-1.61-29.1
SUSE Linux Enterprise Server 12-SP1 (src):    os-prober-1.61-29.1
SUSE Linux Enterprise Desktop 12-SP2 (src):    os-prober-1.61-29.1
SUSE Linux Enterprise Desktop 12-SP1 (src):    os-prober-1.61-29.1