Bug 433105 - no upgradable partition found when going from clean 11.0 to 11.1 beta2 x86
Summary: no upgradable partition found when going from clean 11.0 to 11.1 beta2 x86
Status: RESOLVED FIXED
: 432994 (view as bug list)
Alias: None
Product: openSUSE 11.1
Classification: openSUSE
Component: Installation (show other bugs)
Version: Final
Hardware: i386 openSUSE 11.0
: P3 - Medium : Critical (vote)
Target Milestone: ---
Assignee: Steffen Winterfeldt
QA Contact: Jiri Srain
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-10-07 16:05 UTC by andreas bittner
Modified: 2018-07-03 19:46 UTC (History)
4 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
upgrade yast2 logs when going from 11.0 to 11.1 beta2 (62.91 KB, application/octet-stream)
2008-10-07 16:07 UTC, andreas bittner
Details
opensuse 11.0 logs on the same hardware (788.19 KB, application/octet-stream)
2008-10-07 16:20 UTC, andreas bittner
Details
sl110 install screenshot (293.19 KB, image/png)
2008-10-13 07:38 UTC, Tejun Heo
Details
hwinfo.storage (270.10 KB, text/plain)
2008-10-15 01:45 UTC, Tejun Heo
Details

Note You need to log in before you can comment on or make changes to this bug.
Description andreas bittner 2008-10-07 16:05:55 UTC
hi there,



just installed a clean 11.0 x86 on my testsystem, applied all the current patches via "yast2 online_update", rebooted this system.

then i booted from downloaded x86 iso dvd media of 11.1 beta2.

selected upgrade and the installer tried to find the previous 11.0 product, but it didnt find anything.

in raw mode i listed all the partitions and tried to point to the partition where the / (root) mountpoint should have been and the root files should be located but to no avail. no success.

11.1 beta2 is completely unable to upgrade a clean 11.0 installation.

too bad :(
will attach the yast2 logfiles in the next step after creating this initial entry.

cheers.
Comment 1 andreas bittner 2008-10-07 16:07:12 UTC
Created attachment 244026 [details]
upgrade yast2 logs when going from 11.0 to 11.1 beta2

here are the promised logfiles:

upgrade yast2 logs when going from 11.0 to 11.1 beta2
Comment 2 andreas bittner 2008-10-07 16:18:44 UTC
some additional info:

mount, fdisk -l and fstab as being shown on the normally working 11.0 x86 on this test machine:


/dev/sda6 on / type ext3 (rw,acl,user_xattr)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
debugfs on /sys/kernel/debug type debugfs (rw)
udev on /dev type tmpfs (rw)
devpts on /dev/pts type devpts (rw,mode=0620,gid=5)
/dev/sda8 on /home type ext3 (rw,acl,user_xattr)
/dev/sda1 on /windows/C type fuseblk (rw,noexec,nosuid,nodev,allow_other,default_permissions,blksize=4096)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
securityfs on /sys/kernel/security type securityfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)



Disk /dev/sda: 80.0 GB, 80060424192 bytes
255 heads, 63 sectors/track, 9733 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0001f7f1

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1        2550    20482843+   7  HPFS/NTFS
/dev/sda2            2551        2812     2104515   82  Linux swap / Solaris
/dev/sda3   *        2813        9733    55592932+   f  W95 Ext'd (LBA)
/dev/sda5            2813        2838      208813+  83  Linux
/dev/sda6            2839        4144    10490413+  83  Linux
/dev/sda7            4145        5450    10490413+  83  Linux
/dev/sda8            5451        9733    34403166   83  Linux




/dev/disk/by-id/scsi-SATA_SAMSUNG_SV0802N0652J1FW929822-part6 /                    ext3       acl,user_xattr        1 1
/dev/disk/by-id/scsi-SATA_SAMSUNG_SV0802N0652J1FW929822-part8 /home                ext3       acl,user_xattr        1 2
/dev/disk/by-id/scsi-SATA_SAMSUNG_SV0802N0652J1FW929822-part2 swap                 swap       defaults              0 0
/dev/disk/by-id/scsi-SATA_SAMSUNG_SV0802N0652J1FW929822-part1 /windows/C           ntfs-3g    users,gid=users,fmask=133,dmask=022,locale=en_US.UTF-8 0 0
proc                 /proc                proc       defaults              0 0
sysfs                /sys                 sysfs      noauto                0 0
debugfs              /sys/kernel/debug    debugfs    noauto                0 0
usbfs                /proc/bus/usb        usbfs      noauto                0 0
devpts               /dev/pts             devpts     mode=0620,gid=5       0 0
Comment 3 andreas bittner 2008-10-07 16:20:48 UTC
Created attachment 244030 [details]
opensuse 11.0 logs on the same hardware

here are a set of logfiles from the same hardware, from a simple 11.0 x86 clean install and having done the yast2 online_update patches


just in case anyone needs it to compare it to the 11.1 beta2 x86 log stuff.

regards.
Comment 4 Steffen Winterfeldt 2008-10-08 09:46:52 UTC
No idea; a strange thing is that the devices are back to hdX when we
just switched to sdX. But what do I know.
Comment 5 Arvin Schnell 2008-10-08 10:02:35 UTC
The switch from sda to hda is the cause of the problem. 

In /etc/fstab by-id is used (scsi-SATA_SAMSUNG_SV0802N0652J1FW929822)
and the switch to hda causes these persistent names to not exist
anymore. udev problem.
Comment 6 andreas bittner 2008-10-08 10:19:46 UTC
erm, wasnt there a switch from hdx to sdx just some opensuse releases ago (some kernel 2.6.x ago) which already caused some trouble for people when upgrading from older opensuse versions to newer ones?

and now there is a change back from sdx to hdx? very weird.

whats happening? :)


btw: the hardware of the testbox is physically parallel ata cables connecting to the harddrive and the cd/dvd rom drive.

the mainboard is asus "a8n-sli premium"

i wonder why this behaviour didnt appear on 11.1 beta1, but was introduced in 11.1 beta2? are there any ata driver subsystem changes in the current 2.6 kernels yet again?

cheers.
Comment 7 Kay Sievers 2008-10-08 12:10:50 UTC
Looks like the problem in:
  https://bugzilla.novell.com/show_bug.cgi?id=432994


Tejun, how can this happen? Why do we load IDE drivers?
Comment 8 Tejun Heo 2008-10-08 21:54:01 UTC
Hmmm... I don't know.  It's either the supported.conf thing or some changes in initrd.  Will check.
Comment 9 Kay Sievers 2008-10-08 22:11:44 UTC
Ah, I see.

Hannes, can we stop including all the old IDE drivers in initramfs?
  * Tue Sep 09 2008 hare@suse.de
  - Fix dhcp network detection (bnc#415438)
  - parse 'ip route' lines correctly (bnc#414191)
  - Always include all ATA and SCSI drivers

We have no control, which driver is getting loaded first, if we include them all and don't blacklist.
Comment 10 Tejun Heo 2008-10-08 22:16:03 UTC
Let's track this on bnc #432994.
Comment 11 Tejun Heo 2008-10-08 22:21:09 UTC
(cc'ing Alexander Graf)
Kay, we do have control.  Module loading order is determined by modules.order which is generated according to the linking order and can also be edited afterwards.  There are two changes here.

1. Initial udev start behavior changed between SL110 and SL111.  In SL111, the initial udev start now pulls in all the modules instead of the linuxrc.

2. initrds on installed system now contain all the ATA/SCSI drivers but it doesn't contain all the IDE ones by default, right, Alex?  So, this is no the root cause of the problem.  The IDE driver must have been included because it was used during installation.
Comment 12 Kay Sievers 2008-10-08 22:29:09 UTC
Udev has not changed regarding to module loading.

But this is in initramfs now:
  for i in $(find $root_dir/lib/modules/$kernel_version/kernel/drivers/{ata,ide,scsi,s390/block,s390/scsi} -name "*.ko"); ...

The "ide" must go.
Comment 13 Tejun Heo 2008-10-08 22:35:44 UTC
I think the solution should be...

1. Check why the installation initrd behavior has changed.  linuxrc now tries to load modules which are already loaded by udev.  Something went wrong there.  Either we stop linuxrc from trying to load drivers at all or stop udev from loading drivers before it.

2. Regarding IDE drivers, I don't mind either way.  It doesn't really matter whether they're included in the initrd or not but we definitely need to put all libata ones before ide, which can be trivially achieved by moving ide/ below ata/ in drivers/Makefile.  I'll do it.  This change will also mask #1 but we still need to fix it.

Thanks.
Comment 14 Tejun Heo 2008-10-08 22:36:26 UTC
BTW, who do we bug for installation initrd?
Comment 15 Kay Sievers 2008-10-08 22:41:02 UTC
This is all caused by including IDE drivers unconditionally in initramfs, which we never did before. I guess nothing else has changed. What else are you looking for?
Comment 16 Tejun Heo 2008-10-08 22:49:40 UTC
No no, there are two different problems as stated above.  We've always included all IDE and libata drivers in the initrd of the installation media.  linuxrc took care of the driver priorities there but now udev loads all the drivers and prefer IDE ones over libata ones according to modules.order.

The second problem is the inclusion of all IDE ones into the initrd of the installed system, which can be solved by either excluding IDE drivers from initrd or making modprobe prefer libata ones over IDE ones.  That's Alex and Hannes's decision and regardless of that I'm putting IDE behind libata as that's what the current situation is and solves other problems too.  But we still need to find what changed in the installation media.  Something changed there and udev stole linuxrc's job.
Comment 17 Tejun Heo 2008-10-08 22:57:19 UTC
Patch to prefer libata drivers over ide ones committed to HEAD.

| - patches.drivers/libata-prefer-over-ide: libata: prefer libata
|   drivers over ide ones (bnc#433105).
Comment 18 Felix Miata 2008-10-08 23:00:03 UTC
(In reply to comment #10 from Tejun Heo)
> Let's track this on bnc #432994.

That bug is not generally available. :-(
Comment 19 Tejun Heo 2008-10-08 23:05:02 UTC
(In reply to comment #18 from Felix Miata)
> (In reply to comment #10 from Tejun Heo)
> > Let's track this on bnc #432994.
> 
> That bug is not generally available. :-(

Heh.. forget about that.  This one now contains much more info so I think we should stick with this one and close the other one as duplicate.
Comment 20 Kay Sievers 2008-10-08 23:08:37 UTC
*** Bug 432994 has been marked as a duplicate of this bug. ***
Comment 21 Hannes Reinecke 2008-10-09 07:43:15 UTC
Updated mkinitrd rpm submitted to autobuild.
Comment 22 Tejun Heo 2008-10-09 07:47:39 UTC
Hannes, what has been updated?  Do we know what changed with the installation media?
Comment 23 Hannes Reinecke 2008-10-09 08:13:37 UTC
We're now copying all libata drivers only if the root device is in fact on a libata driver. Other installation continue as previously by just loading (ie adding) the required drivers.
Comment 24 Tejun Heo 2008-10-09 08:50:18 UTC
I don't think that fixes anything by itself.  It alone wouldn't solve this bug even.  The installation media will choose ide driver over libata one and mkinitrd during installation will include the ide driver as that's what was used during installation.

We still need to know what changed with linuxrc and the initial udev of the installation media.  It could have some effect on how linuxrc parameters work.  As I don't know who's responsible for that, I'm reopening and asking for reassignment.

** Summary of the problem with installation media: Till SL110, linuxrc was responsible for loading drivers.  udev was started before linuxrc but it didn't load any module but something changed and on SL111 the initial udev sucks in all drivers and linuxrc comes after that and still tries to load the drivers which are already loaded.  Is this change intended?  If so, can we stop linuxrc from repeating to try to load drivers?

Thanks.
Comment 25 Kay Sievers 2008-10-09 08:57:37 UTC
(In reply to comment #24 from Tejun Heo)
> I don't think that fixes anything by itself.  It alone wouldn't solve this bug
> even.  The installation media will choose ide driver over libata one and
> mkinitrd during installation will include the ide driver as that's what was
> used during installation.
> 
> We still need to know what changed with linuxrc and the initial udev of the
> installation media.  It could have some effect on how linuxrc parameters work. 
> As I don't know who's responsible for that, I'm reopening and asking for
> reassignment.
> 
> ** Summary of the problem with installation media: Till SL110, linuxrc was
> responsible for loading drivers.  udev was started before linuxrc but it didn't
> load any module but something changed and on SL111 the initial udev sucks in
> all drivers and linuxrc comes after that and still tries to load the drivers
> which are already loaded.  Is this change intended?  If so, can we stop linuxrc
> from repeating to try to load drivers?

Steffen?
Comment 26 andreas bittner 2008-10-09 09:00:00 UTC
is it possible to give a rough explanation on how to rebuild / patch a beta2
iso dvd installation media with this patch/resolution so that i could try and
see how things are behaving?

as this bug is pretty essential and i am testing upgrade scenarios i basically
cant use beta2 for upgrades at all at the moment. i could be of more help if i
would know where to get the patched bits and could try to use them.

also: it would be nice to know the time frame / milestone estimate when this
patch will be available.

thanks & regards.
Comment 27 Tejun Heo 2008-10-09 09:03:23 UTC
I can build kISO with the updated kernel for you.  Are you on 32bit or 64bit?
Comment 28 andreas bittner 2008-10-09 09:17:57 UTC
i'm on x86 / 32bit. thanks.
Comment 29 Steffen Winterfeldt 2008-10-09 10:03:49 UTC
Well, udevd is responsible for module loading for quite some time now.
I had patched the udev scripts in 11.0 not to do it because there wasn't
sufficient testing time. With 11.1 I removed that patch.

In fact I was asked by several people why udevd doesn't load modules in
11.0, so there you go.

You can switch back to the old behavior with 'linuxrc.debug=-udev.mods'.

AFAIK linuxrc does not try to load the divers again. Why do think that?
Comment 30 Tejun Heo 2008-10-13 07:38:48 UTC
Created attachment 245060 [details]
sl110 install screenshot

Ah... glad to hear the change was intentional.  In fact, pata_hpt* drivers need such change as they need all the pata_hpt* drivers to be loaded instead of the first one which seems to fit.  I thought linuxrc was trying to load modules because of the messages linuxrc printed out.  I thought it was trying to load all the modules it printed out.  It apparently only loads ide-generic, which BTW is the wrong thing to do as the proper driver is already bound to the device.  Maybe linuxrc thinks that it needs to load ide-generic as it didn't load anything for the storage controller?

Thanks.
Comment 31 Andreas Jaeger 2008-10-13 10:46:32 UTC
Not sure who should handle this.  Before assigning it back to the screening list, please tell us.

Steffen, comment #30 looks like this is for you.
Comment 32 Steffen Winterfeldt 2008-10-13 12:37:48 UTC
Hm, the lone ide controller entry looks a bit fishy to me. I do not
see it with my vmware setup.

Tejun, can you run 'hwinfo --storage --log=foo' on that machine and
attach the log?

Also, loading ide-generic _after_ the usual ata drivers should not
make a difference, should it?
Comment 33 Tejun Heo 2008-10-15 01:45:03 UTC
Created attachment 245552 [details]
hwinfo.storage

Here's the log.  And loading ide-generic can do a lot of damage on certain systems where the ATA controller has two interfaces - a native one and SFF TF compatible one - and enabling the native one doesn't completely disable the legacy one and the native one also doesn't claim legacy ports using it's PCI BARs.  ide-generic will attach to the legacy ports in legacy mode not knowing that it's trying to drive the same hardware the native driver is already attached to.  So, loading ide-generic by default is a really bad idea.

Thanks.
Comment 34 Tejun Heo 2008-10-15 01:45:34 UTC
Oops, clearing NEEDINFO.
Comment 35 Steffen Winterfeldt 2008-10-15 15:58:28 UTC
Uhm, there was indeed a reference to ide-generic in the code. Removed it.
Comment 36 andreas bittner 2008-10-27 16:47:14 UTC
doing upgrade from clean 11.0 x86 dvd, to opensuse 11.1 beta3 seems to work now.

i got other bugs after the upgrade though ;(.

this bug might be settled now.
regards.
Comment 37 Tejun Heo 2008-10-28 01:18:23 UTC
Okay, great.  Sorry about not producing the kISO.  My VPN was down for quite some time and couldn't access the repos.  Thanks.
Comment 38 Philippe Duchenne 2009-04-11 21:41:06 UTC
I'm not 100% sure my problem is related to this report, but the symptoms are very close, so I would like to re-open this bug.

I'm trying to update my openSUSE 11.0 to 11.1 (final), but I got an error saying the /var partition cannot be mounted (and a URL to http://support.novell.com/techcenter/sdb/en/2003/03/fhassel_update_not_possible.html ).

After I select the update installation mode, the root partition is properly suggested (/dev/sda2).  If I click on 'Show all partitions', I can see the other ones, but with an 'unknown Linux' type and unknown architecture.  When I click on 'Next' (/dev/sda2 selected), I got the above error.
I tried the tip from the support page (above URL), but it didn't help.

I've posted many details 
  cat /etc/fstab
  ls -l /dev/disk/by-id
  fdisk -l
and others, on the openSUSE forum: http://forums.opensuse.org/install-boot-login/412332-update-fails-partitions-cannot-mounted.html
Comment 39 Tejun Heo 2009-04-11 22:28:52 UTC
Hmmm... the disk being a genuine SCSI device, I doubt the problem you're seeing is the same as this one.  Can you please open a new bug report and attach the following information?

1. /var/log/boot.msg and /etc/fstab from 11.0 (please attach them instead of inlining or linking.

2. After booting 11.1 installation media, switch to vt2 (ctrl-alt-f2).  Plug in a usb stick and mount it under /mnt (you can see which /dev/sdX node it got assigned by running dmesg).  Run the following commands.

   * save_y2logs /mnt/y2logs.tar.gz; cp /var/log/boot.msg /mnt; dmesg > /mnt/dmesg.out; umount /mnt

and attach the resulting files in the bug report.

I'm setting the status of this one to resolved again.  Keeping different issues in separate bug reports makes things easier later on.

Thanks.
Comment 40 Philippe Duchenne 2009-04-12 18:29:03 UTC
@Tejun Heo
(Replying here for tracability)

Ok, Thanks.  I've opened bug #494240