Bug 330705

Summary: upgrading from 10.2 to 10.3 with separate /var/ parition and new libata dev-names trashes whole system/upgrade process
Product: [openSUSE] openSUSE 11.0 Reporter: andreas bittner <abittner>
Component: YaST2Assignee: Lukas Ocilka <locilka>
Status: RESOLVED DUPLICATE QA Contact: Jiri Srain <jsrain>
Severity: Enhancement    
Priority: P5 - None CC: aschnell, jnelson-suse, locilka, reisenweber, suse-beta
Version: Alpha 2   
Target Milestone: ---   
Hardware: i386   
OS: openSUSE 10.3   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: firststep: cleaninstall opensuse 10.2 textmode (from x86 dvd)
second step: upgrade 10.2 clean system to opensuse 10.3 - gruberror - y2logs attached
10.2 to 10.3 upgrade scenario with manually edition fstab lines
changed fstab to device-id entries - y2log from 10.2 to 10.3
yast2 logdirectory from 10.1 to 10.3 upgrade - grub error even when using var with label
grub errors - upgrading clean opensuse 10.2 to 11.0 goldmaster

Description andreas bittner 2007-10-04 11:25:15 UTC
hello there,


i have apparently found out, that an upgrade from opensuse 10.2 (x86, goldmaster dvd standarinstall, no additional packages except for the standard offerings for kde install, all the latest online_update patches applied) to opensuse 10.3 x86 goldmaster dvd kills the whole system

in opensuse 10.2 the normale ide/ata/pata/atapi devicenames were still /dev/hdX for me.

when going to opensuse 10.3 they change to /dev/sdaX on my systems (parallel ata harddisk and dvd rom drive)

the yast update programm doesnt handle these new devicenames properly and forgets about the separate /var/ partition if you have one, and doesnt mounts it (doesnt ask for the new /dev/sdaX name for it)

please do read my thread on the opensuse-testing mailinglist and what i have found out, its already documented there.

http://lists.opensuse.org/opensuse-testing/2007-10/msg00005.html

i think this is a pretty major bug, if not even a showstopper. i have now for many years always been using separate partitions for /var/ (because of the logs for example, and other stuff like cache, etc...), so if an upgrade from 10.2 to 10.3 fails due to such a basic partition, its pretty severe.


thank you.
andy.
Comment 1 andreas bittner 2007-10-04 16:28:08 UTC
Created attachment 176380 [details]
firststep: cleaninstall opensuse 10.2 textmode (from x86 dvd)

testcase: cleaninstall of x86 opensuse 10.2 from download dvd media.

y2logs after setup is done with all steps.
Comment 2 andreas bittner 2007-10-04 16:30:24 UTC
Created attachment 176381 [details]
second step: upgrade 10.2 clean system to opensuse 10.3 - gruberror - y2logs attached

second step: upgrade 10.2 clean system to opensuse 10.3 - gruberror - y2logs attached

the system from step one gets updated/upgraded by booting from x86 media of opensuse 10.3 download dvd version.


see my bug description and the opensuse-testing discussion.

so here are the y2log logfiles after yast/grub fails at the very end with the grub/bootloader installation.

the system is hosed from there on, no booting any more.
Comment 3 andreas bittner 2007-10-04 16:35:38 UTC
so i have attached two files here, y2log directory of another test on my testsystem.

thistime i made a quicker and plain textmode installation.


installed x86 10.2 from dvd media, selected minimal textmode system for installation. 


i have created several partitions, first = swap, second = /boot
third = /

hda4 = extended for the whole rest of the hdd
hda5=usr
hda6=var
and hda7=opt

all inside the hda4 extened one.
filesystem was ext3 for everything


10.2 installation completed allright.

after that, i tried my upgrade of this minimalistic 10.2 system with the opensuse 10.3 x86 dvd download media.


everything is still as i have described. the yast/install-module doesnt detect the old 10.2 installation, you have to check show all partitions, to be able to see /dev/hda3 for the old / (root) partition

then it asks exactly for two more partitions so that i point /dev/hdaX to /dev/sdaX 

it only asks for the old /usr and the old /opt partitions to point to the new name. 

and i suppose, that this is the big mistake.

the system upgrades the 10.2 packages to 10.3 ones without any visible problems.
it tries to install the bootmanager and so forth, but fails there with grub.


end of story.
Comment 5 Thomas Fehr 2007-10-08 10:39:33 UTC
Lukas, I think there is some problem in current update code.
If /var is on a separate partition it seems to take wrong conclusions:

From y2log:
RootPart.ycp:613 There are only 0 files in /var/lib/hardware/, translation needn't work!

For the device name translations to be able to be detected correctly,
/var/lib/hardware of the system to update needs to be accessible, so if there is
a separate partition for /var, one needs to not only mount root device but also device for /var before Storage::GetTranslatedDevices() gets called. I fixed a 
similar problem the week before 10.3 GM in my code that imports an existing 
fstab, maybe you could use a similar strategy (or we could even share the code).

See functions findExistingFstab and mountVar in file custom_part_lib.ycp of 
yast2-storage.
Comment 6 Christian Boltz 2007-10-08 11:37:00 UTC
Mounting /var is a good idea, but you never know if it really works - some people (like me) have interesting disk layouts.

(In reply to comment #5 from Thomas Fehr)
> From y2log:
> RootPart.ycp:613 There are only 0 files in /var/lib/hardware/, translation
> needn't work!

And please, as I already proposed in bug 299667, please let YaST display an error message if it can't find files in /var/lib/hardware like
    Can't find any files in /var/lib/hardware.
    This means you'll have to modify /etc/fstab, /etc/cryptotab, 
    /boot/grub/menu.lst and /boot/grub/device.map (or lilo.conf if you use 
    lilo) to switch them to the new /dev/sd* naming scheme.
        [Retry] [Ignore]
(retry should check /var/lib/hardware again, maybe the user mounted the partition manually)

Of course, this doesn't solve the problem, but at least the user is aware of it and has a chance to work around it.
Comment 7 Thomas Fehr 2007-10-08 11:49:16 UTC
The disk layout you call "interesting" is simply unsupported and unrelated 
in any way to this bug. To my knowledge, there will be no changes to support 
your stuff, you will be on your own regarding any update of such a system. 

Nevertheless, displaying a warning if there is no /var/lib/hardware found 
during update might be useful. IMHO the only choice if there is no 
/var/lib/hardware found, would be to abort update completely, but this is
Lukas' code, so his decision.
Comment 8 andreas bittner 2007-10-08 13:15:28 UTC
hello there,

i dont quite understand the discussion about this bug.

how come the upgrade fails if the /var/ is on a separate partition, but only when coming from oldstyle ide names /dev/hdX

i did an install on sata-only hardware already with clean opensuse 10.2 using dev/sdX names and still /var/ being on a separate partition.

opensuse 10.3 upgrade from this sata-only 10.2 system just works fine.
Comment 9 Thomas Fehr 2007-10-08 13:27:39 UTC
Of course the problem only exists on systems where the disk device name between
updated and new system differ. To be able to determine the mapping between new
and old device names, the data below /var/lib/hardware of the old system is
needed.
Comment 10 Lukas Ocilka 2007-10-09 09:13:20 UTC
update/rootpart.ycp:274 Selected root partition: /dev/sda3 $[`arch:"i386", `arch_valid:false, `fs:`ext3, `fstype:"Linux native", `label:"", `name:"openSUSE 10.2", `valid:false]
RootPart.ycp:1157 mount partitions: /dev/sda3
FileSystems.ycp:75 SuggestMPoints init:["/", "/usr", "/var", "/opt", "/boot", "/home", "/srv", "/tmp", "/local", ""]
Update.ycp:490 SuSERelease::ReleaseInformation: openSUSE 10.2

Fstab describing /var partition $["file":"/var", "freq":1, "mntops":"acl,user_xattr", "passno":2, "spec":"/dev/hda6", "vfstype":"ext3"]

Anyway, mounting the /var partition has been solved by Arvin already (reassigning). It seems we need to clean the code up :( Which means - rewrite.
Comment 11 andreas bittner 2007-10-09 14:05:54 UTC
my main concern still is:

will there be a new iso image for installation to fix this behaviour while upgrading with the download media, or how is the normal user supposed to handle a situation like this with the current set of 10.3 media?

will there be a workaround / bestpractices posted on the opensuse wiki or something? is there a real way to fix the upgrade procedure so that everything works correctly?

thanks.
Comment 12 Lukas Ocilka 2007-10-09 14:11:43 UTC
A workaround already exists (I'm thinking about writing an article ;) ),

  * boot the system you want to upgrade
  * run `yast2 disk`
  * change mountpoints not to use device names (/dev/hda) but LABEL or
    device ID, or anything else (but not device name)
  * reboot to the installation

  Does it work?

Well, maybe changing the mountpoints (/etc/fstab) doesn't work when those devices are already mounted :( but anyway, changing /etc/fstab directly on a system could work.
Comment 13 andreas bittner 2007-10-09 21:12:27 UTC
change mountpoints to "LABEL"? meaning what exactly? i have changed the fstab lines from /dev/hdaX to something like "/fantasydev/hdaX"

is that what you meant?

still, i did a clean opensuse 10.2 install from dvd, then changed those lines in fstab, any only then inserted the 10.3 dvd media and tried the upgrade from there...


10.3 installer still didnt find any older opensuse version, i had to manually checkbox show all partitions, but then it already displayed me the /dev/sdaX partitions there and found the older 10.2 rootpartition.

once again it only asked me for the usr and opt partition that i should correct /fantasydev/hdaX to /dev/sdaX....

it never asked for the /var/ partition


i will attach the latest logfiles from this scenario. the testmachine was on a different hardware though thistime, but once again on normal pata devices.

Comment 14 andreas bittner 2007-10-09 21:15:59 UTC
Created attachment 177225 [details]
10.2 to 10.3 upgrade scenario with manually edition fstab lines

as advised:

10.2 to 10.3 upgrade scenario with manually edition fstab lines from old pata-style /dev/hdaX to "something else but device names", so i changed them to something like "/fantasydev/hdaX".

see comment #13
https://bugzilla.novell.com/show_bug.cgi?id=330705#c13

thanks.
Comment 16 Lukas Ocilka 2007-10-15 07:20:07 UTC
I agree that it is not nice but we can't support this scenario.
It's nearly impossible to find the disk with /var partition after is has been renamed. There is almost no connection between the old and the new disk name. The only possibility seem to be:

1.) Boot the old system
2.) Run partitioner (yast2 disk)
3.) Change the mounting method from "device name" to "label", "device ID" or
    just anything else than "device name" (hda6=var)

See also this SDB article:
http://support.novell.com/techcenter/sdb/en/2003/03/fhassel_update_not_possible.html

The update code, of course, mounts all the partitions and can also works with a separate /var partition however it doesn't work with more disks. Something can be done for 11.0.
Comment 17 andreas bittner 2007-10-15 15:51:56 UTC
hi again,

you cant be serious about this, right?

this is how i see this bug from an enduser point of view:

your upgrade mechanism doesnt ask/handle the partitions found in fstab properly.

why on earth would your installer ask for usr and opt just fine and mount them, and simply disregard the partition for /var... it never asks for /var...


and besides: how hard cand it be to analyze fstab and if you have found partitions on a physical drive and mapped them from oldname /dev/hdaX to /dev/sdaX already, then simply map all the partitions that are listed in fstab....

why does yast installer ask for /opt and /usr just fine, but never asks to point to the proper location from old /dev/hda6 = /var to /dev/sda6...

why not simply exchanging/offering all the device rootnames once a device has been identified and corrected from an oldname to newname?

so if i have given yast the information that my / (root) parition is on /dev/sda3 now which was listed in my old fstab as /dev/hda3, then yast knows that what was once /dev/hda is now /dev/sda....

and everything that exists in /dev/hda now is on /dev/sda...

this is the most logical and basic thing there is.

as sda and hda are physical disk names, and the numbers after the rootname of the physical disks are parititon numbers...

how in earth could this be a problem for the installer/upgrade mechanis, and how can suse lower the importance of this bug to "enhancement"

this is simply crazy and to freak out. i dont get it why i still am buying suse retail releases on and on....

i am really freaking out about this i am sorry, but i have to write this down, cos i still think i am only dreaming all this mess: you people mess up a simple upgrade process from a clean and simple installation of your very previous version of opensuse and dont support an upgrade just because customers use partitions for mountpoints your installer/tools offer?

you gotta be kidding me. there is nothing so special or magic about a /var partition, but nevertheless suse messes up handling it properly, just because if who knows why some of the devs decided to switch naming convention just all of a sudden for classical hardware.


and while thinking about this "problem" i have explained why i fail to see the "big problem" with finding the partitions.... if you found just one of the new names (on a physical disk/device), then you have found them all... simply replace all of the fstab entries with the old device name to the new devicename the customer/user once has identified/supplied, or the installer has properly found. this cant be that hard. or am i missing a point or is there more to this whole confusion with the naming convention change?

why not asking the user just to point to all of the partitions that are listed in the customers fstab file, instead of omitting the var partition just because of who knows why.


i really really dont get it. i really think i do gotta change to some other distribution for sure and advise others to do so too if this is the real philosophy of suse/novell.
Comment 18 Lukas Ocilka 2007-10-15 15:53:53 UTC
I'm just writing and article how to be able to upgrade such system (with separate /var partition).
Comment 19 Lukas Ocilka 2007-10-15 15:57:43 UTC
BTW: this affects the installation media and can't be changed for the already released ones, that's why I've changed it to enhancement and that's why I'm spending a lot of time to make it better.

I'm sorry, but for openSUSE 10.3, the only way is to offer you a workaround (that article). And yes, your wishes and ideas will probably be implemented for openSUSE 11.0 but it can't be done earlier.
Comment 20 andreas bittner 2007-10-15 16:11:38 UTC
well when i first reported this bug/behaviour and tried to track down the real problem, i even thought this needed to be a blocker/showstopper, as it fails on systems with a simple partition for var from one consecutive opensuse release to the very next.


i still remember, there was once an suse/opensuse release not so long ago, which got itself a remastered iso or something because of the huge bugs found inside the first goldmaster. i think it was in the early days of the zypp/updater stuff and all that...


i really had thought this bug/behaviour was also of similar importance as its really not rocketscience to have a separate var partition  but this killing the upgrade mechanisms from 10.2 to 10.3


opensuse 10.3 can see the harddrive, it can see the partitions, it can read files, it can analyze fstab and it can ask the enduser. so where is the problem with accessing var properly and mounting it just fine as the other parititons.

thanks.
Comment 21 Lukas Ocilka 2007-10-15 16:19:05 UTC
Remastering makes sense if no other workaround exists.
But an easy workaround really exists, I'll post it soon.
Comment 22 andreas bittner 2007-10-15 16:24:41 UTC
ok so here i tried again:

clean textmode install of opensuse 10.2
after installation "yast2 disk" changed from dev-names to device-id....

the fstab showed something like samsung-name-of-harddisk-here-part1 and so forth...


rebooted with opensuse 10.3 dvd media. installer found / (root) partition by itself thistime (show all partitions also showed new-style names /dev/sdaX already)

it upgraded the packages all right.

at the end grub still complains/crashes again with error 23, error while parsing number.... quit.....


so basically nothing has changed?

hopefully i have some y2logs after this stef on the harddisks and can post them.
Comment 23 Lukas Ocilka 2007-10-15 16:27:37 UTC
Basically it seems to be the Bootloader issue which means: not-installation/upgrade issue anymore (it's really different).

Please, file a separate bugreport.
Comment 24 Lukas Ocilka 2007-10-15 16:37:36 UTC
Howto work in progress:
http://en.opensuse.org/How_To_Upgrade_System_with_Separate_/var_Partition
Comment 25 andreas bittner 2007-10-15 16:40:34 UTC
Created attachment 178580 [details]
changed fstab to device-id entries - y2log from 10.2 to 10.3

changed the fstab lines with "yast2 disk" from /dev/hdaX to device-id lines

to for example
/dev/disk/by-id/ata-SAMSUNG_xxxxxxx-part1

upgrade process from 10.2 clean to 10.3 still crashes at the end with grub error while trying to install/upgrade the bootloader parts.
Comment 26 Christian Boltz 2007-10-15 21:25:58 UTC
(In reply to comment #24 from Lukas Ocilka)
> Howto work in progress:
> http://en.opensuse.org/How_To_Upgrade_System_with_Separate_/var_Partition

I just added a link to this bugreport and to your HowTo to http://en.opensuse.org/Bugs:Most_Annoying_Bugs_10.3

Sidenote: In your HowTo, YaST-Disk-Overview.png is displayed as broken image. It results in a 404 error on files.opensuse.org.
Comment 27 Lukas Ocilka 2007-10-16 16:00:55 UTC
(In reply to comment #26 from Christian Boltz)
> Sidenote: In your HowTo, YaST-Disk-Overview.png is displayed as broken image.
> It results in a 404 error on files.opensuse.org.

Well, I just uploaded a PNG file and that's the result. Seems to be some kind of wiki bug or ...?
Comment 28 Lukas Ocilka 2007-10-25 12:05:56 UTC
*** Bug 332392 has been marked as a duplicate of this bug. ***
Comment 29 andreas bittner 2008-06-02 21:08:06 UTC
hi there,

back to this bug again, i did an opensuse 10.1 to 10.3 upgrade.


the workaround at
http://en.opensuse.org/How_To_Upgrade_System_with_Separate_/var_Partition

is not complete / faulty. did you guys actually try what you have written and documented there? :(

first of all, labeling a mounted partition doesnt work at all, i get constant error "-3024" or such from the yast "disk" module when trying to set the label for the /var partition and change the mounting style still in the old 10.1 system.


second the screenshot
http://files.opensuse.org/opensuse/en/7/78/YaST-Disk-FstabOptions.png

clearly displays the radiobutton at "mount device by name", although the device label field is filled out and the whole workaround is supposed to mount the /var by label, so the radiobutton "volume label" needs to be selected ofcourse, otherwise it doesnt do anything at all to fstab.

and then there is the issue that the volume label apparently cant be changed/written to a mounted partition, and booting into the old system uses all the partitions, thus needing to go to "init S" / single user mode and dismount all the disks as /var is a essential partition to many processes and stuff that needs to be actively dismounted and programs and daemons ended so that the /var partition can be cleanly dismounted and then the yast disk module being able to write its label stuff.


finally, the whole process (upgrade to 10.3) didnt really work after all, the label stuff not helping that much.

the packets get installed/upgraded so far so good, but the bootloader installation at the end fails utterly.

i tried to look around in /etc and in /boot and all kinds of places where there could be some leftovers and problems, but during the last steps where the 10.3 upgrade tries to write the grub bootloader it gives many errors.

i will upload the yast2 log directory. please do analyze the logs and fix this grub bug. i still havent managed to start this messed system.

grub complains about missing bootup-screens (thats not really a problem), then the textmode grub startsup but then it gives all kinds of errors with old and new /dev/sda and /dev/hda errors and all that stuff and never boots anything.

i tried to bootup some rescue / live cd opensuse and tried to look around and fix some stuff, but to no avail, im not that of a grandmaster in fixing opensuse upgrade messes :(

thanks.
Comment 30 andreas bittner 2008-06-02 21:09:58 UTC
Created attachment 219640 [details]
yast2 logdirectory from 10.1 to 10.3 upgrade - grub error even when using var with label

here come the yast2 logs from the opensuse 10.1 to 10.3 upgrade when using a disklabel for /var partition.


regards.
Comment 31 Lukas Ocilka 2008-06-04 14:43:33 UTC
Andreas, you're right. I've tried to change the LABEL for /var partition and it failed. Anyway, using UUID worked and all disks are found even if I've changed their order.
Comment 32 Lukas Ocilka 2008-06-04 14:43:55 UTC
PS: Wiki has been already updated...
Comment 33 andreas bittner 2008-06-04 14:48:38 UTC
ok so if the uuid works does that mean that the last grub-step also works fine with using uuid instead of label after all?

the only problem is now, that this other system i tried to upgrade is messed, how can i fore to reinstall the grub system and creating initrd and all this from the last step properly.


even from inside a chroot on the system itself didnt work out :(

thanks.
Comment 34 Lukas Ocilka 2008-06-04 15:02:04 UTC
(In reply to comment #33 from andreas bittner)
> ok so if the uuid works does that mean that the last grub-step also works fine
> with using uuid instead of label after all?

Not tested yet.

For the second question, try to ask in bug #396635.
Comment 35 andreas bittner 2008-06-26 13:43:01 UTC
Created attachment 224566 [details]
grub errors - upgrading clean opensuse 10.2 to 11.0 goldmaster

did a cleaninstall of 10.2 today with separate /boot / /usr and /var partitions.

and then upgraded to 11.0 goldmaster.


install went fine apparently most of the time, it detected and renamed the old /dev/hdaX partition names to /dev/sdaX

but at the end there were some errors, grub related i think, so that the grub menu still displays the old opensuse 10.2 kernel/entry selected as default after the first initial reboot after 11.0 setup/upgrade.

apparently there is still something wrong with the grub configuration related to these changing partition names :(

was there a separate followup bug to this issue here that was handling the grub/bootloader errors?

here are the logs.
Comment 36 Lukas Ocilka 2008-06-26 13:55:47 UTC
See bug 404114, your GRUB error has been reported there.
Comment 37 Lukas Ocilka 2008-06-26 13:57:18 UTC
The rest is going to be marked as duplicate of bug #270908
(Being solved as a feature)

*** This bug has been marked as a duplicate of bug 304979 ***
Comment 38 Lukas Ocilka 2008-06-26 13:58:02 UTC

*** This bug has been marked as a duplicate of bug 270908 ***