Bug 331604 - update of 10.2 with raid1 / and /home fails on first reboot
Summary: update of 10.2 with raid1 / and /home fails on first reboot
Status: RESOLVED FIXED
: 309040 (view as bug list)
Alias: None
Product: openSUSE 10.3
Classification: openSUSE
Component: Installation (show other bugs)
Version: Final
Hardware: x86-64 openSUSE 10.3
: P5 - None : Critical (vote)
Target Milestone: ---
Assignee: Jozef Uhliarik
QA Contact: Jiri Srain
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-10-06 23:21 UTC by Joe Morris
Modified: 2008-01-29 15:14 UTC (History)
3 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
mdadm.conf (242 bytes, text/plain)
2007-10-25 12:34 UTC, Joe Morris
Details
contents of /var/log/Yast2 (3.75 MB, application/x-bzip2)
2007-10-25 12:51 UTC, Joe Morris
Details
current mdadm.conf (183 bytes, text/plain)
2007-10-26 01:00 UTC, Joe Morris
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Joe Morris 2007-10-06 23:21:28 UTC
Because of the libata change in dev names, IDE drives are now called sda instead of hda, etc., but /etc/mdadm.conf is not updated (apparantly fstab is) on an upgrade from 10.2, so on first reboot to continue the install /home did not assemble because it was still looking for hda and hdc.  FSCk failed, so I ended up in the rescue system.  Correcting mdadm.conf to sda and sdb fixed the bug.  Looks like another tweak needed because of libata.
Comment 1 Bernhard Bender 2007-10-10 14:50:44 UTC
I found a similar problem during the upgrade 10.2 -> 10.3.

My /etc/mdadm.conf apparently does not contain references to /dev/hd* or /dev/sd*, but /etc/fstab did.
fstab was not updated.
I should mention that I have a setup with IDE (P-ATA) drives that have a non-raid boot partition (/boot on hda1) and then partitions hd[ab]2 and hd[ab]4 as a raid-1, and LVM on top of that.
Therefore, hd[ab] is only mentioned on the fstab for /boot and swap.

I have /boot on /dev/hda1 (now /dev/sda1), so installation of the new kernel and GRUB failed. So the system ended up in the fsck rescue mode after first reboot during installation.


I was able to edit the fstab and GRUB config from the resuce system so now everything works well.

However, the installation should really check for this.
Comment 2 Richard Creighton 2007-10-17 19:55:40 UTC
Bug 309040 and Bug 304657 was recently closed for a similar problem to this and the one remaining issue was the IDE-SATA renaming issue which was transferred to this bug.   During the troubleshooting, I found If the IDE drive is left online, you can expect the MBR of the new installation to be written on the IDE drive while the
installed files and /boot and /grub will be written on the SATA
devices.   The SATA device starts out as SDAn but at the time of the
reboot/MBR update, the IDE drive becomes SDAn and the SATA drive becomes
SDBn.   I guess you can see the problem that results.  If the IDE drive contains Windows, then Windows gets hosed, In my case, it was an older version of SuSE not part of the MD raid installation.   So, part of the problem is the timing of the ranaming and the fact the bootloader is written after the renaming is effected onto the devices that were modified with new installation files using the old names.   It gets worse, but it should give you a start.  
Comment 3 Cyril Hrubis 2007-10-18 11:38:24 UTC
*** Bug 309040 has been marked as a duplicate of this bug. ***
Comment 4 Thomas Fehr 2007-10-22 08:34:09 UTC
Problems with raid and bootloader on 10.3 are well known but this is
not my responsibility. Reassigning back.
Comment 6 Lukas Ocilka 2007-10-25 11:21:01 UTC
Update module doesn't change the mdadm configuration (it works just with already started RAID, e.g., /dev/md0 device), but Storage does (I hope I'm not wrong).
Comment 7 Lukas Ocilka 2007-10-25 11:34:31 UTC
I see, not a storage but probably bootloader issue.
Comment 8 Thomas Fehr 2007-10-25 11:51:02 UTC
I also doubt this is due to device names being in mdadm.conf, there should be 
none. libstorage puts a line with "DEVICE partitions" into mdadm.conf since SL 
10.1. So there should be no references to partition device names in mdadm.conf
normally. In comment #1 the user explicitly stated that there were no 
reference to devices in mdadm.conf.

Only reason for devices names being part of mdadm.conf could be if the update 
was hitting a mdadm.conf from a system older than SL 10.1 or if the user 
manually edited mdadm.conf.             

So far there is no code that touches mdadm.conf during update.
So it could happen that someone still has a pre-10.1 mdadm.conf in its system
even if he did updates to 10.1, 10.2 and now 10.3. Most of the problems
with booting from mdadm were due to bootloader not handling being /boot on md
raid, not because of device names in mdadm.conf but the issue with mdadm.conf
of pretty old installations still existing could be also hit.           

Please attach your mdadm.conf.                              
Please tell us which was the version you installed SuSE Linux first and if you  
manually edited mdadm.conf. Additionally please attach complete content of the 
directory /var/log/YaST2.
Comment 9 Joe Morris 2007-10-25 12:34:45 UTC
Created attachment 180547 [details]
mdadm.conf

That has been a lomg time now.  IIRC, I originally installed 9.1 x86_64 on this machine, but without raid.  I added another drive probably around the same time I updated to 9.2.  I have only manually had to mess with /etc/mdadm.conf with this upgrade.  AFAIR, it used to auto generate its own config, meaning I do not think there was an /etc/mdadm.conf back then.  I skipped 10.0.  I think I first noticed this file around 9.3, but no longer sure.
Comment 10 Joe Morris 2007-10-25 12:42:27 UTC
Attaching the contents of Yast2 logs are too big, >100 MB.  If you want, I could try to split them up over seceral attachments, or could you be more specific.  I have a y2log-y2log-9 that are 10 MB each.  Just let me know and I will do what I can.
Comment 11 Joe Morris 2007-10-25 12:51:17 UTC
Created attachment 180551 [details]
contents of /var/log/Yast2

I would have never believed it could compress that much.  I think this might work.
Comment 12 Thomas Fehr 2007-10-25 13:09:16 UTC
Thanks for the information. So you really still has a pre-10.1 mdadm.conf on
your system.

I meanwhile added code that changes mdadm.conf to use "DEVICE partitions"
during update if devices have been renamed. So the mdadm related portion of the
bug is fixed now. I abstain from closing it since as far as I know there are 
still issues with bootloader and MD devices.
Comment 13 Lukas Ocilka 2007-10-25 13:18:39 UTC
Thanks, Thomas.
Comment 14 Joe Morris 2007-10-25 23:17:15 UTC
I have some questions.  Should I then update my /etc/mdadm.conf?  I ask this partly because I plan to upgrade our office server, and I know it has 4 raid1 partitions, created under 9.3.  This one had a clean install of 10.2 on a new root (i.e. md2), while 9.3 was on md0 (/home is on md3 and a backup on md1).  I have been waiting to upgrade to see what happened with this bug before causing me extra work.  Since my home system is my testbed for the office server, should I remove the devices line in my mdadm.conf?  Should the first line be DEV partitions, DEVICE partitions, or it mentions in man mdadm DEVICES partitions?  Running mdadm --examine --scan gives me the same basic output as my present mdadm.conf minus the DEV line and the devices= lines.  Would basically a correct mdadm.conf have the output of mdadm --examine --scan plus begin with a DEVICE(S?) partitions line?  I will test this change here first before making any changes at our office.  I will wait for more feedback before trying this to make sure I don't create more problems.  Thanks.

BTW, is Comment 12 correct, or the man page?  Is it DEVICE or DEVICES?
Comment 15 Joe Morris 2007-10-26 01:00:40 UTC
Created attachment 180669 [details]
current mdadm.conf

I couldn't wait to see and decided the knowledge gained was worth the risk.  I went ahead and tried with the attached new mdadm.conf.  Booted fine and seems to work with no problems.  I will check and do the same at work in hopes when I upgrade to 10.3 there (from 10.2 like here at home), I will not experience that problem.  BTW, here at home I only have 2 IDE HD, now sda and sdb instead of hda and hdc.  At work, I have 2 IDE and 2 SATA.  They are now hda, hdc, sda, sdb.  What will be what with 10.3?
Comment 16 Thomas Fehr 2007-10-26 08:39:53 UTC
For the DEVICE line, AFAIK also DEV and DEVICES are fine.
If you have real device names in DEVICE line your should replace them with
the string "partitions". This makes mdadm scan /proc/partitions for devices
to examine. If you use the special string partitions instead of device names
in the DEVICE line of mdadm.conf changing disk names will not affect your raid
setup. YaST2 uses "DEVICE partitions" since SL 10.1 so only systems older than
that chould still contain DEVICE lines with real partition names.

Regarding disk renaming, I would expect the following mapping from 10.2 to 10.3:
hda->sda
hdc->sdb
sda->sdc
sdb->sdd
Comment 17 Joe Morris 2007-10-26 09:01:41 UTC
I checked my mdadm.conf at work.  It had DEVICE partitions, so I should be safe to upgrade.  I also checked man mdadm.conf, and DEVICE partitions is correct, so I made the change to mine here at home.  I also noted the one at work did not have the num-devices=2, so I removed that from mine here as well.  Booted fine and I will leave it this way.  I will let you know if I hit this bug when I upgrade the office.  If not, for me my problem may just have been an outdated mdadm.conf.  I do know md0 assembled somehow, just not md1, which is /home (which agrees with Comment 6).  Thanks for the education!
Comment 18 Joe Morris 2007-10-27 11:20:39 UTC
I ended up doing a new install on my md0 raid1 partition (used to have 9.3), keeping my /home (md3) and backup (md1).  (md2 has 10.2 on it).  I specifically chose to install GRUB in the MBR, and it auto chose sda, which was correct.  Everything installed without a single glitch, all packages, grub, etc.  Mounted both raid1 where I told it to.  As far as I am concerned, my problem was indeed an old style mdadm.conf and not really libata.  I think Comment 12 is correct, this bug has been found and fixed.  I also would change this to Fixed, but I will let those more knowledgeable make that determination.  It worked as I had hoped it would, my glitch was just that, and that has been fixed.  Thanks much for all your help and insight!
Comment 20 Jozef Uhliarik 2008-01-29 15:14:54 UTC
It was fixed by Thomas Fehr.

Thanks Thomas