Bug 720919 - two raid array created, md1 and md127??
Summary: two raid array created, md1 and md127??
Status: RESOLVED INVALID
Alias: None
Product: openSUSE 12.1
Classification: openSUSE
Component: Installation (show other bugs)
Version: Factory
Hardware: i686 Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: Neil Brown
QA Contact: Jiri Srain
URL:
Whiteboard:
Keywords:
Depends on: 721905
Blocks:
  Show dependency treegraph
 
Reported: 2011-09-28 14:05 UTC by Per Jessen
Modified: 2012-01-12 05:57 UTC (History)
1 user (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Per Jessen 2011-09-28 14:05:09 UTC
User-Agent:       Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-GB; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.10

Factory install - I created two RAID1 arrays, and completed the installation. According to /etc/mdadm.conf they are md0 and md1, but at boot-up I get md1 and md127. This is also causes a problem at shutdown (md0 can't be found).

Reproducible: Always
Comment 1 Per Jessen 2011-10-07 20:52:45 UTC
I don't quite understand why this was left waiting, where as bug#721905 was picked up about five days later, but never mind.
Comment 2 Neil Brown 2011-10-07 21:09:40 UTC
I don't understand either - maybe the screening team is over worked.

This sounds like it could be the same bug as was causing 721905.  Does the fix that that also fix this?  Is that what you meant by setting 'blocks' flag?

Thanks for the report anyway.
Comment 3 Per Jessen 2011-10-09 14:10:08 UTC
> This sounds like it could be the same bug as was causing 721905.  Does the fix
> that that also fix this?  Is that what you meant by setting 'blocks' flag?

No, that was unintentional, I only meant to set "depends".
Comment 4 Andreas Jaeger 2011-10-27 10:40:48 UTC
Per, does this still occur with RC1?
Comment 5 Per Jessen 2011-10-27 10:51:33 UTC
I'll do another installation and check.
Comment 6 Per Jessen 2011-10-27 11:34:58 UTC
Using current Factory:
At installation-time, my two existing arrays (root, swap) were auto-detected as md126 and md127.  That didn't look quite right, so I stopped them both, and did "Rescan devices" (Yast:Partitioner).  I was a tiny bit surprised to see both arrays started again, but at least now as md0 and md1. 
It looks like a clean install (without existing arrays) should work fine, but it can't be quite right that two existing arrays turn up as md126 and md127?
Comment 7 Neil Brown 2011-11-02 23:50:01 UTC
I think you are saying that this is fixed now, so I'll close the bug.
If I misunderstand, please re-open.

Names like md126 and ms127 should really be a problem.  We mount things by UUID and the correct UUID will be found whatever name the md device is given.
The recent change that I mentioned (for bug 721905) make a small change so that  you should  get unexpected names less often.
Comment 8 Per Jessen 2011-11-03 07:14:53 UTC
Yes, I think it's solved now. The only other thing I might just try is another install on a machine with existing arrays.  If anything turns up, I'll reopen.
Comment 9 Per Jessen 2012-01-04 11:27:20 UTC
I have new system that I'm having problem booting, possibly due to getting names md126 and md127 assigned:

[    5.831593] md/raid1:md126: not clean -- starting background reconstruction
[    5.832664] md/raid1:md126: active with 2 out of 2 mirrors
[    5.833685] md126: detected capacity change from 0 to 478611404800
[    5.836701] EXT4-fs (md0): re-mounted. Opts: (null)
[    5.846195]  md126: unknown partition table
[    5.849866] md: bind<sdb2>
[    5.855468] md/raid1:md127: active with 2 out of 2 mirrors
[    5.856569] md127: detected capacity change from 0 to 4300760064
[    5.901294]  md127: unknown partition table

# cat /proc/mdstat
Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4]
md126 : active (auto-read-only) raid1 sdb3[1] sda3[0]
      467393950 blocks super 1.2 [2/2] [UU]
        resync=PENDING

md127 : active (auto-read-only) raid1 sdb2[1] sda2[0]
      4199961 blocks super 1.2 [2/2] [UU]

md0 : active raid1 sda1[0] sdb1[1]
      16787776 blocks [2/2] [UU]

The system is 12.1+updates, and I have tried all manner of deleting and recreating md1 and md2. The only way I can get it to boot is by not using md1 and md2 (although md2 is an LVM pv).  I have trouble getting hold of any logs when it fails to boot properly, as it is a remote system that I have no physical access to. (and no ssh access when it doesn't boot).
Comment 10 Neil Brown 2012-01-04 22:05:39 UTC
Can you please report:
-  mdadm --examine
  output of each device.
- /etc/mdadm.conf from the root filesystem
- /etc/mdadm.conf from the initrd
  (zcat /etc/initrd | cpio -idv etc/mdadm.conf)

Thanks.
Comment 11 Per Jessen 2012-01-05 08:11:42 UTC
An update - yesterday evening, I started seeing CPU hardware errors (machine checks), so we had the system board replaced.  The system still only uses md0 for booting from, but now at least I get an md1 and md2 at bootup.  Probably just a coincidence. 

kzinti:~ # mdadm --examine /dev/md0
/dev/md0:
   MBR Magic : aa55
kzinti:~ # mdadm --examine /dev/md1
mdadm: No md superblock detected on /dev/md1.
kzinti:~ # mdadm --examine /dev/md2
mdadm: No md superblock detected on /dev/md2.

# cat /etc/mdadm.conf
DEVICE containers partitions
ARRAY /dev/md0 UUID=9e4b6d12:ba620e07:776c2c25:004bd7b2
ARRAY /dev/md/1 metadata=1.2 UUID=ed3f2d50:68312fbe:575bfc02:58733cd8 name=kzinti:1
ARRAY /dev/md/2 metadata=1.2 UUID=1e51ddd4:d53438ab:9367c94e:25a15b0f name=kzinti:2

I uncommented the last two lines when I was trying to figure out what was going on, but leaving them in does not seem to make a difference.

mdadm.conf from initrd:
AUTO -all
ARRAY /dev/md0 metadata=0.90 UUID=9e4b6d12:ba620e07:776c2c25:004bd7b2

# cat /proc/mdstat
Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4]
md1 : active (auto-read-only) raid1 sdb2[1] sda2[0]
      4199961 blocks super 1.2 [2/2] [UU]

md2 : active (auto-read-only) raid1 sdb3[1] sda3[0]
      467393950 blocks super 1.2 [2/2] [UU]
        resync=PENDING

md0 : active raid1 sdb1[1] sda1[0]
      16787776 blocks [2/2] [UU]

What is (perhaps) also odd - md2 is an LVM PV:

# pvdisplay
  --- Physical volume ---
  PV Name               /dev/md2
  VG Name               kzinti
  PV Size               445.74 GiB / not usable 3.40 MiB
  Allocatable           yes
  PE Size               4.00 MiB
  Total PE              114109
  Free PE               62909
  Allocated PE          51200
  PV UUID               7Jfzm9-ZjIy-9r3k-SaJI-NwkH-ZQpc-GKtUQX

It has just one LV for the time being:

# lvdisplay
  --- Logical volume ---
  LV Name                /dev/kzinti/opensuse
  VG Name                kzinti
  LV UUID                DmQXRk-wCu5-2jtk-soZ2-2rmg-AX6W-e8VWsM
  LV Write Access        read/write
  LV Status              NOT available
  LV Size                200.00 GiB
  Current LE             51200
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto

"LV Status     NOT available" is the odd bit - I can easily activate it myself with "vgchange -a y", but that should happen automatically at boot-up.
Comment 12 Per Jessen 2012-01-05 13:12:50 UTC
Everything looked fine in the above, so I reconfigured the system to resume using md1 and md2 on startup (md1 for swap, md2 via an LVM LV).

With the LV listed in /etc/fstab, the boot fails, but it looks like the RAID issue has in fact disappeared. I.e. the three RAID devices are appropriately numbered etc. 

Unless the "mdadm --examine" above (no superblock?) is an issue that needs to be dealt, I think we can close this again. Sorry about the noise.
Comment 13 Neil Brown 2012-01-12 05:57:07 UTC
"mdadm --examine" is meant to be applied to the member devices, not the arrays.
So that fact that "no superblock" is reported when it is run on an array is not a concern.

The mdadm.conf files look correct ... but then it all seems to be working now so I'm not sure what that tells us.

mdadm will assign md devices like md127 and md126 when it isn't sure that the the array really belongs to "this" machine.  That is determined by the 'homehost' which e.g. "mdadm --examine /d/ev/sdb3" will report.

If it finds correct details in /etc/mdadm.conf it should not worry about 'homehost' but should trust the arrays and given them there 'proper' names.

Maybe there is some sort of ordering problem with systemd - it would need to start LVM after 'mdadm' has assembled the arrays ... or udev would need to give the devices to LVM as they appear ... I'm not sure if it does that.

If you still have problems I suggest trying to boot with sysvinit rather than systemd.

According to http://www.novell.com/linux/releasenotes/x86_64/openSUSE/12.1/#12
this is achieved by pressing 'F5'. 
If this makes a difference, consider filing a bug against systemd.

Thanks.