Bugzilla – Bug 805415
raid1: started degraded after crash
Last modified: 2013-03-12 02:46:29 UTC
Hi Neil, when my system crashes, after a fresh boot, my raid1 is started degraded (I'm not sure if every time): # journalctl |grep \\.md Feb 23 22:26:37 bellona.site boot.md[1538]: Starting MD RAID mdadm: /dev/md1 has been started with 1 drive (out of 2). Feb 23 22:26:37 bellona.site boot.md[1538]: ..done # cat /proc/mdstat Personalities : [raid0] [raid1] md1 : active raid1 sda2[1] 48794496 blocks super 1.2 [2/1] [_U] I have to manually call mdadm -a: # mdadm -a /dev/sdb2 mdadm: added /dev/sdb2 bellona:~ # cat /proc/mdstat Personalities : [raid0] [raid1] md1 : active raid1 sdb2[2] sda2[1] 48794496 blocks super 1.2 [2/1] [_U] [>....................] recovery = 0.0% (6528/48794496) finish=248.5min speed=3264K/sec fdisk output: /dev/sda2 411648 98066431 48827392 83 Linux /dev/sdb2 2048 97656831 48827392 83 Linux # mdadm --misc --detail /dev/md1 /dev/md1: Version : 1.2 Creation Time : Sat Sep 8 20:18:43 2012 Raid Level : raid1 Array Size : 48794496 (46.53 GiB 49.97 GB) Used Dev Size : 48794496 (46.53 GiB 49.97 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Sat Feb 23 23:11:34 2013 State : clean, degraded, recovering Active Devices : 1 Working Devices : 2 Failed Devices : 0 Spare Devices : 1 Rebuild Status : 1% complete Name : bellona:1 UUID : 3bad2815:4e6bfc83:b113591c:22fd63f8 Events : 12658 Number Major Minor RaidDevice State 2 259 917504 0 spare rebuilding /dev/sdb2 1 259 262144 1 active sync /dev/sda2 # rpm -q mdadm mdadm-3.2.6-4.1.x86_64 # uname -a Linux bellona.site 3.8.0-rc7-next-20130218_64+ #1768 SMP Mon Feb 18 10:08:51 CET 2013 x86_64 x86_64 x86_64 GNU/Linux
Can I get the kernel logs when it was booting? I think this is mostly likely caused by some sort of race where one of the device is being held busy by udev in some way while mdadm is trying to assemble the array - so it only manages to get one device. Hopefully the kernel logs will have more hints.
Created attachment 526376 [details] /var/log/message excerpt Yeah, sure.
Nothing much useful there unfortunately. It does confirm that sda2 is definitely visible and working before mdadm runs, but it doesn't show what mdadm didn't use it. does journalctl | grep mdadm show anything? If/when it happens again, could you collect the output of mdadm -E /dev/sd[ab]2 before adding the missing device back in. Is it always the same device that is missing?
(In reply to comment #3) > does > journalctl | grep mdadm > > show anything? I tried that when it happened and there was nothing for mdadm, neither for \\.md. > If/when it happens again, could you collect the output of > mdadm -E /dev/sd[ab]2 > before adding the missing device back in. It haven't happened again yet. And I tried dd if=/dev/urandom of=file with echo c >/proc/sysrq-trigger several times. As soon as it recurs, I will provided that info.
I'm pretty sure this is the same problem as bug 793954. You'll find a work-around there. I'll hopefully come up with a proper fix soon. *** This bug has been marked as a duplicate of bug 793954 ***