Bug 731457

Summary: systemd hangs when processing swap on lvm
Product: [openSUSE] openSUSE 12.1 Reporter: Marcus Schaefer <ms>
Component: BasesystemAssignee: Frederic Crozat <fcrozat>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Critical    
Priority: P5 - None CC: cschum, mmarek, rjschwei, shshyukriev
Version: Final   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Bug Depends on:    
Bug Blocks: 735634    
Attachments: debug dmesg output
udev rule and sane udev exit

Description Marcus Schaefer 2011-11-18 20:13:26 UTC
I build an appliance which create a swap partition on LVM on first boot
systemd hangs when trying to activate the swapspace. Later when I simply
call swapon -a the swap is there and works. the fstab looks like this

   /dev/systemVG/LVSwap swap swap defaults 0 0

I will attach the dmesg output from systemd started with 

   --log-level=debug --log-target=kmsg

How to reproduce this:

The appliance exists as debugging install iso in my NFS home directory

   ~ms/LimeJeOS-openSUSE-12.1.x86_64-1.12.1.iso

to run an installation call the following:

1. qemu-img create /tmp/mydisk 4G
2. qemu-kvm -hda /tmp/mydisk \
            -cdrom LimeJeOS-openSUSE-12.1.x86_64-1.12.1.iso -boot d

during the installation the system will stop with a shell right
before run-init/systemd is called. You can check the environment here
and if you simple type 'exit' the system will boot and you see 
the timeout happening.

This is really important for us, the studio team and the preload team
to work. The problem does not happen if the swap space is not on an
LVM or if sysvinit is used
Comment 1 Marcus Schaefer 2011-11-18 20:14:11 UTC
Created attachment 462985 [details]
debug dmesg output
Comment 2 Marcus Schaefer 2011-11-20 23:00:59 UTC
I also tested if the problem goes away when I use the real device
name in my test /dev/dm-3 instead of /dev/systemVG/LVSwap but that also
didn't help, systemd just sit and wait
Comment 3 Cornelius Schumacher 2011-11-22 10:29:01 UTC
This bites us in Studio as well. Would be great, if we could get this fixed soon.
Comment 4 Frederic Crozat 2011-11-29 10:07:58 UTC
Could you test with package from home:fcrozat:systemd / systemd ?

it contains lvm related fixes.
Comment 5 Marcus Schaefer 2011-11-29 10:11:46 UTC
The test image I built for you in my NFS home dir was built with systemd
from your home repo. So did you add changes there since I opened the report ?
if yes I will rebuilt the image again but if not I fear those changes
did not help
Comment 6 Frederic Crozat 2011-11-29 13:30:57 UTC
changes I'm talking about were done on 2011-11-10 18:17:05  (not 100% sure when the package were ready, but I guess it should be safe).
Comment 7 Marcus Schaefer 2011-12-01 15:48:14 UTC
sorry still the same problem with the latest systemd version from your
home project: systemd-37-298.1.x86_64

have you tried with my debug image like I wrote in comment #1
you can safely reproduce the problem there
Comment 8 Frederic Crozat 2011-12-01 16:02:28 UTC
I'll try to reproduce it here.
Comment 9 Frederic Crozat 2011-12-05 18:10:02 UTC
I can confirm the bug with your image. Strangely, it works fine on the second boot. It looks like kiwi initrd might leave the system in "strange" state, causing systemd (and / or udevd) to not propagate lvm events ..

I'll investigate further
Comment 10 Marcus Schaefer 2011-12-06 13:39:32 UTC
yes that's what I found out as well but I wasn't able to identify any difference in the state from first boot (kiwi initrd) and any subsequent boot with the
suse (mkinitrd) initrd. That's also the reason why I invoke a debug shell
right before systemd is started to give you the chance to debug the environment

Thanks for your effort
Comment 11 Frederic Crozat 2011-12-07 11:23:26 UTC
ok, I've compared the situation with dracut (Fedora initramfs) and it looks like we are missing some bits (which are explained in udev 168 release notes) :

"The running udev daemon can now cleanly shut down with:
  udevadm control --exit

Udev in initramfs should clean the state of the udev database
with: udevadm info --cleanup-db which will remove all state left
behind from events/rules in initramfs. If initramfs uses
--cleanup-db and device-mapper/LVM, the rules in initramfs need
to add OPTIONS+="db_persist" for all dm devices. This will
prevent removal of the udev database for these devices.
"

so, I think (I didn't test since I can't rebuild easily kiwi initrd) we need to :
- ensure we stop udev "safely", ie using :
    udevadm control --exit
    udevadm info --cleanup-db
- flag dm/lvm in udev database, when running under kiwi, as "persistent", by adding a additional udev rule, like dracut is doing :11-dm.rules
SUBSYSTEM!="block", GOTO="dm_end"
KERNEL!="dm-[0-9]*", GOTO="dm_end"
ACTION!="add|change", GOTO="dm_end"
OPTIONS+="db_persist"
LABEL="dm_end"

it looks like we should do this for our "regular" initrd too..
Comment 12 Marcus Schaefer 2011-12-07 20:38:43 UTC
I did the suggested changes and gave it a test but the result was still the
same. I will attach my patch maybe you see an error there
Comment 13 Marcus Schaefer 2011-12-07 20:39:45 UTC
Created attachment 466382 [details]
udev rule and sane udev exit

anything wrong here ?
Comment 14 Frederic Crozat 2011-12-08 10:17:35 UTC
patch looks good but I don't have any garantee it is supposed to fix the bug :(

could you attach the kiwi file you are using, so I'll try to rebuild the appliance myself locally to test various things ?
Comment 15 Marcus Schaefer 2011-12-08 10:32:41 UTC
if you have kiwi and kiwi-templates installed you can simply type:

   kiwi --build suse-12.1-JeOS -d /tmp/mytest --lvm --type oem

that will build the lvm enabled oem appliance inclduing the oem installation iso
you can patch your local kiwi with the attached patch. It's the same I did
for testing
Comment 16 Frederic Crozat 2011-12-08 10:42:39 UTC
thanks, will do.. adding Greg as cc, since he is our udev maintainer ATM, so he might have a clue to this too ;)
Comment 17 Marcus Schaefer 2011-12-08 10:45:56 UTC
If you need a special build with a debug shell or something just tell me
I can build that for you so you don't have to spent time on appliance
creation
Comment 18 Frederic Crozat 2011-12-08 13:43:55 UTC
still debugging the issue. After discussing with Kay Sievers, changing the way we quit udev in kiwi won't fix the issue (unrelated to this problem).

Moreover, systemd has no knowledge of lvm (only device-mapper). So, I'm guessing we might need to wait for boot.lvm to complete before mounting swap in systemd (I had to do similar fix for cryptsetup and fsck).

I'll test this hypothesis..
Comment 19 Frederic Crozat 2011-12-08 17:20:12 UTC
no change by adding a dependency on boot.lvm before running mount command.

In fact, comparing output from :
udevadm info --query=all --name /dev/kiwiVG/LVSwap

between initial boot (when kiwi initrd is started) and "normal" boot shows a lot of differences, mostly in the symlinks for devices, which are not in udev database, which would explain why systemd doesn't "react" on the dependency on this device.

I'm more and more convinced bug is udev handling between kiwi initrd, "standard" initrd and boot after initrd.

Let's see if I can find more info on this..
Comment 20 Frederic Crozat 2011-12-08 18:10:47 UTC
running udevadm trigger --action=change --sysname-match=dm-* 
correctly fill udev database (but since I'm doing that after logging, it is too late for udev to catch-up)..

Maybe we should do that in the "udev trigger" service file, but only when booting after kiwi ?
Comment 21 Frederic Crozat 2011-12-08 18:15:06 UTC
some info there too : http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=593625#25
Comment 22 Marcus Schaefer 2011-12-09 09:15:54 UTC
Hmm, sorry but it has no effect in my test. I changed the following
in the kiwi prepared root tree:

cat /lib/systemd/system/udev-trigger.service                          
[Unit]                                                                          
Description=udev Coldplug all Devices                                           
Wants=udev.service                                                              
After=udev-kernel.socket udev-control.socket                                    
DefaultDependencies=no                                                          
                                                                                
[Service]                                                                       
Type=oneshot                                                                    
RemainAfterExit=yes                                                             
ExecStart=/sbin/udevadm trigger --type=subsystems --action=add ; /sbin/udevadm t
rigger --type=devices --action=add ; udevadm trigger --action=change --sysname-m
atch=dm-*

I build the image with this modification and gave it a try... no change
Comment 23 Frederic Crozat 2011-12-09 13:32:56 UTC
ok, found the bug : udev is now storing its db in /run/udev (and one part in /dev/.udev), so both needs to be moved from initrd to running system (this is only for "recent" distro, like fedora16, openSUSE 12.1) :

/run must be mounted tmpfs, just before starting udevd :
# mount run tmpfs
mount -t tmpfs -o mode=0755,nodev,nosuid tmpfs /run

then, before killing udev :
mount --move /run /mnt/run

We should probably not kill udev the way we do ATM but it is too risky to change that now, better to postpone this for 12.2.
Comment 24 Marcus Schaefer 2011-12-09 14:34:10 UTC
I can verify that this fixed the problem and submitted new kiwi packages

Thanks much for all your help
Comment 25 Swamp Workflow Management 2012-05-30 14:12:03 UTC
openSUSE-RU-2012:0668-1: An update that has 16 recommended fixes can now be installed.

Category: recommended (low)
Bug References: 728885,729251,729315,729636,729857,730763,731457,732247,736491,740033,740073,743159,745548,747898,752259,754344
CVE References: 
Sources used:
openSUSE 12.1 (src):    kiwi-4.98.35-1.4.1