Bug 783152

Summary: zfs-fuse wont start at boot time but starts properly if started manually
Product: [openSUSE] openSUSE 12.2 Reporter: Christian Hoffmann <kaspernasebaer>
Component: BasesystemAssignee: Forgotten User cAXlJ_FoSf <forgotten_cAXlJ_FoSf>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None CC: fcrozat, kaspernasebaer, suse-beta
Version: Final   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE 12.2   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: Output of systemctl show zfs-fuse

Description Christian Hoffmann 2012-10-02 09:24:07 UTC
User-Agent:       Opera/9.80 (Windows NT 6.1; WOW64; U; en) Presto/2.10.289 Version/12.02

My system has 3.2GHz and a ssd drive holding all system partitions.
There is a zfs pool consisting of four normad HDs. zfs-fuse is set
active fpr runlevel 2,3,5 in yast runlevel editor.

After booting zfs filesystem is not available because the zfs-fuse
daemon is not started. If started manually by "# rczfs-fuse start"
no errors occur an the daemon starts up correctly 



Reproducible: Always

Steps to Reproduce:
1. Install zfs-fuse package  
2. create a pool on a physical device, e.g. /dev/sdb
3. enable service in runlevel editor
4. reboot twice
Actual Results:  
Service does not start automatically at boot. All zfs-fuse related data is not available.

Expected Results:  
service starts at boot

The problem is related to the system init process and the
zfs-fuse initscript. 

The /etc/init.d/zfs-fuse is writing its pidfile into /var/run... before
the tmpfs. Later, the pidfile is hidden. 

The system log shows this messages:

WARNING: /var/run/zfs/zfs-fuse.pid already exists; aborting.

Changing the pidfile to "/run/zfs-fuse.pid" inside
the initscript gets me a bit further, but startup still does not work:

Starting zfs-fuse daemonconnect: No such file or directory
Please make sure that the zfs-fuse daemon is running.
internal error: failed to initialize ZFS library
 ..done

Adding $local_fs and $time to the "# Required-Start:" line is a 
workarround which works for me but appears not to be the ultimate
solution.
Comment 1 Forgotten User cAXlJ_FoSf 2012-10-07 12:12:44 UTC
(In reply to comment #0)
> Service does not start automatically at boot. All zfs-fuse related data is not
> available.
> 
> Expected Results:  
> service starts at boot
> 
> The problem is related to the system init process and the
> zfs-fuse initscript. 
> 
> The /etc/init.d/zfs-fuse is writing its pidfile into /var/run... before
> the tmpfs. Later, the pidfile is hidden. 
> 
> The system log shows this messages:
> 
> WARNING: /var/run/zfs/zfs-fuse.pid already exists; aborting.

I can reproduce this when using systemd but it seems to work fine with sysvinit, so this seems to be some systemd quirk and looks indeed like a problem with the /var/run tmpfs.

> Changing the pidfile to "/run/zfs-fuse.pid" inside
> the initscript gets me a bit further, but startup still does not work:
> 
> Starting zfs-fuse daemonconnect: No such file or directory
> Please make sure that the zfs-fuse daemon is running.
> internal error: failed to initialize ZFS library
>  ..done

This error seems to come from the "zfs mount -a" in the init script when no fuse daemon is running. That can only get executed though if startproc returned success and the pidfile was successfully read, have you changed the PIDFILE= line or just modified the startproc argument in the script?

> Adding $local_fs and $time to the "# Required-Start:" line is a 
> workarround which works for me but appears not to be the ultimate
> solution.

I don't see anything wrong with the dependencies, the $remote_fs facility implies $local_fs and that should mean that all filesystems need to be mounted before zfs-fuse can be started. CC'in the systemd maintainer for clarification.
Comment 2 Christian Hoffmann 2012-10-08 06:24:19 UTC
(In reply to comment #1)
> (In reply to comment #0)

> This error seems to come from the "zfs mount -a" in the init script when no
> fuse daemon is running. That can only get executed though if startproc returned
> success and the pidfile was successfully read, have you changed the PIDFILE=
> line or just modified the startproc argument in the script?

I have changed the PIDFILE= line inside the startup script. 

My impression is that tmpfs actually just gets mounted during zfs-fuse
startup and therefore sometimes a pidfile was written to the 
mountpoint directory and later "overmounted" which sometimes leaded to
the confusing error message that a "pidfile already exists" at startup.

> 
> > Adding $local_fs and $time to the "# Required-Start:" line is a 
> > workarround which works for me but appears not to be the ultimate
> > solution.
> 
> I don't see anything wrong with the dependencies, the $remote_fs facility
> implies $local_fs and that should mean that all filesystems need to be mounted
> before zfs-fuse can be started. CC'in the systemd maintainer for clarification.

Maybe zfs-fuse is considered as part of $local_fs and the argument is therefore ignored? (Sorry, I am not very familiar with systemd...)
Comment 3 Frederic Crozat 2012-10-09 14:12:36 UTC
could you attach the output of "systemctl show zfs-fuse.service" ?

in openSUSE 12.3, /var/run will be a symlink to /run which is mounted as tmpfs in initrd, so it shouldn't be a problem (in the future).
Comment 4 Christian Hoffmann 2012-10-10 09:17:44 UTC
Created attachment 508893 [details]
Output of systemctl show zfs-fuse
Comment 5 Frederic Crozat 2012-10-10 12:42:40 UTC
After some tests, zfs-fuse init script is using S runlevel by default, which is not supported under systemd (which is why, enabling it with YaST tool or chkconfig doesn't really enable it). The runlevel should probably be changed to S 0 3 in the initscript.

Moreover, after this fix, zfs-fuse doesn't start, because /var/run/zfs doesn't exist (since /var/run is a tmpfs, so /var/run/zfs doesn't exist at boot). Either initscript should create the file or (preferably), a tmpfile should be shipped in the package in /usr/lib/tmpfiles.d (see man tmpfiles.d for the syntax) to create /var/run/zfs at startup.

You might want to integrate the work done by Fedora people to properly integrate zfs-fuse into systemd (.service and so one) : http://pkgs.fedoraproject.org/cgit/zfs-fuse.git/tree/
Comment 6 Forgotten User cAXlJ_FoSf 2012-10-11 09:57:36 UTC
(In reply to comment #5)
> After some tests, zfs-fuse init script is using S runlevel by default, which is
> not supported under systemd (which is why, enabling it with YaST tool or
> chkconfig doesn't really enable it). The runlevel should probably be changed to
> S 0 3 in the initscript.

Yes, that should be rather S 1 2 3 4 5, it must never get stopped on runlevel changes, I'll fix that.

> Moreover, after this fix, zfs-fuse doesn't start, because /var/run/zfs doesn't
> exist (since /var/run is a tmpfs, so /var/run/zfs doesn't exist at boot).
> Either initscript should create the file or (preferably), a tmpfile should be
> shipped in the package in /usr/lib/tmpfiles.d (see man tmpfiles.d for the
> syntax) to create /var/run/zfs at startup.

No, the zfs-fuse daemon will create /var/run/zfs itself when it tries to create a socket there before writing the pidfile and as a matter of fact it works fine with sysvinit where /var/run is also a tmpfs. I even tried an explicit mkdir in the init script but it still fails.

> You might want to integrate the work done by Fedora people to properly
> integrate zfs-fuse into systemd (.service and so one) :
> http://pkgs.fedoraproject.org/cgit/zfs-fuse.git/tree/

Actually all it does is calling the renamed sysv init script so there's no advantage copying that.
Comment 7 Christian Boltz 2012-10-11 13:04:50 UTC
(In reply to comment #6)
> Yes, that should be rather S 1 2 3 4 5, it must never get stopped on runlevel
> changes, I'll fix that.

Please do not include runlevel 4 - this runlevel is never used, and adding it will probably cause warnings from insserv
Comment 8 Frederic Crozat 2012-10-11 14:56:52 UTC
I would suggest to add "# PIDFile: /var/run/zfs/zfs-fuse.pid" in the initscript
header. This will help systemd to correctly detect if daemon is running.

It looks like startproc is finishing before /var/run/zfs/zfs-fuse.pid is
written, which is breaking the initscript. You should add "-w" option to
startproc, which will ensure it will wait for zfs-fuse parent process to
terminate (and therefore PID is written) before the rest of initscript is
executed. With my test, it seems to fix the issue.
Comment 9 Forgotten User cAXlJ_FoSf 2012-10-11 21:16:20 UTC
(In reply to comment #8)
> I would suggest to add "# PIDFile: /var/run/zfs/zfs-fuse.pid" in the initscript
> header. This will help systemd to correctly detect if daemon is running.
> 
> It looks like startproc is finishing before /var/run/zfs/zfs-fuse.pid is
> written, which is breaking the initscript. You should add "-w" option to
> startproc, which will ensure it will wait for zfs-fuse parent process to
> terminate (and therefore PID is written) before the rest of initscript is
> executed. With my test, it seems to fix the issue.

Thanks, I've added both and done some testing, the latter indeed makes startup reliable with systemd. I'll push updates for 12.1/12.2 soon.
Comment 10 Bernhard Wiedemann 2012-10-11 22:00:07 UTC
This is an autogenerated message for OBS integration:
This bug (783152) was mentioned in
https://build.opensuse.org/request/show/137911 Factory / zfs-fuse
Comment 11 Benjamin Brunner 2012-10-17 12:21:32 UTC
Update released for 12.1 and 12.2. Resolved fixed.
Comment 12 Swamp Workflow Management 2012-10-17 13:08:57 UTC
openSUSE-RU-2012:1356-1: An update that has one recommended fix can now be installed.

Category: recommended (low)
Bug References: 783152
CVE References: 
Sources used:
openSUSE 12.2 (src):    zfs-fuse-0.7.0-11.4.1
openSUSE 12.1 (src):    zfs-fuse-0.7.0-7.4.1