Bug 804544

Summary: libzypp: doesn't handle symlinks in export pathname on NFSv4 properly
Product: [openSUSE] openSUSE Tumbleweed Reporter: Jeff Mahoney <jeffm>
Component: libzyppAssignee: E-mail List <zypp-maintainers>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P1 - Urgent    
Version: 13.1 Beta 1   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: Development Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: zypper.log
Example code to confirm that an NFS4 mount is expected

Description Jeff Mahoney 2013-02-19 21:56:28 UTC
Created attachment 525356 [details]
zypper.log

On my home server, I have a bunch of ISOs loop mounted under /data/tftpboot/images/<release>/<arch>

I've bind-mounted /data as /nfs4/data for export. Also in /nfs4 is a symlink from 'tftpboot' to 'data/tftpboot' I have the same symlink in /, linking /tftpboot to /data/tftpboot.

If I mount server:/tftpboot/opensuse12.3/x86_64 with NFSv3, it works as expected and I can access the image. If I mount it with NFSv4, it also works as expected and I can access the image.

However, when I use that image as a zypper repository, it fails with:
2013-02-19 16:15:13 <1> sled1(31707) [zypp] MediaHandler.cc(checkAttached):604 Looking for media(nfs<192.168.1.254:/tftpboot/images/opensuse12.2/x86_64>)attached(*/var/adm/mount/AP_0xesP9Zs)
2013-02-19 16:15:13 <1> sled1(31707) [zypp++] MediaHandler.cc(checkAttached):611 MountEntries: {
[...]
2013-02-19 16:15:13 <1> sled1(31707) [zypp++] MediaHandler.cc(checkAttached):611   192.168.1.254:/data/tftpboot/images/opensuse12.2/x86_64 on /var/adm/mount/AP_0xesP9Zs type nfs4 (ro,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,port=0,timeo=300,retrans=2,sec=sys,clientaddr=192.168.1.2,local_lock=none,addr=192.168.1.254)
2013-02-19 16:15:13 <1> sled1(31707) [zypp++] MediaHandler.cc(checkAttached):611 }
2013-02-19 16:15:13 <2> sled1(31707) [zypp] MediaHandler.cc(checkAttached):622 Attached media not in mount table ...

I tracked it here: zypp/media/MediaHandler.cc:576
            if( ref.mediaSource->equals( media) &&
                ref.attachPoint->path == Pathname(e->dir))

It turns out that the NFSv4 client will mount the root of the NFSv4 export and then work its way up the path to do the final mount. Any symbolic links in the path will necessarily be expanded and reported as part of the mount.

So rather than seeing:
192.168.1.254:/tftpboot/opensuse12.3/x86_64
I see:
192.168.1.254:/data/tftpboot/images/opensuse12.2/x86_64

.. and the ref.mediaSource->equals(media) call fails.

I asked Neil Brown, one of our resident NFS experts, about this and he said that there's no easy way to resolve the pathname from the requested one to the used one other than manually mounting the NFSv4 root manually and doing the same pathname resolution. Even then, NFSv4 referrals may mean that the mount attempt was referred to a different server entirely and the server reported may differ from the one requested.

I don't know the history of manually checking the result of the mount command after success, but with NFSv4 it may make sense to limit it to the file system type and mount point or to trust a successful mount command.

Full zypper.log attached.
Comment 1 Jeff Mahoney 2013-09-19 20:50:48 UTC
This issue is still in openSUSE 13.1 Beta1.

Incidentally, mounting the NFS root and following the path may be a sane way to handle nfs4 mounting. It is exactly what the kernel does. On the server, everything exported via NFS4 has to be in the same namespace.
Comment 2 Jeff Mahoney 2013-09-20 15:18:58 UTC
OTOH, are there cases in which the kernel will have a successful mount call end up with an unusable file system? Is isAttached even needed anymore?
Comment 3 Jeff Mahoney 2013-09-20 16:31:31 UTC
Created attachment 559318 [details]
Example code to confirm that an NFS4 mount is expected

NFS4 involves referrals and submounts for requests that cross remote file system boundaries.

e.g. on my server, I have:

/nfs4 (root fs)
/nfs4/data (btrfs)
/nfs4/data/tftpboot/images/openSUSE13.1/x86_64 (loop mounted iso9660)

On the client, these appear like this:
starscream:/ on /exports/starscream type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp6,timeo=600,retrans=2,sec=sys,clientaddr=2001:470:1f07:5f7:eda7:3faf:37eb:39c6,local_lock=none,addr=2001:470:1f07:5f7::1)
starscream:/data on /exports/starscream/data type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp6,port=0,timeo=600,retrans=2,sec=sys,clientaddr=2001:470:1f07:5f7:eda7:3faf:37eb:39c6,local_lock=none,addr=2001:470:1f07:5f7::1)
starscream:/data/tftpboot/images/opensuse13.1/x86_64 on /exports/starscream/data/tftpboot/images/opensuse13.1/x86_64 type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp6,port=0,timeo=600,retrans=2,sec=sys,clientaddr=2001:470:1f07:5f7:eda7:3faf:37eb:39c6,local_lock=none,addr=2001:470:1f07:5f7::1)

Any of those transitions could have been referred to a different server or follow a server-side symbolic link. The entry in /proc/mounts will reflect the "real" pathname and server that was followed. When libzypp encounters it, it assumes another file system has been mounted instead and failed the confirmation.

If the server/path pair doesn't match the requested ones, we can confirm that we have the right file system by mounting the root of the original server and then following the path ourselves. By comparing the st_dev and st_ino of the directories, we can confirm that the right file system has been located.
Comment 4 Michael Andres 2013-09-23 07:47:10 UTC
I'll have a look at it. 

AFAIR isAttached was originally written to detect whether devices (CD/HD) are already mounted in order to reuse the existing mounts. Probably not really needed nfs. I'll check how the code mutated throughout the past years.
Comment 5 Michael Andres 2013-09-27 11:19:20 UTC
fixed for libzypp-13.7.1