Bug 1151044 - NFS exports from lates TW cannot be mounted from older systems
Summary: NFS exports from lates TW cannot be mounted from older systems
Status: RESOLVED FIXED
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Network (show other bugs)
Version: Current
Hardware: Other Other
: P5 - None : Critical (vote)
Target Milestone: ---
Assignee: Neil Brown
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-09-17 10:47 UTC by Peter Sütterlin
Modified: 2020-03-27 04:43 UTC (History)
4 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---
jslaby: needinfo? (paka)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Sütterlin 2019-09-17 10:47:41 UTC
Since the latest Tumbleweed update to 20190909 NFS shares exported by the upgraded system cannot be mounted by older systems.

I have a home NFS server running Leap 42.3 that I recently re-installed (new partition) with TW 20190909.  The main client machine is running TW 20180525 (no updates because 3rd party software doesn't like newer ffmpeg stuff).

The client cannot mount the export from the new installation.  It can see exports from the server machine using 'showmount -e', but trying to mount the export results in a
  mount.nfs: access denied by server while mounting 192.168.1.36:/export/Video

From my laptop (also running TW 20190909) I can mount the export without problems.

I've also set up an export on my laptop, but neither the old TW nor the Leap 42.3 installation can mount the share, they get the same error.

The Leap machine runs nfs-client 1.3.0, the old TW has 2.1.1.
The current TW has 2.3.3
Comment 1 patrick shanahan 2019-09-17 12:04:54 UTC
I experienced the same.  Was later able to mount after updating nfs-client on faulty client.  Sorry, do not recall the old version.
Comment 2 Jiri Slaby 2019-09-17 16:29:14 UTC
(In reply to patrick shanahan from comment #1)
> Sorry, do not recall the old version.

Do:
# grep nfs-client /var/log/zypp/history
Comment 3 patrick shanahan 2019-09-17 17:23:29 UTC
here is most recent:

# 2019-07-16 19:20:30 nfs-client-2.1.1-8.5.x86_64.rpm installed ok
2019-07-16 19:20:30|install|nfs-client|2.1.1-8.5|x86_64||Tumbleweed.OSS|3e2cb8cbbe65f13d11c97930464b4578645bec08a3702ccf04554f92d98975f0|
2019-08-19 10:58:33|install|yast2-nfs-client|4.2.0-1.1|noarch||Tumbleweed.OSS|db0c73b099f176a50a2167e75109b2809b8085fbb02ef942ffc4468984ae5c22|
# 2019-09-09 08:48:42 nfs-client-2.1.1-8.6.x86_64.rpm installed ok
2019-09-09 08:48:42|install|nfs-client|2.1.1-8.6|x86_64||Tumbleweed.OSS|5bb2119ae786dd5c0a862a393542d0da6cbcd2ec24414005c2f00d3b96b387ce|
2019-09-09 08:52:52|install|yast2-nfs-client|4.2.2-1.1|noarch||Tumbleweed.OSS|a066f464bb706926cea89e0d7d9dce27c8814093602e066ac7db120651b7eca4|
# 2019-09-13 08:06:16 nfs-client-2.3.3-9.1.x86_64.rpm installed ok
2019-09-13 08:06:16|install|nfs-client|2.3.3-9.1|x86_64||Tumbleweed.OSS|5cadb0ae9ff2c9996c08d6a9409a5a7a8ef7a791a9e1a1744202161e6ed238fe|
Comment 4 Neil Brown 2019-09-17 22:19:29 UTC
- Please report output of "mount -v ...mountpoint".  i.e. add '-v' to mount command
- are you expecting you NFSv3 or NFSv4?
- does /etc/exports on the server contains "fsid=0" ?  Maybe provide the whole /etc/exports if that's ok.

I have no idea what might cause this, but I'm hoping something in the above might give me a clue.
Comment 5 Peter Sütterlin 2019-09-18 07:34:39 UTC
speedy:~ # showmount -e 192.168.1.36
Export list for 192.168.1.36:
/home/Extern 192.168.1.0/24
speedy:~ # mount -v -v -v 192.168.1.36:/home/Extern /mnt
mount.nfs: timeout set for Wed Sep 18 08:12:38 2019
mount.nfs: trying text-based options 'vers=4,addr=192.168.1.36,clientaddr=192.168.1.33'
mount.nfs: mount(2): Permission denied
mount.nfs: access denied by server while mounting 192.168.1.36:/home/Extern

I've tried setting 'debug=all' for nfsd in /etc/nfs.conf.
I do see some startup messages, but *nothing* when I try to connect/mount.
However, now the mount call (on the Leap 42.3 machine) hangs instead of returning with an error.  
No idea what to make of that.

What is "fsid=0"?  I'never ever used that in >20 years.....
So no, it's not in exports, that just has
/home/Extern    192.168.1.0/24(rw,no_root_squash,async)

Similar for the other case, there it's /export/Video.  In both cases top is not /, but a separate partition (/home and /extern; both are XFS partitions)

I have tried with fsid=0, but that doesn't change things here
Comment 6 patrick shanahan 2019-09-18 12:26:21 UTC
08:12 crash2: ~ # mount -v /mnt/nfs/ExT4/
mount.nfs: trying text-based options 'bg,retry=500,vers=4.2,addr=192.168.1.3,clientaddr=192.168.1.8'
08:12 crash2: ~ # mount -v /mnt/nfs/ExT4/
mount.nfs: trying text-based options 'bg,retry=500,vers=4.2,addr=192.168.1.3,clientaddr=192.168.1.8'


is set to highest but expect NFSv4

/etc/exports
/mnt/ExT4       192.168.1.0/24(rw,root_squash,sync,no_subtree_check)


the mount works now since updating nfs-client, so adding "fsid=0" will not prove anything...

fwiw: subject drive/export is usb3 ntfs formatted external drive
Comment 7 Peter Sütterlin 2019-09-18 12:49:30 UTC
Some additional info:
I just downgraded my (TW 20190909) laptop to
  nfs-client-2.1.1-8.6.x86_64
  nfs-kernel-server-2.1.1-8.6.x86_64
  nfsidmap-0.27-2.5.x86_64
from the history tree of snapshot 20190907 (which in turn removed libnfsidmap1)

With unchanged export settings etc. the Leap 42.3 machine can now mount the share without issues.
Comment 8 Neil Brown 2019-09-18 21:52:54 UTC
There is some issue with firewalls and NFS in the latest update.
That would not explain "Permission denied", but might explain the mount
call hanging.  So if you try again, please disable any firewall and see if that makes a difference.

Error message concerning the failure would either come from mountd or from the kernel.
It doesn't seem possible to enable mountd debugging with /etc/nfs.conf.
You would need to edit /usr/lib/systemd/system/nfs-mountd.service and add a 
  --debug all

to the ExecStart line.
Then 
  systemctl daemon-reload
  systemctl restart nfs-mountd

For kernel messages,
   rpcdebug -m nfsd -s all

Then try the mount.

kernel messages will appear in the output of 'dmesg'
mountd messages should appear in 
  journalctl -u nfs-mountd.service --since -1day

You can turn off debugging by reverting the change to nfs-mountd.server and restarting the service.  For the kernel
  rpcdebug -m nfsd -c all


If you see "fsid=0" as an export option on one directory, it becomes the directory you get when you mount "/" over NFSv4.  This is an internal implementation detail that is now best forgotten, but some installers still set it and some installations might have it left over from the past.  It could give confusing results - possibly even "Permission denied" errors.
Comment 9 Peter Sütterlin 2019-09-18 22:43:21 UTC
Firewall is already disabled, that was my first guess.  

So I went ahead and added the option in the service file.
But now the 42.3 machine could mount the share without problems.
I had however updated to 20190917 in the meantime.

I'll try again at home, that one is still at 0909.  But I won't be there before 1-2 days from now....
Comment 10 Peter Sütterlin 2019-09-19 10:52:08 UTC
Hmm, I just rebooted the laptop to a snapshot of 0909, before the update, when I had issues.

This one now also can be mounted from the 42.3 machine.  
Would it be correct to assume that this suggests it is some setting in /var that had been corrected by the updated system, and now also works for the snapshot (as var is not part of the snapshot)?
I'll see if I can compare the laptop with the machine at home...
Comment 11 Peter Sütterlin 2019-09-22 11:17:39 UTC
So, back at the home computer.  I still cannot mount the export from that one.

But I remembered that (on the laptop) I did 'touch /etc/nfs.conf.local' to get rid of that annoying log message.  So first thing I tried also at the home machine was stopping nfs-server, touching the file and starting nfs-server again.

And now I can mount it from the old TW installation! (I don' have another 42.3 at home but I'm quite sure this would work, too).  Nothing else changed.

Just for confirmation, I removed the file on the laptop and (re)started the nfs-server, now the old TW cannot mount the export anymore....

Why that 'missing' file blocks mounts from older versions but allows mounts from recent ones I leave for others to solve....
Comment 12 Neil Brown 2019-09-23 00:16:18 UTC
> So first thing I tried also at the home machine was stopping nfs-server, touching the file and starting nfs-server again.
> 
> And now I can mount it from the old TW installation!

Thanks for the report.  Despite the fact that this seem like strong evidence, I suspect that it is actually misleading.
The existence of non-existence of that file will affect the appearance of the warning, but it won't affect anything else.  I'm quite certain of that.
So there must be sometime else happening - something we cannot yet see.

I wonder if it is just the restarting of nfs-server that make the difference.
Maybe there is some race, or something similar, that causes nfs-server to sometimes start correctly, and to sometimes fail.  If you testing the failure case just happened to line up with the file-missing case.
Comment 13 Neil Brown 2019-09-23 01:06:12 UTC
> but it won't affect anything else.  I'm quite certain of that.

Ok, I was wrong.

The error message causes the syslog code to open a socket for sending messages to the logging daemon.  It gets file descriptor 3.  It leaves it open for later use.

mountd then calls "closeall(3)" to close any unwntedfile descriptors.  This is bad, but not immediately problematic.

mountd then opens some files in /proc/net/rpc from which it gets notification about mount requests.  These are opened on file descriptor 3,4,5.

Then mountd does:
	xlog(L_NOTICE, "Version " VERSION " starting");
The syslog code tries to send this message to the syslog daemon on filedescriptor 3 - which it left open.  But file descriptor 3 is not a socket any longer so this  fails.
So the syslog code closes fd 3 and opens a socket again, and sends the message.  It all works now, but one of those files in /proc/net/rpc is no longer open so mountd misses out on some notifications, so mountding doesn't work.

I create a proper fix.
The work around that you found of creating /etc/nfs.conf.local as an empty file is a valid work around.
Comment 14 Swamp Workflow Management 2019-09-23 02:50:13 UTC
This is an autogenerated message for OBS integration:
This bug (1151044) was mentioned in
https://build.opensuse.org/request/show/732555 Factory / nfs-utils
Comment 15 Neil Brown 2020-03-27 04:43:58 UTC
Fix has landed, so closing.