|
Bugzilla – Full Text Bug Listing |
| Summary: | NFS exports from lates TW cannot be mounted from older systems | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Tumbleweed | Reporter: | Peter Sütterlin <P.Suetterlin> |
| Component: | Network | Assignee: | Neil Brown <nfbrown> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Critical | ||
| Priority: | P5 - None | CC: | hpj, jslaby, paka, sebastian.kuhne |
| Version: | Current | Flags: | jslaby:
needinfo?
(paka) |
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| See Also: |
https://bugzilla.opensuse.org/show_bug.cgi?id=1150807 http://bugzilla.opensuse.org/show_bug.cgi?id=1160035 |
||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
|
Description
Peter Sütterlin
2019-09-17 10:47:41 UTC
I experienced the same. Was later able to mount after updating nfs-client on faulty client. Sorry, do not recall the old version. (In reply to patrick shanahan from comment #1) > Sorry, do not recall the old version. Do: # grep nfs-client /var/log/zypp/history here is most recent: # 2019-07-16 19:20:30 nfs-client-2.1.1-8.5.x86_64.rpm installed ok 2019-07-16 19:20:30|install|nfs-client|2.1.1-8.5|x86_64||Tumbleweed.OSS|3e2cb8cbbe65f13d11c97930464b4578645bec08a3702ccf04554f92d98975f0| 2019-08-19 10:58:33|install|yast2-nfs-client|4.2.0-1.1|noarch||Tumbleweed.OSS|db0c73b099f176a50a2167e75109b2809b8085fbb02ef942ffc4468984ae5c22| # 2019-09-09 08:48:42 nfs-client-2.1.1-8.6.x86_64.rpm installed ok 2019-09-09 08:48:42|install|nfs-client|2.1.1-8.6|x86_64||Tumbleweed.OSS|5bb2119ae786dd5c0a862a393542d0da6cbcd2ec24414005c2f00d3b96b387ce| 2019-09-09 08:52:52|install|yast2-nfs-client|4.2.2-1.1|noarch||Tumbleweed.OSS|a066f464bb706926cea89e0d7d9dce27c8814093602e066ac7db120651b7eca4| # 2019-09-13 08:06:16 nfs-client-2.3.3-9.1.x86_64.rpm installed ok 2019-09-13 08:06:16|install|nfs-client|2.3.3-9.1|x86_64||Tumbleweed.OSS|5cadb0ae9ff2c9996c08d6a9409a5a7a8ef7a791a9e1a1744202161e6ed238fe| - Please report output of "mount -v ...mountpoint". i.e. add '-v' to mount command - are you expecting you NFSv3 or NFSv4? - does /etc/exports on the server contains "fsid=0" ? Maybe provide the whole /etc/exports if that's ok. I have no idea what might cause this, but I'm hoping something in the above might give me a clue. speedy:~ # showmount -e 192.168.1.36 Export list for 192.168.1.36: /home/Extern 192.168.1.0/24 speedy:~ # mount -v -v -v 192.168.1.36:/home/Extern /mnt mount.nfs: timeout set for Wed Sep 18 08:12:38 2019 mount.nfs: trying text-based options 'vers=4,addr=192.168.1.36,clientaddr=192.168.1.33' mount.nfs: mount(2): Permission denied mount.nfs: access denied by server while mounting 192.168.1.36:/home/Extern I've tried setting 'debug=all' for nfsd in /etc/nfs.conf. I do see some startup messages, but *nothing* when I try to connect/mount. However, now the mount call (on the Leap 42.3 machine) hangs instead of returning with an error. No idea what to make of that. What is "fsid=0"? I'never ever used that in >20 years..... So no, it's not in exports, that just has /home/Extern 192.168.1.0/24(rw,no_root_squash,async) Similar for the other case, there it's /export/Video. In both cases top is not /, but a separate partition (/home and /extern; both are XFS partitions) I have tried with fsid=0, but that doesn't change things here 08:12 crash2: ~ # mount -v /mnt/nfs/ExT4/ mount.nfs: trying text-based options 'bg,retry=500,vers=4.2,addr=192.168.1.3,clientaddr=192.168.1.8' 08:12 crash2: ~ # mount -v /mnt/nfs/ExT4/ mount.nfs: trying text-based options 'bg,retry=500,vers=4.2,addr=192.168.1.3,clientaddr=192.168.1.8' is set to highest but expect NFSv4 /etc/exports /mnt/ExT4 192.168.1.0/24(rw,root_squash,sync,no_subtree_check) the mount works now since updating nfs-client, so adding "fsid=0" will not prove anything... fwiw: subject drive/export is usb3 ntfs formatted external drive Some additional info: I just downgraded my (TW 20190909) laptop to nfs-client-2.1.1-8.6.x86_64 nfs-kernel-server-2.1.1-8.6.x86_64 nfsidmap-0.27-2.5.x86_64 from the history tree of snapshot 20190907 (which in turn removed libnfsidmap1) With unchanged export settings etc. the Leap 42.3 machine can now mount the share without issues. There is some issue with firewalls and NFS in the latest update. That would not explain "Permission denied", but might explain the mount call hanging. So if you try again, please disable any firewall and see if that makes a difference. Error message concerning the failure would either come from mountd or from the kernel. It doesn't seem possible to enable mountd debugging with /etc/nfs.conf. You would need to edit /usr/lib/systemd/system/nfs-mountd.service and add a --debug all to the ExecStart line. Then systemctl daemon-reload systemctl restart nfs-mountd For kernel messages, rpcdebug -m nfsd -s all Then try the mount. kernel messages will appear in the output of 'dmesg' mountd messages should appear in journalctl -u nfs-mountd.service --since -1day You can turn off debugging by reverting the change to nfs-mountd.server and restarting the service. For the kernel rpcdebug -m nfsd -c all If you see "fsid=0" as an export option on one directory, it becomes the directory you get when you mount "/" over NFSv4. This is an internal implementation detail that is now best forgotten, but some installers still set it and some installations might have it left over from the past. It could give confusing results - possibly even "Permission denied" errors. Firewall is already disabled, that was my first guess. So I went ahead and added the option in the service file. But now the 42.3 machine could mount the share without problems. I had however updated to 20190917 in the meantime. I'll try again at home, that one is still at 0909. But I won't be there before 1-2 days from now.... Hmm, I just rebooted the laptop to a snapshot of 0909, before the update, when I had issues. This one now also can be mounted from the 42.3 machine. Would it be correct to assume that this suggests it is some setting in /var that had been corrected by the updated system, and now also works for the snapshot (as var is not part of the snapshot)? I'll see if I can compare the laptop with the machine at home... So, back at the home computer. I still cannot mount the export from that one. But I remembered that (on the laptop) I did 'touch /etc/nfs.conf.local' to get rid of that annoying log message. So first thing I tried also at the home machine was stopping nfs-server, touching the file and starting nfs-server again. And now I can mount it from the old TW installation! (I don' have another 42.3 at home but I'm quite sure this would work, too). Nothing else changed. Just for confirmation, I removed the file on the laptop and (re)started the nfs-server, now the old TW cannot mount the export anymore.... Why that 'missing' file blocks mounts from older versions but allows mounts from recent ones I leave for others to solve.... > So first thing I tried also at the home machine was stopping nfs-server, touching the file and starting nfs-server again.
>
> And now I can mount it from the old TW installation!
Thanks for the report. Despite the fact that this seem like strong evidence, I suspect that it is actually misleading.
The existence of non-existence of that file will affect the appearance of the warning, but it won't affect anything else. I'm quite certain of that.
So there must be sometime else happening - something we cannot yet see.
I wonder if it is just the restarting of nfs-server that make the difference.
Maybe there is some race, or something similar, that causes nfs-server to sometimes start correctly, and to sometimes fail. If you testing the failure case just happened to line up with the file-missing case.
> but it won't affect anything else. I'm quite certain of that.
Ok, I was wrong.
The error message causes the syslog code to open a socket for sending messages to the logging daemon. It gets file descriptor 3. It leaves it open for later use.
mountd then calls "closeall(3)" to close any unwntedfile descriptors. This is bad, but not immediately problematic.
mountd then opens some files in /proc/net/rpc from which it gets notification about mount requests. These are opened on file descriptor 3,4,5.
Then mountd does:
xlog(L_NOTICE, "Version " VERSION " starting");
The syslog code tries to send this message to the syslog daemon on filedescriptor 3 - which it left open. But file descriptor 3 is not a socket any longer so this fails.
So the syslog code closes fd 3 and opens a socket again, and sends the message. It all works now, but one of those files in /proc/net/rpc is no longer open so mountd misses out on some notifications, so mountding doesn't work.
I create a proper fix.
The work around that you found of creating /etc/nfs.conf.local as an empty file is a valid work around.
This is an autogenerated message for OBS integration: This bug (1151044) was mentioned in https://build.opensuse.org/request/show/732555 Factory / nfs-utils Fix has landed, so closing. |