Bug 817651

Summary: Kernel 3.7 and newer breaks rpc.gssd -n and thus update of nfs-client package for openSUSE 12.3 needed
Product: [openSUSE] openSUSE 12.3 Reporter: Forgotten User fbKqKvv6Lf <forgotten_fbKqKvv6Lf>
Component: OtherAssignee: Neil Brown <nfbrown>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: forgotten_fbKqKvv6Lf, meissner, nfbrown
Version: Final   
Target Milestone: ---   
Hardware: All   
OS: openSUSE 12.3   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: nfs rpms for testing
debug log following instructions of previous comment
New rpms to test
test results from testing rpms in attachment 541877
Test results following installation of kernel in comment 16
Test results following installation of kernel in comment 16

Description Forgotten User fbKqKvv6Lf 2013-04-29 13:36:15 UTC
User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:20.0) Gecko/20100101 Firefox/20.0

In my organization I need to use kerberized nfs4 mounts without machine credentials. This works by running rpc.gssd with the -n option. This has 
resulted rpc.gssd in using the credentials cache in /tmp/krb5cc_0 when doing the mount instead of machine credentials (which I don't and cannot get). This functionality is broken in kernel 3.7 or newer whereas 3.6.11 and earlier work like expected. Going from openSUSE 12.2 (kernel 3.4) to openSUSE 12.3 (kernel 3.7) this bug was introduced in the distribution.



Reproducible: Always

Steps to Reproduce:
Basic steps to reproduce the problem:
# kinit user (this creates /tmp/krb5cc_0)
# rpc.gssd -f -n -vvvv
# mount -t nfs4 -o sec=krb5 server.example.org:/home /mnt

Detailed steps to reproduce are documented here:
http://forums.opensuse.org/english/get-technical-help-here/network-internet/476695-how-pass-n-flag-rpc-gssd.html
Actual Results:  
mount -vvv -t nfs -o sec=krb5,proto=tcp,vers=4 server.example.org:/home /mnt
mount.nfs: timeout set for Mon Apr 29 14:26:38 2013
mount.nfs: trying text-based options 'sec=krb5,proto=tcp,vers=4,addr=w.x.y.z,clientaddr=a.b.c.d'
mount.nfs: mount(2): Permission denied
mount.nfs: access denied by server while mounting server.example.org:/home

and in the background:

rpc.gssd -fvvvvvvvvv -n
beginning poll
handling gssd upcall (/var/lib/nfs/rpc_pipefs/nfs/clntd)
handle_gssd_upcall: 'mech=krb5 uid=0 service=* enctypes=18,17,16,23,3,1,2 '
handling krb5 upcall (/var/lib/nfs/rpc_pipefs/nfs/clntd)
process_krb5_upcall: service is '*'
Full hostname for 'w.x.y.z' is 'w.x.y.z'
Name or service not known while getting full hostname for 'a.b.c.d'
ERROR: gssd_refresh_krb5_machine_credential: no usable keytab entry found in keytab /etc/krb5.keytab for connection with host w.x.y.z
ERROR: No credentials found for connection to server w.x.y.z
doing error downcall
Closing 'gssd' pipe for /var/lib/nfs/rpc_pipefs/nfs/clntd
destroying client /var/lib/nfs/rpc_pipefs/nfs/clntd


Expected Results:  
I expect the NFS4 mount to succeed when rpc.gssd is started with the -n flag and valid kerberos credentials are available.

The mount command above works when using openSUSE 12.2 or earlier and fails on openSUSE 12.3. More details on the problem can be found on the kernel mailing list, here:
http://permalink.gmane.org/gmane.linux.nfs/54851
http://www.spinics.net/lists/linux-nfs/msg35306.html

It seems that it was decided the bug was in nfs-utils (nfs-client package in openSUSE) and a fix was prepared:
http://permalink.gmane.org/gmane.linux.nfs/55586

I would like to request for this patch to be incorporated in openSUSE 12.3.
Comment 1 Neil Brown 2013-05-07 01:07:07 UTC
That patches that you identified to not seem to contain any fix for this issue.  In fact it looks like it hasn't been fixed at all.
That link contains the text:

>   o  Dropped the patch adding the "-c" option to rpc.gssd.
>      This issue will be revisited soon.

I wonder if the "-c" option was supposed to fix it, but has been dropped for now.

Does adding 
   -k  /tmp/krb5cc_0
to the gssd command line fix the problem?  There seems to be a suggestion that it should, but I'm not very familiar with this stuff.
Comment 2 Forgotten User fbKqKvv6Lf 2013-05-07 07:47:23 UTC
No this does not fix the problem. The output of rpc.gssd with and without the extra "-k  /tmp/krb5cc_0" argument is the same as before with the exception of one line:

ERROR: gssd_refresh_krb5_machine_credential: no usable keytab entry found in keytab /tmp/krb5cc_0 for connection with host w.x.y.z


For completeness sake I list the ticket cache below.
# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: user@EXAMPLE.LOCAL

Valid starting     Expires            Service principal
05/07/13 09:28:04  05/07/13 19:27:26  krbtgt/EXAMPLE.LOCAL@EXAMPLE.LOCAL
        renew until 05/08/13 09:28:04

Regarding your remark Neil. I think there is more information in the thread from which the post came that I linked to, I only linked to the final post from which some intermediate information seems to be missing. The full thread is available here:
http://comments.gmane.org/gmane.linux.nfs/55377 
It seems that the following function calls were updated:
gssd_find_existing_krb5_ccache
gssd_setup_krb5_user_gss_ccache

Hopefully that does fix this issue.
Comment 3 Neil Brown 2013-05-08 06:33:10 UTC
The change to gssd_find_existing_krb5_ccache and gssd_setup_krb5_user_gss_ccache
are just cosmetic.  They don't change any functionality.

I think that using "-k /tmp/krb5cc_0" didn't work because the default principal is "user@..." and gssd is expecting one of "root", "nfs", "host".

Can you use one of those names for the credential instead?

In case you cannot, I'll attach nfs rpms which just add "user" to the list of names to try.  This is not a real fix but would confirm that my understanding of the problem is correct.
Comment 4 Neil Brown 2013-05-08 06:33:41 UTC
Created attachment 538304 [details]
nfs rpms for testing
Comment 5 Neil Brown 2013-05-27 23:12:52 UTC
Hi,
 did you have a chance to test the RPM from comment #4?
Comment 6 Forgotten User fbKqKvv6Lf 2013-05-28 16:32:06 UTC
(In reply to comment #5)
> Hi,
>  did you have a chance to test the RPM from comment #4?

I can confirm that the RPM you provided does not resolve the issue. Unfortunately I can also not tell you why not. Same error again (less obfuscated this time):

handling gssd upcall (/var/lib/nfs/rpc_pipefs/nfs/clntf)
handle_gssd_upcall: 'mech=krb5 uid=0 service=* enctypes=18,17,16,23,3,1,2 '
handling krb5 upcall (/var/lib/nfs/rpc_pipefs/nfs/clntf)
process_krb5_upcall: service is '*'
Full hostname for 'pstor002.domain.local' is 'pstor002.domain.local'
Name or service not known while getting full hostname for 'hppc134.DOMAIN.LOCAL'
ERROR: gssd_refresh_krb5_machine_credential: no usable keytab entry found in keytab /etc/krb5.keytab for connection with host pstor002.domain.local
ERROR: No credentials found for connection to server pstor002.domain.local
doing error downcall
destroying client /var/lib/nfs/rpc_pipefs/nfs/clntf


It also fails with the -k flag using either the root or the users keyab:
ERROR: gssd_refresh_krb5_machine_credential: no usable keytab entry found in keytab /tmp/krb5cc_1000 for connection with host pstor002.domain.local
ERROR: gssd_refresh_krb5_machine_credential: no usable keytab entry found in keytab /tmp/krb5cc_0 for connection with host pstor002.domain.local


I have a bit more time this week to do some debugging so any more suggestion I would be happy to try out.
Comment 7 Neil Brown 2013-05-29 04:57:18 UTC
Thanks for testing.

The problem here seems to be that a hostname lookup of "hppc134.DOMAIN.LOCAL" is failing.
This is the hostname for the client machine as reported by "uname -n".
So presumably
  ping `uname -n`

doesn't work?  

You should check that "hppc134.DOMAIN.LOCAL" appears in /etc/hosts.  On my 12.3 machine it is there with IP address "127.0.0.2".

Please fix that up so the above 'ping' command works, then try NFS again.
If it works with the test rpm, try reverting to the distro rpm and try again.

Thanks.
Comment 8 Forgotten User fbKqKvv6Lf 2013-05-29 10:14:31 UTC
Created attachment 541721 [details]
debug log following instructions of previous comment

I have resolved the error?/warning? message "Name or service not known while getting full hostname for" by Assigning Hostname to loopback IP from Yast. This indeed adds the entry in /etc/hosts:
127.0.0.2       hppc134.DOMAIN.LOCAL hppc134

I have then tested, with and without the patches in attachment 538304 [details], and with and without specifically specying the keytab. (I should stress that with opensuse kernel version <3.7 it "just works"). The results of this is in the attached txt file.

I can see the effect of the patch as an additional entry besides root,nfs,host,etc. but it still does not work.
Comment 9 Neil Brown 2013-05-30 00:48:26 UTC
Thanks.  That is very helpful.  I think I'm now close to understanding what needs to be done to fix the problem.  Hopefully I'll get back to you by early next week.
Comment 10 Neil Brown 2013-05-30 06:01:50 UTC
Created attachment 541877 [details]
New rpms to test

... or maybe late this week :-)

Please install and test these.
There is a new flag "-N" for "gssd" to address your particular case.  See "man gssd".
I've also added a GSSD_OPTIONS setting to /etc/sysconfig/nfs to make it easier to set the -n and -N flags (you will need both).

You don't need the extra setting in /etc/hosts any more.

Thanks.
Comment 11 Forgotten User fbKqKvv6Lf 2013-05-30 16:32:40 UTC
Created attachment 541970 [details]
test results from testing rpms in attachment 541877 [details]

Yesssss. It works! It finally works!

Admittedly I had a few startup issues but it seems to work now. At first it did not work, but after a reboot, editing the GSSD_OPTIONS flag to say "-n -N", and manually starting the NFS deamon I could finally issue a mount. Then a few second later my whole desktop UI (KDE4 plasma-desktop) froze (I could still alt-tab to terminal), until I acquired a fresh kerberos ticket for my uid (rather than just root). I know this freezing behavior is proper behavior in the (newer?) Linux kernels in combination with kerberized NFS4 mounts once a process encounters an expired ticket (so as to not crash the process), but I am not sure what it was in plasma-desktop that was causing the freeze. A second time around this freeze was not occurring. I still get a fair amount of warning messages which I have attached as debut output but over things are working again!! Thanks.

And thank you also for being so kind as to provide me with my much needed GSSD_OPTIONS in /etc/sysconfig/nfs, I really needed that, even before the whole kernel >=3.7 problems as I documented here:
http://forums.opensuse.org/english/get-technical-help-here/network-internet/476695-how-pass-n-flag-rpc-gssd.html
In fact it is strange it was missing in the first place.

Nevertheless I consider this 2 bugs fixed in one go! Thanks!!!
Hopefully these patches will also be adopted by other Linux distributions with kernels >=3.7; the latest version of Ubuntu had the same problem for instance. Debian as well.

Now the next step is to get automounts working based on kerberized NFSv4 + LDAP and SSSD. :)
Comment 12 Neil Brown 2013-06-02 19:51:34 UTC
I've submitted an update request for 12.3, and have posted the "-N" patch upstream in the hope it will be accepted.
So closing this bug now.  Thanks for your patience.
Comment 13 Bernhard Wiedemann 2013-06-02 20:00:35 UTC
This is an autogenerated message for OBS integration:
This bug (817651) was mentioned in
https://build.opensuse.org/request/show/177207 Maintenance /
Comment 14 Neil Brown 2013-06-04 20:14:35 UTC
Hi again,
I've been discussing this issue upstream and it seems they would like to fix it in the kernel instead of in gssd (though I'll be leaving the gssd -N option in place).

Would you be willing to test an alternate kernel if I provided one?
If so, which flavour do you use?  Desktop? default?
The output of "uname -r" is probably the best way to tell.

Thanks.
Comment 15 Forgotten User fbKqKvv6Lf 2013-06-05 04:16:29 UTC
Yes, I've been following the discussion on http://news.gmane.org/gmane.linux.nfs
You are correct to assume that our administrator does not hand out keytabs. I have given the link to this bug report as well as the NFS mailing list to our network architect, perhaps he has time to comment on the matter. I believe the problem is that when keytabs are handed out, there a risk of invalidating all other users' credentials somehow, although I do not pretend to understand exactly how that works.

In the meantime I'm happy to test any kernel fixes. I use kernel-desktop.
Comment 16 Neil Brown 2013-06-05 15:15:09 UTC
I tried to upload an rpm, but bugzilla doesn't like files that big..

So I got the build service to do it for me :-)
Please find rpms at 
https://build.opensuse.org/package/binaries?package=kernel-desktop&project=home%3Aneilbrown%3Akernel&repository=standard
Comment 17 Forgotten User fbKqKvv6Lf 2013-06-07 06:46:02 UTC
Created attachment 543267 [details]
Test results following installation of kernel in comment 16

Yes that kernel works, although I still need the "-n" flag, which means that the nfs client utilities still need a patch which adds the possibility of specifying GSSD_OPTIONS in /etc/sysconfig/nfs

I ran a few tests (8 in total) of different combinations of issues the mount statement and rpc.gssd flags while having (or not having) kerberos credentials for user and/or root. One of the things that puzzles me:
- given a valid user mountable entry in /etc/fstab/
- why can't I mount that entry as user without a valid kerberos ticket for root? I can browse as a user even when the root credentials are destroyed (and thus root can no longer browse), then why do I need it for mounting as user? For some reason it only considers '/tmp/krb5cc_0' when I mount as user, when it should also be looking at '/tmp/krb5cc_1000' since it was user uid=1000 that issued the mount command. (Test 8)
- I can mount as user when root has a ticket (and it will say '/tmp/krb5cc_1000' is valid). (Test 7)
Comment 18 Forgotten User fbKqKvv6Lf 2013-06-07 11:57:45 UTC
Created attachment 543303 [details]
Test results following installation of kernel in comment 16

Fixed the extension of the previous attachment.
Comment 19 Neil Brown 2013-06-11 22:06:22 UTC
Thanks for all the tests.  Your results agree with my understanding of how it should work.

When you run mount as a non-privileged user, it relies on the fact that /sbin/mount.nfs is "setuid" to root, so the actual mount system-call runs as root and consequently uses root's privileged to communicate with the NFS server.

To allow the mount to succeed without root having any credentials, we would need to mount systemcall to authenticate to the server using credentials based on the 'real-uid' rather than 'effective-uid'....  I wonder if that is a good idea.
Comment 20 Neil Brown 2013-06-12 00:17:03 UTC
BTW I've committed the kernel patch to our 12.3 kernel tree so the next kernel update will include it.
Comment 21 Forgotten User fbKqKvv6Lf 2013-06-12 00:28:25 UTC
Great! Thank you for this and all the effort you put it.

Regarding your Comment #19. I think it would be a great benefit to the user if it were possible to mount without the root user needing to have kerberos credentials. I think it is inconsistent that you may browse a mounted mount without root credentials but you may not mount it, even though it is set 'mountable by user in fstab' - it just doesn't make sense to me. Do you think you can dicusss this with the other NFS maintainers? 

Also consider the case where automount has been set up. When a user logs in (after a fresh reboot and without a valid kerberos ticket for root) his or her home would need to be automatically mounted - would this work then? The user authenticates with his or her login credentials and would aquire a valid ticket, but can the automount process then use this to do the mount, or would it crash on account of root not having valid credentials. Or would the ticket somehow (by means of the PAM authentication process) be 'issued' to root as well? This is confusing to me and perhaps you can give it some thought. Meanwhile I'm going to try and test this out next week (set up automount).
Comment 22 Swamp Workflow Management 2013-06-14 09:09:22 UTC
openSUSE-SU-2013:1016-1: An update that solves one vulnerability and has two fixes is now available.

Category: security (moderate)
Bug References: 809226,813464,817651
CVE References: CVE-2013-1923
Sources used:
openSUSE 12.3 (src):    nfs-utils-1.2.7-2.6.1
Comment 23 Forgotten User fbKqKvv6Lf 2013-12-09 10:03:32 UTC
It seems that the "-N" flag has dissappeard from the nfs-utils supplied with 13.1 and the mount.nfs: access denied by server messages are back because of it.

If I enter 
man 8 rpc.gssd
on openSUSE 12.3 with (I think) nfs-utils-1.2.7-2.18.1
I see the "-N" flag documented: 

     -N     With NFSv4, some requests to the server need to authenticated as coming from "the machine" rather than from any particular user.  These requests will normally be authenticated using the "machine credentials" even if -n is set.  Adding -N causes these requests to use the credentials of UID 0 in place of the machine credentials.

But on openSUSE 13.1  nfs-utils-with 1.2.8-4.5.1
man 8 rpc.gssd
mentions nothing about the "-N" flag, and it seems I do need it in order to "abuse" my root's kerberos credentials as machine credentials.

I guess this warrants opening a new bug report against openSUSE 13.1 but I thought I'd mention it here since this bug report shows up in search engien searches.
Comment 24 Swamp Workflow Management 2013-12-30 20:11:53 UTC
openSUSE-SU-2013:1971-1: An update that solves 34 vulnerabilities and has 19 fixes is now available.

Category: security (moderate)
Bug References: 799516,801341,802347,804198,807153,807188,807471,808827,809906,810144,810473,811882,812116,813733,813889,814211,814336,814510,815256,815320,816668,816708,817651,818053,818561,821612,821735,822575,822579,823267,823342,823517,823633,823797,824171,824295,826102,826350,826374,827749,827750,828119,828191,828714,829539,831058,831956,832615,833321,833585,834647,837258,838346
CVE References: CVE-2013-0914,CVE-2013-1059,CVE-2013-1819,CVE-2013-1929,CVE-2013-1979,CVE-2013-2141,CVE-2013-2148,CVE-2013-2164,CVE-2013-2206,CVE-2013-2232,CVE-2013-2234,CVE-2013-2237,CVE-2013-2546,CVE-2013-2547,CVE-2013-2548,CVE-2013-2634,CVE-2013-2635,CVE-2013-2851,CVE-2013-2852,CVE-2013-3222,CVE-2013-3223,CVE-2013-3224,CVE-2013-3226,CVE-2013-3227,CVE-2013-3228,CVE-2013-3229,CVE-2013-3230,CVE-2013-3231,CVE-2013-3232,CVE-2013-3233,CVE-2013-3234,CVE-2013-3235,CVE-2013-3301,CVE-2013-4162
Sources used:
openSUSE 12.3 (src):    kernel-docs-3.7.10-1.24.1, kernel-source-3.7.10-1.24.1, kernel-syms-3.7.10-1.24.1