Bug 620066 - Cleaning up of kerberos credentials by SSH with kerberized NFS leads to excessive log spam by rpc.gssd
Summary: Cleaning up of kerberos credentials by SSH with kerberized NFS leads to exces...
Status: RESOLVED FIXED
Alias: None
Product: openSUSE 11.2
Classification: openSUSE
Component: Basesystem (show other bugs)
Version: Final
Hardware: x86-64 openSUSE 11.2
: P5 - None : Critical with 10 votes (vote)
Target Milestone: ---
Assignee: Neil Brown
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-07-06 12:05 UTC by Mika Fischer
Modified: 2011-07-13 15:05 UTC (History)
4 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
Upstream patch (5.01 KB, patch)
2010-07-30 06:02 UTC, Forgotten User b5BnQSUi71
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mika Fischer 2010-07-06 12:05:34 UTC
User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.6) Gecko/20100628 Ubuntu/10.04 (lucid) Firefox/3.6.6

SSH by default deletes Kerberos credentials when a user logs out.

If the user left a program running (for instance via screen), and if Kerberos credentials are needed to access the home directories (kerberized NFS), rpc.gssd will fail to obtain Kerberos credentials.

The problem is that it generates excessive amounts of warnings in the syslog to this effect (about 1100 wrnings per second), which then quickly fill up the hard drive.

Reproducible: Always

Steps to Reproduce:
1. Log in (via SSH) to host that mounts home directory via kerberized NFS
2. Start screen with some process accessing the home dir inside
3. Detach screen
4. Close SSH session
5. Wait for rpc.gssd credentials cache to expire
Actual Results:  
When the process still running on the target host tries to access the home directory, rpc.gssd will try and fail to obtain kerberos credentials for the user. It will then spam the syslog with the following warning
----
<date> <hostname> rpc.gssd[<pid>]: WARNING: Failed to create krb5 context for user with uid <uid> for server <other hostname>
----
This is repeated ad infinitum until the offending process is killed manually. The logfile otherwise quickly fills up the partition.

Expected Results:  
Maybe one warning or no warning at all should be emitted (the latter is the case for *expired* credentials). See also https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/293705 for the case of expired credentials.

A fix fo this should probably also be coordinated with upstream.
Comment 1 Forgotten User b5BnQSUi71 2010-07-29 06:45:48 UTC
Thanks for the bug report. This issue has been already fixed in upstream.

I have built an updated package including the fix and made it available here:
   http://www.suse.com/~sjayaraman/test-pkgs/nfs-utils/
  (syncing could take a few hours)

You would need to update your nfs-kernel-server package from the above location. Please report back if this fixes the issue for you.
Comment 2 Forgotten User b5BnQSUi71 2010-07-30 06:02:42 UTC
Created attachment 379436 [details]
Upstream patch

Attaching the upstream patch that fixes the problem for completeness.
Comment 4 Mika Fischer 2010-08-03 15:45:37 UTC
After testing the packages I can confirm that they fix the problem for us.

Do you recommend that we deploy them on all our 11.2 hosts or should we wait for an official update?

Also, this probably should be fixed in 11.3. However there we have a similar but slightly different behaviour. The error message does not come from rpc.gssd but from the kernel itself. It is however caused by the same circumstances and also spams the log so quickly that there's a good chance of filling up the /var partition.

The error message in this case is (on the NFS client):
kernel: [1301515.320931] Error: state manager failed on NFSv4 server <NFS server hostname> with error 13

Error 13 probably means NFSERR_ACCES. Which probably means that the process does not have permissions to access the file because the Credentials Cache was removed when the user logged out.

Do you want me to open a separate bug report for this?
Comment 5 Forgotten User b5BnQSUi71 2010-08-03 15:57:28 UTC
(In reply to comment #4)
> After testing the packages I can confirm that they fix the problem for us.

Thanks for confirming.

> Do you recommend that we deploy them on all our 11.2 hosts or should we wait
> for an official update?

You should wait for an official update.
 
> Also, this probably should be fixed in 11.3. However there we have a similar
> but slightly different behaviour. The error message does not come from rpc.gssd
> but from the kernel itself. It is however caused by the same circumstances and
> also spams the log so quickly that there's a good chance of filling up the /var
> partition.
> 
> 
> Do you want me to open a separate bug report for this?

Yes, it sounds different from this one. Please open a separate bugzilla for that issue.
Comment 7 Neil Brown 2010-08-11 06:22:43 UTC
I have submitted an update for 11.2 containing this patch, but I'm not confident that an update will be released in any great hurry.

This bug is fixed in 11.3,  It might be appropriate to upgrade to 11.3, or
just get the nfs-utils package from there.

The update request id is 45345
Comment 8 Forgotten User xRcrmyYBVX 2011-07-04 07:51:14 UTC
Hello everyone,

as Mika Fischer described, this bug exists also in OpenSUSE 11.3/11.4, except that the error message is:
--------
kernel: [<timestamp>] Error: state manager failed on NFSv4 server <IP_of_nfs_server> with error 13
--------

Otherwise, the description is still exactly valid:
- SSH login
- start user job, which accesses kerberized nfs user home
- SSH logout
- Kerberos cache expires
- /var/log/messages is spammed with ~1000 errors PER SECOND!
- /var partition out of space!

So, I could not find the corresponding bug report for 11.3/11.4. Is there one yet?
Comment 9 Forgotten User xRcrmyYBVX 2011-07-13 15:05:24 UTC
I have reported the above as a separate bug at https://bugzilla.novell.com/show_bug.cgi?id=705470