|
Bugzilla – Full Text Bug Listing |
| Summary: | NFS hang followed by kernel BUG at net/sunrpc/rpcb_clnt.c:290 | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 11.0 | Reporter: | Bob Vickers <R.Vickers> |
| Component: | Kernel | Assignee: | Forgotten User b5BnQSUi71 <forgotten_b5BnQSUi71> |
| Status: | RESOLVED DUPLICATE | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Major | ||
| Priority: | P5 - None | ||
| Version: | Final | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | openSUSE 11.0 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: | kernel log messages | ||
I forgot to mention: system info is 2.6.25.11-0.1-default #1 SMP 2008-07-13 20:48:28 +0200 x86_64 x86_64 x86_64 GNU/Linux Bob This NFS hang has now occurred on four different computers and is causing serious headaches. The kernel bug only occurred once, so possibly that is less significant. Here is some extra information: if there is anything else I can provide please let me know.
(1) The NFS hang has happened on 4 different client computers, all 64-bit machines running SuSE 11.0. Current kernel is kernel-default-2.6.25.11-0.1
(2) So far it has not happened on 32-bit clients, nor on SuSE 10.x clients
(3) Typically only one client at a time is affected, other clients continue to work
(4) When it happens anybody with an NFS-mounted home directory gets completely stuck, but root can still login. Other network protocols (e.g. ssh) still work
(5) Two different NFS servers have been involved, both are running SuSE 10.2 and have been running very smoothly for a year. When the problem occurs it typically only affects communication between one client and one server: the client can still use file systems mounted on the other server.
(6) The problem can be resolved by typing
strace df
umount -fl filesystem
mount -o udp filesystem
and repeating this for each hung file system.
I noticed today that a new version of nfs-client (version 1.1.2-9.2) has just been released. The decription of the bug fix doesn't exactly match my problem because for me UDP works whereas TCP is flakey, but I have installed it anyway and rebooted. I'll see what happens over the next few days. Bob The NFS hang happened again today with newer software on the client: nfs-client-1.1.2-9.2 kernel-default-2.6.25.16-0.1 To summarise the history again: (1) The hangs occurred on four different clients a few weeks ago within the space of a few days. (2) In each case remounting the file systems as UDP solved the problem and the machines remained stable (3) Recently all the machines needed rebooting to install the latest kernel, so the NFS file systems went back to default protocol settings (4) The problem has reappeared Note that the kernel BUG in the problem summary has only occured once, so possibly is a red herring. *** This bug has been marked as a duplicate of bug 415607 *** |
Created attachment 233002 [details] kernel log messages Various processes started hanging, and in particular df hung when it got to some NFS file systems. I forcibly umounted the file systems concerned (which released the hung processes), but was unable to mount them again. Mount failed with the message # mount -o mountproto=tcp /rmt/engels/data mount.nfs: internal error and then I noticed that the logfile included a kernel bug check: "kernel BUG at net/sunrpc/rpcb_clnt.c:290!" I have attached a file containing all the kernel messages. Note that the bug check came after I first started seeing problems.