|
Bugzilla – Full Text Bug Listing |
| Summary: | getcwd(2) returns bogus path | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 11.3 | Reporter: | Harald Koenig <koenig> |
| Component: | Basesystem | Assignee: | Leonardo Chiquitto <lchiquitto> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P3 - Medium | CC: | forgotten_sLJ7K2dvxj, lchiquitto |
| Version: | Final | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | openSUSE 11.3 | ||
| See Also: | https://bugzilla.novell.com/show_bug.cgi?id=565151 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
Hi Harald, thanks for the bug report. I'm afraid this is a known problem (please see bug #565151). The summary is: The problem only happens when AutoFS is restarted. Running processes, with $CWD set to an automounted directory, will get "truncated" results from getcwd(). Here's the status for our supported openSUSE releases: openSUSE 11.3: fixed. openSUSE 11.2: not fixed yet. All the code is there, but we need to update the AutoFS init script and sysconfig to enable the feature that fixes it. I will do it the next time we update AutoFS. openSUSE 11.1: not fixed. Unfortunately it's not easy to fix. It's basically a new feature that requires Kernel changes plus an AutoFS version update. I'm sorry but, considering that the problem is not critical for most use cases, I think this is a WONTFIX for openSUSE 11.1. Harald, considering the previous comment, are you OK with closing it as "won't fix"? (In reply to comment #2) > Harald, considering the previous comment, are you OK with closing it as "won't > fix"? ACK, 11.1 is more or less end-of-life anyway;) but one comment reading your interesting information: I did not find any evidence that autofs on my PC got restarted. I've checked /var/log/messages etc. unfortunately I did not keep the output of ps before rebooting, so I'll have to wait for the next time this will happen to do more checks (or just cross fingers;) -- ticket closed for now... thanks for the info! Thanks! Closing as WONTFIX (for 11.1, already fixed for 11.3). (In reply to comment #3) > I did not find any evidence that autofs on my PC got restarted. FYI a quick update: I just had the chance to restart autofs on a suse 11.1 system (did not want to test this on my PC right now -- one never knows...;-) restarting autofs on that system shows this msg in syslog with the old PID of automount: "umount_autofs_indirect: ask umount returned busy /home" I find the same message in my own PC's syslog file: Aug 3 15:18:09 atuin pm-suspend[29642]: Entering suspend. In case of problems, please check /var/log/pm-suspend.log Aug 3 15:18:10 atuin automount[5602]: umount_autofs_indirect: ask umount returned busy /home so you're totally right: the restart of autofs got triggered by a test of suspend2ram for my PC, and it all was about restarting autofs. thanks again! (In reply to comment #1) > Hi Harald, thanks for the bug report. I'm afraid this is a known problem > (please see bug #565151). > > The summary is: > > The problem only happens when AutoFS is restarted. Running processes, with $CWD > set to an automounted directory, will get "truncated" results from getcwd(). > > Here's the status for our supported openSUSE releases: > > openSUSE 11.3: fixed. RUMORS!!! actually today I did the same suspend/resume test with my desktop PC, now running opensuse 11.3 -- and surprise: I slipped into the same bogus behaviour as last year with opensuse 11.1!! atuin > acroread ERROR: Cannot determine current directory. atuin > pwd /home/koenig/dir atuin > /bin/pwd ; echo koenigdir atuin > strace -e getcwd /bin/pwd getcwd("koenigdir", 4096) = 9 please (also?!) note the missing slash between my home dir name "koenig" and the subdir name "dir" ! the 2.6.27 kernel from opensuse 11.1 at least did still print that slash which is now missing too ;-) atuin > uname -a Linux atuin 2.6.34.7-0.7-default #1 SMP 2010-12-13 11:13:53 +0100 x86_64 x86_64 x86_64 GNU/Linux atuin > rpm -q autofs autofs-5.0.5-7.2.x86_64 Harald -- now offline for a reboot... :-((( Please attach /etc/sysconfig/autofs here. (In reply to comment #7) > Please attach /etc/sysconfig/autofs here. DEFAULT_BROWSE_MODE=no that'sall.... autofs gets all it's data via NIS: atuin koenig > grep auto /etc/nsswitch.conf automount: nis files atuin koenig > ypmatch /home auto.master auto.home -rw,grpid,hard,intr,nodevs,nosuid atuin koenig > ypmatch koenig auto.home atuin:/net/atuin/fs1/home/& > > Please attach /etc/sysconfig/autofs here.
>
> DEFAULT_BROWSE_MODE=no
>
> that'sall.... autofs gets all it's data via NIS:
That explains why you're still seeing the problem. You need to add the following line to /etc/sysconfig/autofs:
USE_MISC_DEVICE="yes"
This is set by default in the sysconfig file shipped with the package, but you removed it for some reason. This means AutoFS is *not* using the misc device (/dev/autofs), the feature that resolves this bug.
Although the original bug is fixed (if you have the option explicitly set to "yes"), your comments have made me realize we still have a bug in our init script: if $USE_MISC_DEVICE is not defined, it should be interpreted as "yes" by default (currently this is not the case and that's why you hit the bug again).
I'll report this in a new bug and fix it in openSUSE Factory.
(In reply to comment #9) > That explains why you're still seeing the problem. You need to add the > following line to /etc/sysconfig/autofs: > > USE_MISC_DEVICE="yes" ACK! with USE_MISC_DEVICE="yes" and "rcautofs restart" there are no longer getcwd() problems after suspend2ram! thanks a lot for your quick help (and the fix in #684997 -- *please* feed this change into updates for 11.4 and 11.3, too!) |
recently we updated the kernel from 2.6.27.45-0.1.1 to 2.6.27.48-0.1.1, my $HOME is mounted via autofs. now, for the 1st time, I notice the following strange problems with some shell scripts: for one (of many open;) xterm and the bash within /bin/pwd now returns "koenig/dir" instead of "/home/koenig/dir" in other bash instances which are in the same dir (note esp. the missing / for the broken /bin/pwd output!). in other xterm/bash instances this does not happen at all, while it's 100% reproducable in that one bash. "strace /bin/pwd" shows that that bad string directly comes from getcwd(2): good bash: getcwd("/home/koenig/dir", 4096) = 17 bad bash: getcwd("koenig/dir", 4096) = 11 a "cd $PWD" or "cd subdir" fixes the problem for this shell, but not the parent process/shell (same for interactive shells/tests): $ bash -c "/bin/pwd" koenig/dir $ bash -c "cd $PWD ; /bin/pwd" /home/koenig/dir $ bash -c "cd subdir ; /bin/pwd" /home/koenig/dir/subdir $ /bin/pwd koenig/dir more facts: "stat ." show identical output in shells with good and broken pwd output. that directory is a quite old dir, it was not removed/recreated/renamed/whatsoever in the last decade /home is ext3fs this happend twice today in two different directories! in both cases the 1st error was from acroread (it claims "ERROR: Cannot determine current directory."). for the 1st problem I just did "cd $PWD" (thinking about some "rm dir ; mkdir dir" problem), for the 2nd occurance I started some more testing to see what's coing on. we installed that new kernel on July 29, so it's only running now for 2 office days with at least 2 instances of this new(?) behaviour. autofs did not change recently (install date Mar 16 2009) /proc/self/cwd shows the same broken information for that bash process: lrwxrwxrwx 1 koenig s+c 0 Aug 3 17:55 cwd -> koenig/dir dmesg doesn't show any error or (to me) significant output a closer look at the strace output shows more weird differences: the st_ino=... return value for all stat() and fstat() calls differ. surprisingly, the strace of the "bad" /bin/pwd shows the correct st_ino vaules (compared with "ls -i file" and "stat file" in both a broken and good bash instance -- all show the same inode numbers!) any idea what's going on here ? any problem in the new kernel, like some weird memory corruption or similar ? I'll run a full fsck on that partition overnight -- just in case....