Bug 466484 (shutdown_problem) - on shutdown the home-partition does not clearly unmount.
Summary: on shutdown the home-partition does not clearly unmount.
Status: RESOLVED FIXED
: 462585 465029 (view as bug list)
Alias: shutdown_problem
Product: openSUSE 11.1
Classification: openSUSE
Component: Other (show other bugs)
Version: Final
Hardware: x86-64 openSUSE 11.1
: P4 - Low : Critical (vote)
Target Milestone: ---
Assignee: Dr. Werner Fink
QA Contact: E-mail List
URL:
Whiteboard: maint:released:11.1:22284 maint:relea...
Keywords:
Depends on:
Blocks:
 
Reported: 2009-01-15 17:00 UTC by Sven Zielke
Modified: 2010-01-29 16:13 UTC (History)
13 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
sysvinit.rpm for i586 and higher (477.25 KB, application/x-rpm)
2009-01-16 13:21 UTC, Dr. Werner Fink
Details
aaa_base.rpm for i586 and higher (142.72 KB, application/x-rpm)
2009-01-16 13:23 UTC, Dr. Werner Fink
Details
sysvinit.rpm for x86_64 (492.95 KB, application/x-rpm)
2009-01-16 15:55 UTC, Dr. Werner Fink
Details
aaa_base.rpm for x86_64 (142.79 KB, application/x-rpm)
2009-01-16 15:56 UTC, Dr. Werner Fink
Details
Screenshot (1.70 MB, image/jpeg)
2009-01-16 16:29 UTC, Sven Zielke
Details
Booting fails on x86_64 (59.43 KB, text/plain)
2009-01-19 12:18 UTC, Sven Zielke
Details
aaa_base-11.1-10007.12.i586.rpm (142.60 KB, application/x-rpm)
2009-01-20 11:52 UTC, Dr. Werner Fink
Details
sysvinit-2.86-186.14.i586.rpm (476.97 KB, application/x-rpm)
2009-01-20 11:53 UTC, Dr. Werner Fink
Details
aaa_base-11.1-10007.12.x86_64.rpm (142.67 KB, application/x-rpm)
2009-01-20 11:54 UTC, Dr. Werner Fink
Details
sysvinit-2.86-186.14.x86_64.rpm (492.68 KB, application/x-rpm)
2009-01-20 11:55 UTC, Dr. Werner Fink
Details
This is the "debug" patch I added to boot.localfs to try the backtrace as suggested by Sven Zielke (1.53 KB, patch)
2009-01-22 23:18 UTC, Diego Ercolani
Details | Diff
"debug" patch for boot.localfs (1.87 KB, patch)
2009-01-23 15:23 UTC, Diego Ercolani
Details | Diff
dump generated by attachment #267289 (203.37 KB, text/plain)
2009-01-23 15:29 UTC, Diego Ercolani
Details
sysvinit-2.86-186.14.i586.rpm (477.22 KB, application/x-rpm)
2009-01-26 12:09 UTC, Dr. Werner Fink
Details
sysvinit-2.86-186.14.x86_64.rpm (493.09 KB, application/x-rpm)
2009-01-26 12:10 UTC, Dr. Werner Fink
Details
shutdown freeze log (31.99 KB, text/plain)
2009-02-01 12:19 UTC, Diego Ercolani
Details
"debug" patch for boot.localfs (2.12 KB, patch)
2009-02-23 13:19 UTC, Diego Ercolani
Details | Diff
last shutdown hangup (74.50 KB, application/octet-stream)
2009-02-24 12:45 UTC, Diego Ercolani
Details
Modified mkill that documents its work (20.18 KB, text/plain)
2009-03-10 01:03 UTC, Matthias Hopf
Details
lsof output during shutdown (32.87 KB, text/plain)
2009-03-10 01:03 UTC, Matthias Hopf
Details
mkill log output (83.39 KB, text/plain)
2009-03-10 01:04 UTC, Matthias Hopf
Details
Bug fix for mkill (413 bytes, patch)
2009-03-10 01:05 UTC, Matthias Hopf
Details | Diff
sysvinit-2.86-186.16.i586.rpm (477.71 KB, application/x-rpm)
2009-03-10 12:23 UTC, Dr. Werner Fink
Details
sysvinit-2.86-186.16.x86_64.rpm (493.93 KB, application/x-rpm)
2009-03-10 12:25 UTC, Dr. Werner Fink
Details
boot.omsg (4.03 KB, text/plain)
2009-04-01 19:32 UTC, Eberhard Harbrink
Details
kernel session log for a session without hang but without umount of rootfs (46.92 KB, text/plain)
2009-04-01 19:36 UTC, Diego Ercolani
Details
modification of boot.localfs and halt scripts (1000 bytes, patch)
2009-04-02 23:08 UTC, Diego Ercolani
Details | Diff
kernel session log for a session without hang but without umount of rootfs (halt complainted about "/" is busy during remount,ro) (46.92 KB, text/plain)
2009-04-02 23:10 UTC, Diego Ercolani
Details
log generated for the same session as (id=283822) by script modifications as (id=283820) (1.39 KB, text/plain)
2009-04-02 23:12 UTC, Diego Ercolani
Details
kernel session log for a session without hang but without umount of rootfs (halt complainted about "/" is busy during remount,ro) (46.22 KB, text/plain)
2009-04-06 22:41 UTC, Diego Ercolani
Details
log generated for the same session as (id=284364) by script modifications as (id=283820) (4.44 KB, text/plain)
2009-04-06 22:43 UTC, Diego Ercolani
Details
/etc/init.d/{boot.localfs,halt} /var/log/boot.{omsg,faill.msg} as requested in comment #128 (24.98 KB, application/x-gzip)
2009-04-07 22:48 UTC, Diego Ercolani
Details
/etc/init.d/.depend.* /etc/inittab (2.49 KB, application/x-gzip)
2009-04-09 19:11 UTC, Eberhard Harbrink
Details
ps axu (5.00 KB, text/plain)
2009-04-09 19:18 UTC, Eberhard Harbrink
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sven Zielke 2009-01-15 17:00:36 UTC
Sorry for my English!
Often some applications accesses the /home-partition (if it exists) and so unmounting on shutdown fails.
Solutions seems to be moving the follwing block

echo "Sending all processes the TERM signal..."
killall5 -15
echo -e "$rc_done_up"

# wait between last SIGTERM and the next SIGKILL
rc_wait /sbin/blogd /sbin/splash

echo "Sending all processes the KILL signal..."
killall5 -9
echo -e "$rc_done_up"


between

test -s /etc/init.d/.depend.halt  || RUN_PARALLEL="no"
type -p startpar &> /dev/null     || RUN_PARALLEL="no"
startpar -v      &> /dev/null     || RUN_PARALLEL="no"


and

#
# set back system boot configuration
#
if test "$RUN_PARALLEL" = "yes" ; then

    startopt="-p4 -t 30 -T 3"
    eval $(startpar $startopt -M halt)
    unset failed_service skipped_service
Comment 1 Sven Zielke 2009-01-15 17:03:09 UTC
Oh I forgot: The filename for editing is /etc/init.d/halt
Here is the original thread in German:
http://www.linux-club.de/viewtopic.php?f=4&t=99992&start=20&st=0&sk=t&sd=a&sid=9b6315c409a8ed6b549ff96166609ab9
Comment 2 Dr. Werner Fink 2009-01-16 13:16:31 UTC
The major problem with this approach is that now udevd is killed
and some of the boot scripts loose the event handling of the udevd
for caused kernel events.  This is the reason why we have now an
other solution for this.  But this requires a new sysvinit package
which includes two new tools:
 
  /sbin/mkill   - Send processes making a active mount point busy a signal
  /sbin/vhangup - Cause a virtually hangup on the specified terminals

which are used in a changed /etc/init.d/boot.localfs of the package
aaa_base.
Comment 3 Dr. Werner Fink 2009-01-16 13:21:56 UTC
Created attachment 265643 [details]
sysvinit.rpm for i586 and higher

New sysvinit with the /sbin/mkill and /sbin/vhangup binary.
Please install this *before* installing the next attachment.
Comment 4 Dr. Werner Fink 2009-01-16 13:23:26 UTC
Created attachment 265645 [details]
aaa_base.rpm for i586 and higher

the aaa_base with the /etc/init.d/boot.localfs script using mkill(8)
and /etc/init.d/halt using vhangup(8). Please test out if this works
for you.
Comment 5 Dr. Werner Fink 2009-01-16 13:25:24 UTC
Please try out the above attachments of comment #3 and comment #4 by
installing first attachment #265643 [details] and then attachment #265645 [details].
Does this work for you?
Comment 6 Sven Zielke 2009-01-16 13:58:35 UTC
I could not install the package sysvinit:

sudo rpm -ivh sys*.rpm
Preparing...                ########################################### [100%]
        file /bin/fsync from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /bin/mountpoint from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /bin/usleep from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/blogd from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/blogger from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/checkproc from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/detectups from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/halt from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/init from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/isserial from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/killall5 from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/killproc from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/powerd from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/runlevel from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/showconsole from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/shutdown from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/start-stop-daemon from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/startpar from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/startproc from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /sbin/sulogin from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /usr/bin/last from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
        file /usr/bin/utmpdump from install of sysvinit-2.86-189.11.i586 conflicts with file from package sysvinit-2.86-186.13.x86_64
Comment 7 Dr. Werner Fink 2009-01-16 14:29:50 UTC
Do *not* use `-ihv' but `-Uhv' please ... -i is for install and -U for update
and you want the later case.
Comment 8 Sven Zielke 2009-01-16 15:38:40 UTC
System does not boot with these packages. Maybe wrong architecture? (i586 instead of x86_64)
/sys/class seems to be missing on boot.
Comment 9 Dr. Werner Fink 2009-01-16 15:55:13 UTC
Created attachment 265689 [details]
sysvinit.rpm for x86_64

this is for x86_64
Comment 10 Dr. Werner Fink 2009-01-16 15:56:27 UTC
Created attachment 265691 [details]
aaa_base.rpm for x86_64

this is for x86_64
Comment 11 Dr. Werner Fink 2009-01-16 15:57:44 UTC
Please retry with the correct architecture ... does the /sys/class error
happens again?
Comment 12 Sven Zielke 2009-01-16 16:29:58 UTC
Created attachment 265705 [details]
Screenshot

System still does not boot, it hangs (look screenshot), but I could restart with CTRL+Alt+Del.
Comment 13 Diego Ercolani 2009-01-16 19:49:27 UTC
I installed attachment #265643 [details] and then attachment #265645 [details] (I red after for architecture x86_64 have to be of comment 9 and comment 10 but they seem also to be for x86_64.
I added this comment for track purpose. then I'll see if it solve the shutdown hangup

That's my session dump:

rpm -Uhv /home/diego/Desktop/sysvinit.rpm 
Preparing...                ########################################### [100%]
   1:sysvinit               ########################################### [100%]
Scanning scripts ...                                                          
Resolve dependencies ...                                                      
Install symlinks in /lib/mkinitrd/setup ...                                   
Install symlinks in /lib/mkinitrd/boot ...                                    
Scanning scripts ...                                                          
Resolve dependencies ...                                                      
Install symlinks in /lib/mkinitrd/setup ...                                   
Install symlinks in /lib/mkinitrd/boot ...                                    
casaregno:~ # rpm -Uhv /home/diego/Desktop/aaa_base.rpm 
Preparing...                ########################################### [100%]
   1:aaa_base               ########################################### [100%]
insserv: Script jexec is broken: incomplete LSB comment.                      
insserv: missing `Required-Stop:'  entry: please add even if empty.           
Updating etc/sysconfig/language...                                            
Updating etc/sysconfig/backup...                                              
Updating etc/sysconfig/boot...                                                
Updating etc/sysconfig/kernel...                                              
Updating etc/sysconfig/suseconfig...                                          
Updating etc/sysconfig/clock...                                               
Updating etc/sysconfig/proxy...                                               
Updating etc/sysconfig/windowmanager...                                       
Updating etc/sysconfig/sysctl...                                              
Updating etc/sysconfig/cron...                                                
Updating etc/sysconfig/news...                                                
Updating etc/sysconfig/shutdown...                                            
Updating etc/passwd...unchanged                                               
Updating etc/group...unchanged                                                
Updating etc/shadow...unchanged                                               
insserv: Script jexec is broken: incomplete LSB comment.                      
insserv: missing `Required-Stop:'  entry: please add even if empty.           
casaregno:~ # uname -a                                                        
Linux casaregno 2.6.27.7-9-default #1 SMP 2008-12-04 18:10:04 +0100 x86_64 x86_64 x86_64 GNU/Linux
casaregno:~ # rpm -Uhv /home/diego/Desktop/sysvinit                                               
sysvinit(2).rpm  sysvinit.rpm                                                                     
casaregno:~ # rpm -Uhv /home/diego/Desktop/sysvinit\(2\).rpm                                      
Preparing...                ########################################### [100%]                    
        package sysvinit-2.86-189.11.x86_64 is already installed
Comment 14 Dr. Werner Fink 2009-01-19 09:39:00 UTC
To check for wich arcituecture a rpm is please type in e.g.

    rpm --queryformat '%{NAME} for %{ARCH}\n' -qp sysvinit.rpm

this gives for the first two packages:

 /suse/werner> rpm --queryformat '%{NAME} for %{ARCH}\n' -qp sysvinit.rpm
 sysvinit for i586
 /suse/werner> rpm --queryformat '%{NAME} for %{ARCH}\n' -qp aaa_base.rpm
 aaa_base for i586

and the second tow packages:

 /suse/werner> rpm --queryformat '%{NAME} for %{ARCH}\n' -qp sysvinit.rpm
 sysvinit for x86_64
 /suse/werner> rpm --queryformat '%{NAME} for %{ARCH}\n' -qp aaa_base.rpm
 aaa_base for x86_64

the next point is for overwriting an existing package with the same
version and smae release numbers the option --force can be used:

  rpm -Uhv sysvinit.rpm --force
Comment 15 Diego Ercolani 2009-01-19 09:45:48 UTC
I have to inform that new sysvinit and aaa_base doesn't solve the shutdown problem. I'm not sure that the problem is related to some process that lock the umount, maybe could be some dirty cleanup of some kernel module?
Comment 16 Dr. Werner Fink 2009-01-19 09:48:46 UTC
(In reply to comment #12)

This is very strange: for me this looks like the mingetties are respawning
to fast but I'm not able to read the text on attachment #265705 [details].  You should
check if you have really installed the correct sysvinit package (x86_64).
You may use for this single user mode (be aware that you do not have a virtual console but the system console then, that is no Ctrl-C works and you have to
mount the partitions by hand) or the openSuSE DVD with the repair menu entry.
Comment 17 Dr. Werner Fink 2009-01-19 09:55:35 UTC
(In reply to comment #15)

Diego? Does this mean that your system boots without problems after
installing the two packages?  Do you have the mkill binary around,
that is that

   type -p mkill

should proviode /sbin/mkill and the mkill should be used within
/etc/init.d/boot.localfs to stop all processes making the mount
points busy.
Comment 18 Sven Zielke 2009-01-19 10:12:06 UTC
Okay, I tried with a fresh installation in virtualbox.
System ist booting. System seems to unmount clearly.

the output of type -p mkill is nothing!
Comment 19 Dr. Werner Fink 2009-01-19 10:30:52 UTC
You have to root for

         type -p mkill

otherwise you will not see mkill at /sbin/mkill
Comment 20 Sven Zielke 2009-01-19 12:18:44 UTC
Created attachment 265992 [details]
Booting fails on x86_64

My system still does not boot with these packages. I made sure that I installed the correct architecture. It seems that sysfs could not be mounted on boot, look at the attached log.
Comment 21 Dr. Werner Fink 2009-01-19 12:59:20 UTC
What does your /etc/init.d/boot.local script do?  AFAICS from your log
the /sys file system seems to be mounted. But the message

   mount: /sys not mounted already, or bad option

leads me to the guess that there is an error. Or it could be that you're
missing a module or are running a wrong kernel as the /sys/kernel/security
can not be mounted.  The only difference between mounting /sys between old
/etc/init.d/boot

 [...]
 echo -n "Mounting procfs at /proc"
 mount -n -t proc proc /proc
 rc_status -v -r

 echo -n "Mounting sysfs at /sys"
 mount -n -t sysfs sysfs /sys
 rc_status -v -r
 [...]

and new /etc/init.d/boot is

 [...]
 if test ! -d /proc/1 ; then
     echo -n "Mounting procfs at /proc"
     mount -n -t proc proc /proc
     rc_status -v -r
 fi

 if test ! -d /sys/block ; then
     echo -n "Mounting sysfs at /sys"
     mount -n -t sysfs sysfs /sys
     rc_status -v -r
 fi
 [...]

... this may fail if you have a mount point /sys
with an directory named block therein.
Comment 22 Sven Zielke 2009-01-19 13:50:43 UTC
cat /etc/init.d/boot.local
#! /bin/sh
#
# Copyright (c) 2002 SuSE Linux AG Nuernberg, Germany.  All rights reserved.
#
# Author: Werner Fink <werner@suse.de>, 1996
#         Burchard Steinbild, 1996
#
# /etc/init.d/boot.local
#
# script with local commands to be executed from init on system startup
#
# Here you should add things, that should happen directly after booting
# before we're going to the first run level.
#

/bin/echo min_power > /sys/class/scsi_host/host0/link_power_management_policy
/bin/echo min_power > /sys/class/scsi_host/host1/link_power_management_policy
/bin/echo min_power > /sys/class/scsi_host/host2/link_power_management_policy
/bin/echo min_power > /sys/class/scsi_host/host3/link_power_management_policy
/bin/echo min_power > /sys/class/scsi_host/host4/link_power_management_policy
/bin/echo min_power > /sys/class/scsi_host/host5/link_power_management_policy
/bin/echo min_power > /sys/class/scsi_host/host6/link_power_management_policy
/bin/echo 1500 > /proc/sys/vm/dirty_writeback_centisecs
/bin/echo 1 > /sys/module/snd_ac97_codec/parameters/power_save
/sbin/modprobe saa7134-alsa
/sbin/modprobe lirc_dev

Nothing very special, some options from powertop and some modules...

Okay I could start from Live-CD and have a look if there is a /sys/block.

cat /etc/fstab
/dev/disk/by-id/scsi-SATA_WDC_WD6400AAKS-_WD-WMASY2641335-part6 swap                 swap       defaults              0 0
/dev/disk/by-id/scsi-SATA_WDC_WD6400AAKS-_WD-WMASY2641335-part5 /                    ext3        defaults              1 1
/dev/disk/by-id/scsi-SATA_WDC_WD6400AAKS-_WD-WMASY2641335-part7 /home                ext3        defaults              1 2
/media/sda9          ext3        defaults              1 2
/dev/disk/by-id/scsi-SATA_WDC_WD6400AAKS-_WD-WMASY2641335-part1 /windows/C           ntfs-3g    uid=1000,exec,users,gid=users,fmask=133,dmask=022,locale=de_DE.UTF-8 0 0
proc                 /proc                proc       defaults              0 0
sysfs                /sys                 sysfs      noauto                0 0
debugfs              /sys/kernel/debug    debugfs    noauto                0 0
devpts               /dev/pts             devpts     mode=0620,gid=5       0 0
/dev/fd0             /media/floppy        auto       noauto,user,sync      0 0
/dev/disk/by-id/scsi-SATA_SAMSUNG_HD501LJS0MUJ1EQ164247-part1 /media/data          ext3       defaults              1 2
Comment 23 Sven Zielke 2009-01-19 14:24:35 UTC
Indeed there was an empty folder called /sys/block. I deleted it and now it works. Partitions were cleanly unmounted on reboot.

type -p mkill
/sbin/mkill

Maybe the init.script should check if the /sys/block-directory is empty or not.
There is also a directory called "kernel" but this seems to be needed by aaa-base?
Comment 24 Dr. Werner Fink 2009-01-20 10:15:38 UTC
Hmmm ... the file systems /proc and /sys are virtual file systems and
indeed they exist only in the memory and only if a directory or file
will be opened from a user space application.  If /proc and /sys are
not mounted the mount point should be empty ... we could replace the
simple test for the directorieas by something like

  test $(stat -f -c '%T' /proc) = proc || mount -n -t proc proc /proc
  test $(stat -f -c '%T' /sys) = sysfs || mount -n -t sysfs sysfs /sys

as this would avoid buggy mount points.

Rudi? What do you think about? AFAICS on openSuSE 11.1 and SLES11 we
have /bin/stat and with this we could do this very simple.
Comment 25 Dr. Werner Fink 2009-01-20 11:52:17 UTC
Created attachment 266171 [details]
aaa_base-11.1-10007.12.i586.rpm
Comment 26 Dr. Werner Fink 2009-01-20 11:53:31 UTC
Created attachment 266172 [details]
sysvinit-2.86-186.14.i586.rpm
Comment 27 Dr. Werner Fink 2009-01-20 11:54:44 UTC
Created attachment 266173 [details]
aaa_base-11.1-10007.12.x86_64.rpm
Comment 28 Dr. Werner Fink 2009-01-20 11:55:34 UTC
Created attachment 266174 [details]
sysvinit-2.86-186.14.x86_64.rpm
Comment 29 Dr. Werner Fink 2009-01-20 11:57:23 UTC
Diego? Does those packages work for you?
Comment 30 Diego Ercolani 2009-01-21 14:32:29 UTC
It seem it works but I have to do some other test
Comment 31 Dr. Werner Fink 2009-01-21 14:35:26 UTC
*** Bug 462585 has been marked as a duplicate of this bug. ***
Comment 32 Dr. Werner Fink 2009-01-21 14:36:42 UTC
Anja? For a SWAMPID is required for both packages sysvinit and aaa_base.
Comment 33 Sven Zielke 2009-01-21 15:42:27 UTC
My brother tested the new packages on 3 i586-installations and it seems to work there, too!
Comment 34 Diego Ercolani 2009-01-22 13:38:10 UTC
Hello. The problem seems to persist even with new packages.
I Think we need a sort of "magick key" to click when we are in freeze-mode an have a sort of machine status dump.... can it be possible?
Comment 35 Sven Zielke 2009-01-22 13:52:38 UTC
Hello Diego,

you could edit the file /etc/init.d/boot.localfs
and look for the line
echo "Unmounting file systems"

Add the follwing two lines:

date >> /var/log/boot.fail.msg
lsof | grep /home >> /var/log/boot.fail.msg
Comment 36 Dr. Werner Fink 2009-01-22 13:56:43 UTC
You may read /usr/src/linux/Documentation/sysrq.txt from kernel-source rpm.
Comment 37 Diego Ercolani 2009-01-22 23:18:19 UTC
Created attachment 267063 [details]
This is the "debug" patch I added to boot.localfs to try the backtrace as suggested by  Sven Zielke 

This is the output that the "patch" produced when boot.localfs didn't umount directories:

shutdown procedure freezed (as reported by set -x on the console) with a line beginning by "mkill -TERM" and more mounted path (honestly I didn't wrote down the screen dump)

-----------boot.fail.msg----------------
Thu Jan 22 23:38:46 CET 2009
------
BACKTRACE
-----
  Traceback: 0
  Functions: 
------
mtab
----
/dev/hda11 on / type reiserfs (rw)
/proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
debugfs on /sys/kernel/debug type debugfs (rw)
udev on /dev type tmpfs (rw)
devpts on /dev/pts type devpts (rw,mode=0620,gid=5)
/dev/hda12 on /home type xfs (rw)
/dev/hda8 on /data1 type reiserfs (rw)
/dev/hda7 on /suse10.2 type reiserfs (rw)
/dev/hda1 on /windows/C type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)
/dev/hda2 on /windows/D type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)
/dev/hda3 on /windows/E type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)
/dev/hda5 on /windows/F type fuseblk (rw,noexec,nosuid,nodev,allow_other,default_permissions,blksize=4096)
/dev/hda9 on /windows/G type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)
/dev/hda10 on /windows/H type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)
/dev/mapper/dati-multimedia on /mnt/hdb1 type xfs (rw,noexec,nosuid,nodev)
/dev/mapper/dati-distribuzione on /mnt/hdb2 type reiserfs (rw,noexec,nosuid,nodev)
securityfs on /sys/kernel/security type securityfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
------
lsof
----
COMMAND    PID  USER   FD      TYPE             DEVICE    SIZE/OFF   NODE NAME
init         1  root  cwd       DIR               3,11         688      2 /
init         1  root  rtd       DIR               3,11         688      2 /
init         1  root  txt       REG               3,11      838176  68779 /sbin/init
init         1  root   10u     FIFO               0,14         0t0   1763 /dev/initctl
kthreadd     2  root  cwd       DIR               3,11         688      2 /
kthreadd     2  root  rtd       DIR               3,11         688      2 /
kthreadd     2  root  txt   unknown                                       /proc/2/exe
migration    3  root  cwd       DIR               3,11         688      2 /
migration    3  root  rtd       DIR               3,11         688      2 /
migration    3  root  txt   unknown                                       /proc/3/exe
ksoftirqd    4  root  cwd       DIR               3,11         688      2 /
ksoftirqd    4  root  rtd       DIR               3,11         688      2 /
ksoftirqd    4  root  txt   unknown                                       /proc/4/exe
events/0     5  root  cwd       DIR               3,11         688      2 /
events/0     5  root  rtd       DIR               3,11         688      2 /
events/0     5  root  txt   unknown                                       /proc/5/exe
khelper      6  root  cwd       DIR               3,11         688      2 /
khelper      6  root  rtd       DIR               3,11         688      2 /
khelper      6  root  txt   unknown                                       /proc/6/exe
kintegrit    7  root  cwd       DIR               3,11         688      2 /
kintegrit    7  root  rtd       DIR               3,11         688      2 /
kintegrit    7  root  txt   unknown                                       /proc/7/exe
kblockd/0    8  root  cwd       DIR               3,11         688      2 /
kblockd/0    8  root  rtd       DIR               3,11         688      2 /
kblockd/0    8  root  txt   unknown                                       /proc/8/exe
kacpid       9  root  cwd       DIR               3,11         688      2 /
kacpid       9  root  rtd       DIR               3,11         688      2 /
kacpid       9  root  txt   unknown                                       /proc/9/exe
kacpi_not   10  root  cwd       DIR               3,11         688      2 /
kacpi_not   10  root  rtd       DIR               3,11         688      2 /
kacpi_not   10  root  txt   unknown                                       /proc/10/exe
cqueue      11  root  cwd       DIR               3,11         688      2 /
cqueue      11  root  rtd       DIR               3,11         688      2 /
cqueue      11  root  txt   unknown                                       /proc/11/exe
kseriod     12  root  cwd       DIR               3,11         688      2 /
kseriod     12  root  rtd       DIR               3,11         688      2 /
kseriod     12  root  txt   unknown                                       /proc/12/exe
kondemand   13  root  cwd       DIR               3,11         688      2 /
kondemand   13  root  rtd       DIR               3,11         688      2 /
kondemand   13  root  txt   unknown                                       /proc/13/exe
pdflush     14  root  cwd       DIR               3,11         688      2 /
pdflush     14  root  rtd       DIR               3,11         688      2 /
pdflush     14  root  txt   unknown                                       /proc/14/exe
pdflush     15  root  cwd       DIR               3,11         688      2 /
pdflush     15  root  rtd       DIR               3,11         688      2 /
pdflush     15  root  txt   unknown                                       /proc/15/exe
kswapd0     16  root  cwd       DIR               3,11         688      2 /
kswapd0     16  root  rtd       DIR               3,11         688      2 /
kswapd0     16  root  txt   unknown                                       /proc/16/exe
aio/0       17  root  cwd       DIR               3,11         688      2 /
aio/0       17  root  rtd       DIR               3,11         688      2 /
aio/0       17  root  txt   unknown                                       /proc/17/exe
kpsmoused   18  root  cwd       DIR               3,11         688      2 /
kpsmoused   18  root  rtd       DIR               3,11         688      2 /
kpsmoused   18  root  txt   unknown                                       /proc/18/exe
ata/0       57  root  cwd       DIR               3,11         688      2 /
ata/0       57  root  rtd       DIR               3,11         688      2 /
ata/0       57  root  txt   unknown                                       /proc/57/exe
ata_aux     58  root  cwd       DIR               3,11         688      2 /
ata_aux     58  root  rtd       DIR               3,11         688      2 /
ata_aux     58  root  txt   unknown                                       /proc/58/exe
scsi_eh_0   60  root  cwd       DIR               3,11         688      2 /
scsi_eh_0   60  root  rtd       DIR               3,11         688      2 /
scsi_eh_0   60  root  txt   unknown                                       /proc/60/exe
scsi_eh_1   61  root  cwd       DIR               3,11         688      2 /
scsi_eh_1   61  root  rtd       DIR               3,11         688      2 /
scsi_eh_1   61  root  txt   unknown                                       /proc/61/exe
scsi_eh_2   76  root  cwd       DIR               3,11         688      2 /
scsi_eh_2   76  root  rtd       DIR               3,11         688      2 /
scsi_eh_2   76  root  txt   unknown                                       /proc/76/exe
ksuspend_  187  root  cwd       DIR               3,11         688      2 /
ksuspend_  187  root  rtd       DIR               3,11         688      2 /
ksuspend_  187  root  txt   unknown                                       /proc/187/exe
khubd      188  root  cwd       DIR               3,11         688      2 /
khubd      188  root  rtd       DIR               3,11         688      2 /
khubd      188  root  txt   unknown                                       /proc/188/exe
reiserfs/  554  root  cwd       DIR               3,11         688      2 /
reiserfs/  554  root  rtd       DIR               3,11         688      2 /
reiserfs/  554  root  txt   unknown                                       /proc/554/exe
udevd      627  root  cwd       DIR               3,11         688      2 /
udevd      627  root  rtd       DIR               3,11         688      2 /
udevd      627  root  txt       REG               3,11      101544  52774 /sbin/udevd
udevd      627  root  mem       REG               3,11       47784 220323 /lib64/libnss_files-2.9.so
udevd      627  root  mem       REG               3,11       43744  23292 /lib64/libnss_nis-2.9.so
udevd      627  root  mem       REG               3,11       89232 211603 /lib64/libnsl-2.9.so
udevd      627  root  mem       REG               3,11       31792 220058 /lib64/libnss_compat-2.9.so
udevd      627  root  mem       REG               3,11       14872 211599 /lib64/libdl-2.9.so
udevd      627  root  mem       REG               3,11     1406248  23271 /lib64/libc-2.9.so
udevd      627  root  mem       REG               3,11      113904  38855 /lib64/libselinux.so.1
udevd      627  root  mem       REG               3,11      127896  23264 /lib64/ld-2.9.so
udevd      627  root    0u      CHR                1,3         0t0   1830 /dev/null
udevd      627  root    1u      CHR                1,3         0t0   1830 /dev/null
udevd      627  root    2u      CHR                1,3         0t0   1830 /dev/null
udevd      627  root    3r      DIR               3,11        1440   3099 /etc/init.d/boot.d
udevd      627  root    4r      DIR               0,10           0      1 inotify
udevd      627  root    5u     unix 0xffff88003787fc00         0t0   1884 socket
udevd      627  root    6u     sock                0,4         0t0   1885 can't identify protocol
udevd      627  root    7r     FIFO                0,6         0t0   1886 pipe
udevd      627  root    8w     FIFO                0,6         0t0   1886 pipe
kgameport 1021  root  cwd       DIR               3,11         688      2 /
kgameport 1021  root  rtd       DIR               3,11         688      2 /
kgameport 1021  root  txt   unknown                                       /proc/1021/exe
khpsbpkt  1293  root  cwd       DIR               3,11         688      2 /
khpsbpkt  1293  root  rtd       DIR               3,11         688      2 /
khpsbpkt  1293  root  txt   unknown                                       /proc/1293/exe
knodemgrd 1310  root  cwd       DIR               3,11         688      2 /
knodemgrd 1310  root  rtd       DIR               3,11         688      2 /
knodemgrd 1310  root  txt   unknown                                       /proc/1310/exe
saa7133[0 1380  root  cwd       DIR               3,11         688      2 /
saa7133[0 1380  root  rtd       DIR               3,11         688      2 /
saa7133[0 1380  root  txt   unknown                                       /proc/1380/exe
kstriped  1464  root  cwd       DIR               3,11         688      2 /
kstriped  1464  root  rtd       DIR               3,11         688      2 /
kstriped  1464  root  txt   unknown                                       /proc/1464/exe
kdmflush  1479  root  cwd       DIR               3,11         688      2 /
kdmflush  1479  root  rtd       DIR               3,11         688      2 /
kdmflush  1479  root  txt   unknown                                       /proc/1479/exe
kdmflush  1488  root  cwd       DIR               3,11         688      2 /
kdmflush  1488  root  rtd       DIR               3,11         688      2 /
kdmflush  1488  root  txt   unknown                                       /proc/1488/exe
xfs_mru_c 1547  root  cwd       DIR               3,11         688      2 /
xfs_mru_c 1547  root  rtd       DIR               3,11         688      2 /
xfs_mru_c 1547  root  txt   unknown                                       /proc/1547/exe
xfslogd/0 1548  root  cwd       DIR               3,11         688      2 /
xfslogd/0 1548  root  rtd       DIR               3,11         688      2 /
xfslogd/0 1548  root  txt   unknown                                       /proc/1548/exe
xfsdatad/ 1549  root  cwd       DIR               3,11         688      2 /
xfsdatad/ 1549  root  rtd       DIR               3,11         688      2 /
xfsdatad/ 1549  root  txt   unknown                                       /proc/1549/exe
xfsbufd   1550  root  cwd       DIR               3,11         688      2 /
xfsbufd   1550  root  rtd       DIR               3,11         688      2 /
xfsbufd   1550  root  txt   unknown                                       /proc/1550/exe
xfsaild   1552  root  cwd       DIR               3,11         688      2 /
xfsaild   1552  root  rtd       DIR               3,11         688      2 /
xfsaild   1552  root  txt   unknown                                       /proc/1552/exe
xfssyncd  1553  root  cwd       DIR               3,11         688      2 /
xfssyncd  1553  root  rtd       DIR               3,11         688      2 /
xfssyncd  1553  root  txt   unknown                                       /proc/1553/exe
mount.ntf 1575  root  cwd       DIR               3,11         688      2 /
mount.ntf 1575  root  rtd       DIR               3,11         688      2 /
mount.ntf 1575  root  txt       REG               3,11       40400  41621 /bin/ntfs-3g
mount.ntf 1575  root  mem       REG               3,11     1406248  23271 /lib64/libc-2.9.so
mount.ntf 1575  root  mem       REG               3,11      130284  23297 /lib64/libpthread-2.9.so
mount.ntf 1575  root  mem       REG               3,11      273120  56571 /lib64/libntfs-3g.so.40.0.0
mount.ntf 1575  root  mem       REG               3,11      127896  23264 /lib64/ld-2.9.so
mount.ntf 1575  root  mem       REG               3,11      256444 233499 /usr/lib/locale/it_IT.utf8/LC_CTYPE
mount.ntf 1575  root  mem       REG               3,11      952254 233500 /usr/lib/locale/it_IT.utf8/LC_COLLATE
mount.ntf 1575  root  mem       REG               3,11          54  31176 /usr/lib/locale/it_IT.utf8/LC_NUMERIC
mount.ntf 1575  root  mem       REG               3,11        2426 199831 /usr/lib/locale/it_IT.utf8/LC_TIME
mount.ntf 1575  root  mem       REG               3,11         294  31077 /usr/lib/locale/it_IT.utf8/LC_MONETARY
mount.ntf 1575  root  mem       REG               3,11          54 233494 /usr/lib/locale/it_IT.utf8/LC_MESSAGES/SYS_LC_MESSAGES
mount.ntf 1575  root  mem       REG               3,11          34  31219 /usr/lib/locale/it_IT.utf8/LC_PAPER
mount.ntf 1575  root  mem       REG               3,11          62 233491 /usr/lib/locale/it_IT.utf8/LC_NAME
mount.ntf 1575  root  mem       REG               3,11         127 220321 /usr/lib/locale/it_IT.utf8/LC_ADDRESS
mount.ntf 1575  root  mem       REG               3,11          49  31066 /usr/lib/locale/it_IT.utf8/LC_TELEPHONE
mount.ntf 1575  root  mem       REG               3,11          23  31223 /usr/lib/locale/it_IT.utf8/LC_MEASUREMENT
mount.ntf 1575  root  mem       REG               3,11       26050  30680 /usr/lib64/gconv/gconv-modules.cache
mount.ntf 1575  root  mem       REG               3,11         343  29123 /usr/lib/locale/it_IT.utf8/LC_IDENTIFICATION
mount.ntf 1575  root    0u      CHR                1,3         0t0   1830 /dev/null
mount.ntf 1575  root    1u      CHR                1,3         0t0   1830 /dev/null
mount.ntf 1575  root    2u      CHR                1,3         0t0   1830 /dev/null
mount.ntf 1575  root    3r      DIR               3,11        1440   3099 /etc/init.d/boot.d
mount.ntf 1575  root    4u      BLK                3,5 0x27115f400   1487 /dev/hda5
mount.ntf 1575  root    5u      CHR             10,229         0t0   5152 /dev/fuse
xfsbufd   1576  root  cwd       DIR               3,11         688      2 /
xfsbufd   1576  root  rtd       DIR               3,11         688      2 /
xfsbufd   1576  root  txt   unknown                                       /proc/1576/exe
xfsaild   1577  root  cwd       DIR               3,11         688      2 /
xfsaild   1577  root  rtd       DIR               3,11         688      2 /
xfsaild   1577  root  txt   unknown                                       /proc/1577/exe
xfssyncd  1578  root  cwd       DIR               3,11         688      2 /
xfssyncd  1578  root  rtd       DIR               3,11         688      2 /
xfssyncd  1578  root  txt   unknown                                       /proc/1578/exe
console-k 2062  root  cwd       DIR               3,11         688      2 /
console-k 2062  root  rtd       DIR               3,11         688      2 /
console-k 2062  root  txt       REG               3,11      140224 154371 /usr/sbin/console-kit-daemon
console-k 2062  root  mem       REG               3,11       96744 263920 /lib64/libgcc_s.so.1
console-k 2062  root  mem       REG               3,11      170240 220093 /lib64/libexpat.so.1.5.2
console-k 2062  root  mem       REG               3,11      194816  39927 /usr/lib64/libpcre.so.0.0.1
console-k 2062  root  mem       REG               3,11     1406248  23271 /lib64/libc-2.9.so
console-k 2062  root  mem       REG               3,11      130284  23297 /lib64/libpthread-2.9.so
console-k 2062  root  mem       REG               3,11      106040 243740 /usr/lib64/libpolkit.so.2.0.0
console-k 2062  root  mem       REG               3,11      803112 186719 /usr/lib64/libglib-2.0.so.0.1800.2
console-k 2062  root  mem       REG               3,11       36008 220367 /lib64/librt-2.9.so
console-k 2062  root  mem       REG               3,11       18984  38389 /usr/lib64/libgthread-2.0.so.0.1800.2
console-k 2062  root  mem       REG               3,11      277928 186965 /usr/lib64/libgobject-2.0.so.0.1800.2
console-k 2062  root  mem       REG               3,11      253488 138799 /lib64/libdbus-1.so.3.4.0
console-k 2062  root  mem       REG               3,11       89232 211603 /lib64/libnsl-2.9.so
console-k 2062  root  mem       REG               3,11      135848 105580 /usr/lib64/libdbus-glib-1.so.2.1.0
console-k 2062  root  mem       REG               3,11      127896  23264 /lib64/ld-2.9.so
console-k 2062  root    0u      CHR                1,3         0t0   1830 /dev/null
console-k 2062  root    1u      CHR                1,3         0t0   1830 /dev/null
console-k 2062  root    2u      CHR                1,3         0t0   1830 /dev/null
console-k 2062  root    3r      DIR               3,11        2520   3105 /etc/init.d/rc5.d
console-k 2062  root    4r     FIFO                0,6         0t0   5888 pipe
console-k 2062  root    5u      CHR                1,3         0t0   1830 /dev/null
console-k 2062  root    6r      DIR               0,10           0      1 inotify
console-k 2062  root    7w     FIFO                0,6         0t0   5888 pipe
console-k 2062  root    8r     FIFO                0,6         0t0   5889 pipe
console-k 2062  root    9w     FIFO                0,6         0t0   5889 pipe
console-k 2062  root   12r      DIR               0,10           0      1 inotify
console-k 2062  root   14r      DIR               3,11          88  76582 /etc/ConsoleKit/run-session.d
console-k 2062  root   15r     FIFO                0,6         0t0  11086 pipe
console-k 2062  root   16w     FIFO                0,6         0t0  11086 pipe
console-k 2062  root   17r      DIR               3,11          48  76592 /usr/lib/ConsoleKit/run-session.d
console-k 2062  root   18r      DIR               3,11          88  76582 /etc/ConsoleKit/run-session.d
console-k 2062  root   19r      DIR               3,11          48  76592 /usr/lib/ConsoleKit/run-session.d
console-k 2062  root   20r      DIR               3,11          88  76582 /etc/ConsoleKit/run-session.d
console-k 2062  root   21r      DIR               3,11          48  76592 /usr/lib/ConsoleKit/run-session.d
console-k 2062  root   22r      DIR               3,11          88  76582 /etc/ConsoleKit/run-session.d
console-k 2062  root   23r      DIR               3,11          48  76592 /usr/lib/ConsoleKit/run-session.d
kauditd   2720  root  cwd       DIR               3,11         688      2 /
kauditd   2720  root  rtd       DIR               3,11         688      2 /
kauditd   2720  root  txt   unknown                                       /proc/2720/exe
gam_serve 3978 diego  cwd       DIR               3,12       12288    131 /home/diego
gam_serve 3978 diego  rtd       DIR               3,11         688      2 /
gam_serve 3978 diego  txt       REG               3,11      343918  92261 /usr/lib64/gam_server
gam_serve 3978 diego  mem       REG               3,11      194816  39927 /usr/lib64/libpcre.so.0.0.1
gam_serve 3978 diego  mem       REG               3,11     1406248  23271 /lib64/libc-2.9.so
gam_serve 3978 diego  mem       REG               3,11      803112 186719 /usr/lib64/libglib-2.0.so.0.1800.2
gam_serve 3978 diego  mem       REG               3,11      127896  23264 /lib64/ld-2.9.so
gam_serve 3978 diego  mem       REG               3,11      217016 199483 /var/run/nscd/passwd
gam_serve 3978 diego  mem       REG               3,11       26050  30680 /usr/lib64/gconv/gconv-modules.cache
gam_serve 3978 diego    0r      CHR                1,3         0t0   1830 /dev/null
gam_serve 3978 diego    1w      CHR                1,3         0t0   1830 /dev/null
gam_serve 3978 diego    2w      CHR                1,3         0t0   1830 /dev/null
gam_serve 3978 diego    3r      DIR               0,10           0      1 inotify
gam_serve 3978 diego    4u     unix 0xffff88006b0759c0         0t0  12136 socket
gam_serve 3978 diego    5r     FIFO                0,6         0t0  12137 pipe
gam_serve 3978 diego    6w     FIFO                0,6         0t0  12137 pipe
em28xx-wo 4036  root  cwd       DIR               3,11         688      2 /
em28xx-wo 4036  root  rtd       DIR               3,11         688      2 /
em28xx-wo 4036  root  txt   unknown                                       /proc/4036/exe
rc        4885  root  cwd       DIR               3,11         688      2 /
rc        4885  root  rtd       DIR               3,11         688      2 /
rc        4885  root  txt       REG               3,11      715072 215022 /bin/bash
rc        4885  root  mem       REG               3,11      293936 188701 /lib64/libncurses.so.5.6
rc        4885  root  mem       REG               3,11     1406248  23271 /lib64/libc-2.9.so
rc        4885  root  mem       REG               3,11       14872 211599 /lib64/libdl-2.9.so
rc        4885  root  mem       REG               3,11      263568  35119 /lib64/libreadline.so.5.2
rc        4885  root  mem       REG               3,11      127896  23264 /lib64/ld-2.9.so
rc        4885  root    0u      CHR                5,1         0t0   1790 /dev/console
rc        4885  root    1u      CHR                5,1         0t0   1790 /dev/console
rc        4885  root    2u      CHR                5,1         0t0   1790 /dev/console
rc        4885  root  255r      REG               3,11        9374 131764 /etc/init.d/rc
S01halt   5796  root  cwd       DIR               3,11         688      2 /
S01halt   5796  root  rtd       DIR               3,11         688      2 /
S01halt   5796  root  txt       REG               3,11      715072 215022 /bin/bash
S01halt   5796  root  mem       REG               3,11      293936 188701 /lib64/libncurses.so.5.6
S01halt   5796  root  mem       REG               3,11     1406248  23271 /lib64/libc-2.9.so
S01halt   5796  root  mem       REG               3,11       14872 211599 /lib64/libdl-2.9.so
S01halt   5796  root  mem       REG               3,11      263568  35119 /lib64/libreadline.so.5.2
S01halt   5796  root  mem       REG               3,11      127896  23264 /lib64/ld-2.9.so
S01halt   5796  root    0u      CHR                5,1         0t0   1790 /dev/console
S01halt   5796  root    1u      CHR                5,1         0t0   1790 /dev/console
S01halt   5796  root    2u      CHR                5,1         0t0   1790 /dev/console
S01halt   5796  root    3r     FIFO                0,6         0t0  19698 pipe
S01halt   5796  root  255r      REG               3,11        5994 131759 /etc/init.d/halt
startpar  5829  root  cwd       DIR               3,11         688      2 /
startpar  5829  root  rtd       DIR               3,11         688      2 /
startpar  5829  root  txt       REG               3,11       27464  47214 /sbin/startpar
startpar  5829  root  mem       REG               3,11     1406248  23271 /lib64/libc-2.9.so
startpar  5829  root  mem       REG               3,11      127896  23264 /lib64/ld-2.9.so
startpar  5829  root    0u      CHR                5,1         0t0   1790 /dev/console
startpar  5829  root    1w     FIFO                0,6         0t0  19698 pipe
startpar  5829  root    2u      CHR                5,1         0t0   1790 /dev/console
startpar  5829  root    3r      DIR               3,11        1440   3099 /etc/init.d/boot.d
startpar  5829  root    4r     FIFO                0,6         0t0  19699 pipe
startpar  5829  root    5w     FIFO                0,6         0t0  19699 pipe
startpar  5829  root    6u      CHR                5,2         0t0   1832 /dev/ptmx
boot.loca 5970  root  cwd       DIR               3,11         688      2 /
boot.loca 5970  root  rtd       DIR               3,11         688      2 /
boot.loca 5970  root  txt       REG               3,11      715072 215022 /bin/bash
boot.loca 5970  root  mem       REG               3,11      293936 188701 /lib64/libncurses.so.5.6
boot.loca 5970  root  mem       REG               3,11     1406248  23271 /lib64/libc-2.9.so
boot.loca 5970  root  mem       REG               3,11       14872 211599 /lib64/libdl-2.9.so
boot.loca 5970  root  mem       REG               3,11      263568  35119 /lib64/libreadline.so.5.2
boot.loca 5970  root  mem       REG               3,11      127896  23264 /lib64/ld-2.9.so
boot.loca 5970  root    0u      CHR                5,1         0t0   1790 /dev/console
boot.loca 5970  root    1u      CHR              136,0         0t0      2 /dev/pts/0
boot.loca 5970  root    2u      CHR              136,0         0t0      2 /dev/pts/0
boot.loca 5970  root    3r      DIR               3,11        1440   3099 /etc/init.d/boot.d
boot.loca 5970  root  255r      REG               3,11        9778 230893 /etc/init.d/boot.localfs
lsof      5980  root  cwd       DIR               3,11         688      2 /
lsof      5980  root  rtd       DIR               3,11         688      2 /
lsof      5980  root  txt       REG               3,11      127416  35202 /usr/bin/lsof
lsof      5980  root  mem       REG               3,11       14872 211599 /lib64/libdl-2.9.so
lsof      5980  root  mem       REG               3,11     1406248  23271 /lib64/libc-2.9.so
lsof      5980  root  mem       REG               3,11      113904  38855 /lib64/libselinux.so.1
lsof      5980  root  mem       REG               3,11      127896  23264 /lib64/ld-2.9.so
lsof      5980  root    0u      CHR                5,1         0t0   1790 /dev/console
lsof      5980  root    1w      REG               3,11       33932  88408 /var/log/boot.fail.msg
lsof      5980  root    2u      CHR              136,0         0t0      2 /dev/pts/0
lsof      5980  root    3r      DIR                0,3           0      1 /proc
lsof      5980  root    4r      DIR                0,3           0  19868 /proc/5980/fd
lsof      5980  root    5w     FIFO                0,6         0t0  19872 pipe
lsof      5980  root    6r     FIFO                0,6         0t0  19873 pipe
lsof      5981  root  cwd       DIR               3,11         688      2 /
lsof      5981  root  rtd       DIR               3,11         688      2 /
lsof      5981  root  txt       REG               3,11      127416  35202 /usr/bin/lsof
lsof      5981  root  mem       REG               3,11       14872 211599 /lib64/libdl-2.9.so
lsof      5981  root  mem       REG               3,11     1406248  23271 /lib64/libc-2.9.so
lsof      5981  root  mem       REG               3,11      113904  38855 /lib64/libselinux.so.1
lsof      5981  root  mem       REG               3,11      127896  23264 /lib64/ld-2.9.so
lsof      5981  root    4r     FIFO                0,6         0t0  19872 pipe
lsof      5981  root    7w     FIFO                0,6         0t0  19873 pipe
------
lsmod
----
Module                  Size  Used by
zl10353                 8368  1 
em28xx_dvb             20092  0 
dvb_core               87948  1 em28xx_dvb
em28xx_audio            9036  0 
tuner_xc3028            6264  1 
tvp5150                18712  0 
em28xx                413988  2 em28xx_dvb,em28xx_audio
ip6t_LOG                7180  7 
xt_tcpudp               3608  2 
xt_pkttype              2152  3 
ipt_LOG                 6812  11 
xt_limit                3180  18 
binfmt_misc            10260  1 
ip6t_REJECT             6024  3 
nf_conntrack_ipv6      24840  4 
ip6table_raw            2456  1 
xt_NOTRACK              2152  4 
ipt_REJECT              3480  3 
xt_state                2568  14 
iptable_raw             2760  1 
iptable_filter          3400  1 
ip6table_mangle         3128  0 
nf_conntrack_netbios_ns     2840  0 
nf_conntrack_ipv4      12792  10 
nf_conntrack           80480  5 nf_conntrack_ipv6,xt_NOTRACK,xt_state,nf_conntrack_netbios_ns,nf_conntrack_ipv4
ip_tables              19464  2 iptable_raw,iptable_filter
ip6table_filter         3240  1 
ip6_tables             21048  4 ip6t_LOG,ip6table_raw,ip6table_mangle,ip6table_filter
x_tables               23376  11 ip6t_LOG,xt_tcpudp,xt_pkttype,ipt_LOG,xt_limit,ip6t_REJECT,xt_NOTRACK,ipt_REJECT,xt_state,ip_tables,ip6_tables
ipv6                  293608  11 ip6t_REJECT,nf_conntrack_ipv6,ip6table_mangle
cpufreq_conservative     8272  0 
cpufreq_userspace       4204  0 
cpufreq_powersave       2248  0 
powernow_k8            15580  0 
fuse                   61088  2 
nls_iso8859_1           5352  5 
nls_cp437               7064  5 
vfat                   11864  5 
fat                    54376  1 vfat
xfs                   545312  2 
loop                   17924  0 
dm_mod                 73952  5 
saa7134_alsa           14464  0 
tda827x                10892  1 
tda8290                14956  1 
tuner                  26220  0 
saa7134               158020  1 saa7134_alsa
sg                     35344  0 
osst                   52928  0 
ir_common              43340  1 saa7134
compat_ioctl32          8536  1 saa7134
videodev               35328  4 em28xx,tuner,saa7134,compat_ioctl32
st                     38892  2 
v4l1_compat            14220  2 em28xx,videodev
ohci1394               31380  0 
v4l2_common            12600  2 tuner,saa7134
videobuf_dma_sg        14332  2 saa7134_alsa,saa7134
videobuf_core          20748  2 saa7134,videobuf_dma_sg
ieee1394               98880  1 ohci1394
tveeprom               13708  1 saa7134
nvidia               5662024  0 
rtc_cmos               13960  0 
snd_pcm                95440  2 em28xx_audio,saa7134_alsa
ppdev                   8208  0 
isp1760                20776  0 
shpchp                 32244  0 
rtc_core               22420  1 rtc_cmos
snd_timer              26664  1 snd_pcm
ide_cd_mod             33984  0 
pci_hotplug            31864  1 shpchp
button                  8328  0 
rtc_lib                 3560  1 rtc_core
parport_pc             40392  0 
ns558                   6264  0 
snd                    74632  4 em28xx_audio,saa7134_alsa,snd_pcm,snd_timer
gameport               13640  2 ns558
cdrom                  36200  1 ide_cd_mod
i2c_nforce2             8624  0 
parport                41568  2 ppdev,parport_pc
forcedeth              60312  0 
k8temp                  5352  0 
pcspkr                  3064  0 
snd_page_alloc          9816  1 snd_pcm
i2c_core               35280  12 zl10353,tuner_xc3028,tvp5150,em28xx,tda827x,tda8290,tuner,saa7134,v4l2_common,tveeprom,nvidia,i2c_nforce2
soundcore               8816  1 snd
floppy                 63240  0 
ide_disk               14872  14 
ehci_hcd               55348  0 
ohci_hcd               36548  0 
usbcore               198656  7 em28xx_dvb,em28xx_audio,em28xx,isp1760,ehci_hcd,ohci_hcd
advansys               79600  0 
edd                    10272  0 
reiserfs              241392  4 
fan                     6016  0 
ide_pci_generic         4652  0 
ata_generic             6044  0 
pata_amd               13692  0 
sata_nv                26480  0 
libata                183376  3 ata_generic,pata_amd,sata_nv
scsi_mod              179144  5 sg,osst,st,advansys,libata
dock                   14564  1 libata
amd74xx                 7152  12 
ide_core              118012  4 ide_cd_mod,ide_disk,ide_pci_generic,amd74xx
thermal                24232  0 
processor              49904  2 powernow_k8,thermal
thermal_sys            14336  3 fan,thermal,processor
hwmon                   4040  2 k8temp,thermal_sys
------------END-------
Comment 38 Sven Zielke 2009-01-23 10:10:15 UTC
Hello Diego,

it should be "lsof | grep /home" instead of "lsof" to shorten the log to the most important things. Nevertheless the gamin-server blocks your /home on unmounting. That was my problem, too before I installes these packages.
"gam_serve 3978 diego  cwd       DIR               3,12       12288    131
/home/diego"
I cannot say, why this still happens to you, maybe Dr. Werner Fink has an idea.
Comment 39 Dr. Werner Fink 2009-01-23 10:40:33 UTC
IMHO this has nothing todo with the HOME paritition. It looks like

        mkill -TERM

stops a daemon or service process which shouldn't be stopped on
Diegos system.  Diego?  Please could you change the above line
into

        strace mkill -TERM

the we may see, which user space process is the problem.  Beside
this it seems that you're using NTF file systems together with the
ntfs-3g tools from gnome.  And the mount.ntfs seems to hang around
as daemons?

As a possible solution you could add

         ntfs,ntfs-3g

to the line

         typeset -r tmpfs=tmpfs,ramfs,hugetlbfs,mqueue

of /etc/init.d/boot.localfs
Comment 40 Diego Ercolani 2009-01-23 15:23:45 UTC
Created attachment 267289 [details]
"debug" patch for boot.localfs

I rewrote the patch to accomplish more informations here it is
Comment 41 Diego Ercolani 2009-01-23 15:29:34 UTC
Created attachment 267291 [details]
dump generated by attachment #267289 [details]

This is the generated log file from #267289.
Near Date I wrote ok if the dump refer to a successful shutdown
I wrote "failed" if it refers to a "freezed" shutdown.

The dump refers to another fresh 11.1 install machine with #266174 and #266173
Comment 42 Dr. Werner Fink 2009-01-23 15:36:46 UTC
Do you have troed to add ntfs,ntfs-3g to tmpfs like

     typeset -r tmpfs=tmpfs,ramfs,hugetlbfs,mqueue,ntfs,ntfs-3g

in /etc/init.d/boot.localfs
Comment 43 Diego Ercolani 2009-01-23 15:38:56 UTC
I'll do this week end. This is another PC
Comment 44 Diego Ercolani 2009-01-23 15:44:21 UTC
I forgot to say two things for my last attachment.
As you can see in my patch, I mark the end of strace with a tag 
"------------END 2st phase-------" and for the successful shutdown it correctly appears in the log file (strangely because of mkill itself) and doesn't appear in the failed shutdown so it seem that mkill is the freezing thing as also on the console doesn't appear the debug line that "set -x" should show.
Comment 45 Dr. Werner Fink 2009-01-23 15:50:18 UTC
Nothing strange ... if mkill terminates a user space deamon which
serve a NTFS file system as user space driver the system hangs.
Comment 46 Diego Ercolani 2009-01-23 15:59:44 UTC
But, as you see... in this computer attachment #267291 [details] I don't have any ntfs filesystem
Comment 47 Dr. Werner Fink 2009-01-26 10:21:10 UTC
In comment #37 I see a running user space daemon /bin/ntfs-3g .. maybe this
is not for a NTFS but for a vfat. Nevertheless it is a user space daemon
which provides a driver for windows file systems.  And this is killed
by mkill (which should not happen here).
Comment 48 Diego Ercolani 2009-01-26 10:57:38 UTC
Yes, on that machine I added the modifications you asked me to in comment #39 but for my comment #41 I shoud repeat that on that (comment #41) I don't have any vfat/ntfs partition
Comment 49 Dr. Werner Fink 2009-01-26 11:12:05 UTC
There is also a user space daemon gvfs-fuse which provides a file system
driver in user space.
Comment 50 Dr. Werner Fink 2009-01-26 11:23:21 UTC
Please provide the full content of the pid directory of such
a gvfs-fuse and ntfs-3g daemon (you have to be root to do this),
e.g. with this

 for p in $(fuser /dev/fuse 2>/dev/null); do
    find /proc/$p -maxdepth 2 -type l -ls
 done

this should find all processes which provides user space file
systems driver.
Comment 51 Diego Ercolani 2009-01-26 12:00:06 UTC
Yes, gvfs uses fuse... but honestly I don't know who install gvfs as I use kde
gvfs-fuse-daemon on /home/diego/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev,user=diego)

 70931    0 lrwx------   1 diego users       64 Jan 26 12:30 /proc/4906/fd/0 -> /dev/null
 70932    0 lrwx------   1 diego users       64 Jan 26 12:30 /proc/4906/fd/1 -> /dev/null
 70933    0 lrwx------   1 diego users       64 Jan 26 12:30 /proc/4906/fd/2 -> /dev/null
 70934    0 lrwx------   1 diego users       64 Jan 26 12:30 /proc/4906/fd/3 -> /dev/fuse
 70935    0 lrwx------   1 diego users       64 Jan 26 12:30 /proc/4906/fd/4 -> socket:[14916]
 70936    0 lr-x------   1 diego users       64 Jan 26 12:30 /proc/4906/fd/5 -> pipe:[14918]
 70937    0 l-wx------   1 diego users       64 Jan 26 12:30 /proc/4906/fd/6 -> pipe:[14918]
 70938    0 lrwx------   1 diego users       64 Jan 26 12:30 /proc/4906/fd/7 -> socket:[14919]
 70929    0 lrwxrwxrwx   1 diego users        0 Jan 26 12:30 /proc/4906/cwd -> /
 70928    0 lrwxrwxrwx   1 diego users        0 Jan 26 12:30 /proc/4906/root -> /
 22139    0 lrwxrwxrwx   1 diego users        0 Jan 26 09:15 /proc/4906/exe -> /usr/lib64/gvfs/gvfs-fuse-daemon
Comment 52 Dr. Werner Fink 2009-01-26 12:09:13 UTC
Created attachment 267579 [details]
sysvinit-2.86-186.14.i586.rpm

sysvinit with mkill which skips all processes using /dev/fuse
Comment 53 Dr. Werner Fink 2009-01-26 12:10:32 UTC
Created attachment 267581 [details]
sysvinit-2.86-186.14.x86_64.rpm

sysvinit with mkill which skips all processes using /dev/fuse
Comment 54 Dr. Werner Fink 2009-01-26 12:13:09 UTC
I've modified and also verified that the mkill(8) utility from the
sysvinit of the attachmnent #267579 and attachmnent #67581 will
not touch the user space daemons providing a file system driver.
Please test out if this helps in your case.
Comment 55 Diego Ercolani 2009-01-26 13:36:01 UTC
Shall I remove the comment #42 modifications after applying new sysvinit?
Comment 56 Dr. Werner Fink 2009-01-26 13:40:27 UTC
yes
Comment 59 Diego Ercolani 2009-01-28 09:11:34 UTC
I think we catched the problem. It seem workstation are shutdowning correctly.
Comment 60 Dr. Werner Fink 2009-01-28 12:23:00 UTC
OK strike.  Anja, we need a SWAMPID for this to get out updates for
aaa_base and sysvinit for openSuSE 11.1.  This because all users of
file systems which are driven by user space deamon are affected by
this problem.
Comment 61 Dirk Mueller 2009-01-28 13:27:41 UTC
I agree that we want to fix this for 11.1. I read the comment#57 as "submit to factory first", as an additional measure for protecting against regressions.
Comment 62 Dr. Werner Fink 2009-01-28 13:33:21 UTC
Both aaa_base and sysvinit are submitted to SLES11 and Factory ;)
Comment 63 Dirk Mueller 2009-01-28 14:20:24 UTC
added to the planned updates. if there are no regressions, lets revisit this and push out the update mid of next month.
Comment 64 Swamp Workflow Management 2009-01-28 16:02:16 UTC
The SWAMPID for this issue is 22283.
Please submit the patch and patchinfo file using this ID.
(https://swamp.suse.de/webswamp/wf/22283)
Comment 65 Dr. Werner Fink 2009-01-28 16:21:12 UTC
I've submitted both aaa_base *and* sysvinit as *both* packages are required.

This includes also the fixes for several bugs for aaa_base bug #426270,
bug #463477, bug #466718, bug #458940, bug #463175, bug #457093, bug #457984,
bug #422010, bug #445646, bug #441053, and bug #442753 ...
Comment 66 andreas bittner 2009-01-30 00:41:09 UTC
when will be these patches landing on online_update

my 11.0 to 11.1 upgraded system is still affected during every reboot cycle. loads of fsck on boot.

thanks and regards.
Comment 67 Diego Ercolani 2009-02-01 12:19:22 UTC
Created attachment 269096 [details]
shutdown freeze log

I'm sorry to inform that again we have issue... I left log grab during shutdown and again we have a freeze of mkill....
unfortunately I don't have all the log of the failed shutdown
but only the "mount" "lsmod" and "lsof" output before the stop procedure in /etc/init.d/boot.localfs. On the console procedure stopped again during mkill invocation.
Comment 68 Ruediger Oertel 2009-02-16 11:56:51 UTC
*** Bug 465029 has been marked as a duplicate of this bug. ***
Comment 69 Swamp Workflow Management 2009-02-16 12:47:09 UTC
Update released for: aaa_base, sysvinit
Products:
openSUSE 11.1 (debug, i586, ppc, x86_64)
Comment 70 Swamp Workflow Management 2009-02-16 13:07:12 UTC
The SWAMPID for this issue is 22528.
Please submit the patch and patchinfo file using this ID.
(https://swamp.suse.de/webswamp/wf/22528)
Comment 71 Dr. Werner Fink 2009-02-16 13:10:59 UTC
Diego: Please check if the packages sysvinit and aaa_base are uptodate.
The last changelog entry of sysvinit is:

 Fri Feb  6 00:36:27 CET 2009 - ro@suse.de
 - fix build (move static int loop before first usage) 

 Tue Jan 27 16:00:03 CET 2009 - werner@suse.de
 - Do not terminate udevd with mkill(8)
 - Do not terminate udevd with killall5(8)
 - Avoid chrashing startpar due recursion caused by loops

 Mon Jan 26 12:02:43 CET 2009 - werner@suse.de
 - Do not kill fuse user space processes with mkill(8) (bnc#466484)
 - Minimize fuse patch for killall5(8) by using readlinkat(2)

and those of aaa_base

 Mon Jan 26 11:25:36 CET 2009 - coolo@suse.de
 - removing the timeout, there is no good timeout value (bnc#426270)

 Fri Jan 23 12:19:31 CET 2009 - coolo@suse.de
 - wait for udev to settle the modprobe events (bnc#426270)

... AFAICS these are the latest changes for openSuSE 11.1.

If with this changes the mkill hangs then the order of the mounts
could be wrong.  With the current mkill the udev daemon process and
all processes using /dev/fuse for user space driven file systems
are not touched anymore.  This is process 1685 listed in your log
serving the file system for /dev/hda5 aka /windows/F
Comment 72 andreas bittner 2009-02-17 09:29:24 UTC
my 11.1 x86 with the latest online_update patches applied doesnt suffer from unclean shutdown, and fsck sessions during startup any more.

the bug seems to be gone now.
thank you.

at last.

p.s. how about better quality control, learning from past mistakes and trying to avoid such new bugs that get introduced with each new opensuse release or even intermediate patches during lifecycle. thanks.
Comment 73 Dr. Werner Fink 2009-02-17 10:31:59 UTC
Andreas (Jaeger)? ... the last question concerns to you.

Andreas (Bittner)? ... do you have been beta tester?
Such bug can only be detected with the help of beta testers with
various and exotic system setups.

Diego?  Do you have verified which version of aaa_base and
sysvinit is installed on your system.  Beside this the
command line

     mkill -0 /windows/* | xargs ps -x

whill show you all processes which makes your windows mounts
busy.
Comment 74 Andreas Jaeger 2009-02-17 14:06:47 UTC
Werner, andreas: Please discuss the quality of openSUSE on the opensuse-testing or opensuse-factory mailing lists.  Ideas how the openSUSE community can do better are appreciated.
Comment 75 Dr. Werner Fink 2009-02-20 15:54:29 UTC
Diego? Please read comment #73
Comment 76 Diego Ercolani 2009-02-20 16:15:24 UTC
Sorry:
sysvinit-2.86-186.15.1
aaa_base-11.1-10007.12.1

On about 20 shutdown I had about a couple of freezes.

for the mkill -0 command, I did it to fuse mounted filesystem (~/.gvfs and /sys/fs/fuse/connections) but it seems no process is using it

on this machine (that today frozed one time the shutdown procedure) I don't have any windows partition
Comment 77 Dr. Werner Fink 2009-02-20 16:21:40 UTC
The please add the boot fila messages of this system ... this because
the log in attachment  #269096 [details] definitly shows windows partitions:

 /dev/hda11 on / type reiserfs (rw)
 /proc on /proc type proc (rw)
 sysfs on /sys type sysfs (rw)
 debugfs on /sys/kernel/debug type debugfs (rw)
 udev on /dev type tmpfs (rw)
 devpts on /dev/pts type devpts (rw,mode=0620,gid=5)
 /dev/hda12 on /home type xfs (rw)
 /dev/hda8 on /data1 type reiserfs (rw)
 /dev/hda7 on /suse10.2 type reiserfs (rw)
 /dev/hda1 on /windows/C type vfat  (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)
 /dev/hda2 on /windows/D type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)
 /dev/hda3 on /windows/E type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)
 /dev/hda5 on /windows/F type fuseblk (rw,noexec,nosuid,nodev,allow_other,default_permissions,blksize=4096)
 /dev/hda9 on /windows/G type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)
 /dev/hda10 on /windows/H type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)
 /dev/mapper/dati-multimedia on /mnt/hdb1 type xfs (rw,noexec,nosuid,nodev)
 /dev/mapper/dati-distribuzione on /mnt/hdb2 type reiserfs (rw,noexec,nosuid,nodev)
 securityfs on /sys/kernel/security type securityfs (rw)
 none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

and the process 1685 (mount.ntf) is providing /windows/F.
Comment 78 Diego Ercolani 2009-02-21 00:36:47 UTC
Yes, the the dump you've shown is about another machine where there are mounted vfat and ntfs filesystems (even this system is a OpenSuSE 11.1 x86_64 with same patchlevel for sysvinit and aaa_base) :

/dev/hda1 on /windows/C type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)
/dev/hda2 on /windows/D type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)
/dev/hda3 on /windows/E type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)
/dev/hda5 on /windows/F type fuseblk (rw,noexec,nosuid,nodev,allow_other,default_permissions,blksize=4096)
/dev/hda9 on /windows/G type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)
/dev/hda10 on /windows/H type vfat (rw,noexec,nosuid,nodev,gid=100,umask=0002,utf8=true)

even for that system:
mkill -0 /windows/*, mkill -0 ~/.gvfs and mkill -0 /sys/fs/fuse/connections doesn't return any value
Comment 79 Dr. Werner Fink 2009-02-23 12:28:24 UTC
Strange .. then please add strace before the mkill which freeze on your
system, the line

        mkill -TERM $ulist

in /etc/init.d/boot.localfs becomes

        strace -s 80 mkill -TERM $ulist

I'd like to see the last few famouse lines of strace.
Comment 80 Diego Ercolani 2009-02-23 12:57:03 UTC
I reput on the "debug patch" #267289 with the strace addendum a the mkill -0 on top.
next time it'll happen a dirty shutdown I'll post the dump
Comment 81 Dr. Werner Fink 2009-02-23 12:59:38 UTC
What does `dirty shutdown' exactly mean?
Comment 82 Diego Ercolani 2009-02-23 13:17:17 UTC
sorry it's only because as shutdown procedure hangs, I have to force a machine shutdown pressing and holding PC power button
Comment 83 Diego Ercolani 2009-02-23 13:19:00 UTC
Created attachment 274659 [details]
"debug" patch for boot.localfs
Comment 84 Diego Ercolani 2009-02-24 12:45:37 UTC
Created attachment 274963 [details]
last shutdown hangup

This is last shutdown hangup log.

I have a small question about shutdown procedure:
1. what happens if I mount (outside the fstab) a windows share mount -t cifs -o username=name //cifsserver.domain/share /mnt/sambamount
2. what happens if I mount a device over another device: 
   mount /dev/mapper/dati-samba /mnt/samba
   mount /dev/mapper/anotherthing /mnt/samba

are these exception to what handled by boot.localfs ?
Comment 85 Dr. Werner Fink 2009-02-24 13:02:00 UTC
In normal case the 1) should be handled as the /etc/init.d/boot.localfs
knows about cifs as it is part of the netfs variable.  To bo not
deadlocked such file systems will be ignored.

the second case 2) seems not be the problem but it could be a problem
as you mount local devices into a remote file system ... I've no
clue what happens here ... but the mtab shows:

 /dev/sda5 on / type reiserfs (rw,acl,user_xattr)
 /proc on /proc type proc (rw)
 sysfs on /sys type sysfs (rw)
 debugfs on /sys/kernel/debug type debugfs (rw)
 udev on /dev type tmpfs (rw)
 devpts on /dev/pts type devpts (rw,mode=0620,gid=5)
 /dev/mapper/dati-home on /home type xfs (rw)
 /dev/mapper/dati-samba on /mnt/samba type xfs (rw)
 /dev/mapper/dati-vmware on /mnt/vmware type reiserfs (rw)
 securityfs on /sys/kernel/security type securityfs (rw)
 none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
 gvfs-fuse-daemon on /home/SSIS/diego.ercolani/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev,user=SSIS+diego.ercolani)


... it seems that terminating the pid 4472 is the reason why your
system hangs around.  The pid belongs to the process of

   /usr/lib64/gam_server

and seems to be connected to

  /usr/lib64/gvfs/gvfs-fuse-daemon

as it use the same user id 10000.  Please run

       rpm -qf /usr/lib64/gam_server

to determine to which package this daemon belongs to.
Comment 86 Diego Ercolani 2009-02-24 13:25:42 UTC
Yes:
rpm -qf /usr/lib64/gam_server -> gamin-0.1.10-0.pm.1
rpm -qi gamin
Name        : gamin                        Relocations: (not relocatable)
Version     : 0.1.10                            Vendor: packman.links2linux.de
Release     : 0.pm.1                        Build Date: Sat Jan  3 12:24:50 2009
Install Date: Mon Jan  5 09:19:47 2009         Build Host: pmbs
Group       : Development/Libraries         Source RPM: gamin-0.1.10-0.pm.1.src.rpm
Size        : 777547                           License: LGPL
Signature   : DSA/SHA1, Sat Jan  3 12:25:45 2009, Key ID f899f20d9a795806
Packager    : Detlef Reichelt <detlef@links2linux.de>
URL         : http://www.gnome.org/~veillard/gamin/
Summary     : Library providing the FAM File Alteration Monitor API
Description :
This C library provides an API and ABI compatible file alteration
monitor mechanism compatible with FAM but not dependent on a system wide
daemon.
Distribution: openSUSE 11.1 (x86_64)
Comment 87 Dr. Werner Fink 2009-02-24 13:38:11 UTC
I do not find this package on opensuse.org ... does this daemon have
a signal handler for SIGTERM around to call inotify_rm_watch(2) for
its inotify file descriptor which ... I guess ... is set on cwd which
is /home/SSIS/diego.ercolani/Documents.

Or do you know any other reason why your systems becomes (temporary)
dead locked if the gam_server is terminated?  Could it be that
/home/SSIS/diego.ercolani/Documents is included with samba, that
is that at this point we do not have any network around and terminating
the gam_server triggers a (long) network timeout.  If so it would
better to terminate the gam_server *before* shuting down the samba
connection and the network.
Comment 88 Diego Ercolani 2009-02-24 13:59:35 UTC
You don't find because is from packman repository I guess. I think is installed through dependencies.
From a konsole I killed it

kill `pgrep gam_server`
system doesn't hangup but in the process table I find a new instance of gam_server

I issued a "ps afuxwwww" to search which process launches gam_server, but it seem to be launched at the same level of init.

It isn't possible to erase gamin as is needed by:
rpm -e gamin
error: Failed dependencies:
        libfam.so.0()(64bit) is needed by (installed) libgio-fam-2.18.2-5.1.x86_64
        libfam.so.0()(64bit) is needed by (installed) kdelibs3-3.5.10-21.11.x86_64
        libfam.so.0()(64bit) is needed by (installed) gnome-vfs2-2.24.0-4.1.x86_64
        libfam.so.0()(64bit) is needed by (installed) libkipi5-4.1.3-4.6.x86_64
        libfam.so.0()(64bit) is needed by (installed) libkde4-4.2.0-102.1.x86_64
Comment 89 Dr. Werner Fink 2009-02-24 14:05:39 UTC
On openSuSE the libfam.so.0 is provided by fam-2.7.0-130.1

  Name        : fam                          Relocations: (not relocatable)
  Version     : 2.7.0                             Vendor: SUSE LINUX Products GmbH, Nuernberg, Germany
  Release     : 130.29                        Build Date: Fri Feb 20 17:22:58 2009
  Install Date: Sun Feb 22 13:51:26 2009         Build Host: eisler
  Group       : System/Daemons                Source RPM: fam-2.7.0-130.29.src.rpm
  Size        : 84562                            License: GPL v2 or later; LGPL v2.1 or later
  Signature   : RSA/8, Fri Feb 20 17:23:21 2009, Key ID e3a5c360307e3d54
  URL         : http://oss.sgi.com/projects/fam/
  Summary     : File Alteration Monitoring Daemon
  Description :
  Fam is a file alteration monitoring service. With it, you can receive
  signals when files are created or changed.
  
  This package provides libfam, which is used by KDE and GNOME. It also
  provides a tool for the console called fileschanged.
  
  To use fam notifications (it can reduce the network load on NFS
  servers, especially if they host user home directories) you need to run
  the fam daemon, which can be found in the fam-server package.
  
  
  
  Authors:
  --------
      Bruce Karsh
      Bob Miller
      SGI corp.
  
      Author of fileschanged command line tool:
      Ben Asselstine <bda@panix.com>
  Distribution: SUSE:Factory:Head
Comment 90 Dr. Werner Fink 2009-02-24 14:09:32 UTC
In other words try

  rpm -e gamin --force
  killall -9 gam_server
  zypper install --name fam

... does this work for you?
Comment 91 Diego Ercolani 2009-02-24 14:18:34 UTC
rpm --nodeps -e gamin
kill `pgrep gam_server` (I'm more graceful)
zypper in --name fam -y
(fam-2.7.0-130.1)

Work done.

Let's see what happens now
but I think we have to investigate the gamin problem....
I sent a mail to the gamin packager...
Comment 92 Dr. Werner Fink 2009-02-27 13:24:30 UTC
Hi Matthias
Comment 93 Detlef Reichelt 2009-03-01 10:39:37 UTC
Hi,

i'm the packager of gamin, and i couldn't reproduce it on my systems. I've heard that sometimes not gam_server still hangs, but pulseaudio or artsd. So it sounds like a general problem of openSUSE 11.1. The shutdown process seems to fast... ;)

I don't want to use fam, because it often hangs and kills thunar/pcmanfm. Gamin is absolut stable in use, so i decided to shift.

I'm back at home in July (!), could somebody else help to fix the gamin.rpm?
Comment 94 Matthias Hopf 2009-03-02 17:23:32 UTC
For the record:

I can reproduce the hang at home, even with a patched mkill from Werner. Will try to debug this at home.

This is not necessarily the same issue, but we won't know until we analyze it.
Comment 95 Matthias Hopf 2009-03-10 01:03:20 UTC
Created attachment 278280 [details]
Modified mkill that documents its work

If something like this still happens on another machine - this is the
instrumented version of mkill I used for debugging.
Comment 96 Matthias Hopf 2009-03-10 01:03:52 UTC
Created attachment 278281 [details]
lsof output during shutdown
Comment 97 Matthias Hopf 2009-03-10 01:04:17 UTC
Created attachment 278282 [details]
mkill log output
Comment 98 Matthias Hopf 2009-03-10 01:05:00 UTC
Created attachment 278284 [details]
Bug fix for mkill

This fixes the issues for me.

Basically, the mount point comparison function was dead wrong. On my system,
everything that had a file open *starting* with /d or /u (that includes /dev
and /usr) was killed.
Including "/bin/bash /etc/init.d/halt", and "/bin/bash
/etc/init.d/boot.localfs" itself.
Comment 99 Matthias Hopf 2009-03-10 01:07:12 UTC
This is a severe issue. Also affects SLED11.

Bug 467906 might be a dup. Though I somewhat doubt that in the meantime.
Comment 100 Dr. Werner Fink 2009-03-10 12:23:56 UTC
Created attachment 278423 [details]
sysvinit-2.86-186.16.i586.rpm

sysvinit with updated mkill utility
Comment 101 Dr. Werner Fink 2009-03-10 12:25:04 UTC
Created attachment 278425 [details]
sysvinit-2.86-186.16.x86_64.rpm

sysvinit with updated mkill
Comment 102 Dr. Werner Fink 2009-03-10 12:26:41 UTC
Diego? Does the new sysvinit rpm with the fixed mkill help for you?
Comment 103 Diego Ercolani 2009-03-11 10:14:32 UTC
Hello,
my last change was to replace gaim with standard fam (as my comment #91), with this change, shutdown process doesn't "freeze", as supposed by Werner probably gaim (gam_server) causes a sort of deadlock when killed .... I don't know.

BUT

also if shutdown process doesn't hangup, the new problem is that during shutdown mkill leaves some file dectriptor open and then when the sistem brings up the next time, it complaints about dirty filesystem and fsck is runned....

this issue continues also with new (#101) sysvinit

but what about you, are you recording these issues too?
Comment 104 Dr. Werner Fink 2009-03-11 11:00:32 UTC
mkill only opens /proc/mounts to get all active mount points and /proc to
read the directory, then it uses readlink(2), open(2) and opendir(3)
to determine which running program makes a mount point busy. And
running

   strace -e open,readlink,close mkill -0 /dev

does not show a file descriptor leak.  Beside this mkill does not stop
/sbin/udevd nor programs which have /dev/fuse open.  The last one because
if those would be terminated the underlying fuse file system becomes dirty.

Could it be that there is a program which opens it own fuse device
which the program its self creates with makedev(3) and mknod(2)?  Or
could it be that there is a program which uses a /dev/fuse within a
chroot environment ... maybe a combination with a network based
file system (samba,NFS) and a local file system.

Nevertheless the new mkill sorts kill(pid,SIGTERM) in the reverse order
of the mount points found in /proc/mounts.

Or implies the order of your mounts that some of the mount points remains
busy? To see this you should compare /etc/mtab and /proc/mounts.

I'll add Magnus which maybe can explain what happens with this
gvfs-fuse-daemon which seems to hold /home/SSIS/diego.ercolani/.gvfs
or /home busy or dirty.
Comment 105 Magnus Boman 2009-03-11 19:59:43 UTC
I'm actually just an external contributor helping out with package updates (which, I suppose, is where you found my name).
I'm adding HPJ instead, who's the real maintainer of gvfs
Comment 106 Swamp Workflow Management 2009-03-13 12:16:14 UTC
Update released for: aaa_base, sysvinit
Products:
openSUSE 11.1 (debug, i586, ppc, x86_64)
Comment 107 Eberhard Harbrink 2009-03-18 21:37:49 UTC
Seems the latest update broke the shutdown process.
At shutdown the system stops at

Turning off swap files
Sending all processes the TERM signal ...
/etc/init.d/rc: line 317:   5907 killed    $link            start
Master Resource Control: runlevel 0 has been reached
Failed services in runlevel 0:                          smartd lm_sensors
Skipped services in runlevel 0:                         SuSEfirewall2_setup
INIT: no processes left in this runlevel

Here the systems stops. It fails to unmount the drives and to shutdown.

Could you please increase the priority of this bug, since at the moment the situation is quite problematic?
Comment 108 Andreas Jaeger 2009-03-20 07:21:45 UTC
Michael, please see: http://en.opensuse.org/Bugs/Definitions

This is not a BLOCKER according to our definition - and please do not change priority.  Werner is on vacation this week and will answer this once he's back for sure - and change the priority himself.
Comment 109 Dr. Werner Fink 2009-04-01 11:45:48 UTC
Diego?  Please could you add the line

       killproc -TERM /usr/sbin/console-kit-daemon
before

     killproc -p $DBUS_DAEMON_PID -TERM $DBUS_DAEMON_BIN

in /etc/init.d/dbus ... compare with bug #491063

Then please make sure that you have really installed aaa_base-11.2-1.6
and sysvinit-2.86-186.17.1 ...  then you may add

   mkill -0 $ulist | xargs -r ps u
   mkill -0 $ulist | xargs -n 1 -r | while read p; do ls -Gl /proc/$p/fd; done
   sleep 10

before

   mkill -TERM $ulist

in /etc/init.d/boot.localfs ... with this we may see what exactly happens
compare with bug #486710

@Eberhard ... Are you using RUN_PARALLEL=no in /etc/sysconfig/boot
              or are you using PROMPT_FOR_CONFIRM=yes ??
Comment 110 Diego Ercolani 2009-04-01 15:12:38 UTC
Uhm....

in my current situation (sysvinit-2.86-186.17.1,aaa_base-11.1-10007.15.1)
, shutdown doesn't hangs but filesystem ("/") doesn't cleanly unmount and so when system statup an fsck is done.
Comment 111 Dr. Werner Fink 2009-04-01 15:38:28 UTC
... OK ... then I'd like to know which process or which forgotten
mount makes ``/'' busy.  Please attach /var/log/boot.omsg, maybe
this helps to see, what happens in last famous seconds.  Also a

       cat /proc/mounts
       fuser -m /

after the umount in /etc/init.d/boot.localfs would show whats
going on there.
Comment 112 Eberhard Harbrink 2009-04-01 19:32:56 UTC
Created attachment 283526 [details]
boot.omsg
Comment 113 Diego Ercolani 2009-04-01 19:36:13 UTC
Created attachment 283527 [details]
kernel session log for a session without hang but without umount of rootfs

Here it is the boot.omsg that you requested;
cat /proc/mounts returned:
rootfs / rootfs rw 0 0
udev /dev tmpfs rw,mode=755 0 0
/dev/hda11 / reiserfs rw 0 0
/proc /proc proc rw 0 0
sysfs /sys sysfs rw 0 0
debugfs /sys/kernel/debug debugfs rw 0 0
devpts /dev/pts devptsrw,gid=5,mode=620 0 0
securityfs /sys/kernel/security securityfs rw 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0

fuser returned:
1 rce 2rc 3rc 4rc 5rc 6rc 7rc 8rc 9rc 10rc 11rc 12rc 13rc 14rc 16rc 17rc 18rc 57rc 58rc 60rc 61rc 76rc 187rc 188rc 567rc 638rce 891rc 1165rc 1297rc 1300rc 1486rc 1579rc 1592rc 1601rc 1660rc 1661rc 1662rc 2173rce 2132rc 4288rc 4289rc 4767rc 4898rce 5803rce 5821rce 5977rce
Comment 114 Eberhard Harbrink 2009-04-01 19:42:06 UTC
PROMPT_FOR_CONFIRM=no
RUN_PARALLEL was yes, changed to no, but that changed nothing

aaa_base 11.1-100007.15.1-x86_64
sysvinit 2.86-186.17.1-x86_64
from the update repository

I also attached my boot.omsg. Maybe it is of any importance that I run xfs.
Comment 115 Dr. Werner Fink 2009-04-02 10:41:28 UTC
@Eberhard

I also use xfs here and do not have any problems here.
Please run as root

    rpm -V aaa_base sysvinit

and report the result.

@Diego

The root file system will be mounted read only in /etc/init.d/halt
that is that it will *not* touched in /etc/init.d/boot.localfs
... AFAIC see from your comment #113 there is noting which makes
the file system busy after the killall5 is done in /etc/init.d/halt
Comment 116 Diego Ercolani 2009-04-02 11:08:11 UTC
Werner: the problem is that I have more times that when I startup the machine it complaints about "dirty shutdown"
This doesn't happen every time, it happens about 1 time each 3.
Comment 117 Dr. Werner Fink 2009-04-02 11:35:54 UTC
Hmmm ... strange, please have a look into /etc/init.d/halt at line
163 upto 184 ... was this piece of code reached if your system complains
about "dirty shutdown" next boot?

You may remove the  `2> /dev/null' from the remount to see what happens
and you may add a `usleep 100000' of `sleep 1' before the remount.
Or maybe sometimes a daemon is within D state during the remount, you
may also add

     fuser -a -m /

before the remount.
Comment 118 Dr. Werner Fink 2009-04-02 11:52:06 UTC
... or use

  fuser -a -m /  2>/dev/null | xargs -r ps u

and/or

  fuser -a -m /  2>/dev/null | \
  xargs -n 1 -r | while read p; do ls -Gl /proc/$p/fd; done

we may then see what going on there.
Comment 119 Eberhard Harbrink 2009-04-02 18:45:45 UTC
# rpm -V aaa_base sysvinit
S.5....T  c /etc/inittab
S.5....T  c /etc/mailcap
Comment 120 Diego Ercolani 2009-04-02 23:08:22 UTC
Created attachment 283820 [details]
modification of boot.localfs  and halt scripts
Comment 121 Diego Ercolani 2009-04-02 23:10:54 UTC
Created attachment 283822 [details]
kernel session log for a session without hang but without umount of rootfs (halt complainted about "/" is busy during remount,ro)
Comment 122 Diego Ercolani 2009-04-02 23:12:34 UTC
Created attachment 283824 [details]
log generated for the same session as (id=283822) by script modifications as (id=283820)
Comment 123 Dr. Werner Fink 2009-04-03 09:42:36 UTC
AFAICS from attachment #283824 [details] there is nothing busy :((
/dev/console does not belong nor /dev/initctl does belong to root fs
Comment 124 Dr. Werner Fink 2009-04-06 16:34:13 UTC
Beside the problem reported by Diego, is there anyone who has the reported
problem *after* the update of aaa_base (10007.15.1), sysvinit (186.17.1),
and applying ConsoleKit changes from bug #491063
Comment 125 Eberhard Harbrink 2009-04-06 19:58:19 UTC
Confirmed! I still have the problem although I applied the fix on consolekit.
Comment 126 Diego Ercolani 2009-04-06 22:41:58 UTC
Created attachment 284364 [details]
kernel session log for a session without hang but without umount of rootfs (halt complainted about "/" is busy during remount,ro)

today I registered the same issue I warned some days ago.
Comment 127 Diego Ercolani 2009-04-06 22:43:12 UTC
Created attachment 284365 [details]
log generated for the same session as (id=284364) by script modifications as (id=283820)
Comment 128 Dr. Werner Fink 2009-04-07 10:30:32 UTC
@Eberhard: Are you running AOE+squashfs+aufs ... that is KIWI's AOE/NBD feature?
If yes, this is highly experimental and does not belong to this bug, this problem
is covered by  bug #491890.

@Diego: Please attach your boot.localfs and your halt script.
Comment 129 Eberhard Harbrink 2009-04-07 18:01:04 UTC
No, I'm not running it.
Comment 130 Diego Ercolani 2009-04-07 22:48:44 UTC
Created attachment 284642 [details]
/etc/init.d/{boot.localfs,halt} /var/log/boot.{omsg,faill.msg} as requested in comment #128

As requested, here it is a new "session" dump.
The included halt script is the last kind of modification I did:
if "mount -o remount,ro /" fails, control is left to the shell.

In the attached boot.fail.msg, mount -o remount failed and so I've done some dump like:
lsof
fuser -va /
ps axuwwwwww
and finally a strace -f mount -o remount,ro / >>/var/log/boot.fail.msg 2>&1

I know, probably in last "strace" the fail of the remount,ro can be caused by the strace itself but I don't know how to avoid it.

After last strace, I issued a logout and correctly mount -o remount,ro / had a success run so session had a graceful shutdown.

A little notice:
After exit to shell,I firstly issued a "ps auxwwww" without redirecting its output to /var/log, I noticed that there was a zombie process (it was bacula-sd), after that I tryied to mount readonly the root without success and then I issued some diagnostic command (ps, lsof, fuser) redirecting output to /var/log/boot.fail.msg and then I close the session with a logout, and the halt script retryied successfully the mount remount,ro process.
Examining the log on the next session, I noticed that in ps axuwwwww command I redirected to /var/log/boot.fail.msg doesn't appear any bacula-sd process.... Can be possible that after a while the zombie process exited and then it left the / filesystem free to be remounted RO?
Comment 131 Dr. Werner Fink 2009-04-08 09:32:40 UTC
Add maintainer of bacula to CC list (hmmm ... seems to be dropped in factory).
Comment 132 Dr. Werner Fink 2009-04-08 09:50:40 UTC
@Eberhard:  AFAICS from comment #107 your system s loosing the root file
            system ... the question rises: *why* does this happen on your
            system. There must be a difference between your system and the
            system e.g. Diego or my own systems here around. Do you have
            a own kernel or are you using the standard kernel of 11.1.
            Next point are your mount points, please show us the output
            of `cat /proc/mounts'.
Comment 133 Dr. Werner Fink 2009-04-08 10:11:15 UTC
@Diego:

Do the bacula daemons have threads and are real daemons?
Please run

          ps -Leo pid,ppid,sid,lwp,stat,comm | grep bacula

to see more about.

What happens if you disable baluca that is

          insserv -r bacula-dir bacula-fd bacula-sd
          /etc/init.d/bacula-dir stop
          /etc/init.d/bacula-fd stop
          /etc/init.d/bacula-sd stop

after this please check if there is any zombie process around.
Now does shutdown work?
Comment 134 Dr. Werner Fink 2009-04-08 10:26:44 UTC
@Diego:  Which package provides /etc/init.d/spindown? Please run

                rpm -qf /etc/init.d/spindown

this because the boot.omsg shows:

 Shutting down the Bacula Storage daemonShutting down java.binfmt_misc done
 Shutting down irqbalance done
 Shutting down service kdmdone
 /etc/init.d/spindown: line 48: log_daemon_msg: command not found
 /etc/init.d/spindown: line 50: log_end_msg: command not found
Comment 135 Eberhard Harbrink 2009-04-08 18:25:41 UTC
I use the default kernel, now version 2.6.27.21-0.1.2-x86_64 from the update repository. Before it was 2.6.27.19 and nothing changed with the update.

# cat /proc/mounts
rootfs / rootfs rw 0 0
udev /dev tmpfs rw,mode=755 0 0
/dev/sda6 / xfs rw,noquota 0 0
/proc /proc proc rw 0 0
sysfs /sys sysfs rw 0 0
debugfs /sys/kernel/debug debugfs rw 0 0
devpts /dev/pts devpts rw,gid=5,mode=620 0 0
/dev/sdb5 /usr/lib xfs rw,noquota 0 0
/dev/sdb6 /usr/lib64 xfs rw,attr2,noquota 0 0
/dev/sdb7 /opt xfs rw,attr2,noquota 0 0
/dev/sdb8 /home xfs rw,attr2,noquota 0 0
/dev/sda7 /windows/L vfat rw,nosuid,nodev,noexec,gid=100,fmask=0002,dmask=0002,allow_utime=0020,codepage=cp437,iocharset=iso8859-1,utf8 0 0
/dev/sdb9 /windows/N vfat rw,nosuid,nodev,noexec,gid=100,fmask=0002,dmask=0002,allow_utime=0020,codepage=cp437,iocharset=iso8859-1,utf8 0 0
/dev/sdb10 /windows/O vfat rw,nosuid,nodev,noexec,gid=100,fmask=0002,dmask=0002,allow_utime=0020,codepage=cp437,iocharset=iso8859-1,utf8 0 0
/dev/sda5 /windows/D vfat rw,nosuid,nodev,noexec,gid=100,fmask=0002,dmask=0002,allow_utime=0020,codepage=cp437,iocharset=iso8859-1,utf8 0 0
/dev/sda1 /windows/C fuseblk rw,nosuid,nodev,noexec,user_id=0,group_id=0,default_permissions,allow_other,blksize=4096 0 0
/dev/sda9 /windows/M vfat rw,nosuid,nodev,noexec,gid=100,fmask=0002,dmask=0002,allow_utime=0020,codepage=cp437,iocharset=iso8859-1,utf8 0 0
/dev/sda8 /windows/P fuseblk rw,nosuid,nodev,noexec,user_id=0,group_id=0,default_permissions,allow_other,blksize=4096 0 0
/dev/sdb11 /windows/Q fuseblk rw,nosuid,nodev,noexec,user_id=0,group_id=0,default_permissions,allow_other,blksize=4096 0 0
fusectl /sys/fs/fuse/connections fusectl rw 0 0
securityfs /sys/kernel/security securityfs rw 0 0
/proc /var/lib/ntp/proc proc ro 0 0
Comment 136 Eberhard Harbrink 2009-04-08 19:05:11 UTC
Could it be that there are still files needed from /usr/lib or /usr/lib64 when these file-systems are already unmounted? That would be funny since I'm running this fs-layout for a long time and I never had this problems before.
Comment 137 Diego Ercolani 2009-04-08 22:10:48 UTC
reference comment #133:

ps -Leo pid,ppid,sid,lwp,stat,comm | grep bacula
 3187     1  3187  3187 Ssl  bacula-fd
 3187     1  3187  3190 Ssl  bacula-fd
 3188     1  3188  3188 Ssl  bacula-sd
 3188     1  3188  3201 Ssl  bacula-sd
 3383     1  3383  3383 Ssl  bacula-dir
 3383     1  3383  3518 Ssl  bacula-dir
 3383     1  3383  3519 Ssl  bacula-dir

for the zombiness....

rcbacula-dir stop
rcbacula-fd stop
rcbacula-sd stop

ps axuwww | grep bacula
root      3188  0.0  0.0      0     0 ?        Zsl  17:58   0:02 [bacula-sd] <defunct>

It seem that bacula-sd remains zombie until tape driver (DDS3-SCSI) ejects its tape, so it could be a sort of i/o freeze....
But I can tell you that always the shutdown procedure take less time that is taken by the tapedriver to rewind and eject the tape..... and often the shutdown procedure successfully remount readonly the root filesystem.

My 2 ยข... I read that bacula has been removed from the factory.... I think is one of the best piece of software it has ever written, I think is a bad idea to remove it from the distribution

comment #134:
Yes, also spindownd comes from Packman repository, from a small search via google, I have understood that "log_daemon_msg" and "log_end_message" are log facilities that belong to linux standard base 3.0-3 (functions that are defined in /lib/lsb/init-functions) the error message you refer to, doesn't seem to leave the daemon in a dirty state after rcspindaemon stop. The sysinit script also refers to another function that is "status_of_proc" that it isn't defined in opensuse 11.1 /lib/lsb/init-tools:
extract of /etc/init.d/spindown:
[...]
. /lib/lsb/init-functions
[...]
case "$1" in
    "start")
        log_daemon_msg "Starting disk spindown daemon" "spindownd"
        start_daemon -p $PIDFILE $DAEMON -d -s $STATUSPATH -c $CONFPATH -p $PIDFILE
        log_end_msg $?

        exit $?
        ;;

    "stop")
        log_daemon_msg "Stopping disk spindown daemon" "spindownd"
        killproc -p $PIDFILE $DAEMON
        log_end_msg $?

        exit $?
        ;;

    "status")
        if status_of_proc -p $PIDFILE $DAEMON spindown; then
            echo -n
        else
            exit 1
        fi

        killproc -p $PIDFILE $DAEMON -PIPE
        status

        exit 0
        ;;
[...]
Comment 138 Dr. Werner Fink 2009-04-09 09:59:51 UTC
@Anna: Do you know why bacula had dropped from factory?

@Diego: you should drop a feature request on openSuSE.org to reenable bacula
maybe Anna knows more about bacula ... nevertheless you should make sure that
the /etc/init.d/spindown does not spin down the disks on stop. Beside this do
you have seen the problem if you have disabled the bacula serives?  IMHO a busy
tape should cause a `D' but not a `Z' state (uninterruptible not defunct
process).  A zombie is a terminated process which is not reaped by its parent.
The default parent of a real daemon is process 1 aka /sbin/init ... the question
is why init takes so long to reap this specific process?

@Eberhard for comment #136 ... AFAIK there is no process which requires /usr
or any other mount point below /usr after boot.localfs has unmounted it.  If
you will find one please report.  Please also attach the files

  /etc/init.d/.depend.boot
  /etc/init.d/.depend.halt
  /etc/init.d/.depend.start
  /etc/init.d/.depend.stop

and the file

  /etc/inittab

you may also add a simple line

   bash

after the line which remounts the root file system readonly
in /etc/init.d/halt to get a shell for debugging.
Comment 139 Eberhard Harbrink 2009-04-09 19:11:20 UTC
Created attachment 285112 [details]
/etc/init.d/.depend.* /etc/inittab
Comment 140 Eberhard Harbrink 2009-04-09 19:18:33 UTC
Created attachment 285113 [details]
ps axu

In /etc/init.d/halt it gets exactly until line 168
rc_wait /sbin/blogd /sbin/splash
and there it stops. Doesn't get beyond this line.
If I enter bash before this line I get a shell. All volumes are still mounted and 
mount -no remount,ro / works without throwing an error message. At ps axu I can't see any offending process, but since my eye isn't very trained, I append the output of ps axu.
Comment 141 Anna Maresova 2009-04-10 09:46:03 UTC
Werner: Bacula was not completely dropped from the distribution, I just moved it to Contrib (and AFAIK, Contrib repo will be a part of default installation in the next openSUSE release). I will try to update it here and there and I will fix the bugs, but I hope we find some external maintainer sooner or later. 

And why? First, I am tired of maintaining our huge FORTIFY_SOURCE patch (see #354872) and upstream is not interested to really fix the issue. Second, I have never found time to package it properly (ie. make it work with all the databases) - I have created the package several years ago as a quick hack for our internal IT and I do not think it really deserves to be a part of Factory. If anyone wishes to do better, I will gladly let him. As I have lot of work I consider much more important, I do not think I will do better in a reasonable time.
Comment 142 Diego Ercolani 2009-04-10 10:40:28 UTC
@Werner:
 Yes, it seems that "sometimes" while bacula-sd is in "Zombie" rootfs is locked in some manner and then the remount fails. In this case, when the tape is ejected, bacula-sd exits from the zombie state and then it is possible to remount the roofs.

The workaroundo could be something like this:

i=0;
while [ (! mount -no remount,ro /) && i<50 ]; do sync; i=$[$i+1]; sleep 1; done
Comment 143 Eberhard Harbrink 2009-04-10 11:02:23 UTC
@Werner: my problem seems unrelated to Diego's. Maybe I better open a separate bug-report?
Comment 144 Dr. Werner Fink 2009-04-14 09:35:27 UTC
Yes that would be fine.
Comment 145 Dr. Werner Fink 2009-04-14 11:48:59 UTC
(In reply to comment #142)

Diego? What happens if you force eject within the boot script of bacula-sd
That is a line

        eject /dev/tape

before the line

        killproc -TERM $BACULA_SD_BIN

within /etc/init.d/bacula-sd ..and if this does not work, try out to
move this line after the terminating killproc line.  Clearly you
should chekc if /dev/tape exists and points to the real physical device
like /dev/st0 or /dev/nst0 ...

(In reply to comment #140)

Eberhard?  Maybe you could try before the line with rc_wait to
do an

         echo $BASH_VERSION
         echo $SECONDS

and then

         set -x
         rc_wait

to see if the bash has a problem with increasing $SECONDS on your
sytem or the rc_wait() shell function. On question I have: Why do
the processes mouning /windows/C, /windows/P,and /windows/Q exist
at this point?  IMHO this processes should be gone after boot.localfs
has done its job even if the file systems are fuse based.
Comment 146 Diego Ercolani 2009-04-15 10:49:52 UTC
I solved the problem with bacula-sd setting:

Offline On Unmount = no

in bacula-sd.conf
and with this line bacula doesn't send an offline to the tape when it shutdowns.
Comment 147 Diego Ercolani 2009-04-15 10:51:37 UTC
    I solved the problem with bacula-sd setting:

    Offline On Unmount = no

    in bacula-sd.conf
    and with this line bacula doesn't send an offline to the tape when it
    shutdowns.

Your comment was:

    I solved the problem with bacula-sd setting:

    Offline On Unmount = no

    in bacula-sd.conf
    and with this line bacula doesn't send an offline to the tape when it
    shutdowns.
    But my question is:
    the halt process isn't problem-proof as if some process haven't free the
    rootfs, system doesn't shutdown correctly
Comment 148 Eberhard Harbrink 2009-04-15 19:40:27 UTC
I found that my problem may be related to https://bugzilla.novell.com/show_bug.cgi?id=486710 , so I will append my answers there
Comment 149 Dr. Werner Fink 2010-01-29 16:13:59 UTC
This problem seems to be solved on openSuSE 11.2 ...