Bug 927776

Summary: System failed to start
Product: [openSUSE] openSUSE Distribution Reporter: Richard Weinberger <richard>
Component: BasesystemAssignee: Cristian Rodríguez <crrodriguez>
Status: RESOLVED NORESPONSE QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: chcao, coolo, crrodriguez, forgotten_DV81ZEWZkN, fstrba, qantas94heavy, richard, systemd-maintainers, thomas.blume
Version: 13.2Flags: thomas.blume: needinfo? (richard)
Target Milestone: 13.2   
Hardware: Other   
OS: openSUSE 13.2   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Richard Weinberger 2015-04-20 07:26:11 UTC
Hi!

My laptop failed to start, I'd like to find out why.
After a hard reset it worked again.
So far it happened only once.

See:
http://git.infradead.org/~rw/os132_boot_fail.jpg
http://git.infradead.org/~rw/os132_boot_fail.txt

From the bootlogs it seems like dbus did not start
and hence systemd was hosed.

The interesting lines are:
Apr 16 13:51:01 sandpuppy systemd[1]: Failed to get initial list of names: Connection timed out
Apr 16 13:51:01 sandpuppy systemd[1]: Looping too fast. Throttling execution a little.

Is this a known issue?
How can this happen? If it happens again, how to debug further?
I got no getty nor a x11 session.

Thanks,
//richard
Comment 1 Richard Weinberger 2015-04-20 07:27:49 UTC
BTW: A very sad discussion on this happened already on the opensuse mailinglist
http://lists.opensuse.org/opensuse/2015-04/msg00538.html
Comment 2 Dr. Werner Fink 2015-04-20 08:50:47 UTC
This bug is for openSUSE 12.3 which is out of support ... your message http://lists.opensuse.org/opensuse/2015-04/msg00538.html is for openSUSE 13.2

From the log http://git.infradead.org/~rw/os132_boot_fail.txt I'd like to guess that your raid is in trouble which is not systemd but a hardware problem.

Btw: What are those `BKA Überwachungseinheit' NetworkManager connections?
Comment 3 Richard Weinberger 2015-04-20 08:56:54 UTC
(In reply to Dr. Werner Fink from comment #2)
> This bug is for openSUSE 12.3 which is out of support ... your message

Sorry, it is 13.2. Looks liked I messed up the version field.

> http://lists.opensuse.org/opensuse/2015-04/msg00538.html is for openSUSE 13.2
> 
> From the log http://git.infradead.org/~rw/os132_boot_fail.txt I'd like to
> guess that your raid is in trouble which is not systemd but a hardware
> problem.

My laptop does not have raid. Why do you think it is a raid issue?

> Btw: What are those `BKA Überwachungseinheit' NetworkManager connections?

Shhh!
Comment 4 Dr. Werner Fink 2015-04-20 09:15:38 UTC
(In reply to Richard Weinberger from comment #3)

> My laptop does not have raid. Why do you think it is a raid issue?

Read your logs (IMHO) something is dying:

Apr 16 13:51:02 sandpuppy dbus-daemon[1570]: Unknown username "srvGeoClue" in message bus configuration file
Apr 16 13:51:02 sandpuppy systemd[1]: Cannot add dependency job for unit cups.socket, ignoring: Unit cups.socket failed to load: No such file or directory.
Apr 16 13:51:02 sandpuppy systemd[1]: Started Modem Manager.
Apr 16 13:51:02 sandpuppy systemd[1]: Cannot add dependency job for unit cups.socket, ignoring: Unit cups.socket failed to load: No such file or directory.
Apr 16 13:51:02 sandpuppy systemd[1]: Cannot add dependency job for unit cups.socket, ignoring: Unit cups.socket failed to load: No such file or directory.
Apr 16 13:51:02 sandpuppy systemd-logind[1599]: New seat seat0.

[...]

Apr 16 13:51:01 sandpuppy logd[1601]: [1601]: WARN: Cannot open config file [/etc/logd.cf]

[...]

Apr 16 13:52:04 sandpuppy systemd[1]: Looping too fast. Throttling execution a little.
Apr 16 13:52:04 sandpuppy NetworkManager[1563]: <info> wpa_supplicant stopped
Apr 16 13:52:04 sandpuppy NetworkManager[1563]: <info> (wlan0): supplicant interface state: inactive -> down
Apr 16 13:52:04 sandpuppy NetworkManager[1563]: <info> (wlan0): device state change: disconnected -> unavailable (reason 'supplicant-failed') [30 20 10]
Apr 16 13:52:04 sandpuppy NetworkManager[1563]: <info> (wlan0): deactivating device (reason 'supplicant-failed') [10]
Apr 16 13:52:04 sandpuppy NetworkManager[1563]: <warn> (wlan0): add_pending_action (2): 'waiting for supplicant' already pending
Apr 16 13:52:04 sandpuppy NetworkManager[1563]: file devices/nm-device.c: line 6324 (nm_device_add_pending_action): should not be reached
Apr 16 13:52:04 sandpuppy dbus[1570]: [system] Activating via systemd: service name='fi.w1.wpa_supplicant1' unit='wpa_supplicant.service'
Apr 16 13:52:05 sandpuppy systemd[1]: Failed to start WPA Supplicant daemon.
Apr 16 13:52:05 sandpuppy systemd[1]: Cannot add dependency job for unit cups.socket, ignoring: Unit cups.socket failed to load: No such file or directory.

[...]
Apr 16 13:52:05 sandpuppy systemd[1]: Looping too fast. Throttling execution a little.

[repeating very often]

Apr 16 13:54:53 sandpuppy systemd[1]: Job dev-disk-by\x2did-raid\x2dcr_ata\x2dTOSHIBA_MK5061GSY_X2I2Y00CF\x2dpart2.device/stop timed out.
Apr 16 13:54:53 sandpuppy systemd[1]: Timed out stoppping /dev/disk/by-id/raid-cr_ata-TOSHIBA_MK5061GSY_X2I2Y00CF-part2.
Apr 16 13:54:53 sandpuppy systemd[1]: Job dev-disk-by\x2did-dm\x2duuid\x2dCRYPT\x2dLUKS1\x2d30b0cc15df95447da89877446394e48b\x2dcr_ata\x2dTOSHIBA_MK5061GSY_X2I2Y00CF\x2dpart2.device/stop timed
Apr 16 13:54:53 sandpuppy systemd[1]: Timed out stoppping /dev/disk/by-id/dm-uuid-CRYPT-LUKS1-30b0cc15df95447da89877446394e48b-cr_ata-TOSHIBA_MK5061GSY_X2I2Y00CF-part2.
Apr 16 13:54:53 sandpuppy systemd[1]: Job dev-disk-by\x2did-dm\x2dname\x2dcr_ata\x2dTOSHIBA_MK5061GSY_X2I2Y00CF\x2dpart2.device/stop timed out.
Apr 16 13:54:53 sandpuppy systemd[1]: Timed out stoppping /dev/disk/by-id/dm-name-cr_ata-TOSHIBA_MK5061GSY_X2I2Y00CF-part2.
Apr 16 13:54:53 sandpuppy systemd[1]: Job dev-dm\x2d0.device/stop timed out.
Apr 16 13:54:53 sandpuppy systemd[1]: Timed out stoppping /dev/dm-0.
Apr 16 13:54:53 sandpuppy systemd[1]: Job sys-devices-virtual-block-dm\x2d0.device/stop timed out.
Apr 16 13:54:53 sandpuppy systemd[1]: Timed out stoppping /sys/devices/virtual/block/dm-0.
Apr 16 13:54:53 sandpuppy systemd[1]: Looping too fast. Throttling execution a little.
Comment 5 Thomas Blume 2015-04-20 09:21:02 UTC
(In reply to Richard Weinberger from comment #3)
> (In reply to Dr. Werner Fink from comment #2)
> > This bug is for openSUSE 12.3 which is out of support ... your message
> 
> Sorry, it is 13.2. Looks liked I messed up the version field.
> 
> > http://lists.opensuse.org/opensuse/2015-04/msg00538.html is for openSUSE 13.2
> > 
> > From the log http://git.infradead.org/~rw/os132_boot_fail.txt I'd like to
> > guess that your raid is in trouble which is not systemd but a hardware
> > problem.
> 
> My laptop does not have raid. Why do you think it is a raid issue?


Apr 16 13:54:53 sandpuppy systemd[1]: Job dev-disk-by\x2did-raid\x2dcr_ata\x2dTOSHIBA_MK5061GSY_X2I2Y00CF\x2dpart2.device/stop timed out.
Apr 16 13:54:53 sandpuppy systemd[1]: Timed out stoppping /dev/disk/by-id/raid-cr_ata-TOSHIBA_MK5061GSY_X2I2Y00CF-part2.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This looks like an encrypted raid device.

Apr 16 13:54:53 sandpuppy systemd[1]: Job dev-disk-by\x2did-dm\x2duuid\x2dCRYPT\x2dLUKS1\x2d30b0cc15df95447da89877446394e48b\x2dcr_ata\x2dTOSHIBA_MK5061GSY_X2I2Y00CF\x2dpart2.device/stop timed
Apr 16 13:54:53 sandpuppy systemd[1]: Timed out stoppping /dev/disk/by-id/dm-uuid-CRYPT-LUKS1-30b0cc15df95447da89877446394e48b-cr_ata-TOSHIBA_MK5061GSY_X2I2Y00CF-part2.

The logs show that it times out on shutdown.
Maybe there was already a problem decrypting the device at startup?
Comment 6 Richard Weinberger 2015-04-20 09:35:23 UTC
(In reply to Thomas Blume from comment #5)
> (In reply to Richard Weinberger from comment #3)
> > (In reply to Dr. Werner Fink from comment #2)
> > > This bug is for openSUSE 12.3 which is out of support ... your message
> > 
> > Sorry, it is 13.2. Looks liked I messed up the version field.
> > 
> > > http://lists.opensuse.org/opensuse/2015-04/msg00538.html is for openSUSE 13.2
> > > 
> > > From the log http://git.infradead.org/~rw/os132_boot_fail.txt I'd like to
> > > guess that your raid is in trouble which is not systemd but a hardware
> > > problem.
> > 
> > My laptop does not have raid. Why do you think it is a raid issue?
> 
> 
> Apr 16 13:54:53 sandpuppy systemd[1]: Job
> dev-disk-by\x2did-raid\x2dcr_ata\x2dTOSHIBA_MK5061GSY_X2I2Y00CF\x2dpart2.
> device/stop timed out.
> Apr 16 13:54:53 sandpuppy systemd[1]: Timed out stoppping
> /dev/disk/by-id/raid-cr_ata-TOSHIBA_MK5061GSY_X2I2Y00CF-part2.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> This looks like an encrypted raid device.
> 
> Apr 16 13:54:53 sandpuppy systemd[1]: Job
> dev-disk-by\x2did-
> dm\x2duuid\x2dCRYPT\x2dLUKS1\x2d30b0cc15df95447da89877446394e48b\x2dcr_ata\x2
> dTOSHIBA_MK5061GSY_X2I2Y00CF\x2dpart2.device/stop timed
> Apr 16 13:54:53 sandpuppy systemd[1]: Timed out stoppping
> /dev/disk/by-id/dm-uuid-CRYPT-LUKS1-30b0cc15df95447da89877446394e48b-cr_ata-
> TOSHIBA_MK5061GSY_X2I2Y00CF-part2.
> 
> The logs show that it times out on shutdown.
> Maybe there was already a problem decrypting the device at startup?

Hmmm.

sandpuppy:~ # cat /proc/mdstat 
Personalities : 
unused devices: <none>
sandpuppy:~ # ls -la /dev/disk/by-id/raid-*
lrwxrwxrwx 1 root root 10 Apr 16 23:52 /dev/disk/by-id/raid-cr_ata-TOSHIBA_MK5061GSY_X2I2Y00CF-part2 -> ../../dm-0
lrwxrwxrwx 1 root root 10 Apr 16 23:52 /dev/disk/by-id/raid-system-home -> ../../dm-3
lrwxrwxrwx 1 root root 10 Apr 16 23:52 /dev/disk/by-id/raid-system-root -> ../../dm-2
lrwxrwxrwx 1 root root 10 Apr 16 23:52 /dev/disk/by-id/raid-system-swap -> ../../dm-1

It is definitely not a RAID. My whole disk is encrypted but the root filesystem was mounted correctly.
Comment 7 Dr. Werner Fink 2015-04-20 09:46:37 UTC
(In reply to Richard Weinberger from comment #6)

Nevertheless somethings goes wrong which is caused by the final decrypted partition even if mounted.  See the log you've attached.  There are files, which are not found.  Run

         rpm -qVa

to see what is missed and and what is modified.
Comment 8 Richard Weinberger 2015-04-20 10:52:07 UTC
(In reply to Werner Fink from comment #7)
> (In reply to Richard Weinberger from comment #6)
> 
> Nevertheless somethings goes wrong which is caused by the final decrypted
> partition even if mounted.  See the log you've attached.  There are files,
> which are not found.  Run
> 
>          rpm -qVa
> 
> to see what is missed and and what is modified.

See:
---cut---
S.5....T.  c /etc/splashy/config.xml
.......T.  c /etc/sssd/sssd.conf
.M.....T.    /usr/share/fonts/jmk/fonts.dir
.M.....T.    /usr/share/fonts/jmk/fonts.scale
S.5....T.  c /etc/postfix/main.cf
S.5....T.  c /etc/postfix/master.cf
missing     /usr/lib64/libreoffice/share/autocorr/acor_de-AT.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_de-BE.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_de-CH.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_de-LI.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_de-LU.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-AR.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-BO.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-CL.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-CO.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-CR.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-CU.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-DO.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-EC.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-GT.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-HN.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-MX.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-NI.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-PA.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-PE.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-PR.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-PY.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-SV.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-US.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-UY.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_es-VE.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_fr-BE.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_fr-CA.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_fr-CH.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_fr-LU.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_fr-MC.dat
missing     /usr/lib64/libreoffice/share/autocorr/acor_it-CH.dat
S.5....T.  c /etc/rsyslog.d/remote.conf
.......T.    /usr/include/X11/extensions/windowswm.h
.......T.    /usr/include/X11/extensions/windowswmstr.h
.......T.    /usr/lib64/pkgconfig/windowswmproto.pc
.M.....T.    /boot/boot.readme
.M.......    /boot/.vmlinuz-3.16.7-7-vanilla.hmac
.M.....T.    /boot/System.map-3.16.7-7-vanilla
.M.......    /boot/config-3.16.7-7-vanilla
.M.......    /boot/symvers-3.16.7-7-vanilla.gz
.M.......    /boot/sysctl.conf-3.16.7-7-vanilla
.M.......    /boot/vmlinux-3.16.7-7-vanilla.gz
.M.......    /boot/vmlinuz-3.16.7-7-vanilla
S.5....T.  c /etc/default/grub
.M.......    /usr/local/Brother/Printer/MFC7360N/inf
.M...UGT.    /usr/local/Brother/Printer/MFC7360N/inf/brMFC7360Nrc
.M...UG..    /var/spool/lpd/MFC7360N
missing     /usr/share/YaST2/theme/current/wizard
missing     /usr/share/YaST2/theme/current/wizard/arr_down.png
missing     /usr/share/YaST2/theme/current/wizard/arr_left.png
missing     /usr/share/YaST2/theme/current/wizard/arr_right.png
missing     /usr/share/YaST2/theme/current/wizard/arr_up.png
missing     /usr/share/YaST2/theme/current/wizard/background.png
missing     /usr/share/YaST2/theme/current/wizard/branch-closed.png
missing     /usr/share/YaST2/theme/current/wizard/branch-end.png
missing     /usr/share/YaST2/theme/current/wizard/branch-more.png
missing     /usr/share/YaST2/theme/current/wizard/branch-open.png
missing     /usr/share/YaST2/theme/current/wizard/checkbox-off.png
missing     /usr/share/YaST2/theme/current/wizard/checkbox-on.png
missing     /usr/share/YaST2/theme/current/wizard/header-background.png
missing     /usr/share/YaST2/theme/current/wizard/header-logo.png
missing     /usr/share/YaST2/theme/current/wizard/inst_arr_down.png
missing     /usr/share/YaST2/theme/current/wizard/inst_arr_left.png
missing     /usr/share/YaST2/theme/current/wizard/inst_arr_right.png
missing     /usr/share/YaST2/theme/current/wizard/inst_arr_up.png
missing     /usr/share/YaST2/theme/current/wizard/inst_checkbox-off-disabled.png
missing     /usr/share/YaST2/theme/current/wizard/inst_checkbox-off.png
missing     /usr/share/YaST2/theme/current/wizard/inst_checkbox-on-disabled.png
missing     /usr/share/YaST2/theme/current/wizard/inst_checkbox-on.png
missing     /usr/share/YaST2/theme/current/wizard/inst_radio-button-checked.png
missing     /usr/share/YaST2/theme/current/wizard/inst_radio-button-unchecked.png
missing     /usr/share/YaST2/theme/current/wizard/installation.qss
missing     /usr/share/YaST2/theme/current/wizard/installation_richtext.css
missing     /usr/share/YaST2/theme/current/wizard/installation_slim.qss
missing     /usr/share/YaST2/theme/current/wizard/logo.png
missing     /usr/share/YaST2/theme/current/wizard/richtext.css
missing     /usr/share/YaST2/theme/current/wizard/separator.png
missing     /usr/share/YaST2/theme/current/wizard/spin_down.png
missing     /usr/share/YaST2/theme/current/wizard/spin_up.png
missing     /usr/share/YaST2/theme/current/wizard/step-current.png
missing     /usr/share/YaST2/theme/current/wizard/step-done.png
missing     /usr/share/YaST2/theme/current/wizard/step-todo.png
missing     /usr/share/YaST2/theme/current/wizard/style.qss
missing     /usr/share/YaST2/theme/current/wizard/style_slim.qss
missing     /usr/share/YaST2/theme/current/wizard/vline.png
.M.....T.    /boot/memtest.bin
....L....    /usr/lib64/browser-plugins/javaplugin.so
S.5....T.  c /etc/suspend.conf
.......T.    /lib/firmware/qlogic/sd7220.fw
....L....    /usr/share/java/xml-commons-apis.jar
S.5....T.  c /etc/sane.d/dll.conf
S.5....T.  c /etc/speech-dispatcher/speechd.conf
S.5....T.  c /etc/sysconfig/SuSEfirewall2
S.5....T.  c /etc/plymouth/plymouthd.conf
.........    /usr/share/kde4/apps/libkdcraw/profiles/srgb-d65.icm (replaced)
.M.......    /var/lib/PackageKit/transactions.db
S.5....T.  c /var/lib/nfs/etab
S.5....T.  c /var/lib/nfs/rmtab
S.5....T.  c /etc/php5/apache2/php.ini
....L....  c /etc/pam.d/common-account
....L....  c /etc/pam.d/common-auth
....L....  c /etc/pam.d/common-password
....L....  c /etc/pam.d/common-session
.......T.    /usr/lib64/gconv/gconv-modules.cache
S.5....T.  c /etc/ppp/chap-secrets
S.5....T.  c /etc/ppp/pap-secrets
.M.......    /boot/symtypes-3.16.7-7-vanilla.gz
S.5....T.  c /etc/ppp/peers/wvdial
S.5....T.  c /etc/wvdial.conf
S.5....T.  c /etc/tuned/active_profile
.M.......    /etc/cups
S.5....T.  c /etc/dnsmasq.conf
S.5....T.  c /etc/X11/xorg.conf.d/50-synaptics.conf
S.5....T.  c /etc/libvirt/qemu.conf
S.5....T.  c /etc/apache2/default-server.conf
S.5....T.  c /etc/apache2/mod_userdir.conf
/usr/sbin/suexec2: cannot verify root:root 0755 - not listed in /etc/permissions
S.5....T.  c /etc/default/passwd
....L....    /usr/share/java/jaxp_transform_impl.jar
.........    /usr/share/kde4/apps/kipi/data/kipi-icon.svg (replaced)
.........    /usr/share/kde4/apps/kipi/data/kipi-logo.svg (replaced)
.........    /usr/share/kde4/servicetypes/kipiplugin.desktop (replaced)
.........    /usr/bin/vnc_inetd_httpd (replaced)
.........    /usr/bin/vncpasswd (replaced)
.........    /usr/bin/vncpasswd.arg (replaced)
.........    /usr/bin/vncserver (replaced)
.........  d /usr/share/man/man1/vncpasswd.1.gz (replaced)
.........  d /usr/share/man/man1/vncserver.1.gz (replaced)
.........    /usr/share/vnc/classes/VncViewer.jar (replaced)
.........    /usr/share/vnc/classes/index.vnc (replaced)
S.5....T.  c /etc/dhcpd.conf
S.5....T.  c /etc/ipsec.conf
S.5....T.  c /etc/ipsec.secrets
S.5....T.  c /etc/apache2/conf.d/collectd.conf
.........    /usr/lib64/audacious/Output/pulse_audio.so (replaced)
....L....    /usr/share/java/jaxp_parser_impl.jar
.......T.  c /etc/login.defs
S.5....T.  c /etc/pam.d/login
....L....  d /usr/share/man/man1/ftp.1.gz
S.5....T.  c /etc/pulse/client.conf
missing     /opt/brother/scanner/brscan-skey/script
missing     /opt/brother/scanner/brscan-skey/script/brscan_scantoemail-0.2.4-0
....L....    /usr/share/java/xml-commons-resolver.jar
S.5....T.  c /etc/collectd.conf
S.5....T.  c /etc/modprobe.d/99-local.conf
S.5....T.  c /etc/systemd/journald.conf
S.5....T.  c /usr/share/kde4/config/kdm/kdmrc
S.5....T.  c /etc/squid/squid.conf
/usr/sbin/basic_pam_auth: cannot verify root:shadow 2750 - not listed in /etc/permissions
/usr/sbin/pinger: cannot verify root:squid 0750 - not listed in /etc/permissions
/var/cache/squid/: cannot verify squid:root 0750 - not listed in /etc/permissions
/var/log/squid/: cannot verify squid:root 0750 - not listed in /etc/permissions
S.5....T.  c /etc/xinetd.d/vnc
S.5....T.  c /etc/fonts/conf.d/10-rendering-options.conf
S.5....T.  c /etc/fonts/conf.d/58-family-prefer-local.conf
S.5....T.  c /etc/libvirt/libvirtd.conf
....L....    /var/lib/libvirt
S.5....T.  c /etc/ntp.conf
missing     /usr/lib/virtualbox/ExtensionPacks
....L....    /usr/lib64/browser-plugins/javaplugin.so
.....U...    /var/cache/cups
S.5....T.  c /etc/samba/smb.conf
---cut---

Does not look that bad. :)
Comment 9 Dr. Werner Fink 2015-04-20 11:07:06 UTC
(In reply to Richard Weinberger from comment #8)

The ``missing'' entries are not OK! Beside this the log is showing

 Cannot add dependency job for unit cups.socket, ignoring: Unit cups.socket failed to load: No such file or directory.

That is the file

  /usr/lib/systemd/system/cups.socket

is missed or at least for the failing boot case.  Also booting a vanilla kernel without deep kernel and user space knowledge is a BadIdea[tm] as there are patches in the kernel which are required by the user space tools. Beside bug fixes backported from upstream as well as fixes for laptops and encyption.
Comment 10 Richard Weinberger 2015-04-20 11:13:13 UTC
(In reply to Dr. Werner Fink from comment #9)
> (In reply to Richard Weinberger from comment #8)
> 
> The ``missing'' entries are not OK! Beside this the log is showing

rpm found only some non-cirticial files as missing.
1. UI stuff in /usr/share/YaST2/theme/current/wizard/
2. Office stuff in /usr/lib64/libreoffice/share/autocorr/

>  Cannot add dependency job for unit cups.socket, ignoring: Unit cups.socket
> failed to load: No such file or directory.
> 
> That is the file
> 
>   /usr/lib/systemd/system/cups.socket
> 
> is missed or at least for the failing boot case.  Also booting a vanilla
> kernel without deep kernel and user space knowledge is a BadIdea[tm] as
> there are patches in the kernel which are required by the user space tools.
> Beside bug fixes backported from upstream as well as fixes for laptops and
> encyption.

Wait, just for the record, my system will not boot if a minor service like cups
is unable to start?

What kernel patches are needed to run openSUSE?
I'm following mainline for years now without issues.
Other kernel developers which work for SUSE may also want to know.
Comment 11 Dr. Werner Fink 2015-04-20 11:36:02 UTC
(In reply to Richard Weinberger from comment #10)

> Wait, just for the record, my system will not boot if a minor service like cups
> is unable to start?

No ... but if for some boot tries a regular file isn't found after the ``/'' aka the root file system is mounted then this is a hint that the file system is corrupted.  Or you have not specified the correct passphrase (e.g. transposed letters) as similar passphrase may cause trouble for some encryption schemes. That is that the used decrypted file system look only similar to the real encrypted one.

> What kernel patches are needed to run openSUSE?
> I'm following mainline for years now without issues.
> Other kernel developers which work for SUSE may also want to know.

Sorry but this is not a teaching portal. Please install the official kernel for openSUSE 13.2 together with all update and retest.
Comment 12 Richard Weinberger 2015-04-20 11:40:36 UTC
(In reply to Dr. Werner Fink from comment #11)
> (In reply to Richard Weinberger from comment #10)
> 
> > Wait, just for the record, my system will not boot if a minor service like cups
> > is unable to start?
> 
> No ... but if for some boot tries a regular file isn't found after the ``/''
> aka the root file system is mounted then this is a hint that the file system
> is corrupted.  Or you have not specified the correct passphrase (e.g.
> transposed letters) as similar passphrase may cause trouble for some
> encryption schemes. That is that the used decrypted file system look only
> similar to the real encrypted one.

What file exactly? Which packages should contain it?
I'm asking because rpm did not report it as missing.

> > What kernel patches are needed to run openSUSE?
> > I'm following mainline for years now without issues.
> > Other kernel developers which work for SUSE may also want to know.
> 
> Sorry but this is not a teaching portal. Please install the official kernel
> for openSUSE 13.2 together with all update and retest.

No Werner, you tell me now what patches are missing.
BTW, run git log on the kernel and watch out for my name...
Comment 13 Thomas Blume 2015-04-20 11:48:34 UTC
(In reply to Richard Weinberger from comment #12)
> (In reply to Dr. Werner Fink from comment #11)
> > No ... but if for some boot tries a regular file isn't found after the ``/''
> > aka the root file system is mounted then this is a hint that the file system
> > is corrupted.  Or you have not specified the correct passphrase (e.g.
> > transposed letters) as similar passphrase may cause trouble for some
> > encryption schemes. That is that the used decrypted file system look only
> > similar to the real encrypted one.
> 
> What file exactly? Which packages should contain it?
> I'm asking because rpm did not report it as missing.

Its also possible that lvm is guilty for missing/hanging devices.
There are known problems when lvmetad is enabled.
See bug 871704 for details.
If so, can you try with lvmetad disabled?
Comment 14 Dr. Werner Fink 2015-04-20 11:52:46 UTC
(In reply to Richard Weinberger from comment #12)

> What file exactly? Which packages should contain it?
> I'm asking because rpm did not report it as missing.

As you can see from http://git.infradead.org/~rw/os132_boot_fail.txt the file was missed at least once ... that is in that fail case.

Or you have install the wrong cups package.

> No Werner, you tell me now what patches are missing.
> BTW, run git log on the kernel and watch out for my name...

The repository

 osc ls openSUSE:13.2 kernel-source | grep patch
 apply-patches
 patches.addon.tar.bz2
 patches.apparmor.tar.bz2
 patches.arch.tar.bz2
 patches.drivers.tar.bz2
 patches.fixes.tar.bz2
 patches.kabi.tar.bz2
 patches.kernel.org.tar.bz2
 patches.rpmify.tar.bz2
 patches.rt.tar.bz2
 patches.suse.tar.bz2
 patches.trace.tar.bz2
 patches.xen.tar.bz2

is your friend:

 for p in $(osc ls openSUSE:13.2 kernel-source | grep ^patches)
 do
    osc cat openSUSE:13.2 kernel-source $p | tar tfj -
 done

... or for thinkpads ...

 for p in $(osc ls openSUSE:13.2 kernel-source | grep ^patches)
 do
    osc cat openSUSE:13.2 kernel-source $p | tar tfj -
 done | grep -i think

which shows here

 patches.arch/acpi_thinkpad_introduce_acpi_root_table_boot_param.patch

or for dm ...

 for p in $(osc ls openSUSE:13.2 kernel-source | grep ^patches)
 do
    osc cat openSUSE:13.2 kernel-source $p | tar tfj -
 done | grep -i /dm

I see

 patches.fixes/dm-mpath-reattach-dh
 patches.fixes/dm-release-map_lock-before-set_disk_ro
 patches.fixes/dm-table-switch-to-readonly
 patches.suse/dm-emulate-blkrrpart-ioctl
 patches.suse/dm-mpath-accept-failed-paths
 patches.suse/dm-mpath-detach-existing-hardware-handler
 patches.suse/dm-mpath-leastpending-path-update
 patches.suse/dm-mpath-no-activate-for-offlined-paths
 patches.suse/dm-mpath-no-partitions-feature

[...]
Comment 15 Richard Weinberger 2015-04-20 11:54:16 UTC
(In reply to Thomas Blume from comment #13)
> (In reply to Richard Weinberger from comment #12)
> > (In reply to Dr. Werner Fink from comment #11)
> > > No ... but if for some boot tries a regular file isn't found after the ``/''
> > > aka the root file system is mounted then this is a hint that the file system
> > > is corrupted.  Or you have not specified the correct passphrase (e.g.
> > > transposed letters) as similar passphrase may cause trouble for some
> > > encryption schemes. That is that the used decrypted file system look only
> > > similar to the real encrypted one.
> > 
> > What file exactly? Which packages should contain it?
> > I'm asking because rpm did not report it as missing.
> 
> Its also possible that lvm is guilty for missing/hanging devices.
> There are known problems when lvmetad is enabled.
> See bug 871704 for details.
> If so, can you try with lvmetad disabled?

The problem is that the issue happened only once.
It is not a persistent failure.

BTW: In my /etc/lvm/lvm.conf use_lvmetad is 0.

But as a failed boot is a nasty issue, I'm serious about finding
the root cause. :-)
Comment 16 Richard Weinberger 2015-04-20 12:15:42 UTC
(In reply to Dr. Werner Fink from comment #14)
> (In reply to Richard Weinberger from comment #12)
> 
> > What file exactly? Which packages should contain it?
> > I'm asking because rpm did not report it as missing.
> 
> As you can see from http://git.infradead.org/~rw/os132_boot_fail.txt the
> file was missed at least once ... that is in that fail case.
> 
> Or you have install the wrong cups package.

So a broken cups randomly renders my system unusable?

> > No Werner, you tell me now what patches are missing.
> > BTW, run git log on the kernel and watch out for my name...
> 
> The repository
> 
>  osc ls openSUSE:13.2 kernel-source | grep patch
>  apply-patches
>  patches.addon.tar.bz2
>  patches.apparmor.tar.bz2
>  patches.arch.tar.bz2
>  patches.drivers.tar.bz2
>  patches.fixes.tar.bz2
>  patches.kabi.tar.bz2
>  patches.kernel.org.tar.bz2
>  patches.rpmify.tar.bz2
>  patches.rt.tar.bz2
>  patches.suse.tar.bz2
>  patches.trace.tar.bz2
>  patches.xen.tar.bz2
> 
> is your friend:
> 
>  for p in $(osc ls openSUSE:13.2 kernel-source | grep ^patches)
>  do
>     osc cat openSUSE:13.2 kernel-source $p | tar tfj -
>  done
> 
> ... or for thinkpads ...
> 
>  for p in $(osc ls openSUSE:13.2 kernel-source | grep ^patches)
>  do
>     osc cat openSUSE:13.2 kernel-source $p | tar tfj -
>  done | grep -i think
> 
> which shows here
> 
>  patches.arch/acpi_thinkpad_introduce_acpi_root_table_boot_param.patch

How would that patch affect my issue?
Did you read the bug entry related to that patch?
https://bugzilla.kernel.org/show_bug.cgi?id=8246

> or for dm ...
> 
>  for p in $(osc ls openSUSE:13.2 kernel-source | grep ^patches)
>  do
>     osc cat openSUSE:13.2 kernel-source $p | tar tfj -
>  done | grep -i /dm
> 
> I see
> 
>  patches.fixes/dm-mpath-reattach-dh

I'm not using multipath.

>  patches.fixes/dm-release-map_lock-before-set_disk_ro

This fixes a kernel lockup, my kernel was fine. systemd locked up.

>  patches.fixes/dm-table-switch-to-readonly


I'm not using multipath.

>  patches.suse/dm-emulate-blkrrpart-ioctl

Completely unrealted.

>  patches.suse/dm-mpath-accept-failed-paths
>  patches.suse/dm-mpath-detach-existing-hardware-handler
>  patches.suse/dm-mpath-leastpending-path-update
>  patches.suse/dm-mpath-no-activate-for-offlined-paths
>  patches.suse/dm-mpath-no-partitions-feature

Again, I'm not using multipath.

> [...]

Please don't point me to random patches.
If there is something missing in Linus's tree to run openSUSE, please tell.
Comment 17 Dr. Werner Fink 2015-04-20 12:32:01 UTC
(In reply to Richard Weinberger from comment #16)

You have not taken my point: I want know if the bug happens with the original kernel of openSUSE 13.2 and packages of openSUSE 13.2.  Then I can make a decision if this is a bug of systemd or not.  Currently I'm in doubt that this is a systemd bug at all.  Nor I do know if this could be a side effect of a similar but not correct passphrase for the decryption.

> Please don't point me to random patches.
> If there is something missing in Linus's tree to run openSUSE, please tell.

Richard,

the only things you have provided is the log file which indeed shows DM and the information that the system is a thinkpad and a strange boot error which no one can reproduce here nor anyone has seen in any an other bug report. What do you expect in such a case?
Comment 18 Richard Weinberger 2015-04-20 12:44:59 UTC
(In reply to Dr. Werner Fink from comment #17)
> (In reply to Richard Weinberger from comment #16)
> 
> You have not taken my point: I want know if the bug happens with the
> original kernel of openSUSE 13.2 and packages of openSUSE 13.2.  Then I can
> make a decision if this is a bug of systemd or not.  Currently I'm in doubt
> that this is a systemd bug at all.  Nor I do know if this could be a side
> effect of a similar but not correct passphrase for the decryption.

I never said, that it is a systemd bug.

Let's sum up the facts we know.
1. The passphrase was correct as my root filesystem was mounted correctly.

2. For reasons I don't know dbus did not start. 
   The top most errors on the log are:
Apr 16 13:51:01 sandpuppy systemd[1]: Failed to get initial list of names: Connection timed out
Apr 16 13:51:01 sandpuppy dbus[1570]: [system] Activating via systemd: service name='org.freedesktop.PolicyKit1' unit='polkit.service'
Apr 16 13:51:01 sandpuppy ModemManager[1567]: <warn>  failed to create PolicyKit authority: 'Error initializing authority: Error calling StartServiceByName for org.freedesktop.PolicyKit1: Time
Apr 16 13:51:01 sandpuppy dbus[1570]: [system] Failed to activate service 'org.freedesktop.PolicyKit1': timed out
Apr 16 13:51:01 sandpuppy dbus[1570]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service'
Apr 16 13:51:02 sandpuppy dbus[1570]: [system] Activating via systemd: service name='org.freedesktop.PolicyKit1' unit='polkit.service'

And *much* later we face:
Apr 16 13:54:53 sandpuppy systemd[1]: Job dev-disk-by\x2did-raid\x2dcr_ata\x2dTOSHIBA_MK5061GSY_X2I2Y00CF\x2dpart2.device/stop timed out.

If dbus was not running the whole IPC between systemd and its helpers did not work. This would also explain
the timeout on dev-disk-by\x2did-raid\x2dcr_ata\x2dTOSHIBA_MK5061GSY_X2I2Y00CF...

3. The kernel did not lockup, nor panic. I was able to change between ttys. But there was no getty.

4. Something bad happened to cups because we face this line:
Apr 16 13:51:02 sandpuppy systemd[1]: Cannot add dependency job for unit cups.socket, ignoring: Unit cups.socket failed to load: No such file or directory.

   Again, if dbus is down, the IPC is down and maybe systemd was unable to start cups and hence no cups.socket for us.

> > Please don't point me to random patches.
> > If there is something missing in Linus's tree to run openSUSE, please tell.
> 
> Richard,
> 
> the only things you have provided is the log file which indeed shows DM and
> the information that the system is a thinkpad and a strange boot error which
> no one can reproduce here nor anyone has seen in any an other bug report.
> What do you expect in such a case?

As the issue happened only once the log and the sceenshot is all I have.
Please see my list of facts above. The missing device is not the top most error
and can be explained by a dead dbus.

So, the interesting question is, what was wrong with dbus?
Comment 19 Dr. Werner Fink 2015-04-20 13:02:25 UTC
(In reply to Richard Weinberger from comment #18)

> I never said, that it is a systemd bug.

Currently most errors reported by systemd will be assigned to systemd. That is that the bug dispatcher or reporter isn't able to distinguish between reporting and causing en error. 

> Let's sum up the facts we know.
> 1. The passphrase was correct as my root filesystem was mounted correctly.(In 
> reply to Richard Weinberger from comment #18)

Belive me: In rare case it is possible that a passpharse error does not always protect that a file system can be mounted.  AFAICR I had seen this in past on a virtual test system.  In this case the file system check had recovered the afterward broken ext3 file system.

> 2. For reasons I don't know dbus did not start. 
>   The top most errors on the log are:

Adding dbus-1 maintainers to carbon copy.
Comment 20 Richard Weinberger 2015-04-20 13:12:38 UTC
(In reply to Dr. Werner Fink from comment #19)
> (In reply to Richard Weinberger from comment #18)
> 
> > I never said, that it is a systemd bug.
> 
> Currently most errors reported by systemd will be assigned to systemd. That
> is that the bug dispatcher or reporter isn't able to distinguish between
> reporting and causing en error. 
> 
> > Let's sum up the facts we know.
> > 1. The passphrase was correct as my root filesystem was mounted correctly.(In 
> > reply to Richard Weinberger from comment #18)
> 
> Belive me: In rare case it is possible that a passpharse error does not
> always protect that a file system can be mounted.  AFAICR I had seen this in
> past on a virtual test system.  In this case the file system check had
> recovered the afterward broken ext3 file system.

Hmm, your idea is that instead of passpharse P I've entered passpharse P'.
P' was able to open the LUKS disk but as it was not identical to P the blocks
got decrypted in a wrong manner and caused bad files?

I don't think that this can happen with LUKS as the hash would not match.
A false positive hash _and_ mountable (bad) filesystems sounds very unlikely
to me. 

> > 2. For reasons I don't know dbus did not start. 
> >   The top most errors on the log are:
> 
> Adding dbus-1 maintainers to carbon copy.

Thx!
Comment 21 Cristian Rodríguez 2015-06-01 14:40:44 UTC
Ok Richard.. so you are running a self-built kernel right ? ensure http://cgit.freedesktop.org/systemd/systemd/tree/README section REQUIREMENTS is met by your build. In particular it looks like your kernel does not have CONFIG_FHANDLE set.
Comment 22 Richard Weinberger 2015-06-01 14:51:02 UTC
(In reply to Cristian Rodríguez from comment #21)
> Ok Richard.. so you are running a self-built kernel right ? ensure
> http://cgit.freedesktop.org/systemd/systemd/tree/README section REQUIREMENTS
> is met by your build. In particular it looks like your kernel does not have
> CONFIG_FHANDLE set.

Yeah, the kernel is self-built (it's my day job)...
But CONFIG_FHANDLE *is* set.
In fact this config option is default because of me:
https://git.kernel.org/linus/15bae280e412e79d74912c7ae6b6a002444edb1f
Comment 23 Cristian Rodríguez 2015-06-01 15:42:59 UTC
(In reply to Richard Weinberger from comment #22)
> (In reply to Cristian Rodríguez from comment #21)
> > Ok Richard.. so you are running a self-built kernel right ? ensure
> > http://cgit.freedesktop.org/systemd/systemd/tree/README section REQUIREMENTS
> > is met by your build. In particular it looks like your kernel does not have
> > CONFIG_FHANDLE set.
> 
> Yeah, the kernel is self-built (it's my day job)...
> But CONFIG_FHANDLE *is* set.
> In fact this config option is default because of me:
> https://git.kernel.org/linus/15bae280e412e79d74912c7ae6b6a002444edb1f

ok.. let's try something different then:

follow http://freedesktop.org/wiki/Software/systemd/Debugging/
section Early Debug Shell.

when pid 1 is looping too fast...

follow this instructions

http://git.infradead.org/~rw/os132_boot_fail.txt

also your boot log shows that the virtualbox kernel drivers are being loaded..which are known to be ..let say.. "buggy".. try disabling the vboxdrv service before continuing debugging..
Comment 24 Cristian Rodríguez 2015-06-01 15:44:24 UTC
(In reply to Cristian Rodríguez from comment #23)
> (In reply to Richard Weinberger from comment #22)
> > (In reply to Cristian Rodríguez from comment #21)
> > > Ok Richard.. so you are running a self-built kernel right ? ensure
> > > http://cgit.freedesktop.org/systemd/systemd/tree/README section REQUIREMENTS
> > > is met by your build. In particular it looks like your kernel does not have
> > > CONFIG_FHANDLE set.
> > 
> > Yeah, the kernel is self-built (it's my day job)...
> > But CONFIG_FHANDLE *is* set.
> > In fact this config option is default because of me:
> > https://git.kernel.org/linus/15bae280e412e79d74912c7ae6b6a002444edb1f
> 
> ok.. let's try something different then:
> 
> follow http://freedesktop.org/wiki/Software/systemd/Debugging/
> section Early Debug Shell.
> 
> when pid 1 is looping too fast...
> 
> follow this instructions
> 
> http://git.infradead.org/~rw/os132_boot_fail.txt
> 
> also your boot log shows that the virtualbox kernel drivers are being
> loaded..which are known to be ..let say.. "buggy".. try disabling the
> vboxdrv service before continuing debugging..

oops, wrong instruction link.. here it is http://lists.freedesktop.org/archives/systemd-devel/2015-February/028541.html
Comment 25 Cristian Rodríguez 2015-06-01 15:54:13 UTC
(In reply to Cristian Rodríguez from comment #24)
> (In reply to Cristian Rodríguez from comment #23)
> > (In reply to Richard Weinberger from comment #22)
> > > (In reply to Cristian Rodríguez from comment #21)
> > > > Ok Richard.. so you are running a self-built kernel right ? ensure
> > > > http://cgit.freedesktop.org/systemd/systemd/tree/README section REQUIREMENTS
> > > > is met by your build. In particular it looks like your kernel does not have
> > > > CONFIG_FHANDLE set.
> > > 
> > > Yeah, the kernel is self-built (it's my day job)...
> > > But CONFIG_FHANDLE *is* set.
> > > In fact this config option is default because of me:
> > > https://git.kernel.org/linus/15bae280e412e79d74912c7ae6b6a002444edb1f
> > 
> > ok.. let's try something different then:
> > 
> > follow http://freedesktop.org/wiki/Software/systemd/Debugging/
> > section Early Debug Shell.
> > 
> > when pid 1 is looping too fast...
> > 
> > follow this instructions
> > 
> > http://git.infradead.org/~rw/os132_boot_fail.txt
> > 
> > also your boot log shows that the virtualbox kernel drivers are being
> > loaded..which are known to be ..let say.. "buggy".. try disabling the
> > vboxdrv service before continuing debugging..
> 
> oops, wrong instruction link.. here it is
> http://lists.freedesktop.org/archives/systemd-devel/2015-February/028541.html


It could be a coincidence that

Apr 16 13:51:01 sandpuppy kernel: vboxdrv: Found 4 processor cores.
Apr 16 13:51:01 sandpuppy kernel: vboxdrv: fAsync=0 offMin=0x2ac offMax=0x7a7e
Apr 16 13:51:01 sandpuppy kernel: vboxdrv: TSC mode is 'synchronous', kernel timer mode is 'normal'.
Apr 16 13:51:01 sandpuppy kernel: vboxdrv: Successfully loaded version 4.3.20 (interface 0x001a0008).
Apr 16 13:51:01 sandpuppy kernel: vboxpci: IOMMU not found (not registered)

--> Things go bersek right after the virtual box drivers are loaded..>>>>


Apr 16 13:51:01 sandpuppy systemd[1]: Failed to get initial list of names: Connection timed out
Apr 16 13:51:01 sandpuppy systemd[1]: Looping too fast. Throttling execution a little.
Apr 16 13:51:01 sandpuppy systemd[1]: Looping too fast. Throttling execution a little.

That could be misleading so let's rule that out first...
Comment 26 Stephan Kulow 2016-01-08 09:06:49 UTC
Cristian, please take this bug. If Richard can find out something, fine - if not, close it as WORKSFORME. But *never* reassign back to screening team
Comment 27 Karl Cheng 2017-11-16 11:46:14 UTC
Hi, thanks for reporting this issue.

As there has been no response for over a year, this issue will be closed.

If you still can reproduce the bug, please add a comment (reopen if possible) or open a new bug. Thank you!