Bug 878853

Summary: No output upon reboot
Product: [openSUSE] openSUSE 13.1 Reporter: Richard Weinberger <richard>
Component: BasesystemAssignee: Thomas Blume <thomas.blume>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: olivpass, thomas.blume, vojtech
Version: Final   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE 13.1   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: minimal output upon reboot
upstream-fix-hanging-ssh-sessions-at-shutdown.patch
the systemd core dump "/usr/lib/systemd/systemd --system --deserialize 18"

Description Richard Weinberger 2014-05-20 13:54:43 UTC
User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:25.0) Gecko/20100101 Firefox/25.0

If I reboot my 13.1 VM via SSH the SSH-Session immediately freezes.
os131-template:~ # systemctl reboot
Timeout, server 10.0.1.28 not responding.

Also on the VGA console no output at all is visible.
Interestingly I still see the login prompt from getty and the cursor blinks, but
also no input is possible.
After 2 minutes the kernel reboots.

Looks like systemd (?) disables too early various services.

Reproducible: Always

Steps to Reproduce:
1. Install an openSUSE 13.1 minimal (no X, just server)
2. Reboot
3.
Comment 1 Dr. Werner Fink 2014-05-21 07:34:22 UTC
No, it does not:  If you like to see output you have to press ESC on the keyboard to cause plymouth to show boot messages.  I've several VM here, one with 13.1 and some more with latest factory and SLES12.

To avoid the freeze you may install systemd from

   http://download.opensuse.org/repositories/Base:/System/openSUSE_13.1/
Comment 2 Richard Weinberger 2014-05-21 07:41:20 UTC
As this is a minimal installation, no plymouth is installed.
And how does your resolution explain the fact that also my SSH connection timeouts?

What special systemd is this in
http://download.opensuse.org/repositories/Base:/System/openSUSE_13.1/?
Comment 3 Richard Weinberger 2014-05-21 07:42:56 UTC
-> REOPEN
Comment 4 Dr. Werner Fink 2014-05-21 07:44:54 UTC
Stop this!
Comment 5 Richard Weinberger 2014-05-21 07:47:18 UTC
Excuse me, stop what?
Comment 6 Dr. Werner Fink 2014-05-21 07:51:15 UTC
I've exactly explained what happens, and how to solve the freeze, in other worrds: there is no need to reopen this bug.
Comment 7 Richard Weinberger 2014-05-21 08:03:19 UTC
I apologize for reoping the bug.

But as I wrote, there is no plymouth installed as it is the minimal
installation. Hitting ESC does not work.
All I see is a getty prompt without output.

Also why does the SSH connection timeout?
The expected behavior is a message like:
---
Broadcast message from root@(none) (pts/1) (Wed May 21 07:59:41 2014):

The system is going down for reboot NOW!
---
followed by sshd closing the TCP connection.

Finally, I don't know everyone of the millions of openSUSE repos,
what special systemd is this in
http://download.opensuse.org/repositories/Base:/System/openSUSE_13.1/?
If it fixes a bug why isn't it in update?
Comment 8 Dr. Werner Fink 2014-05-21 08:13:00 UTC
(In reply to comment #7)

If you want to have the messages on the console you may modify

   LogTarget=

in /etc/systemd/system.conf by removing the '#' and using console as target but be aware that you get what you have requested.  Using plymouth is the other way to see boot messages but if plymouth is stopped then no messages will pollute the console afterwards. 

The freeze is/was caused by the order how network and processes are stopped, with http://download.opensuse.org/repositories/Base:/System/openSUSE_13.1/ this has changed, nevertheless the result is the same, the system reboots.
Comment 9 Richard Weinberger 2014-05-21 08:51:32 UTC
Created attachment 591288 [details]
minimal output upon reboot
Comment 10 Richard Weinberger 2014-05-21 08:52:25 UTC
Thanks for the info.
Setting LogTarget=console only gives me minimal more output.
Please see the attached image. After taking the sceenshot it hang for 2 more
minutes.
This happens on a completely fresh installed openSUSE 13.1.
I've double checked it.
If it helps I can upload you my VM image plus detailed instructions howto reproduce.

Upgrading to systemd-210 from Base:/System solves most problems.
1. reboot is does not longer hang
2. sshd closes the connection upon shutdown/reboot
3. I see all output on the console (after setting LogTarget=console)

But I still get no messages like:
---
Broadcast message from root@(none) (pts/1) (Wed May 21 07:59:41 2014):

The system is going down for reboot NOW!
---

Has this feature been removed?

Last but not least, when will the fixed systemd from Base:/System hit opensuse 13.1 as update?
Comment 11 Dr. Werner Fink 2014-05-21 08:59:31 UTC
(In reply to comment #10)

The broadcast messages does not reach in case of an immediate reboot, compare with upstream source code of systemd.

For an update I'm not sure as this may cause other bug reports due changed behaviour and this had happen in past very often.
Comment 12 Richard Weinberger 2014-05-21 09:07:48 UTC
So, the issue is _not_ fixed in openSUSE 13.1 unless one installs manually
an updated version of systemd from http://download.opensuse.org/repositories/Base:/System/openSUSE_13.1/?
Sorry, but I have a hard time to understand why you mark this bug as resolved.

Please consider an update. I'm sure that I'm not the only one who is annoyed
by the said issues.
Comment 13 Vojtech Pavlik 2014-05-21 09:29:56 UTC
Dear Werner,

I don't think it's at all correct to close this bug as WORKSFORME, since in a openSUSE 13.1 distribution with all updates installed, it CANNOT WORK FOR YOU.

In fact, every 13.1 install I did recently was broken in exactly the same way.

Timing out on reboot for long times. Not closing SSH connections before rebooting. And unless you fiddle with debugging facilities of systemd by eg. adding a debug shell, also no indication of what is going on.

It's highly annoying.

Yes, installing systemd from Base:System:13.1 fixes that, and that's what I have done on all the machines even before stumbling upon this bug report. But so does installing Ubuntu instead. openSUSE 13.1 is still broken and every single user is bitten by these bugs in 13.1's systemd. An update is sorely needed.

Vojtech
Comment 14 Dr. Werner Fink 2014-05-21 09:54:09 UTC
(In reply to comment #13)

If you're willingly to handle and solve all bug caused by such an update I'm willingly to do this update.  Now the question is: Are you willingly to help here?  It should be noted that I'm really busy with SLES-12 and I've tried to make systemd-210 working on 13.1 but I can and will not guarantee this (out of the box systemd-210 would not boot on 13.1).
Comment 15 Vojtech Pavlik 2014-05-21 11:58:48 UTC
I understand you're worried that the update might break more than it fixes.

While that reflects on the overall state of systemd, I can also say that I haven't observed any ill effects after updating to version 209 or 210 from Base:System:13.1.

If you feel uncomfortable doing a wholesale upgrade, I can also point out the specific commit that fixes this problem in the 208->209 changes. I have found it some time ago, I would need to look it up again, but I can do that.

I'm certainly willing to help, and I expect that Richard is, too.

But no, I'm not interested in taking over the maintainership of the systemd package for 13.1 after this update as you suggest. ;)
Comment 16 Richard Weinberger 2014-05-21 12:12:05 UTC
Of course I'm willing to help.
If there is anything I can do, please tell me.
Comment 17 Vojtech Pavlik 2014-05-22 09:04:12 UTC
This is the upstream bug:

https://bugs.freedesktop.org/show_bug.cgi?id=70593

And this is a commit that I believe fixes it:

http://cgit.freedesktop.org/systemd/systemd/commit/?id=63966da

I haven't tried building a package with the fix to test.

By my testing, 209 and 210 aren't affected, according to the freedesktop bugzilla, 211 and 212 see a similar issue again.
Comment 18 Thomas Blume 2014-05-23 13:35:15 UTC
building test packages.
Comment 19 Thomas Blume 2014-05-26 06:47:20 UTC
Testpackages are available at:

https://build.opensuse.org/package/binaries/home:tsaupe:branches:openSUSE:13.1:Update/systemd?repository=standard


can you please test?
Comment 20 Richard Weinberger 2014-05-26 06:57:05 UTC
Hmmm, the link to the download repo gives me a HTTP 404 :(
http://download.opensuse.org/repositories/home:/tsaupe:/branches:/openSUSE:/13.1:/Update/standard
Comment 21 Richard Weinberger 2014-06-16 08:53:27 UTC
The download link is still 404.
I cannot test.
Comment 22 Thomas Blume 2014-06-27 09:07:30 UTC
Hm, my repository got deleted, not sure why.
I have rebuild it now.
Please fetch the packages from:

https://build.opensuse.org/package/binaries/home:tsaupe:branches:openSUSE:13.1:Update/systemd?repository=standard
Comment 23 Thomas Blume 2015-01-26 16:22:44 UTC
(In reply to Thomas Blume from comment #22)
> Hm, my repository got deleted, not sure why.
> I have rebuild it now.
> Please fetch the packages from:
> 
> https://build.opensuse.org/package/binaries/home:tsaupe:branches:openSUSE:13.
> 1:Update/systemd?repository=standard

tested locally, but still got hanging ssh sessions.
Currently implementing the patches from:

https://bugzilla.redhat.com/show_bug.cgi?id=626477 comment#38.
Let's see wheter they fix the issue.
Comment 24 Thomas Blume 2015-01-27 11:34:45 UTC
Created attachment 621005 [details]
upstream-fix-hanging-ssh-sessions-at-shutdown.patch

ported patchset from upstream
Comment 25 Thomas Blume 2015-01-27 11:40:03 UTC
The attached patch fixes the issue on my testmachine.
However, the changes are quite large and I'd really like some more confirmation before submitting.
Testpackages are available here:

https://build.opensuse.org/package/binaries/home:tsaupe:branches:openSUSE:13.1:Update/systemd?repository=standard

Can you please test and report feedback?
Comment 26 Thomas Blume 2015-01-27 12:01:30 UTC
Sorry, wrong link in my previous comment, please use this one:

http://download.opensuse.org/repositories/home:/tsaupe:/branches:/openSUSE:/13.1:/Update/standard/
Comment 27 Richard Weinberger 2015-01-29 10:36:49 UTC
You patch fixes the issue that ssh connections hang upton reboot. :-)
But if I issue a reboot from ssh I still don't see any output on the tty console.
Getty is just there and suddly the box reboots.

Thanks,
//richard
Comment 28 Thomas Blume 2015-02-12 08:36:54 UTC
(In reply to Richard Weinberger from comment #27)
> You patch fixes the issue that ssh connections hang upton reboot. :-)
> But if I issue a reboot from ssh I still don't see any output on the tty
> console.
> Getty is just there and suddly the box reboots.

I could reproduce the issue.
Actually the reboot delay is caused by a user session timing out:

-->--
[   84.068384] systemd[1]: Got D-Bus request: org.freedesktop.systemd1.Agent.Released() on /org/freedesktop/systemd1/agent
[   84.068928] systemd[1]: Got D-Bus request: org.freedesktop.DBus.Local.Disconnected() on /org/freedesktop/DBus/Local
[  171.493705] systemd[1]: user@0.service stopping timed out. Killing.
[  171.496599] systemd[1]: user@0.service changed stop-sigterm -> stop-sigkill
[  171.496688] systemd[1]: Received SIGCHLD from PID 1554 (systemd).
[  171.496739] systemd[1]: Got SIGCHLD for process 1554 (systemd)
[  171.496876] systemd[1]: Child 1554 died (code=killed, status=9/KILL)
[  171.496883] systemd[1]: Child 1554 belongs to user@0.service
[  171.496908] systemd[1]: user@0.service: main process exited, code=killed, status=9/KILL
--<--

I've added some upstream patches that address this issue.
Please give my new test packages at:

http://download.opensuse.org/repositories/home:/tsaupe:/branches:/openSUSE:/13.1:/Update/standard/

a try.
There is still not much log output on reboot, but this is a configuration issue.
If you want more boot log, just remove the "quiet" option from the boot parameters.
Comment 29 Benjamin Brunner 2015-02-13 12:05:54 UTC
Thomas, can we already release the update for systemd or should we better wait for a follow-up fix?
Comment 30 Thomas Blume 2015-02-13 13:13:56 UTC
(In reply to Benjamin Brunner from comment #29)
> Thomas, can we already release the update for systemd or should we better
> wait for a follow-up fix?

The fixes are independent. If you want to release the ssh hang fix first, go ahead.
Comment 31 Swamp Workflow Management 2015-02-16 13:05:32 UTC
openSUSE-RU-2015:0289-1: An update that has one recommended fix can now be installed.

Category: recommended (low)
Bug References: 878853
CVE References: 
Sources used:
openSUSE 13.1 (src):    systemd-208-28.1, systemd-mini-208-28.1, systemd-rpm-macros-2-28.1
Comment 32 Oliver Mössinger 2015-02-18 09:57:24 UTC
Hi,

after installing this update on our vmware virtual hosts we get systemd core dumps! Following the /var/log/message log:

2015-02-17T02:24:33.445884+01:00 monsrv systemd[1]: Caught <SEGV>, dumped core as pid 30432.
2015-02-17T02:24:33.446251+01:00 monsrv systemd[1]: Freezing execution.
2015-02-17T02:24:33.444830+01:00 monsrv kernel: [10420206.631824] systemd[1]: segfault at 7f3ebd73a9d8 ip 00007f3ebd73a9d8 sp 00007fffe40b69a8 error 15 in libc-2.18.so[7f3ebd73a000+2000]
2015-02-17T02:30:26.963818+01:00 monsrv /usr/sbin/cron[1625]: pam_systemd(crond:session): Failed to create session: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2015-02-17T02:30:26.964212+01:00 monsrv dbus[575]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out
2015-02-17T02:30:26.964556+01:00 monsrv /usr/sbin/cron[1624]: pam_systemd(crond:session): Failed to create session: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2015-02-17T02:15:01.502588+01:00 monsrv systemd-logind[12830]: Failed to store session release timer fd
2015-02-17T02:30:26.964855+01:00 monsrv systemd-logind[12830]: Failed to start session scope session-32876.scope: Activation of org.freedesktop.systemd1 timed out org.freedesktop.DBus.Error.TimedOut
2015-02-17T02:30:51.990166+01:00 monsrv systemd-logind[12830]: Failed to start session scope session-32877.scope: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. org.freedesktop.DBus.Error.NoReply
2015-02-17T02:40:26.207648+01:00 monsrv systemd-logind[12830]: Failed to start session scope session-32878.scope: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. org.freedesktop.DBus.Error.NoReply
2015-02-17T02:40:26.208416+01:00 monsrv /usr/sbin/cron[8030]: pam_systemd(crond:session): Failed to create session: Input/output error
2015-02-17T02:45:26.560109+01:00 monsrv systemd-logind[12830]: Failed to start session scope session-32879.scope: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. org.freedesktop.DBus.Error.NoReply
2015-02-17T02:45:26.560465+01:00 monsrv /usr/sbin/cron[11081]: pam_systemd(crond:session): Failed to create session: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2015-02-17T02:45:26.560969+01:00 monsrv /usr/sbin/cron[11080]: pam_systemd(crond:session): Failed to create session: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2015-02-17T02:45:51.568899+01:00 monsrv systemd-logind[12830]: Failed to start session scope session-32880.scope: Activation of org.freedesktop.systemd1 timed out org.freedesktop.DBus.Error.TimedOut
2015-02-17T03:00:26.823095+01:00 monsrv /usr/sbin/cron[20726]: pam_systemd(crond:session): Failed to create session: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2015-02-17T03:00:26.823395+01:00 monsrv /usr/sbin/cron[20727]: pam_systemd(crond:session): Failed to create session: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2015-02-17T03:00:26.823707+01:00 monsrv /usr/sbin/cron[20725]: pam_systemd(crond:session): Failed to create session: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
Comment 33 Oliver Mössinger 2015-02-18 10:00:21 UTC
Created attachment 623656 [details]
the systemd core dump "/usr/lib/systemd/systemd --system --deserialize 18"

the missing core dump
Comment 34 Oliver Mössinger 2015-02-18 10:15:07 UTC
The patch i installed is this: openSUSE-2015-149

http://lists.opensuse.org/opensuse-updates/2015-02/msg00065.html
Comment 35 Thomas Blume 2015-02-18 10:54:54 UTC
(In reply to Oliver Mössinger from comment #34)
> The patch i installed is this: openSUSE-2015-149
> 
> http://lists.opensuse.org/opensuse-updates/2015-02/msg00065.html

Issue is already processed in bug 918226
Comment 36 Swamp Workflow Management 2015-02-22 19:04:58 UTC
openSUSE-RU-2015:0347-1: An update that has two recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 878853,918226
CVE References: 
Sources used:
openSUSE 13.1 (src):    systemd-208-32.1, systemd-mini-208-32.1, systemd-rpm-macros-2-32.1
Comment 37 Thomas Blume 2015-05-05 06:37:49 UTC
Update has been released, closing