|
Bugzilla – Full Text Bug Listing |
| Summary: | Long delay on shutdown stopping User Manager for 0 | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Tumbleweed | Reporter: | Andrei Borzenkov <arvidjaar> |
| Component: | Basesystem | Assignee: | systemd maintainers <systemd-maintainers> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | crrodriguez, fbui, lmuelle, nfbrown, seife, thomas.blume, werner |
| Version: | 13.1 Beta 1 | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | openSUSE 13.1 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
Suggested fix in SR#199892 (patch from Olivier Brunel) This is an autogenerated message for OBS integration: This bug (841544) was mentioned in https://build.opensuse.org/request/show/200302 Factory / systemd *** Bug 842003 has been marked as a duplicate of this bug. *** Still not fixed. Apparently patch just changes relative timings. The real problem is, SIGRTMIN+24 gets either lost or ignored when user instance is requested to stop. Here is log in good case: Sep 26 08:39:39 linux-1a7f systemd[1507]: pam_unix(systemd-user:session): session opened for user root by (uid=0) Sep 26 08:39:39 linux-1a7f systemd[1507]: Failed to open private bus connection: Failed to connect to socket /run/user/0/dbus/user_bus_socket: No such file or directory Sep 26 08:39:39 linux-1a7f systemd[1507]: Starting Default. Sep 26 08:39:39 linux-1a7f systemd[1507]: Reached target Default. Sep 26 08:39:39 linux-1a7f systemd[1507]: Startup finished in 37ms. Sep 26 08:39:42 linux-1a7f systemd[1507]: Time has been changed Sep 26 08:39:50 linux-1a7f systemd[1507]: Stopping Default. Sep 26 08:39:50 linux-1a7f systemd[1507]: Stopped target Default. Sep 26 08:39:50 linux-1a7f systemd[1507]: Starting Shutdown. Sep 26 08:39:50 linux-1a7f systemd[1507]: Reached target Shutdown. Sep 26 08:39:50 linux-1a7f systemd[1507]: Starting Exit the Session... and in bad case Sep 26 08:41:23 linux-1a7f systemd[1508]: pam_unix(systemd-user:session): session opened for user root by (uid=0) Sep 26 08:41:23 linux-1a7f systemd[1508]: Failed to open private bus connection: Failed to connect to socket /run/user/0/dbus/user_bus_socket: No such file or directory Sep 26 08:41:23 linux-1a7f systemd[1508]: Starting Default. Sep 26 08:41:23 linux-1a7f systemd[1508]: Reached target Default. Sep 26 08:41:23 linux-1a7f systemd[1508]: Startup finished in 38ms. Sep 26 08:41:25 linux-1a7f systemd[1508]: Time has been changed Sep 26 08:45:27 linux-1a7f systemd[1508]: Stopping Default. Sep 26 08:45:27 linux-1a7f systemd[1508]: Stopped target Default. Sep 26 08:45:27 linux-1a7f systemd[1508]: Starting Shutdown. Sep 26 08:45:27 linux-1a7f systemd[1508]: Reached target Shutdown. Sep 26 08:45:27 linux-1a7f systemd[1508]: Starting Exit the Session... Sep 26 08:45:27 linux-1a7f systemd[1508]: Started Exit the Session. Sep 26 08:45:27 linux-1a7f systemd[1508]: Starting Exit the Session. Sep 26 08:45:27 linux-1a7f systemd[1508]: Reached target Exit the Session. Service "Exit the Sessions" sends SIGRTMIN+24 to systemd which should cause it to stop. But in the second case it did not get or react to it, and continue to run. Any attempt increase debug level makes this problem disappear. patch has to be replaced by: http://cgit.freedesktop.org/systemd/systemd/commit/?id=7400b9d2e99938d17b281d7df43680eade18666e upstream fix. already in my branch, I'll submit it soon (In reply to comment #4) > The real > problem is, SIGRTMIN+24 gets either lost or ignored it is never received by systemd user instance. Confirmed by adding additional logging immediately after receiveing SIGRTMIN+24. I managed to reproduce it with user instance debug level. /bin/kill -RTMIN+24 gets killed itself by SIGTERM. Jan 26 11:53:58 linux-1a7f systemd[1942]: Received SIGTERM from PID 1 (systemd). Jan 26 11:53:58 linux-1a7f systemd[1942]: Activating special unit exit.target Jan 26 11:53:58 linux-1a7f systemd[1942]: Trying to enqueue job exit.target/start/replace Jan 26 11:53:58 linux-1a7f systemd[1942]: Installed new job exit.target/start as 3 Jan 26 11:53:58 linux-1a7f systemd[1942]: Installed new job systemd-exit.service/start as 4 Jan 26 11:53:58 linux-1a7f systemd[1942]: Installed new job shutdown.target/start as 5 Jan 26 11:53:58 linux-1a7f systemd[1942]: Installed new job default.target/stop as 7 Jan 26 11:53:58 linux-1a7f systemd[1942]: Enqueued job exit.target/start as 3 Jan 26 11:53:58 linux-1a7f systemd[1942]: Stopping Default. Jan 26 11:53:58 linux-1a7f systemd[1942]: default.target changed active -> dead Jan 26 11:53:58 linux-1a7f systemd[1942]: Job default.target/stop finished, result=done Jan 26 11:53:58 linux-1a7f systemd[1942]: Stopped target Default. Jan 26 11:53:58 linux-1a7f systemd[1942]: Starting Shutdown. Jan 26 11:53:58 linux-1a7f systemd[1942]: shutdown.target changed dead -> active Jan 26 11:53:58 linux-1a7f systemd[1942]: Job shutdown.target/start finished, result=done Jan 26 11:53:59 linux-1a7f systemd[1942]: Reached target Shutdown. Jan 26 11:53:59 linux-1a7f systemd[1942]: Starting Exit the Session... Jan 26 11:53:59 linux-1a7f systemd[1942]: About to execute: /usr/bin/kill -s 58 $MANAGERPID Jan 26 11:53:59 linux-1a7f systemd[1942]: Forked /usr/bin/kill as 1951 Jan 26 11:53:59 linux-1a7f systemd[1942]: systemd-exit.service changed dead -> start Jan 26 11:53:59 linux-1a7f systemd[1942]: Set up jobs progress timerfd. Jan 26 11:53:59 linux-1a7f systemd[1942]: Collecting default.target Jan 26 11:53:59 linux-1a7f systemd[1942]: Received SIGCHLD from PID 1943 ((sd-pam)). Jan 26 11:53:59 linux-1a7f systemd[1942]: Got SIGCHLD for process 1943 ((sd-pam)) Jan 26 11:53:59 linux-1a7f systemd[1942]: Child 1943 died (code=exited, status=0/SUCCESS) Jan 26 11:53:59 linux-1a7f systemd[1942]: Received SIGCHLD from PID 1951 ((kill)). Jan 26 11:53:59 linux-1a7f systemd[1942]: Got SIGCHLD for process 1951 ((kill)) Jan 26 11:53:59 linux-1a7f systemd[1942]: Child 1951 died (code=killed, status=15/TERM) Jan 26 11:53:59 linux-1a7f systemd[1942]: Child 1951 belongs to systemd-exit.service Jan 26 11:53:59 linux-1a7f systemd[1942]: systemd-exit.service: main process exited, code=killed, status=15/TERM Jan 26 11:53:59 linux-1a7f systemd[1942]: systemd-exit.service changed start -> dead Jan 26 11:53:59 linux-1a7f systemd[1942]: Job systemd-exit.service/start finished, result=done Jan 26 11:53:59 linux-1a7f systemd[1942]: Started Exit the Session. Jan 26 11:53:59 linux-1a7f systemd[1942]: Closed jobs progress timerfd. Jan 26 11:53:59 linux-1a7f systemd[1942]: Starting Exit the Session. Jan 26 11:53:59 linux-1a7f systemd[1942]: exit.target changed dead -> active Jan 26 11:53:59 linux-1a7f systemd[1942]: Job exit.target/start finished, result=done Jan 26 11:53:59 linux-1a7f systemd[1942]: Reached target Exit the Session. The problem is, user@<UID>.service has KillMode=control-group which causes systemd to traverse cgroup and send SIGTERM to every process. So it happens to find /bin/kill in the wrong moment and kills it. Even ignoring this problem it looks wrong. systemd user instance manages all started services itself. Each service started by user instance may have separate requirements how it should be stopped; sending every process SIGTERM before trying ExecStop for repective service is just wrong. Unfortunately currently systemd enforces KillMode=control-group for services that register itself with PAM. SR#215564 for Base:System with upstream fix (even though I do not quite like it). home:arvidjaar:bnc:841544:13.1 has fix for 13.1. thanks a lot ... I've to resolve a conflict in systemd.changes to be able to add the content of your SR See SR#215598 This is an autogenerated message for OBS integration: This bug (841544) was mentioned in https://build.opensuse.org/request/show/215598 Factory / systemd This still affects opensuse 13.1, should we backport the corresponding fixes ? openSUSE-RU-2016:0320-1: An update that has 146 recommended fixes can now be installed. Category: recommended (moderate) Bug References: 737690,742774,750845,818044,838475,841544,849870,852015,852021,852232,853293,854884,856389,856392,856858,857204,858864,859072,859365,860574,860937,861316,861489,863217,864745,864904,865834,866732,866933,867128,867663,867664,867840,868019,868230,868439,868931,869142,869603,872929,873432,873444,874665,875502,876587,876694,877021,877674,878525,880438,880732,881125,881559,881942,882393,882714,883565,884271,884403,885232,885288,886211,886599,886852,888178,888215,888612,889297,889357,890977,892096,892162,892300,893797,895087,896664,897799,897801,897803,898233,898240,898432,900558,901481,902240,902901,903009,903963,904214,904517,904828,905550,906709,906900,907318,907393,908476,909358,910643,911347,912030,912334,913517,916420,918118,919095,920195,921831,921898,921920,926169,927250,927457,928265,931388,932284,933365,933512,933521,933533,934077,934901,937512,937900,938908,939571,940264,941576,944132,944799,945282,947212,948458,948555,948705,949574,949683,949739,950510,951265,951663,953241,954336,954781,955635,961576 CVE References: Sources used: openSUSE 13.1 (src): systemd-210-40.1, systemd-mini-210-40.1 This issue should be fixed by the latest update release of systemd. Feel free to reopen it if you don't think it's the case. Closing. |
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0 In beta 1 every shutdown waits for timeout stopping user manager: [ OK ] Stopped LSB: Configure network interfaces and set up routing. Stopping SuSEfirewall2 phase 1... [ OK ] Stopped SuSEfirewall2 phase 1. [ *] A stop job is running for User Manager for 0 This is VM in qemu KVM, minimal text only installation. Reproducible: Always Steps to Reproduce: 1. 2. 3.