|
Bugzilla – Full Text Bug Listing |
| Summary: | Race condition in systemd startup/shutdown affecting CIFS remote mounts | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 13.1 | Reporter: | Rodney Baker <rodney.baker> |
| Component: | Network | Assignee: | systemd maintainers <systemd-maintainers> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Major | ||
| Priority: | P5 - None | CC: | arvidjaar, fbui, mt, wicked-maintainers |
| Version: | Final | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | openSUSE 13.1 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
|
Description
Rodney Baker
2014-06-20 13:28:02 UTC
(In reply to comment #0) > to /usr/lib/systemd/system/remote-fs.service. > Please show output of "rpm -qif /usr/lib/systemd/system/remote-fs.service" Sorry - correction - the file should be remote-fs.target (and this is where I added the Requires=network@eth0.service line. Name : systemd Version : 208 Release : 19.1 Architecture: x86_64 Install Date: Fri Mar 14 02:51:03 2014 Group : System/Base Size : 6235015 License : LGPL-2.1+ Signature : RSA/SHA256, Mon Mar 3 20:05:01 2014, Key ID b88b2fd43dbdc284 Source RPM : systemd-208-19.1.src.rpm Build Date : Mon Feb 24 21:37:07 2014 Build Host : build08 Relocations : (not relocatable) Packager : http://bugs.opensuse.org Vendor : openSUSE URL : http://www.freedesktop.org/wiki/Software/systemd Summary : A System and Session Manager Description : Systemd is a system and service manager, compatible with SysV and LSB init scripts for Linux. systemd provides aggressive parallelization capabilities, uses socket and D-Bus activation for starting services, offers on-demand starting of daemons, keeps track of processes using Linux cgroups, supports snapshotting and restoring of the system state, maintains mount and automount points and implements an elaborate transactional dependency-based service control logic. It can work as a drop-in replacement for sysvinit. Distribution: openSUSE 13.1 My edited remote-fs.target unit file is as follows: # This file is part of systemd. # # systemd is free software; you can redistribute it and/or modify it # under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation; either version 2.1 of the License, or # (at your option) any later version. [Unit] Description=Remote File Systems Documentation=man:systemd.special(7) After=remote-fs-pre.target network.target DefaultDependencies=no Conflicts=shutdown.target Requires=network.service network@eth0.service Note that I first added Requires=network.service but this did not solve the problem - it was necessary to require the interface. I did not try network-online.target though... (In reply to comment #2) > Sorry - correction - the file should be remote-fs.target So what service mounts filesystems from /etc/samba/cifstab? remote-fs.target is pure synchronization point and does not do anything by itself. Could you test updated package. It should make legacy services to be started after network-online.target; network@xxx.service are ordered before it by default. zypper add obs://home:arvidjaar:bnc:883565/standard bnc883565 zypper refresh bnc883565 zypper dup -r bnc883565 And revert your change in remote-fs.target (should be overwritten on update anyway). OK, thanks. Please give me a day or so to test and respond. Actually I got it done quicker. Happy to report that the fix appears to have resolved the issue for me. A couple of shutdown/startup cycles and the CIFS shares are mounted and unmounted correctly each time. Thank you! @systemd-maintainers - this is fixed upstream in commit 0404c609f399b2092a3de52eef9d75b0dc12e94c which makes legacy scripts that need $network be ordered after network-online.target. network@.service is already ordered before network-onlilne.target. For factory request 238278 contains backport of this commit (it does not apply directly). For 13.1 I believe we need patch as well. (In reply to comment #7) This requires that in wicked.service there is a line Wants=network-online.target as otherwise we get into trouble with OS 13.2 and SLES12 Marius? (In reply to comment #8) > (In reply to comment #7) > > This requires that in wicked.service there is a line > > Wants=network-online.target > No, it does not. Systemd automatically adds Wants=network-online.target to legacy sysvinit services. That's the main point of upstream patch. This requires that network-online.target comes after wicked services; I assumed this is already the case? This will interfere with 0018-Make-LSB-Skripts-know-about-Required-and-Should.patch and insserv-generator.patch Er, I'm confused. I can find no wicked.service on my machine at all. I'm talking about the soon coming 13.2 and not 13.1 Ah, OK. This report was for 13.1, hence my confusion at your response. :) (In reply to comment #10) > This will interfere with > 0018-Make-LSB-Skripts-know-about-Required-and-Should.patch Could you explain. My patch is on top of 0018-Make-LSB-Skripts-know-about-Required-and-Should.patch and complements it by adding compatible upstream behavior. It does not change what 0018-Make-LSB-Skripts-know-about-Required-and-Should.patch does. > and insserv-generator.patch You are right, I missed it. I submitted SR#238415 which updates insserv-generator.patch to use network-onlilne.target for $network. (In reply to comment #14) the patch is on top of 0018-Make-LSB-Skripts-know-about-Required-and-Should.patch but as you may have seen this patch also uses UNIT_REQUIRES_OVERRIDABLE for all required services .. therefore I wonder if your boolean does match: if (streq(m, SPECIAL_NETWORK_ONLINE_TARGET) && d == UNIT_AFTER && e == _UNIT_DEPENDENCY_INVALID) e = UNIT_WANTS; do you have traced this to see how often this does match? I'm currently testing out your new SR#238415 ... beside some other changes. For openSUSE 13.1 I have to use
%if 0%{?suse_version} <= 1310
#
# Older versions like oS 13.1 do not distinguish between
# network.target and network-online.target
#
for f in src/core/service.c src/insserv-generator/insserv-generator.c
do
sed -ri '/"network",.*SPECIAL_NETWORK_ONLINE_TARGET,/{ s/SPECIAL_NETWORK_ONLINE_TARGET/SPECIAL_NETWORK_TARGET/}' $f
done
%endif
to make it work
(In reply to comment #8) > (In reply to comment #7) > > This requires that in wicked.service there is a line > > Wants=network-online.target > > as otherwise we get into trouble with OS 13.2 and SLES12 > > Marius? Yes. Fix is in git master and will be submitted today. [https://github.com/openSUSE/wicked/commit/e6d1f904ba] This is an autogenerated message for OBS integration: This bug (883565) was mentioned in https://build.opensuse.org/request/show/238676 Factory / wicked (In reply to comment #16) > For openSUSE 13.1 I have to use > > %if 0%{?suse_version} <= 1310 > # > # Older versions like oS 13.1 do not distinguish between > # network.target and network-online.target > # > for f in src/core/service.c src/insserv-generator/insserv-generator.c > do > sed -ri '/"network",.*SPECIAL_NETWORK_ONLINE_TARGET,/{ > s/SPECIAL_NETWORK_ONLINE_TARGET/SPECIAL_NETWORK_TARGET/}' $f > done > %endif > > to make it work ?? That is what we have right now. And it is broken. I think I understand the problem. In 13.1 network@<if>.service is not part of transaction. It is started explicitly by /etc/init,d/network via "systemctl start network@<if>". This means we just replaced one race condition with another one. If job for network@<if> is queued *before* network-online.target, it works. But if network-online@target happens to be finished before, we lose. Note that upstream commit 58e027023b47b32e42cf93dd4a629b869ee1ef25 adds explicit After=network.target to network-online.target. This should fix it - it ensures that network-online.target is started only after /etc/init.d/network is finished, which means all jobs for network@if are submitted. I added backport of this patch to my repo, which is rebuilding right now. Could you test if it works for you? (In reply to comment #17) > (In reply to comment #8) > > (In reply to comment #7) > > > > This requires that in wicked.service there is a line > > > > Wants=network-online.target > > > > as otherwise we get into trouble with OS 13.2 and SLES12 > > > > Marius? > > Yes. Fix is in git master and will be submitted today. > [https://github.com/openSUSE/wicked/commit/e6d1f904ba] Folks, that is totally wrong, please do not do it. network-online.target should be pulled in by *consumers* of network, not by *providers*. What you must do, is to order wicked.service before network-online.target, but this is automatic with commit 58e027023b47b32e42cf93dd4a629b869ee1ef25 from upstream. (In reply to comment #21) The commit 58e027023b47b32e42cf93dd4a629b869ee1ef25 is now part of Base:System/systemd ... nevertheless I've seen that upstream there is a systemd-networkd-wait-online.service which has [Install] WantedBy=network-online.target ... I wonder if this should go into wicked.service as this one is what the upstream service does. This is an autogenerated message for OBS integration: This bug (883565) was mentioned in https://build.opensuse.org/request/show/238832 Factory / wicked This is an autogenerated message for OBS integration: This bug (883565) was mentioned in https://build.opensuse.org/request/show/238861 Factory / wicked (In reply to comment #21) > (In reply to comment #17) > > (In reply to comment #8) > > > (In reply to comment #7) > > > > > > This requires that in wicked.service there is a line > > > > > > Wants=network-online.target > > > > > > as otherwise we get into trouble with OS 13.2 and SLES12 > > > > > > Marius? > > > > Yes. Fix is in git master and will be submitted today. > > [https://github.com/openSUSE/wicked/commit/e6d1f904ba] > > Folks, that is totally wrong, please do not do it. network-online.target should > be pulled in by *consumers* of network, not by *providers*. What you must do, > is to order wicked.service before network-online.target, but this is automatic > with commit 58e027023b47b32e42cf93dd4a629b869ee1ef25 from upstream. So we have to remove this again. :-( Done in https://github.com/openSUSE/wicked/pull/291 wicked.service contains a Before=...network-online.target... Andrey, could you review if it is OK so far now again? https://github.com/openSUSE/wicked/blob/6a5cdf4889b29b3a1c498bf8804d6f7fd1ef6508/etc/systemd/wicked.service.in I'd like to suggest to use WantedBy=network-online.target Andrey, do we need a WantedBy instead?: [Install] WantedBy=network-online.target (In reply to comment #26) > I'd like to suggest to use WantedBy=network-online.target I'll add it when it is correct to add it. IMHO it is correct:
systemd/Upstream> grep network-online.target -rs units/
units/systemd-networkd-wait-online.service.in:Before=network.target network- online.target
units/systemd-networkd-wait-online.service.in:WantedBy=network-online.target
From manual page man:systemd.unit(5)
WantedBy=, RequiredBy=
This option may be used more than once, or a space-separated list of unit names may be given. A symbolic link is created in the .wants/ or .requires/ directory of each of the listed units when this unit is installed by systemctl enable. This has the effect that a dependency of type Wants= or Requires= is added from the listed unit to the current unit. The primary result is that the current unit will be started when the listed unit is started. See the description of Wants= and Requires= in the [Unit] section for details.
WantedBy=foo.service in a service bar.service is mostly equivalent to Alias=foo.service.wants/bar.service in the same file. In case of template units, systemctl enable must be called with an instance name, and this instance will be added to the .wants/ or .requires/ list of the listed unit. E.g. WantedBy=getty.target in a service getty@.service will result in systemctl enable getty@tty2.service creating a getty.target.wants/getty@tty2.service link to getty@.service.
(In reply to comment #22) > systemd-networkd-wait-online.service which has > [Install] > WantedBy=network-online.target > > ... I wonder if this should go into wicked.service as this one is what the > upstream service does. But this is backward! WantedBy != Wants, it is exactly reverse dependency. (In reply to comment #27) > Andrey, > > do we need a WantedBy instead?: > > [Install] > WantedBy=network-online.target Well ... upstream has two different units - service that actively configures networking and service that passively checks (or waits) for network to be configured. Service, that waits for network, is not started automatically to avoid delays on startup for everyone. That's the reason of WantedBy - to pull in service to wait for network if someone needs to wait. As I understand, there is no such difference in wicked. wicked.service is synchronous, right? When wicked.service is has finished startup, networking is configured. So it is enough to have it ordered before network-online.target as it already is. So no, I think it is not required. (In reply to comment #30) ... yes and no ... as the future of wicked will become asynchronous (AFAIK) and therefore I'd like to see this WantedBy=network-online.target to avoid trouble in future ;) Don't know how dail-in on demand will fit into this scheme but I guess if the device is up the network is somehow online. (In reply to comment #31) > (In reply to comment #30) > > ... yes and no ... as the future of wicked will become asynchronous Then it should provide second service wicked-online or similar. > (AFAIK) and > therefore I'd like to see this WantedBy=network-online.target to avoid trouble > in future ;) > How can WantedBy on *wicked* service help in this case?!? Then you will need WantedBy on service that will wait until network is online (wicked-online.service or whatever). Having WantedBy on a service that just triggers network configuration without waiting for it is pointless. That is exactly the problem we try to fix here, in this bug report. > Don't know how dail-in on demand will fit into this scheme but I guess if the > device is up the network is somehow online. This works only because network-online.target will be wanted by service_load_sysv_path() for the LSB services. Without LSB scripts any other unit depending on network-online.target would also require a Wants=network-online.target (In reply to comment #33) > This works only because network-online.target will be wanted by > service_load_sysv_path() for the LSB services. Without LSB scripts any other > unit depending on network-online.target would also require a > Wants=network-online.target I lost you here. network-online.target is by design intended to be *EXPLICITLY* pulled in by units that want to be delayed on startup until network is configured. For those cases when units are autogenerated by systemd (like network mounts) dependency is added automatically. For units that you ship manually you need to add this manually to unit file if unit really needs it. So what exactly are you arguing here? Sorry, I do not get it. If there is a unit written by a user/customer which has network-online.target in its After= line and no LSB script is enabled then this unit may fail as it could be up before network (In reply to comment #35) > If there is a unit written by a user/customer which has network-online.target > in its After= line and no LSB script is enabled then this unit may fail as it > could be up before network I'm afraid we are going in circles. As my last comment: - short version - This is a bug in respective unit and user/customer needs to fix it. - long version - network-online.target should be explicitly pulled in by units that need it. Quoting http://www.freedesktop.org/software/systemd/man/systemd.special.html: network-online.target Units that strictly require a configured network connection should pull in network-online.target (via a Wants= type dependency) and order themselves after it. And some more detailed explanation in http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ This is an autogenerated message for OBS integration: This bug (883565) was mentioned in https://build.opensuse.org/request/show/239589 Factory / wicked (In reply to comment #36) I'm aware ... nevertheless in my experience documentation will not be read but a bug created causing work load :( openSUSE-RU-2016:0320-1: An update that has 146 recommended fixes can now be installed. Category: recommended (moderate) Bug References: 737690,742774,750845,818044,838475,841544,849870,852015,852021,852232,853293,854884,856389,856392,856858,857204,858864,859072,859365,860574,860937,861316,861489,863217,864745,864904,865834,866732,866933,867128,867663,867664,867840,868019,868230,868439,868931,869142,869603,872929,873432,873444,874665,875502,876587,876694,877021,877674,878525,880438,880732,881125,881559,881942,882393,882714,883565,884271,884403,885232,885288,886211,886599,886852,888178,888215,888612,889297,889357,890977,892096,892162,892300,893797,895087,896664,897799,897801,897803,898233,898240,898432,900558,901481,902240,902901,903009,903963,904214,904517,904828,905550,906709,906900,907318,907393,908476,909358,910643,911347,912030,912334,913517,916420,918118,919095,920195,921831,921898,921920,926169,927250,927457,928265,931388,932284,933365,933512,933521,933533,934077,934901,937512,937900,938908,939571,940264,941576,944132,944799,945282,947212,948458,948555,948705,949574,949683,949739,950510,951265,951663,953241,954336,954781,955635,961576 CVE References: Sources used: openSUSE 13.1 (src): systemd-210-40.1, systemd-mini-210-40.1 This issue should be fixed by the latest update release of systemd. Feel free to reopen it if you don't think it's the case. Closing. |