Bug 846361

Summary: pppd does not reconnect (option persist)
Product: [openSUSE] openSUSE 13.1 Reporter: Ruediger Meier <sweet_f_a>
Component: NetworkAssignee: Marius Tomaschewski <mt>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: max, mt, noblehng
Version: Final   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Bug Depends on:    
Bug Blocks: 869831    
Attachments: logs plus network extra traces
ifup: do not redirect to systemd for ppp
ifup: do not redirect to systemd for ppp

Description Ruediger Meier 2013-10-17 09:49:28 UTC
User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0

Hi,
after upgrading to 13.1 pppd does not reconnected after lcp-echo-failure happened:

2013-10-16T19:31:27.924893+02:00 tenfore pppd[5405]: No response to 2 echo-requests
2013-10-16T19:31:28.121107+02:00 tenfore pppd[5405]: Serial link appears to be disconnected.
2013-10-16T19:31:28.121745+02:00 tenfore pppd[5405]: Connect time 8532.1 minutes.
2013-10-16T19:31:28.144737+02:00 tenfore pppd[5405]: Sent 576240611 bytes, received 4264384411 bytes.
2013-10-16T19:31:28.894514+02:00 tenfore pppd[5405]: Script /etc/ppp/ip-down finished (pid 5193), status = 0x0
2013-10-16T19:31:32.227775+02:00 tenfore pppd[5405]: Connection terminated.
2013-10-16T19:31:32.245382+02:00 tenfore pppd[5405]: Sent PADT
2013-10-16T19:31:32.257248+02:00 tenfore pppd[5405]: Modem hangup
2013-10-16T19:31:32.266251+02:00 tenfore systemd[1]: Stopping ifup managed network interface dsl0...

... then it did not reconnect. My setup was working since openSUSE 11.4. On my former installations it did not sent PADT.  

maxfail 0
lcp-echo-interval 60
lcp-echo-failure 2
persist

I don't know how to reproduce. It works when I plug off/on any cables. I guess I need to trick "No response to 2 echo-requests" somehow else.

Could be that these recently added patches have something todo with this problem?:

Mon Sep  2 14:41:21 UTC 2013 - max@suse.com
- Add ppp-terminate-correct.patch, fix the pppd's hang after exit
  (bnc#693469).
-------------------------------------------------------------------
Wed Jul 24 12:54:01 UTC 2013 - max@suse.com
- ppp-send-padt.patch: Send PADT at the end of a PPPoE session.




Reproducible: Always

Steps to Reproduce:
1.
2.
3.
Comment 1 Reinhard Max 2013-11-05 15:24:44 UTC
I've tried pppd with the two mentioned patches applied on a DSL line and could neither reproduce this by sending SIGHUP nor by unplugging the LAN cable to the DSL modem.

Please tell me more about the scenario in which this happens and if it happens every time or only sometimes. Also, does pppd exit after the disconnect or does just stop trying to reconnect?
Comment 2 Ruediger Meier 2013-11-05 22:35:54 UTC
Finally I could reproduce it, when plugging off the "DSL cable" of my modem the first time after the network was started by systemd (e.g. right after reboot).


Nov 02 15:10:43 tenfore systemd[1]: Starting ifup managed network interface dsl0...
[...]
<plug off dsl cable>
Nov 05 20:42:17 tenfore pppd[5387]: No response to 2 echo-requests
Nov 05 20:42:17 tenfore pppd[5387]: Serial link appears to be disconnected.
Nov 05 20:42:17 tenfore pppd[5387]: Connect time 4651.6 minutes.
Nov 05 20:42:17 tenfore pppd[5387]: Sent 200807548 bytes, received 3015279651 bytes.
Nov 05 20:42:17 tenfore pppd[5387]: Script /etc/ppp/ip-down finished (pid 12457), status = 0x0
Nov 05 20:42:21 tenfore pppd[5387]: Connection terminated.
Nov 05 20:42:21 tenfore systemd[1]: Stopping ifup managed network interface dsl0...
Nov 05 20:42:21 tenfore pppd[5387]: Sent PADT
Nov 05 20:42:21 tenfore pppd[5387]: Modem hangup
Nov 05 20:42:21 tenfore ifdown[12554]: dsl0      no such interface
Nov 05 20:42:21 tenfore ifdown[12635]: dsl0      no such interface
Nov 05 20:42:21 tenfore pppd[5387]: Terminating on signal 15
Nov 05 20:42:21 tenfore pppd[5387]: Exit.


When I now reconnect manually by typing
  cinternet -i dsl0 -A
then pppd survives the next cable cut.

After another rcnetwork restart (i.e. after seeing again a log line "Starting ifup managed network interface dsl0...") I can reproduce it again.

What does this "systemd managed" interface mean? How can I avoid that systemd does not "manage" my network devices?
What about these 2 ifdown lines from my log? Are they called by systemd?
Is pppd SIGTERM'ed by these ifdown calls?


BTW this is my /etc/sysconfig/network/ifcfg-dsl0
BOOTPROTO='none'
DEVICE='eth0'
MODEM_IP='10.0.0.138'
NAME='DSL Connection'
PPPD_OPTIONS='dump maxfail 0 lcp-echo-interval 20 lcp-echo-failure 2 persist'
PPPMODE='pppoe'
PROVIDER='provider0'
STARTMODE='auto'
UDI=''
USERCONTROL='yes'
VPIVCI=''
Comment 3 Reinhard Max 2013-12-06 12:55:38 UTC
Could you please try it with the ppp package from 12.3, which does not contain the two patches you mention above? It should install on 13.1 without dependency issues.

BTW, have you updated from RC1 to final in the meantime, and if so, does it still happen?
Comment 4 Ruediger Meier 2013-12-18 09:21:45 UTC
Yes it's still broken on fully updated 13.1 even without these patches.

The question is who sends SIGTERM to pppd? I've tracked that down. It's killed by by the command "ifdown ds0". Who called this? udev, systemd, smpppd?
Comment 5 Ruediger Meier 2014-01-15 13:19:45 UTC
This bug also happens on a fresh installed (non-upgraded) openSUSE 13.1 plus official updates only.
Comment 6 Reinhard Max 2014-01-15 16:05:26 UTC
(In reply to comment #4)

> The question is who sends SIGTERM to pppd? I've tracked that down. It's killed
> by by the command "ifdown ds0". Who called this? udev, systemd, smpppd?

Marius, can you shed some more light on this?
Comment 7 Ruediger Meier 2014-01-15 21:18:57 UTC
(In reply to comment #6)
> (In reply to comment #4)
> 
> > The question is who sends SIGTERM to pppd? I've tracked that down. It's killed
> > by by the command "ifdown ds0". Who called this? udev, systemd, smpppd?
> 
> Marius, can you shed some more light on this?

Ok, I increased systemd debug output. Looks like it is the damn systemd who calls "ifdown dsl0":

2014-01-15T22:14:10.766283+01:00 tenfore pppd[29591]: Hangup (SIGHUP)
2014-01-15T22:14:10.766754+01:00 tenfore pppd[29591]: Connect time 0.7 minutes.
2014-01-15T22:14:10.767066+01:00 tenfore pppd[29591]: Sent 31140 bytes, received 50526 bytes.
2014-01-15T22:14:10.768260+01:00 tenfore pppd[29591]: Script /etc/ppp/ip-down started (pid 30376)
2014-01-15T22:14:10.768839+01:00 tenfore pppd[29591]: sent [LCP TermReq id=0x2 "User request"]
2014-01-15T22:14:10.776684+01:00 tenfore pppd[29591]: rcvd [LCP TermAck id=0x2]
2014-01-15T22:14:10.777363+01:00 tenfore pppd[29591]: Connection terminated.
2014-01-15T22:14:10.777853+01:00 tenfore systemd[1]: sys-subsystem-net-devices-dsl0.device changed plugged -> dead
2014-01-15T22:14:10.778689+01:00 tenfore systemd[1]: Trying to enqueue job network@dsl0.service/stop/replace
2014-01-15T22:14:10.779185+01:00 tenfore systemd[1]: Installed new job network@dsl0.service/stop as 3713
2014-01-15T22:14:10.779632+01:00 tenfore systemd[1]: Enqueued job network@dsl0.service/stop as 3713
2014-01-15T22:14:10.780087+01:00 tenfore systemd[1]: sys-devices-virtual-net-dsl0.device changed plugged -> dead
2014-01-15T22:14:10.780538+01:00 tenfore systemd[1]: Collecting sys-devices-virtual-net-dsl0.device
2014-01-15T22:14:10.781109+01:00 tenfore systemd[1]: Stopping ifup managed network interface dsl0...
2014-01-15T22:14:10.781565+01:00 tenfore systemd[1]: About to execute: /sbin/ifdown dsl0
2014-01-15T22:14:10.782245+01:00 tenfore systemd[1]: Forked /sbin/ifdown as 30383
2014-01-15T22:14:10.782602+01:00 tenfore systemd[1]: network@dsl0.service changed running -> stop
2014-01-15T22:14:10.782975+01:00 tenfore systemd[1]: Set up jobs progress timerfd.
2014-01-15T22:14:10.783352+01:00 tenfore systemd[1]: Set up idle_pipe watch.
2014-01-15T22:14:10.784460+01:00 tenfore systemd[30383]: Executing: /sbin/ifdown dsl0
2014-01-15T22:14:10.796217+01:00 tenfore pppd[29591]: Send PPPOE Discovery V1T1 PADT session 0xc96 length 8
2014-01-15T22:14:10.796816+01:00 tenfore pppd[29591]:  dst 0:30:88:1:66:53  src 0:19:d1:99:f3:a4
2014-01-15T22:14:10.797272+01:00 tenfore pppd[29591]:  [host-uniq  97 73 00 00]
2014-01-15T22:14:10.797715+01:00 tenfore pppd[29591]: Sent PADT
2014-01-15T22:14:10.801804+01:00 tenfore systemd[1]: Accepted connection on private bus.
2014-01-15T22:14:10.802764+01:00 tenfore systemd[1]: Got D-Bus request: org.freedesktop.DBus.Properties.GetAll() on /org/freedesktop/systemd1/unit/network_2eservice
2014-01-15T22:14:10.803443+01:00 tenfore systemd[1]: Got D-Bus request: org.freedesktop.DBus.Local.Disconnected() on /org/freedesktop/DBus/Local
2014-01-15T22:14:10.848339+01:00 tenfore systemd[1]: Accepted connection on private bus.
2014-01-15T22:14:10.849013+01:00 tenfore systemd[1]: Got D-Bus request: org.freedesktop.DBus.Properties.GetAll() on /org/freedesktop/systemd1/unit/network_2eservice
2014-01-15T22:14:10.849800+01:00 tenfore systemd[1]: Got D-Bus request: org.freedesktop.DBus.Local.Disconnected() on /org/freedesktop/DBus/Local
2014-01-15T22:14:10.865070+01:00 tenfore systemd[1]: Accepted connection on private bus.
2014-01-15T22:14:10.866529+01:00 tenfore systemd[1]: Got D-Bus request: org.freedesktop.DBus.Properties.GetAll() on /org/freedesktop/systemd1/unit/network_2eservice
2014-01-15T22:14:10.867285+01:00 tenfore systemd[1]: Got D-Bus request: org.freedesktop.DBus.Local.Disconnected() on /org/freedesktop/DBus/Local
2014-01-15T22:14:10.880654+01:00 tenfore ifdown[30383]: dsl0      no such interface
2014-01-15T22:14:10.882327+01:00 tenfore ifdown[30383]:     dsl0      no such interface
2014-01-15T22:14:10.909510+01:00 tenfore systemd[1]: Accepted connection on private bus.
2014-01-15T22:14:10.911164+01:00 tenfore systemd[1]: Got D-Bus request: org.freedesktop.DBus.Properties.GetAll() on /org/freedesktop/systemd1/unit/network_2eservice
2014-01-15T22:14:10.912018+01:00 tenfore systemd[1]: Got D-Bus request: org.freedesktop.DBus.Local.Disconnected() on /org/freedesktop/DBus/Local
2014-01-15T22:14:10.955248+01:00 tenfore SuSEfirewall2: Setting up rules from /etc/sysconfig/SuSEfirewall2 ...
2014-01-15T22:14:10.958998+01:00 tenfore pppd[29591]: Terminating on signal 15


How can I stop systemd doing anything with my network devices?
Comment 8 Marius Tomaschewski 2014-01-16 12:35:38 UTC
Make sure you have the last systemd related fix for bnc#853908 installed.

When not the above, check bnc#853845 (try out if "rpm -e avahi-autoipd"
helps).

When the problem still exists, enable ifup tracing setting DEBUG=EXTRA
in /etc/sysconfig/network/config and reboot.

In grub enable also systemd debugging, e.g. via systemd.log_level=debug
kernel option.

Then reproduce the problem and disable DEBUG=EXTRA as it may produce
a lot of data and fill the /dev fs.

The DEBUG=EXTRA writes the traces to /dev/.sysconfig/network/, please
attach a tar.bz2 of all files you find there to this bug, e.g.:
  tar cvjpf /tmp/debug-extra-traces.tar.bz2 /dev/.sysconfig/network/
Comment 9 Marius Tomaschewski 2014-01-16 12:37:52 UTC
And the systemd debug logs too please. Review the files, not that
there is some (ISP) password it them or similar -- XXX them out.
Comment 10 Ruediger Meier 2014-01-16 16:59:43 UTC
Created attachment 574675 [details]
logs plus network extra traces

It still does not work. I've attached the requested logs. messages contain two markers "root: MARK"
Comment 11 Noble Huang 2014-01-23 09:43:59 UTC
Revert this patch prevent systemd to manage the virtual network interface dsl0.
https://github.com/openSUSE/sysconfig/commit/2466b14065729f2c8a4c13aae97f9f4ac77ccca5

When remote host disconnect the pppoe connection, pppd will call /etc/ppp/ip-down to bring down the link. If the persist option of pppd is using, pppd will try to reconnect and set up the link again.

When dsl0 is managed by systemd, after /etc/ppp/ip-down execute, systemd think the device is dead and call /etc/ifdown to bring down it, which will remove dsl0 and terminate pppd. Thus cause reconnect abort.

Maybe one should filter out some interfaces based on $INTERFACETYPE in that patch.
Comment 12 Ruediger Meier 2014-01-28 00:16:21 UTC
(In reply to comment #11)
> Revert this patch prevent systemd to manage the virtual network interface dsl0.
> https://github.com/openSUSE/sysconfig/commit/2466b14065729f2c8a4c13aae97f9f4ac77ccca5

I can confirm this. Reverting 2466b140 (and followups) fixes the problem.
 
> When dsl0 is managed by systemd, after /etc/ppp/ip-down execute, systemd think
> the device is dead and call /etc/ifdown to bring down it, which will remove
> dsl0 and terminate pppd. Thus cause reconnect abort.
>
> Maybe one should filter out some interfaces based on $INTERFACETYPE in that
> patch.

Actually I don't see why it redirects to systemd at all. Choosing "traditional ifup setup" should give us _traditional_ setup without stupid managed network interface services whatever.
After "network start" nothing else should should be done automatically.
Comment 13 Marius Tomaschewski 2014-02-26 17:14:22 UTC
Created attachment 580165 [details]
ifup: do not redirect to systemd for ppp
Comment 14 Marius Tomaschewski 2014-02-26 17:42:16 UTC
Created attachment 580172 [details]
ifup: do not redirect to systemd for ppp

Does it work for you with this package?:

http://download.opensuse.org/repositories/home:/mtomaschewski:/branches:/openSUSE:/13.1:/Update/standard/
Comment 15 Marius Tomaschewski 2014-02-26 17:43:57 UTC
Still builds...  sysconfig > 0.81.5-21.1 will be visible soon.
Comment 16 Ruediger Meier 2014-03-08 13:24:42 UTC
(In reply to comment #14)
> Created an attachment (id=580172) [details]
> ifup: do not redirect to systemd for ppp

Thanks, this does work now.
Comment 17 Bernhard Wiedemann 2014-03-10 09:00:12 UTC
This is an autogenerated message for OBS integration:
This bug (846361) was mentioned in
https://build.opensuse.org/request/show/225402 13.1 / sysconfig
Comment 18 Ruediger Meier 2014-03-10 10:27:48 UTC
One thing to add.
I see that now pppd even survives "rcnetwork" stop. Don't know if this is how it should be.
Comment 19 Marius Tomaschewski 2014-03-10 12:50:05 UTC
(In reply to comment #18)
> One thing to add.
> I see that now pppd even survives "rcnetwork" stop. Don't know if this is how
> it should be.

rcnetwork calls "ifdown xxx" -- this usually stops the network@xxx service;
in ppp case it is forwarded to "/usr/sbin/smpppd-ifcfg --down" now instead.
Everything else is in the hands of the smpppd.
Comment 20 Swamp Workflow Management 2014-03-17 11:04:22 UTC
openSUSE-RU-2014:0384-1: An update that has one recommended fix can now be installed.

Category: recommended (low)
Bug References: 846361
CVE References: 
Sources used:
openSUSE 13.1 (src):    sysconfig-0.81.5-22.1
Comment 21 Bernhard Wiedemann 2014-03-29 12:00:15 UTC
This is an autogenerated message for OBS integration:
This bug (846361) was mentioned in
https://build.opensuse.org/request/show/228168 13.1 / sysconfig
Comment 22 Swamp Workflow Management 2014-04-07 07:04:37 UTC
openSUSE-RU-2014:0485-1: An update that has 5 recommended fixes can now be installed.

Category: recommended (important)
Bug References: 846361,860644,866870,868323,869761
CVE References: 
Sources used:
openSUSE 13.1 (src):    sysconfig-0.81.5-26.1