Bug 849717

Summary: r8169 crashes sometimes auf pm-suspend
Product: [openSUSE] openSUSE 13.1 Reporter: Ulf Lange <mopp>
Component: NetworkAssignee: Kristyna Streitova <kstreitova>
Status: RESOLVED WONTFIX QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: one, pgajdos
Version: RC 1   
Target Milestone: ---   
Hardware: x86-64   
OS: SUSE Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: messages

Description Ulf Lange 2013-11-09 09:37:55 UTC
User-Agent:       Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0

I use pm-suspend to to a suspend to RAM.
After a couple of suspends the r8169 module crashes after resume. The module is sill loaded and the network interface is up, but when I try to ping a host from the local console I get following messages.
Unfortunately the server is not able to connect to any network services anymore.
[239674.914204] r8169 0000:02:00.0 enp2s0: rtl_counters_cond == 1 (loop: 1000, delay: 10).
[239674.924854] r8169 0000:02:00.0 enp2s0: rtl_chipcmd_cond == 1 (loop: 100, delay: 100).
[239674.929128] r8169 0000:02:00.0 enp2s0: rtl_phyar_cond == 1 (loop: 20, delay: 25).
[239674.930142] r8169 0000:02:00.0 enp2s0: rtl_phyar_cond == 1 (loop: 20, delay: 25).
[239674.931192] r8169 0000:02:00.0 enp2s0: rtl_phyar_cond == 1 (loop: 20, delay: 25).
[239674.932147] r8169 0000:02:00.0 enp2s0: rtl_phyar_cond == 1 (loop: 20, delay: 25).
[239674.933105] r8169 0000:02:00.0 enp2s0: rtl_phyar_cond == 1 (loop: 20, delay: 25).
[239675.042216] r8169 0000:02:00.0 enp2s0: rtl_phyar_cond == 1 (loop: 20, delay: 25).
[239675.043332] r8169 0000:02:00.0 enp2s0: rtl_phyar_cond == 1 (loop: 20, delay: 25).
[...]


Reproducible: Sometimes

Steps to Reproduce:
1. pm-suspend
2. try to access a network service
3.


Expected Results:  
A working suspend to RAM
Comment 1 Michele Cherici 2013-11-23 10:19:58 UTC
I had this problem in the past, now I use the following workaround:
- create a new file with the line SUSPEND_MODULES="r8169"
- copy the file to /etc/pm/config.d/
Comment 2 Vojtech Dziewiecki 2013-11-25 11:43:22 UTC
Ulf could you please try this fix Michele mentioned and report if it works? If it does, I will add it as a maintenance update for everybody.
Comment 3 Ulf Lange 2013-11-27 06:33:48 UTC
It seems to fix the rtl_phyar_cond == 1 bug, at least id didn't occur since then.
Last evening I suspended the system "Tue Nov 26 22:21:56 CET 2013: performing suspend" and it woke up "Tue Nov 26 23:02:51 CET 2013: Awake." and I' ve no idea what triggered the wake up process, but anyway. Here is the next problem.

The system didn't have any network device up (except the lo device).

# rcnetwork start
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
4: enp2s0: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN qlen 1000
    link/ether bc:5f:f4:7b:da:d9 brd ff:ff:ff:ff:ff:ff

# systemctl status network
network.service - LSB: Configure network interfaces and set up routing
   Loaded: loaded (/usr/lib/systemd/system/network.service; enabled)
   Active: inactive (dead) since Wed 2013-11-27 07:11:57 CET; 5min ago
  Process: 18305 ExecStop=/etc/init.d/network stop (code=exited, status=0/SUCCESS)
  Process: 17722 ExecStart=/etc/init.d/network start (code=exited, status=0/SUCCESS)
 Main PID: 23918 (code=exited, status=1/FAILURE)
   CGroup: /system.slice/network.service

/var/log/messages:
2013-11-27T07:11:50.198116+01:00 testpc rcnetwork[17707]: redirecting to "systemctl  start network.service"
2013-11-27T07:11:50.199709+01:00 testpc systemd[1]: Starting LSB: Configure network interfaces and set up routing...
2013-11-27T07:11:50.298040+01:00 testpc network[17722]: Setting up network interfaces:
2013-11-27T07:11:50.394961+01:00 testpc network[17722]: lo
2013-11-27T07:11:50.395942+01:00 testpc ifup[17976]:     lo
2013-11-27T07:11:50.429276+01:00 testpc ifup[17976]:     lo
2013-11-27T07:11:50.430717+01:00 testpc ifup[17976]: IP address: 127.0.0.1/8
2013-11-27T07:11:50.430866+01:00 testpc network[17722]: lo        IP address: 127.0.0.1/8
2013-11-27T07:11:50.431939+01:00 testpc ifup[17976]:
2013-11-27T07:11:50.507764+01:00 testpc systemd[1]: Starting ifup managed network interface enp2s0...
2013-11-27T07:11:50.547503+01:00 testpc ifup[18121]: enp2s0    device: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
2013-11-27T07:11:50.548390+01:00 testpc ifup[18121]:     enp2s0    device: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
2013-11-27T07:11:50.710211+01:00 testpc kernel: [140063.626687] r8169 0000:02:00.0 enp2s0: link down
2013-11-27T07:11:50.710223+01:00 testpc kernel: [140063.626726] r8169 0000:02:00.0 enp2s0: link down
2013-11-27T07:11:50.710224+01:00 testpc kernel: [140063.626738] IPv6: ADDRCONF(NETDEV_UP): enp2s0: link is not ready
2013-11-27T07:11:50.712926+01:00 testpc avahi-daemon[763]: Joining mDNS multicast group on interface enp2s0.IPv4 with address 192.168.178.2.
2013-11-27T07:11:50.713193+01:00 testpc avahi-daemon[763]: New relevant interface enp2s0.IPv4 for mDNS.
2013-11-27T07:11:50.713368+01:00 testpc avahi-daemon[763]: Registering new address record for 192.168.178.2 on enp2s0.IPv4.
2013-11-27T07:11:50.753134+01:00 testpc avahi-autoipd(enp2s0)[18239]: Found user 'avahi-autoipd' (UID 484) and group 'avahi-autoipd' (GID 483).
2013-11-27T07:11:50.753384+01:00 testpc avahi-autoipd(enp2s0)[18239]: Successfully called chroot().
2013-11-27T07:11:50.753540+01:00 testpc avahi-autoipd(enp2s0)[18239]: Successfully dropped root privileges.
2013-11-27T07:11:50.753691+01:00 testpc avahi-autoipd(enp2s0)[18239]: Starting with address 169.254.12.94
2013-11-27T07:11:50.753861+01:00 testpc avahi-autoipd(enp2s0)[18239]: Routable address already assigned, sleeping.
2013-11-27T07:11:52.998053+01:00 testpc kernel: [140065.913571] r8169 0000:02:00.0 enp2s0: link up
2013-11-27T07:11:52.998066+01:00 testpc kernel: [140065.913578] IPv6: ADDRCONF(NETDEV_CHANGE): enp2s0: link becomes ready
2013-11-27T07:11:54.374119+01:00 testpc avahi-daemon[763]: Joining mDNS multicast group on interface enp2s0.IPv6 with address fe80::be5f:f4ff:fe7b:dad9.
2013-11-27T07:11:54.374391+01:00 testpc avahi-daemon[763]: New relevant interface enp2s0.IPv6 for mDNS.
2013-11-27T07:11:54.374544+01:00 testpc avahi-daemon[763]: Registering new address record for fe80::be5f:f4ff:fe7b:dad9 on enp2s0.*.
2013-11-27T07:11:55.814012+01:00 testpc systemd[1]: Started ifup managed network interface enp2s0.
2013-11-27T07:11:55.828485+01:00 testpc network[17722]: ..done..doneSetting up service network  .  .  .  .  .  .  .  .  .  .  .  .  ...done
2013-11-27T07:11:55.991178+01:00 testpc network[18305]: Shutting down network interfaces:
2013-11-27T07:11:56.021220+01:00 testpc systemd[1]: Stopping ifup managed network interface enp2s0...
2013-11-27T07:11:56.059873+01:00 testpc ifdown[18589]: enp2s0    device: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
2013-11-27T07:11:56.061076+01:00 testpc ifdown[18589]:     enp2s0    device: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
2013-11-27T07:11:56.069739+01:00 testpc avahi-autoipd(enp2s0)[18239]: Got SIGTERM, quitting.
2013-11-27T07:11:56.575399+01:00 testpc avahi-daemon[763]: Interface enp2s0.IPv6 no longer relevant for mDNS.
2013-11-27T07:11:56.575702+01:00 testpc avahi-daemon[763]: Leaving mDNS multicast group on interface enp2s0.IPv6 with address fe80::be5f:f4ff:fe7b:dad9.
2013-11-27T07:11:56.575860+01:00 testpc avahi-daemon[763]: Interface enp2s0.IPv4 no longer relevant for mDNS.
2013-11-27T07:11:56.576021+01:00 testpc avahi-daemon[763]: Leaving mDNS multicast group on interface enp2s0.IPv4 with address 192.168.178.2.
2013-11-27T07:11:56.576170+01:00 testpc avahi-daemon[763]: Withdrawing address record for fe80::be5f:f4ff:fe7b:dad9 on enp2s0.
2013-11-27T07:11:56.576322+01:00 testpc avahi-daemon[763]: Withdrawing address record for 192.168.178.2 on enp2s0.
2013-11-27T07:11:56.598838+01:00 testpc systemd[1]: Stopped ifup managed network interface enp2s0.
2013-11-27T07:11:57.277497+01:00 testpc network[18305]: ..doneShutting down service network  .  .  .  .  .  .  .  .  .  .  .  ...done
2013-11-27T07:11:57.278458+01:00 testpc systemd[1]: Started LSB: Configure network interfaces and set up routing.


=> Still no network device up, For some reason the service stops the network card after firing in up.

I used the old ifconfig tool to configure the ip.
# ifconfig enp2s0 192.168.178.2 up
After this command the network was up and runnung (of course without a default route).

# rcnetwork restart
=> Network card is down again!
Comment 4 Ulf Lange 2013-11-27 06:35:36 UTC
Created attachment 569258 [details]
messages
Comment 5 Vojtech Dziewiecki 2013-11-27 09:49:27 UTC
The module r8169 is your network card isn't it? Maybe it fails to load again after suspend? Did you try modprobe r8169, then rcnetwork restart ?
Comment 6 Ulf Lange 2013-11-27 09:59:46 UTC
The module loads and is the right module for the NIC
As you can see above "ifconfig  enp2s0 192.168.178.2 up" works, just rcnetwork start fails.
Comment 7 Vojtech Dziewiecki 2013-11-27 11:13:17 UTC
I'm sorry I cannot think of a way to fix this and I cannot look into it any further.
I also want to drop pm-utils soon in favor of the systemd suspend capabilities.

It would be great if you tried uninstalling pm-utils and then systemctl suspend, then report if it works and if not, what are you missing. Fixing that would make more sense now than fixing pm-utils.
Comment 8 Michele Cherici 2013-11-27 14:46:25 UTC
When pm-utils will be dropped? In 13.2?
Comment 9 Vojtech Dziewiecki 2013-12-02 12:32:14 UTC
I don't know yet. I think it would be best if in 13.2 pm-utils weren't used by default but could be explicitly installed if someone really wanted to use them.
Comment 10 Ulf Lange 2013-12-02 15:21:48 UTC
The initial problem with "rtl_phyar_cond" seems to be fixed with SUSPEND_MODULES="r8169". I still have some other issues, sometimes the system will not go into suspend at all etc.
But I think this as nothing to do with the network.
So the bug can be closed.
Comment 12 Petr Gajdos 2014-09-25 13:37:35 UTC
(In reply to Ulf Lange from comment #10)
> So the bug can be closed.

Let's do so then.
Comment 13 Michele Cherici 2015-03-10 21:29:26 UTC
I've upgraded to 13.2 and now SUSPEND_MODULES="r8169" workaround doesn't work anymore.
After a suspend is not possible to resume the network, the strange thing is that unloading/reloading r8169 kernel module doesn't help.
Comment 14 Petr Gajdos 2015-03-11 07:55:30 UTC
(In reply to Michele Cherici from comment #13)
> I've upgraded to 13.2 and now SUSPEND_MODULES="r8169" workaround doesn't
> work anymore.
> After a suspend is not possible to resume the network, the strange thing is
> that unloading/reloading r8169 kernel module doesn't help.

As far as I know pm-utils are not used by default for suspending anymore. You need to fill new bug report against systemd, I guess.