Bugzilla – Bug 334170
Too many Wakeups-from-idle per second after suspend2ram
Last modified: 2008-09-29 10:35:00 UTC
After resume from suspend to ram the cpu wakes up several thousand times per second according to powertop 1.8. (After a clean reboot the number of wakeups per second is about 200.) # uname -a Linux kpeter 2.6.22.9-0.4-default #1 SMP 2007/10/05 21:32:04 UTC i686 i686 i386 GNU/Linux powertop output: ================ Cn Avg residency P-states (frequencies) C0 (cpu running) (15,6%) 2,00 Ghz 21,7% C1 0,0ms ( 0,0%) 1,67 Ghz 0,0% C2 0,4ms (84,4%) 1333 Mhz 11,7% 1000 Mhz 66,6% Wakeups-from-idle per second : 2274,7 interval: 10,0s no ACPI power usage estimate available Top causes for wakeups: 83,0% (2346,9) <interrupt> : extra timer interrupt 5,4% (151,5) <interrupt> : HDA Intel, i915@pci:0000:00:02.0 3,5% (100,1) <interrupt> : uhci_hcd:usb1, ehci_hcd:usb5 1,9% ( 53,5) amarokapp : schedule_timeout (process_timeout) 1,3% ( 36,2) Xorg : do_setitimer (it_real_fn) 0,8% ( 22,7) compiz : schedule_timeout (process_timeout) 0,4% ( 12,5) sh : queue_delayed_work_on (delayed_work_timer_fn 0,4% ( 11,9) <kernel module> : usb_hcd_poll_rh_status (rh_timer_func) 0,4% ( 10,0) <interrupt> : uhci_hcd:usb3, tifm_7xx1, yenta 0,3% ( 9,7) <interrupt> : iwl3945 0,2% ( 6,5) postmaster : schedule_timeout (process_timeout) 0,2% ( 6,4) <interrupt> : acpi 0,2% ( 6,2) kicker : futex_wait (hrtimer_wakeup) 0,2% ( 5,1) thunderbird-bin : futex_wait (hrtimer_wakeup) 0,2% ( 4,3) kicker : schedule_timeout (process_timeout) 0,1% ( 4,0) <interrupt> : PS/2 keyboard/mouse/touchpad 0,1% ( 4,0) soffice.bin : schedule_timeout (process_timeout) 0,1% ( 3,6) oracle : schedule_timeout (process_timeout) 0,1% ( 3,0) <interrupt> : libata 0,1% ( 2,5) <interrupt> : ahci 0,1% ( 2,4) emerald : schedule_timeout (process_timeout) 0,1% ( 2,3) konsole : schedule_timeout (process_timeout) 0,1% ( 1,9) top : schedule_timeout (process_timeout) 0,1% ( 1,7) Xorg : schedule_timeout (process_timeout) 0,1% ( 1,5) s2ram : queue_delayed_work_on (delayed_work_timer_fn 0,1% ( 1,5) <kernel core> : queue_delayed_work_on (delayed_work_timer_fn 0,0% ( 1,4) firefox-bin : futex_wait (hrtimer_wakeup) top output: =========== top - 09:18:22 up 48 min, 4 users, load average: 0.34, 0.52, 0.78 Tasks: 180 total, 1 running, 179 sleeping, 0 stopped, 0 zombie Cpu(s): 0.3%us, 0.6%sy, 0.0%ni, 98.8%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Mem: 2068028k total, 1283460k used, 784568k free, 31800k buffers Swap: 2104472k total, 0k used, 2104472k free, 766108k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2815 root 15 0 426m 79m 5724 S 2 3.9 2:15.79 Xorg 7729 kerekfyp 18 0 180m 82m 20m S 0 4.1 0:57.18 firefox-bin 4889 kerekfyp 15 0 35696 21m 12m S 8 1.0 0:41.63 compiz 4939 kerekfyp 15 0 124m 40m 24m S 0 2.0 0:39.90 amarokapp 6473 kerekfyp 15 0 2328 1044 772 R 4 0.1 0:34.37 top 4900 kerekfyp 15 0 104m 19m 14m S 4 1.0 0:25.24 kicker 6403 kerekfyp 20 0 168m 46m 20m S 0 2.3 0:17.49 thunderbird-bin 4894 kerekfyp 15 0 17884 9700 7080 S 0 0.5 0:05.48 emerald 6472 kerekfyp 15 0 32208 16m 13m S 2 0.8 0:04.94 konsole 5036 kerekfyp 15 0 180m 39m 30m S 0 2.0 0:03.44 soffice.bin 6939 root 10 -5 0 0 0 S 0 0.0 0:03.42 events/1 4936 kerekfyp 15 0 69080 21m 17m S 0 1.1 0:02.94 kopete 3673 postgres 15 0 46920 1212 720 S 0 0.1 0:02.93 postmaster 2635 haldaemo 15 0 5624 3912 3176 S 0 0.2 0:02.77 hald 2577 messageb 15 0 2328 992 660 S 0 0.0 0:02.56 dbus-daemon 463 root 10 -5 0 0 0 S 0 0.0 0:01.88 kondemand/0 2968 root 15 0 28908 2204 1828 S 0 0.1 0:01.55 NetworkManager 4960 kerekfyp 15 0 33224 13m 10m S 0 0.7 0:01.50 knetworkmanager 7538 kerekfyp 15 0 33316 14m 11m S 0 0.7 0:01.49 konsole 4880 kerekfyp 15 0 36708 16m 12m S 0 0.8 0:01.46 kded 1 root 15 0 744 288 240 S 0 0.0 0:01.40 init 3920 root 15 0 93596 8580 5316 S 0 0.4 0:01.39 httpd2-prefork 4898 kerekfyp 15 0 33928 13m 10m S 0 0.7 0:01.38 kdesktop 4953 kerekfyp 15 0 48520 12m 9268 S 0 0.6 0:01.31 knotify 3656 oracle 15 0 659m 13m 12m S 0 0.7 0:01.14 oracle 6932 root 34 19 0 0 0 S 2 0.0 0:01.07 ksoftirqd/1 3654 oracle 15 0 661m 41m 38m S 0 2.1 0:00.97 oracle 3527 root 15 0 6024 1956 1420 S 0 0.1 0:00.94 cupsd 32 root 10 -5 0 0 0 S 0 0.0 0:00.93 kacpi_notify 2926 root 15 0 2644 1032 836 S 0 0.0 0:00.93 dhcdbd 4976 kerekfyp 15 0 29300 13m 11m S 0 0.7 0:00.88 opensuseupdater 3410 root 15 0 139m 1088 792 S 0 0.1 0:00.87 nscd 3526 root 15 0 4376 1696 1440 S 0 0.1 0:00.74 powersaved 2578 root 17 0 2124 696 448 S 2 0.0 0:00.72 syslog-ng 4885 kerekfyp 15 0 1584 368 300 S 0 0.0 0:00.68 kwrapper 4929 kerekfyp 15 0 34372 18m 13m S 0 0.9 0:00.59 oooqs 4828 kerekfyp 15 0 3544 480 300 S 0 0.0 0:00.53 gpg-agent
Péter, can you please provide the output of powertop before suspending the machine (e.g. when you only see 200 w/s). Thanks!
After reboot and before any suspend powertop prints: Cn Avg residency P-states (frequencies) C0 (cpu running) ( 7,2%) 2,00 Ghz 0,6% C1 0,0ms ( 0,0%) 1,67 Ghz 0,0% C2 1,4ms (92,8%) 1333 Mhz 0,0% 1000 Mhz 99,4% Wakeups-from-idle per second : 653,5 interval: 10,0s no ACPI power usage estimate available Top causes for wakeups: 35,1% (242,4) <interrupt> : extra timer interrupt 18,5% (127,6) <interrupt> : HDA Intel, i915@pci:0000:00:02.0 14,5% (100,0) <interrupt> : uhci_hcd:usb1, ehci_hcd:usb5 10,5% ( 72,8) amarokapp : schedule_timeout (process_timeout) 4,2% ( 29,2) Xorg : do_setitimer (it_real_fn) 3,3% ( 22,9) compiz : schedule_timeout (process_timeout) 1,8% ( 12,5) <kernel module> : queue_delayed_work_on (delayed_work_timer_f 1,7% ( 11,7) <kernel module> : usb_hcd_poll_rh_status (rh_timer_func) 1,6% ( 11,0) <interrupt> : libata 0,9% ( 6,5) postmaster : schedule_timeout (process_timeout) 0,9% ( 6,4) <interrupt> : iwl3945 0,9% ( 6,2) kicker : futex_wait (hrtimer_wakeup) 0,9% ( 6,0) <interrupt> : acpi 0,7% ( 4,8) soffice.bin : schedule_timeout (process_timeout) 0,6% ( 3,9) oracle : schedule_timeout (process_timeout) 0,4% ( 3,0) <kernel core> : queue_delayed_work_on (delayed_work_timer_f 0,4% ( 2,5) emerald : schedule_timeout (process_timeout) 0,3% ( 2,0) <kernel core> : clocksource_register (clocksource_watchdog) 0,3% ( 1,9) kicker : schedule_timeout (process_timeout) 0,3% ( 1,8) <interrupt> : ahci 0,2% ( 1,3) Xorg : schedule_timeout (process_timeout) 0,2% ( 1,1) kopete : schedule_timeout (process_timeout) 0,1% ( 1,0) kwrapper : do_nanosleep (hrtimer_wakeup) 0,1% ( 1,0) NetworkManager : tg3_open (tg3_timer) 0,1% ( 1,0) httpd2-prefork : schedule_timeout (process_timeout) 0,1% ( 1,0) ntpd : do_setitimer (it_real_fn) 0,1% ( 1,0) slmodemd : schedule_timeout (process_timeout) (Note: as you probably see, this and the previous output was taken while running a full kde desktop environment with compiz.) When no X application (except kdm) is running the number of extra timer interrupts decreases to ~30 before suspend and to ~300 after suspend.
Not sure what triggers the high amount of extra timer interrupts. Maybe you can investigate by killing processes (one by one) and also unloading modules in order to narrow down the cause.
I don't think that has to do with processes. I expect the timer set up got screwed somehow. In nohz mode the timer interrupts are switched off or lowered. For some reason the system seems to fall back to normal HZ mode an timer interrupts are fired (not sure whether we are at 250 or 1000 timers per second -> it normally doesn't matter anymore..). If you do a: watch -n1 cat /proc/interrupts you should see the timer interrupts increase regularly by 250 or 1000 per second after s2ram, but not before? Maybe setting different clocksources help? /sys/devices/system/clocksource/clocksource0/ However, this is rather complex code written by Ingo Molnar and Thomas Gleixner. IMO this bug should get verified in latest kernel and reported upstream, e.g. via http://bugzilla.kernel.org To verify whether the latest mainline kernel is also affected, you can take this one: ftp.suse.com/pub/projects/kernel/kotd/HEAD/x86_64/kernel-vanilla-2.6.23.1-20071013190126.x86_64.rpm (the timestamp might vary of course). Adding Pavel and Rafael, they may already know about this or might want to know about it.
If I do 'watch -n1 cat /proc/interrupts' I see the same number of timer interrupts fired per second as with powertop. Before suspend it is about 300, after suspend it is about 3000. Setting different clocksources makes no difference except that clocksource 'jiffies' locked up the system. # cat /sys/devices/system/clocksource/clocksource0/available_clocksource hpet acpi_pm jiffies tsc What about the executeable called "s2ram"? Please take a look at the first powertop output (which was taken after the suspend). There is a line contaning the word "s2ram". In the second powertop output (which was taken before the suspend) no such line exists.
This has nothing to do with a process (maybe triggered by one indirectly, no idea), this is about a wrongly programmed timer after s2ram. Hmm, you mean the s2ram executable is executed all the time after suspend? You can double check with ps aux |grep s2ram, this indeed sounds strange... Can you attach dmesg, pls. Best is you also try the latest kernel as said in comment #4.
Created attachment 178782 [details] dmesg including boot and s2ram
No, the s2ram executeable is not executed according to 'ps aux | grep s2ram'. I just wanted to point out this slight difference between the above two powertop outputs. dmesg attached. Anyway, I'll do a fresh install with latest vanilla kernel as soon as I can.
Maybe hpet=disable helps, AFAIK this one got some extra suspend/resume code and made some problems on suspend recently.
Fresh 10.3 GM install also has this issue with both vanilla and default kernel. hpet=disable kernel parameter does not change anything. See the attached outputs.
Created attachment 178840 [details] dmesg (2.6.23.1-20071013190126-vanilla)
Created attachment 178842 [details] powertop before s2ram (2.6.23.1-20071013190126-vanilla)
Created attachment 178843 [details] powertop after s2ram (2.6.23.1-20071013190126-vanilla) This output also contains that strange "s2ram" executeable which is in fact not being executed....
Yes, powertop is confused, and it happens in mainline, too. Eating too much power is ugly, but given newness of nohz subsystem, I guess this should better be solved on mainline.
Ok, lets keep the bug here so that people can see it was already reported...
There is another bug where this happens: bug #335086 This one is even more sever as you don't even have to s2ram for it. There processor.max_cstate=1 workarounds the problem.
It seems that 'echo 1 > /sys/module/processor/parameters/max_cstate' workarounds this problem too. It normalizes the number of wakeups per second but also makes the cpu run always in C0. Cn Avg residency P-states (frequencies) C0 (cpu running) (100,0%) 2,00 Ghz 0,0% C1 0,0ms ( 0,0%) 1,67 Ghz 0,0% C2 0,0ms ( 0,0%) 1333 Mhz 0,0% 1000 Mhz 100,0%
Disabling higher Cstates is "throw baby out with bathwater" situation. It will make your power consumption _big_. Anyway, it also hints that problem is with cpuidle framework. Adam?
We do not have cpuidle framework yet in 10.3 (We have them in -rt kernel only). My suspicion is that this gets introduced with nohz/clockevents/highres patches. Maybe it's the way the timer's are (re-)programmed. In this case maybe the way the resume functions of timer reprogramming for these patches..., just a guess. But as all this is complicated stuff... Peter: It will probably be some work, but I'd start to try a kernel shortly before nohz/clockevents/highres patches, not sure anymore, when exactly they came in, 2.6.21 should be ok. If this one works, a git bisect should bring you to the offending patch. If it's the above mentioned patches, open a bug on bugzilla.kernel.org and assign them to Ingo Molnar and/or Thomas Gleixner, those guys are to blame for these huge patches... If it's something else and it worked before, we have a good chance to fix it if it's a smaller patch.
Powertop won't be available, you probably can use: watch -n1 cat /proc/interrupts to prove whether things are working or not. Depending on the CONFIG_HZ value used (in SUSE 250 per default), the timer interrupt should increase by this value (250) per second. If you get some thousands per second, something goes wrong (you can also try this with the current system). If this also happens with older kernels, before nohz patches, we might have a general timer programming/C-state problem... zcat /proc/config.gz |less then search for CONFIG_HZ will show you the CONFIG_HZ value of your kernel (could also be 1000...).
I added a solution for my problem with too many wake-ups as Comment #7 of Bug #335086.
Ok, chance to fix this for 10.3 is less than 0; if it still happens with 11.0 / 11.1 alpha, please open a new bug.