Bug 714455

Summary: Opensuse 11.4 64b Kernel 2.6.37.6-0.7 bug : kworker thread at 90%
Product: [openSUSE] openSUSE 11.4 Reporter: Guy Zelck <guy.zelck>
Component: KernelAssignee: Rafael Wysocki <rjw>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None CC: aspiers, guy.zelck, jlee
Version: Final   
Target Milestone: ---   
Hardware: HP   
OS: openSUSE 11.4   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: PCI/ACPI: Report ASPM support to BIOS if not disabled from command line

Description Guy Zelck 2011-08-26 15:13:13 UTC
User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:5.0) Gecko/20100101 Firefox/5.0

Hi all,

There is a known kernel bug which eats almost 1 cpu, so pretty anoying (see https://bugzilla.kernel.org/show_bug.cgi?id=29722).
It quickly appears after working normaly. A workaround is to use kernel parameter pcie_ports=compat.

BUT : This does not help when you resume your PC from a suspend.

A kernel has been patched (from the kernel bug site if I remember correctly, just google) :

"A patch referencing this bug report has been merged in v2.6.38-8876-g036a982:

commit 8b8bae901ce23addbdcdb54fa1696fb2d049feb5
Author: Rafael J. Wysocki <rjw@sisk.pl>
Date: Sat Mar 5 13:21:51 2011 +0100

PCI/ACPI: Report ASPM support to BIOS if not disabled from command line"

The bug is known since Mar 5 2011 but Opensuse stalls in releasing a patched kernel for 11.4. 

Reproducible: Always

Steps to Reproduce:
Boot and wait 1 hour at the most. I did a fresh install on my new hardware (HP 8540w 64bit) and noticed this right away.
Actual Results:  
You loose 1 cpu +/- completely (>90% cpu by kworker/0:0 constantly).


guylx:/var/log # ps -ef|egrep -i 'uid|kworker'
UID        PID  PPID  C STIME TTY          TIME CMD
root        11     2 68 15:05 ?        01:24:18 [kworker/0:1]
root     20503     2  0 16:31 ?        00:00:00 [kworker/u:1]
root     20610     2 62 16:34 ?        00:21:38 [kworker/0:2]
root     20826     2  0 16:38 ?        00:00:00 [kworker/u:2]
root     20984     2  1 16:42 ?        00:00:23 [kworker/2:0]
root     21572     2  0 16:51 ?        00:00:00 [kworker/1:1]
root     21692     2  0 16:53 ?        00:00:00 [kworker/3:1]
root     21863     2 94 16:56 ?        00:11:19 [kworker/0:0]
root     21941     2  0 16:58 ?        00:00:00 [kworker/u:0]
root     21977     2  0 16:59 ?        00:00:00 [kworker/2:1]
root     21983     2  0 16:59 ?        00:00:00 [kworker/3:0]
root     22248     2  0 17:03 ?        00:00:00 [kworker/1:0]
root     22289     2  0 17:04 ?        00:00:00 [kworker/3:2]
root     22312     2  0 17:04 ?        00:00:00 [kworker/u:3]
root     22329     2  0 17:05 ?        00:00:00 [kworker/2:2]
Comment 1 Guy Zelck 2011-10-06 20:46:09 UTC
A sure step to reproduce : suspend to ram and then resume. Bingo, one cpu less to use.
Comment 2 Adam Spiers 2012-01-24 12:52:05 UTC
I have a similar issue on openSUSE 12.1 - see bug 743101.
Comment 3 Rafael Wysocki 2012-03-15 22:33:44 UTC
(In reply to comment #2)
> I have a similar issue on openSUSE 12.1 - see bug 743101.

No, this is a different problem.
Comment 4 Rafael Wysocki 2012-03-15 22:49:29 UTC
Created attachment 481721 [details]
PCI/ACPI: Report ASPM support to BIOS if not disabled from command line

Backport of commit 8b8bae901ce23addbdcdb54fa1696fb2d049feb5 to the openSUSE-11.4 kernel code base.
Comment 5 Rafael Wysocki 2012-03-15 22:51:20 UTC
Comment #4 contains a backport of the mainline kernel change that reportedly fixes the problem at hand.  Can you please test if the problem really goes away with that patch applied or let me know if a test kernel is needed?
Comment 6 Guy Zelck 2012-03-17 15:30:10 UTC
I would prefer a test kernel as patching and compiling a kernel is no routine to me. The laptop is my works one so better not mess it up too much.
I realy appreciate you took the effort of providing a fix and realy want to test this.
Comment 7 Rafael Wysocki 2012-03-28 22:40:43 UTC
Sorry for the delay.

The kernel to test is available at:

http://beta.suse.com/private/rwysocki/testkernel/714455/

Please pick up the falvor that's the most similar to the one you're currently using and let me know if it fixes the problem for you.
Comment 8 Guy Zelck 2012-03-31 11:49:22 UTC
Rafaƫl,

Thanks for preparing the kernel.
I just tested with kernel-desktop-2.6.37.6-0.12.x86_64.rpm and it's a success!!!

Did a sleep2ram and resumed correctly. Cpus are quiet.

Config :
Linux guylx.work 2.6.37.6-0.12-desktop #1 SMP PREEMPT 2012-03-21 23:45:39 +0100 x86_64 x86_64 x86_64 GNU/Linux

Top :
top - 13:48:07 up 18 min,  3 users,  load average: 0.02, 0.12, 0.24
Tasks: 232 total,   1 running, 231 sleeping,   0 stopped,   0 zombie
Cpu(s):  6.3%us,  2.8%sy,  0.0%ni, 90.6%id,  0.0%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:   8046632k total,  2059140k used,  5987492k free,   114132k buffers
Swap:  4194300k total,        0k used,  4194300k free,   850600k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                              
 1772 root      20   0  169m  78m  20m S   14  1.0   0:55.68 Xorg                                                                 
    1 root      20   0 12460  812  668 S    0  0.0   0:01.13 init                                                                 
    2 root      20   0     0    0    0 S    0  0.0   0:00.00 kthreadd                                                             
    3 root      20   0     0    0    0 S    0  0.0   0:00.05 ksoftirqd/0                                                          
    4 root      20   0     0    0    0 S    0  0.0   0:00.16 kworker/0:0                                                          
    6 root      RT   0     0    0    0 S    0  0.0   0:00.00 migration/0                                                          
    7 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/0                                                           
   11 root      20   0     0    0    0 S    0  0.0   0:00.54 kworker/0:1                                                          
   21 root       0 -20     0    0    0 S    0  0.0   0:00.00 cpuset                                                               
   22 root       0 -20     0    0    0 S    0  0.0   0:00.00 khelper                                                              
   23 root       0 -20     0    0    0 S    0  0.0   0:00.00 netns                                                                
   24 root      20   0     0    0    0 S    0  0.0   0:00.00 sync_supers                                                          
   25 root      20   0     0    0    0 S    0  0.0   0:00.00 bdi-default                                                          
   26 root       0 -20     0    0    0 S    0  0.0   0:00.00 kintegrityd                                                          
   27 root       0 -20     0    0    0 S    0  0.0   0:00.00 kblockd                                                              
   28 root       0 -20     0    0    0 S    0  0.0   0:00.00 kacpid                                                               
   29 root       0 -20     0    0    0 S    0  0.0   0:00.00 kacpi_notify                                                         
   30 root       0 -20     0    0    0 S    0  0.0   0:00.00 kacpi_hotplug                                                        
   31 root       0 -20     0    0    0 S    0  0.0   0:00.00 ata_sff                                                              
   32 root      20   0     0    0    0 S    0  0.0   0:00.00 khubd
Comment 9 Guy Zelck 2012-03-31 11:54:10 UTC
These were the boot parameters used :

kernel /vmlinuz-2.6.37.6-0.12-desktop root=/dev/system/root resume=/dev/system/swap splash=silent quiet showopts vga=0x31a agp=off nomodeset

I did not have to use the workaround "pcie_ports=compat".
Comment 10 Rafael Wysocki 2012-04-12 21:48:43 UTC
Sorry for the delay, I was traveling last week.

Thanks for the testing.

The patch from comment #4 has been added to the openSUSE-11.4 kernel repository, should be included in the next kernel update.
Comment 11 Guy Zelck 2012-04-12 22:13:30 UTC
Great!
Comment 12 Swamp Workflow Management 2012-06-28 08:11:26 UTC
openSUSE-SU-2012:0799-1: An update that solves 25 vulnerabilities and has 22 fixes is now available.

Category: security (moderate)
Bug References: 466279,651219,653260,655696,676204,681186,681639,683671,689860,703410,707332,711941,713430,714455,717209,717749,721366,726045,726600,729247,730118,731673,732908,737624,738644,740448,740703,740745,744658,745832,746980,747038,747660,748859,749569,750079,750959,756203,756840,757278,758243,758260,758813,759545,760902,765102,765320
CVE References: CVE-2009-4020,CVE-2010-3873,CVE-2010-4164,CVE-2010-4249,CVE-2011-1083,CVE-2011-1173,CVE-2011-2517,CVE-2011-2700,CVE-2011-2909,CVE-2011-2928,CVE-2011-3619,CVE-2011-3638,CVE-2011-4077,CVE-2011-4086,CVE-2011-4330,CVE-2012-0038,CVE-2012-0044,CVE-2012-0207,CVE-2012-1090,CVE-2012-1097,CVE-2012-1146,CVE-2012-2119,CVE-2012-2123,CVE-2012-2136,CVE-2012-2663
Sources used:
openSUSE 11.4 (src):    kernel-docs-2.6.37.6-0.20.2, kernel-source-2.6.37.6-0.20.1, kernel-syms-2.6.37.6-0.20.1, preload-1.2-6.17.1
Comment 13 Guy Zelck 2012-07-15 21:39:10 UTC
Rafael,

When the kernel packages for kernel-desktop-2.6.37.6-0.20.1 were distributed to the public I installed them with yast2. To my disappointment this did not solve the problem and upon rebooting I had a runaway kernel worker thread 0:0 straight away.

I searched the opensuse websites and found the next kernel on kernel.opensuse.org that works for me : kernel-desktop-2.6.37.6-121.1.x86_64.rpm.
Comment 14 Swamp Workflow Management 2012-11-05 09:12:26 UTC
openSUSE-SU-2012:1439-1: An update that solves 26 vulnerabilities and has 28 fixes is now available.

Category: security (moderate)
Bug References: 466279,651219,653260,655696,676204,681186,681639,683671,689860,703410,707332,711941,713430,714455,717209,717749,721366,726045,726600,729247,730118,731673,732908,734056,737624,738644,740448,740703,740745,744658,745832,746980,747038,747660,748859,749569,750079,750959,755546,756203,756840,757278,758243,758260,758813,759545,760902,765102,765320,769408,769784,769896,774285,781134
CVE References: CVE-2009-4020,CVE-2010-3873,CVE-2010-4164,CVE-2010-4249,CVE-2011-1083,CVE-2011-1173,CVE-2011-2517,CVE-2011-2700,CVE-2011-2909,CVE-2011-2928,CVE-2011-3619,CVE-2011-3638,CVE-2011-4077,CVE-2011-4086,CVE-2011-4110,CVE-2011-4330,CVE-2012-0038,CVE-2012-0044,CVE-2012-0207,CVE-2012-1090,CVE-2012-1097,CVE-2012-1146,CVE-2012-2119,CVE-2012-2123,CVE-2012-2136,CVE-2012-2663
Sources used:
openSUSE 11.4 (src):    kernel-docs-2.6.37.6-24.2, kernel-source-2.6.37.6-24.1, kernel-syms-2.6.37.6-24.1, preload-1.2-6.19.1