Bug 432809 - C1 state unsupported on modern AMD mobile CPUs
Summary: C1 state unsupported on modern AMD mobile CPUs
Status: RESOLVED INVALID
Alias: None
Product: openSUSE 11.0
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Final
Hardware: Other openSUSE 11.0
: P5 - None : Normal with 1 vote (vote)
Target Milestone: ---
Assignee: Thomas Renninger
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-10-06 21:39 UTC by Jakub Jozwicki
Modified: 2009-01-30 15:28 UTC (History)
3 users (show)

See Also:
Found By: Corporate Interoperability Test
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Jozwicki 2008-10-06 21:39:05 UTC
Newer laptops with AMD Turion X2 and AMD Athlon X2 mobile processors (which are
C1E capable) are overheating and have very degraded battery life due to bugs in
/usr/src/linux/drivers/acpi/processor_idle.c (not properly implemented C1 support).
C1E replaces C2 and C3 states, so BIOS usually doesn't have definitions for
_CST and PBLK in FADT - it is valid according to ACPI specification.

Here is MS info about PPM in Vista:
http://download.microsoft.com/download/0/0/b/00bba048-35e6-4e5b-a3dc-36da83cbb0d1/ProcPowerMgmt.docx

Below is patch which fixes C1 visibility, but doesn't increase battery life to Vista level.

--- cut here --
--- kernels/linux-2.6.27-rc8/drivers/acpi/processor_idle.c	2008-09-30 00:24:02.000000000 +0200
+++ linux/drivers/acpi/processor_idle.c	2008-10-06 00:24:19.000000000 +0200
@@ -501,7 +501,7 @@
 	 * ------
 	 * Invoke the current Cx state to put the processor to sleep.
 	 */
-	if (cx->type == ACPI_STATE_C2 || cx->type == ACPI_STATE_C3) {
+	if (cx->type >= ACPI_STATE_C1) {
 		current_thread_info()->status &= ~TS_POLLING;
 		/*
 		 * TS_POLLING-cleared state must be visible before we
@@ -523,12 +523,17 @@
 		 * Use the appropriate idle routine, the one that would
 		 * be used without acpi C-states.
 		 */
+		 
+		t1 = jiffies;
+		 
 		if (pm_idle_save) {
 			pm_idle_save(); /* enables IRQs */
 		} else {
 			acpi_safe_halt();
 			local_irq_enable();
 		}
+		
+		t2 = jiffies;
 
 		/*
 		 * TBD: Can't get time duration while in C1, as resumes
@@ -538,8 +543,7 @@
 		 * Note: the TSC better not stop in C1, sched_clock() will
 		 *       skew otherwise.
 		 */
-		sleep_ticks = 0xFFFFFFFF;
-
+		sleep_ticks = ticks_elapsed(t1, t2);
 		break;
 
 	case ACPI_STATE_C2:
@@ -642,12 +646,13 @@
 		return;
 	}
 	cx->usage++;
-	if ((cx->type != ACPI_STATE_C1) && (sleep_ticks > 0))
+	if (cx->type != ACPI_STATE_C1 && sleep_ticks > 0)
 		cx->time += sleep_ticks;
 
 	next_state = pr->power.state;
 
 #ifdef CONFIG_HOTPLUG_CPU
+
 	/* Don't do promotion/demotion */
 	if ((cx->type == ACPI_STATE_C1) && (num_online_cpus() > 1) &&
 	    !pr->flags.has_cst && !(acpi_gbl_FADT.flags & ACPI_FADT_C2_MP_SUPPORTED)) {
@@ -811,8 +816,11 @@
 	if (!pr)
 		return -EINVAL;
 
+	/* Newer dual-core CPUs use C1E instead of C2 and C3 and
+	 * usually do not have _CST definitions or PBLK entries.
+	 * ACPI specification allows for that so return zero here */
 	if (!pr->pblk)
-		return -ENODEV;
+		return 0;
 
 	/* if info is obtained from pblk/fadt, type equals state */
 	pr->power.states[ACPI_STATE_C2].type = ACPI_STATE_C2;
@@ -852,6 +860,11 @@
 		pr->power.states[ACPI_STATE_C1].type = ACPI_STATE_C1;
 		pr->power.states[ACPI_STATE_C1].valid = 1;
 		pr->power.states[ACPI_STATE_C1].entry_method = ACPI_CSTATE_HALT;
+		snprintf(pr->power.states[ACPI_STATE_C1].desc, ACPI_CX_DESC_LEN, "ACPI HLT");
+		if (!pr->power.states[ACPI_STATE_C1].latency)
+			pr->power.states[ACPI_STATE_C1].latency = 1;
+		if (!pr->power.states[ACPI_STATE_C1].power)
+			pr->power.states[ACPI_STATE_C1].power = 1000;
 	}
 	/* the C0 state only exists as a filler in our array */
 	pr->power.states[ACPI_STATE_C0].valid = 1;
@@ -1191,12 +1204,11 @@
 	memset(pr->power.states, 0, sizeof(pr->power.states));
 
 	result = acpi_processor_get_power_info_cst(pr);
-	if (result == -ENODEV)
-		result = acpi_processor_get_power_info_fadt(pr);
-
 	if (result)
-		return result;
+		result = acpi_processor_get_power_info_fadt(pr);
 
+	/* No valid _CST and FADT, but C1 must be supported,
+	 * so here we go */
 	acpi_processor_get_power_info_default(pr);
 
 	pr->power.count = acpi_processor_power_verify(pr);
@@ -1216,13 +1228,13 @@
 #endif
 
 	/*
-	 * if one state of type C2 or C3 is available, mark this
+	 * if one state of type C1(e), C2 or C3 is available, mark this
 	 * CPU as being "idle manageable"
 	 */
 	for (i = 1; i < ACPI_PROCESSOR_MAX_POWER; i++) {
 		if (pr->power.states[i].valid) {
 			pr->power.count = i;
-			if (pr->power.states[i].type >= ACPI_STATE_C2)
+			if (pr->power.states[i].type >= ACPI_STATE_C1)
 				pr->flags.power = 1;
 		}
 	}
@@ -1455,7 +1467,7 @@
 static int acpi_idle_enter_c1(struct cpuidle_device *dev,
 			      struct cpuidle_state *state)
 {
-	u32 t1, t2;
+	u32 t1, t2, elapsed;
 	struct acpi_processor *pr;
 	struct acpi_processor_cx *cx = cpuidle_get_statedata(state);
 
@@ -1482,7 +1494,10 @@
 
 	local_irq_enable();
 	cx->usage++;
-
+	elapsed = ticks_elapsed(t1, t2);
+	if (elapsed > 0)
+		cx->time += elapsed;
+	
 	return ticks_elapsed_in_us(t1, t2);
 }
 
-- cut here --
Comment 1 Thomas Renninger 2008-10-08 09:23:41 UTC
Have you or has this patch been posted upstream?
Discussing patches in bugzilla is ugly, best would be you post that upstream (lkml and acpi-linux) and CC me.

> but doesn't increase battery life to Vista level
Do you see power savings at all and if how much about?

I have several questions about this, IMO this should be discussed via mail first.
Comment 2 Thomas Renninger 2008-10-08 14:14:23 UTC
Jakub, I am very interested in this one.
I always was sure C1 is entered on these (should be hlt instruction if no other C-states are found?).
You may want to mail me privately, adding Joachim from AMD if you do not like to post that on lkml and linux-acpi directly.
Comment 3 Jakub Jozwicki 2008-10-08 16:59:40 UTC
I have sent an email with content:
------------------------------------
>Have you or has this patch been posted upstream?
>Discussing patches in bugzilla is ugly, best would be you post that
>upstream
>(lkml and acpi-linux) and CC me.

Hello,
For me sending my own patches to lkml doesn't have a sense.
Long time ago I had sent patches for enabling UDMA100 on Asus A6K and Acer
Aspire 3680 and these patches were ignored. I had filled bug report in SuSE
bugzilla and the same patches signed by SuSE people were accepted. If you are
not top committer you are ignored. I had the same bad experience with patches
fixing -rt kernel.

>Jakub, I am very interested in this one.
>I always was sure C1 is entered on these (should be hlt instruction >if no
other C-states are found?).

Let's assume this kind of situation: Laptop manufacturer knows that Turion
X2 or mobile Athlon X2 supports C1(e). He doesn't add _CST and FADT entries in
DSDT, since C1 is always supported on any CPU (in the theory). Now if you
look at processor_idle.c : acpi_processor_get_power_info you see that:

a) if there is no _CST we return from _get_power_info_cst with error
and enter _get_power_info_dsdt
b) if P_BLK address is zero we return from _get_power_info_dsdt with error
and exit with error the whole _get_power_info function without marking that C1
is supported.

If normal idle function is the same as C1 handling then mine patch doesn't
have any sense (except enabling visibility of C1 in Powertop). From the other
side I have checked with Windows Driver Kit for Vista and pwrtest.exe that my
CPU on Windows Server 2008 uses C1 above 90% of the time while reading
slashdot or osnews and battery life is above 3 hours. The same activity on
OpenSUSE 11.0, 11.1 beta2 - battery lives 1h 40 min. The are no miracles - either
W2K8 has some marvelous power management or power management in linux is
broken.

>You may want to mail me privately, adding Joachim from AMD if you do >not
like to post that on lkml and linux-acpi directly.

I've been trying to ask tech.support@amd.com for some technical help,
but I was told that 'Linux is not officially supported'...
Comment 4 Jakub Jozwicki 2008-10-08 18:06:56 UTC
ftp://download.intel.com/technology/IAPC/acpi/downloads/30222305.pdf

Section 3.3: C1 with I/O.

Maybe amdk8.sys uses C1 I/O halt with hardcoded IO Port address?
Comment 5 Jakub Jozwicki 2008-10-08 22:04:00 UTC
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/41256.pdf

2.4.3.1       C1 Enhanced State (C1E)
The C1 enhanced state (C1E) is a stop-grant state supported by the processor. The C1E state is characterized by
the following properties:
     • All cores are in the halt (C1) state.
     • The ACPI-defined P_LVL3 register has been accessed.
     • The IO Hub has issued a STPCLK assertion message with the appropriate SMAF for C1E entry. Note
       that [The ACPI Power State Control Registers] F3x[84:80] specify the processor clocking and voltage
       behavior in response to the C1E SMAF.
     • The processor has issued a STOP_GRANT message to the IO Hub.
General requirements for C1E:
     • The ACPI-defined C2 and C3 states must not be declared to the operating system.
         • The ACPI-defined P_LVL2_LAT should be greater than 100.
         • The ACPI-defined P_LVL3_LAT should be greater than 1000.
     • C1E should only be enabled when the platform is in ACPI power management mode.
     • C1E is not supported when CLMC is used by the IO Hub.

---------------------------


Comment 6 Jakub Jozwicki 2008-10-08 22:24:27 UTC
Linux enables HyperTransport Centralized Link Management Controller and then CPU cannot enter C1E and we do not see any power savings. Am I right?
Comment 7 Jakub Jozwicki 2008-10-08 22:40:37 UTC
No, I was wrong. MSRs are misconfigured. I will try to fix it during weekend.
Comment 8 Thomas Renninger 2008-10-08 22:53:08 UTC
AFAIK, through MSR you can only check whether C1E is supported and disable/enable it.
I remember testing power consumption when C1E was disabled via MSR (as a workaround for an old kernel) and enabled. I definitely saw a power consumption change. That would mean (and it is more or less the only possibility from OS view) that C1E worked.

> I've been trying to ask tech.support@amd.com for some technical help,
> but I was told that 'Linux is not officially supported'...
Yeah, but this is practically for every Windows only supported model.
I mean it's the support, they have to ask you this and then say good bye if the answer is Linux.

Joachim told me he has some news for C1E for me. He could add you into CC if you are interested. It is great to see people getting involved.
Comment 9 Jakub Jozwicki 2008-10-08 23:33:28 UTC
Please add me to CC. Recently I have bought a nice laptop (http://jakub007.blogspot.com/2008/09/acer-aspire-4530-laptop-ma-14-calowy.html)
and I'm going to have usable Linux on it.
Comment 10 Jakub Jozwicki 2008-10-09 19:54:27 UTC
I have modified arch/x86/kernel/cpu/amd_64.c:

static int __cpuinit early_init_amd_mc(struct cpuinfo_x86 *c)
{
	unsigned bits, ecx;

	/* Multi core CPU? */
	if (c->extended_cpuid_level < 0x80000008)
		return 0;
#ifdef CONFIG_SMP
	ecx = cpuid_ecx(0x80000008);

	c->x86_max_cores = (ecx & 0xff) + 1;

	/* CPU telling us the core id bits shift? */
	bits = (ecx >> 12) & 0xF;

	/* Otherwise recompute */
	if (bits == 0) {
		while ((1 << bits) < c->x86_max_cores)
			bits++;
	}

	c->x86_coreid_bits = bits;
	return 1;
#else
	return 1;
#endif
}

static void __cpuinit early_init_amd(struct cpuinfo_x86 *c)
{
	int mc, addr;
	unsigned long value;
	mc = early_init_amd_mc(c);
	
	/* c->x86_power is 8000_0007 edx. Bit 8 is constant TSC */
	if (c->x86_power & (1<<8))
		set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);

	set_cpu_cap(c, X86_FEATURE_SYSCALL32);
	
#define MSR_C1E 0xc0010055
#define C1E_ON_CMP_HALT_BIT	(1L << 28)
#define SMI_ON_CMP_HALT_BIT	(1L << 27)
#define IO_RD			(1L << 26)
#define INTR_PND_MSG_BIT	(1L << 25)
#define IO_MSG_ADDR_MASK	((1L << 16)-1)

	if (c->x86 == 0x11) {
		rdmsrl(MSR_C1E, value);
		addr = value & IO_MSG_ADDR_MASK;
		if (addr==0)
			addr = 0x1015; // what should be here as P_LVL3?
		if (addr) {
			printk(KERN_INFO "Enabling AMD C1E power management with IO address %x\n", addr);
			value &= ~INTR_PND_MSG_BIT;
			value |= IO_RD;
			if (mc) {
				value |= C1E_ON_CMP_HALT_BIT;
				value &= ~SMI_ON_CMP_HALT_BIT;
			}
			else {
				value &= ~C1E_ON_CMP_HALT_BIT;
				value |= SMI_ON_CMP_HALT_BIT;
			}
			wrmsrl(MSR_C1E, value);
		}
		else {
			printk(KERN_ERR "Cannot enable AMD C1E power management due to invalid IO address\n");
		}
	}
}

------------------------
Addr is 0. I have checked MSR_C1E on Windows with tool from eprotek.com and this MSR is zero. Is there other way to enable C1E?
Comment 11 Jakub Jozwicki 2008-10-11 21:44:45 UTC
http://jakub007.blogspot.com/2008/10/powertop-on-linux-2.html

powertop screenshots
Comment 12 Jakub Jozwicki 2008-10-11 22:17:56 UTC
objdump -S amdk8.sys | less /hlt:

11ab0:       8b ff                   mov    %edi,%edi
11ab2:       fb                      sti
11ab3:       f4                      hlt
11ab4:       c3                      ret

amdk8.sys references hal.dll with WRITE_PORT_UCHAR...
but we can see normal C1. Does AMD know what is inside Vista driver?
Comment 13 Thomas Renninger 2009-01-30 15:27:59 UTC
First: sorry for the late answer and thanks for your input.

> Is there other way to enable C1E?
Only via BIOS setting.

Theoretically you could try to switch it on via msr writes, but you shouldn't override your BIOS settings.

You can use rdmsr and wrmsr from the msr-tools package on Linux to find out more about the C1E settings of your machine.

-> this is not a Linux issue -> closing invalid.
Comment 14 Thomas Renninger 2009-01-30 15:28:46 UTC
BTW: C1E should get enabled via SMM BIOS code when the OS switches to ACPI mode.