Bug 182151 - IBM Thinkpad T43 -- Suspend to RAM no longer works when ATI fglrx driver is loaded -- worked fine in 9.3 and 10.0
Summary: IBM Thinkpad T43 -- Suspend to RAM no longer works when ATI fglrx driver is l...
Status: RESOLVED FIXED
: 229317 (view as bug list)
Alias: None
Product: SUSE Linux 10.1
Classification: openSUSE
Component: X11 3rd Party (show other bugs)
Version: Final
Hardware: i686 SuSE Linux 10.1
: P2 - High : Major with 5 votes (vote)
Target Milestone: ---
Assignee: Jammy Zhou
QA Contact: E-mail List
URL:
Whiteboard: Top5 Bug
Keywords:
Depends on:
Blocks:
 
Reported: 2006-06-06 18:52 UTC by Jarom Hatch
Modified: 2007-03-31 15:26 UTC (History)
11 users (show)

See Also:
Found By: Customer
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
syslog entries related to the software suspend (5.19 KB, text/plain)
2006-06-07 23:27 UTC, Matthew Hatch
Details
xorg log files (16.60 KB, application/x-bzip-compressed-tar)
2006-06-07 23:27 UTC, Matthew Hatch
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jarom Hatch 2006-06-06 18:52:26 UTC
When my laptop suspends to RAM with the ATI fglrx driver it now locks up upon resume with a garbled display.  It never used to do that since ATI fixed the lockup problem a few versions back.

If I unload fglrx suspend to RAM works fine.

SuSE 9.3 and 10.0 both functioned properly when suspended.

Suspend to disk *does* work properly in 10.1.
Comment 1 Matthew Hatch 2006-06-06 20:27:03 UTC
I can confirm that this is an existing problem on a nearly identical thinkpad, so it's not his configuration.

I have tried using a hand-rolled 2.6.16.19 kernel, as well as a new xorg.conf file, neither of which affected the hang on resume.  I haven't tried a custom xorg install yet.

Using the 2.6.13 kernel and xorg 2.6.8 (SUSE 10) worked just fine under default config with fglrx.
Comment 2 Michael Gross 2006-06-07 11:10:13 UTC
Please attach /var/log/Xorg.* here. Just in case also add 300 lines of /var/log/messages.
Comment 3 Matthew Hatch 2006-06-07 23:27:06 UTC
Created attachment 87850 [details]
syslog entries related to the software suspend

I basically cut the portion of /var/log/messages from the start of suspend2ram until syslog was restarted when the system rebooted.

I didn't see anything particularly out of the ordinary.
Comment 4 Matthew Hatch 2006-06-07 23:27:51 UTC
Created attachment 87851 [details]
xorg log files

Here are the /var/log/Xorg* files as requested.
Comment 5 Michael Gross 2006-06-08 09:37:24 UTC
Stefan, can you hel here?
Comment 6 Stefan Dirsch 2006-06-08 09:53:43 UTC
Not really, only ATI can do.
Comment 7 Jarom Hatch 2006-06-08 14:14:22 UTC
ATI already fixed the problem with fglrx several releases ago.  As I said, suspend to RAM worked until SuSE 10.1.  So, therefore something that changed between 10.0 and 10.1 broke it.
Comment 8 Matthew Tippett 2006-06-08 14:25:03 UTC
Does the same driver work with 10.1?  

ATI is currently investigating an issue with CPU_HOTPLUG.  It doesn't appear as though the driver is causing an issue.  Since you have rolled your own kernel already, please try without CPU_HOTPLUG.  

Reference Novell bug #181886.

I have also heard of T43's having some unique problems.  I will see if I can track down more information.
Comment 9 Jarom Hatch 2006-06-08 14:35:59 UTC
I am running a SuSE rolled kernel.  I can recompile without CPU_HOTPLUG and let you know what happens.

And, just so you know, a co-worker of mine has a different laptop (HP) and he has reported the same behavior to me.
Comment 10 Jarom Hatch 2006-06-08 14:38:59 UTC
By the way, I get this:

You are not authorized to access bug #181886.
Comment 11 Matthew Hatch 2006-06-08 14:49:03 UTC
There are no configurable options in the SuSE kernel source OR the vanilla 2.6.16.20 source related to CPU_HOTPLUG that I can find (doing a 'make menuconfig', 'less /proc/config.gz', or 'vi .config').  Is this a manual entry that needs to be inserted into the .config file (ex: CPU_HOTPLUG=n) or are you referring to a boot-time kernel option?
Comment 12 Matthew Tippett 2006-06-08 15:30:20 UTC
CONFIG_HOTPLUG_CPU is the option, apologies

Through menuconfig, you must have CONFIG_SMP, CONFIG_EXPERIMENTAL and CONFIG_HOTPLUG enabled.  
Comment 13 Matthew Hatch 2006-06-08 15:43:16 UTC
CONFIG_SMP, huh?  So, if I'm running a uniprocessor kernel, CONFIG_HOTPLUG_CPU isn't even an option, which would suggest it is disabled in this configuration -- there's no point in trying to hotplug the only cpu you have.  :)  With the config you've outlined above, really the only change I'm making is enabling SMP and leaving the cpu hotplug disabled.

Anyway, it's compiling now (I'm using the kernel source for the 2.6.16.13-4 kernel).  I'll let you know if it makes a difference.
Comment 14 Matthew Tippett 2006-06-08 15:52:32 UTC
I am unsure if the stock Novell kernels are always built with SMP.  Some distributions are reducing the number of kernels they support by stabilizing and dealing with Uniprocessor issues on the SMP paths.  Perhaps a Novell employee can provide guidance on the enablement of SMP in Novell kernels.
Comment 15 Matthew Hatch 2006-06-08 15:59:01 UTC
For some time now, SuSE/Novell have provided different rpms for different kerlel types, depending on how your system is detected during install, ex: My laptop was configured with kernel-default-2.6.16.13-4 whereas my P4 at home was configured with kernel-smp-2.6.16.13-4.
Comment 16 Stefan Dirsch 2006-06-09 10:20:14 UTC
SUSE still also ships non-smp kernels. This is correct.
Comment 17 Matthew Hatch 2006-06-13 04:37:58 UTC
I don't know if we're still waiting for me to check an SMP kernel with CONFIG_HOTPLUG_CPU disabled or not, but for what it's worth, it made no difference.
Comment 18 Matthew Hatch 2006-06-19 14:06:41 UTC
Has there been any progress on this bug?
Comment 19 Egbert Eich 2006-06-23 08:23:05 UTC
 (In reply to comment #15)
> ex: My
> laptop was configured with kernel-default-2.6.16.13-4 whereas my P4 at home was
> configured with kernel-smp-2.6.16.13-4.

This is because the P4 supports hyper threading. 

It's really difficult to say from here what's going wrong. ATi has told me that they had problems reproducing this bug. What surprises me slightly is that suspend to disk works while suspend to ram doesn't. 
Now suspend to disk re-POSTs the entire hardware but this causes all the 3D engine state to be tossed.

I would like to ask you to do another test:
what happens when you VT switch to the console before you suspend X to ram?
Do you still get a lockup on resume or after you've switched back to X?
Comment 20 Jarom Hatch 2006-06-24 18:09:30 UTC
Both Matthew and I tested this with close to the same results.  Both systems came back while switched to tty1 (upon resume mine had some garbled stuff at the top of the screen where the SuSE logo is, but switching to tty2 and back corrected this).  However, when I switched to tty7 X was locked up.  The only difference between mine and Matthew's was that mine still had an active mouse pointer, but the keyboard was dead.  His was completely dead.  I was able to SSH into the laptop and try killing X, which caused a quick death of my PC (I could ping, no ssh, no local terminal).  Matthew could not SSH once he switched back to X.
Comment 21 Jarom Hatch 2006-06-28 03:33:53 UTC
I ran across this site regarding the Thinkpad and I am unsure what it means:

http://thinkwiki.org/wiki/Problems_with_fglrx#Troubles_using_software_suspend

The lines of most interest are the last two:  T43 and SuSE 10.1, one using swsusp and one using Suspend to RAM

Both say this:

without vbetool or UseDummyXServer, with DRI enabled
Comment 22 Matthias Hopf 2006-07-04 13:49:49 UTC
This could be a int10 issue we had with out X.org packages (bug #180535, bug #170991, bug #158806).

Could you please retest, as soon as updated X.org packages are out for 10.1 (presumably short after SLES10 release)?
Comment 23 Stefan Dirsch 2006-07-12 02:51:45 UTC
You can find RPMs for testing in 

  ftp://ftp.suse.com/pub/people/sndirsch/RPMS/bug182151
Comment 24 Jarom Hatch 2006-07-12 03:38:28 UTC
I get this:

Could not chdir to bug182151: server said: CWD command failed.

Permissions?
Comment 25 Stefan Dirsch 2006-07-12 06:13:05 UTC
Could you try again? Probably it haven been synced yet when you tried it. Thanks.
Comment 26 Jarom Hatch 2006-07-12 15:48:45 UTC
Both Matt and I tried with the same result -- no difference in behavior.  the system still freezes after resume.
Comment 27 Matthew Hatch 2006-07-24 14:13:15 UTC
Just an update:

Today I found a bunch of updates available from YaST Online Update which I installed, specifically:

kernel-default-2.6.16.21-0.13
xorg-x11-server-6.9.0-50.17

There were other packages related to xorg, but they were all based on this same version (I believe).  Anyway, with every update installed and configured (also re-installed fglrx for the new kernel), the system still hangs after suspend-to-ram in exactly the same fashion as before.
Comment 28 Stefan Dirsch 2006-07-24 14:16:43 UTC
Sure, it's the same udpate you tried before.
Comment 29 Matthew Hatch 2006-07-24 14:48:39 UTC
Almost -- the update I tried before was xorg-x11-server-6.9.0-50.14.  Maybe there's no major changes, but the version number /is/ different.  :)

Anyway...
Comment 30 Matthew Hatch 2006-08-02 14:54:27 UTC
A new version of fglrx (8.27.10) was released recently, but it did not address this issue -- I'm still locking up after the upgrade.
Comment 31 Jarom Hatch 2006-09-29 17:54:50 UTC
I just installed FGLRX 8.29.06 today and re-tested -- Same behavior as before.

I wanted to note something though.  If I switch to tty1 before suspending, it will resume.  Switching back to tty7 immediately kills it.  If, after I resume in tty1 I attempt to go to init 3 the screen blanks out but the system is still running.  At this point I can SSH in.  If I then go to runlevel 5 again, X starts up normally without locking and the display returns to normal.

Is there an easy way to perhaps install x.org 6.8 and see if it works there or would that break too many things?
Comment 32 Matthew Hatch 2006-10-05 01:52:28 UTC
I just updated my kernel to 2.6.16.21-0.25, my fglrx driver to 8.29.06, and my xorg-x11-server to 6.9.0-50.24.

Nothing has changed -- still crashes as previously described.

I am going to attempt an install of openSUSE 10.2 Alpha to see if this issue has been resolved thus far in the new development tree.  If it is, we may want to see what happens by downgrading to xorg 6.8 or upgrading to 7.0 or 7.1 within SUSE 10.1.

Just a thought...
Comment 33 Matthew Hatch 2006-10-19 20:40:45 UTC
No dice on 10.2 Alpha 5...

Using kernel 2.6.18-9, fglrx 8.29.06, and xorg 7.1-27, the system locks up exactly as in previous versions.

I should note that this problem again only presented itself after fglrx was installed, as I successfully suspended under the default configuration without ATI's driver.

...thoughts?
Comment 34 Matthew Hatch 2006-10-23 04:20:39 UTC
I just recently tested using a fresh install of SUSE Linux Enterprise Desktop 10.  It works fine until I install fglrx, just as on the other tests.  I figured it would fail here since it's basically the same as openSUSE 10.1, but still...  This is your enterprise product, and a LOT of enterprises use Thinkpads.

We haven't heard anything from Novell on this matter since July 24, so I'm assuming that this isn't a priority case.  Can we escalate this a bit, as the bug still exists in your current development base?  Has anyone besides Jarom and I been able to duplicate this problem?

My final tests on this machine will be with Fedora Core 5 and Kubuntu.  If either of them function properly, I may (against all better judgement) be looking at my new distro of choice.
Comment 35 Matthew Hatch 2006-10-25 18:09:56 UTC
My test with Fedora Core 5 was a success.  The kernel was upgraded to 2.6.18, xorg was version 7.0, and I used fglrx 8.29.06.

I noticed that Fedora seems to use a different method to suspend the system.  My next test (when I get time) will be to try this different method with openSUSE.  I'll report my findings when I can.  Jarom will be working on it as well.

I'm still a little perturbed by the lack of activity on this bug.  At the very least, let us know that this is still on the radar.
Comment 36 Christian Ober-Bloebaum 2006-11-13 15:23:36 UTC
I can confirm all of the reported behaviour here:

Thinkpad R52, ATI Radeon X300
SuSE 10.1
Xorg 6.9

 - s2ram with fglrx worked for me before, now only suspend 2 disk works
 - machine crashes if one tries to use an X running with fglrx that was suspended before
 - it works fine without X running or with X with radeon-driver
 - also killing the old X from the terminal (when suspended from there) and starting a new one works fine

I tried some combinations of vbetool commands after waking up from suspend but nothing prevented the system from hanging when switching back to vt7 to an X that has been suspended...
Comment 37 Jarom Hatch 2006-11-13 18:45:12 UTC
Good to finally see someone else with this problem who has had it work in prior versions of SuSE.  Though, it seems that Novell has dropped this bug...

I verified that Fedora Core 6 works fine with s2ram.

One other thing of note:  I decided to try running without the fglrx drivers.  s2ram works with the opensource Radeon driver but as soon as you enable DRI (so I can get acceleration) s2ram no longer works.  So, it seems to not be a problem with fglrx at all, since I was able to reproduce the problem with the opensource driver.

Can we get s2ram to work again with an accelerated driver??????
Comment 38 Stefan Dirsch 2006-11-13 21:20:25 UTC
>I decided to try running without the fglrx drivers. 
>s2ram works with the opensource Radeon driver but as soon as you enable DRI >(so I can get acceleration) s2ram no longer works. 
We don't support DRI on X300 cards.
Comment 39 Jarom Hatch 2006-11-13 22:40:21 UTC
That's not the point -- it used to work.  I want it to work again.
Comment 40 Stefan Dirsch 2006-11-14 02:35:08 UTC
> That's not the point -- it used to work.
Only by accident.
> I want it to work again.
Good luck!
Comment 41 Jarom Hatch 2006-11-14 05:22:38 UTC
Wow...  nice cop-out!  It almost seems that you don't care...

Look, whether it was intended for it to work or not, it did at one point and the fact that it no longer does indicates that some unintended digression occurred.  Therefore, the code is now broken.  Why is it so hard to figure out what happened?  Other distros are able to perform this task just fine.  What is it with SuSE 10.1 and above that is different?  Was an enhancement made that is causing this behavior?  If so, what was it, and do we really need it?  Does it have something to do with Xgl?  Can it be undone?
Comment 42 Christian Ober-Bloebaum 2006-11-14 08:00:32 UTC
Jaron,

you said it works with fedora and you also mentioned that fedora has some different suspend mechanism - have you figured out the difference? So maybe one could have a look at the code and try to do something similar in suse - especially since it can not have to do anything with the kernel or the graphics driver package provided by suse (I use a vanilla kernel and the current driver from the ati website).
Comment 43 Stefan Dirsch 2006-11-14 09:05:14 UTC
Jarom, it's just a matter of priorities. Features we do no support (reengineered and experimental drivers due to lack of documention), we never supported and tested in the past, do not have the highest priority you might imagine.
Comment 44 Stefan Dirsch 2006-12-05 16:19:07 UTC
We need to check first if this is a driver regression (of a new driver) or a specific problem of 10.1. For this we'll try to reproduce this problem with a driver >= 8.29 on SUSE 9.3/10.0.
Comment 45 Stefan Dirsch 2006-12-05 16:20:43 UTC
We might run into some troubles with compiling newer fglrx drivers for older kernels, which are used by SUSE 9.3/10.0.
Comment 46 Stefan Dirsch 2007-01-15 10:45:18 UTC
First I need hardware for testing. Stefan B., do we have a T43 (not T43P!) for testing available?
Comment 47 Stefan Behlert 2007-01-15 15:58:19 UTC
I don't think we have it available at the moment. I'll take a loook.
Comment 48 Stefan Dirsch 2007-01-15 22:52:07 UTC
*** Bug 229317 has been marked as a duplicate of this bug. ***
Comment 51 Stefan Dirsch 2007-02-13 15:47:19 UTC
Finally I found hardware for testing. TODO (for me): see comments #44/45
Comment 52 Stefan Dirsch 2007-02-15 10:49:59 UTC
Update: Neither Suspend-to-Ram nor Suspend-To-Disk work properly with openSUSE 10.2 using the 8.33.6 fglrx driver.
Comment 53 Stefan Dirsch 2007-02-15 12:06:30 UTC
Update: Neither Suspend-to-Ram nor Suspend-To-Disk work properly with SUSE Linux 9.3 using the 8.33.6 fglrx driver (resume does not work).
Comment 54 Stefan Dirsch 2007-02-16 17:45:57 UTC
Just tried the 8.19.10 fglrx driver version, which we shipped together with SUSE 9.3. When trying to start the Xserver I get a black screen and that's all. No need to test STD/STR.

I don't think we have a driver regression here - it's the opposite. It is not 
a kernel bug (same problems on 9.3 and 10.2). Anyway, I think this is something which should be looked at - by ATI/AMD. I think it makes sense for
ATI/AMD to support STD/STR on a T43. It's quite common AFAIK.
Comment 55 Matthew Hatch 2007-02-22 19:33:02 UTC
Interesting news:

Jarom called me yesterday afternoon reporting that after upgrading to fglrx 8.34.8, suspend is working.  He is still using 10.1.  When I got home last night, I updated to the new driver on my T43 running 10.2 and behold -- it worked for both suspend-to-ram and suspend-to-disk.  There was a leap in the air, a shout for joy, and a few tears.

I don't have immediate access to any older versions (or my SLED 10 disk) so I can't test it anywhere else at the moment.

Are we looking at a fix for this bug?  What was done?
Comment 56 Tanel Kokk 2007-02-23 08:39:23 UTC
Tried new fglrx (8.34.8) driver on SuSE-10.2, too. Doesn't work! To be more specific, it might work (for once), but next time it for example may not susmepnd(-to-ram) at all or it may suspend and then it doesn't resume (so power-reset is required).

pity!

my system:
IBM/Lenovo T43 (Radeon Mobility M300)
SuSE-10.2 (kernel-2.6.18.2-34-default)
fglrx_7_1_0_SUSE102-8.34.8-1

Comment 57 Jarom Hatch 2007-02-23 15:19:41 UTC
I am currently using the new driver and I have suspended and resumed successfully multiple times now without a reboot.  The only drawback for me is that the display may become a little garbled until I switch to another tty and back to X.  I can deal with that though.  I'm still using SuSE 10.1 so I personally haven't tried 10.2.

It looks like progress, however there is still a little work to do.  
Comment 58 Matthias Hopf 2007-02-26 17:27:55 UTC
This is a well known bug of the fglrx driver. I sometimes could reestablish the correct mode setting by switching back and forth to X multiple times. YMMV
Comment 59 Stefan Dirsch 2007-03-31 15:26:57 UTC
Since the initial problem has been fixed for the initial reporter(comment #57), let's finally close this bugreport as fixed.