Bug 1011155

Summary: Nvidia graphics card fan not running or to slow, danger of overheating
Product: [openSUSE] openSUSE Distribution Reporter: Egon Niessner <susebugzilla>
Component: X.OrgAssignee: E-mail List <xorg-maintainer-bugs>
Status: RESOLVED UPSTREAM QA Contact: E-mail List <xorg-maintainer-bugs>
Severity: Major    
Priority: P3 - Medium CC: sndirsch
Version: Leap 42.2   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Egon Niessner 2016-11-20 10:53:52 UTC
I upgraded from Opensuse 42.1 to 42.2.

I have a nvidia geforce GT240 grapics card in my system.
When i start opensuse Leap 42.2, the fan of the graphics card 
stops, it starts running very slowly, after the cooler of the card
has become very hot.

In the system, the nouveau driver is used.
A test with the nvidia gfxG04 driver was not possible
because it produces a black screen.

(When in the dual boot system windows 10 is booted,
or a reboot is done and the system goes into the bios,
the fan spins up immediately.)
Comment 1 Egon Niessner 2016-11-20 16:52:40 UTC
I made a second attempt of a new installation on a other harddisk.
I observed, that the fan on the nvidia card was not running during the
whole installation phase of opensuse 42.2 with yast.
At rebooting after installation the fan only run, when the computer was
in the bios.
When the new installed opensuse has started, the fan stops.
During working with the new installed opensuse 42.2 it never started running.

I then installed the nvidia-gfxG04 387.57.18 driver with yast.

After rebooting, the fan starts running.

So I think, that the nouveau driver installed by default
can cause a defective graphics card or -chip by overheating during installation
and using the nouveau driver.
Comment 2 Andreas Stieger 2016-11-21 08:17:23 UTC
Stefan?
Comment 3 Stefan Dirsch 2016-11-21 15:00:12 UTC
Honestly, Tesla is getting old. Nouveau engineers no longer working actively on it. So if the proprietary driver is working for you I suggest to live with it.
Of course you can try kernel-of-the-day and see if that one improves the situation for nouveau.
Comment 4 Stefan Dirsch 2016-11-21 15:01:07 UTC
(In reply to Egon Niessner from comment #1)
> I then installed the nvidia-gfxG04 387.57.18 driver with yast.

G04 does not support GT 240, so I guess you've installed G03 340.xx driver instead.
Comment 5 Egon Niessner 2016-11-21 15:32:51 UTC
I have looked into my system with the GT240 card.
With   rpm -qa | grep -i nvidia
i got this output:

rpm -qa | grep -i nvidia
nvidia-computeG04-367.57-18.2.x86_64
nvidia-glG04-367.57-18.2.x86_64
nvidia-gfxG04-kmp-default-367.57_k4.4.27_2-18.3.x86_64
x11-video-nvidiaG04-367.57-18.2.x86_64


Under the link:
http://www.nvidia.de/download/driverResults.aspx/107956/de
and Supported cards
you get:
GeForce 200 Series:
GeForce GTX 295, GeForce GTX 285, GeForce GTX 280, GeForce GTX 275, GeForce GTX 260, GeForce GTS 250, GeForce GTS 240, GeForce GT 230, GeForce GT 240, GeForce GT 220, GeForce G210, GeForce 210, GeForce 205

So the GT 240 is supported by the G04 nvidia driver.
Comment 6 Stefan Dirsch 2016-11-21 15:42:46 UTC
(In reply to Egon Niessner from comment #5)
> I have looked into my system with the GT240 card.
> With   rpm -qa | grep -i nvidia
> i got this output:
> 
> rpm -qa | grep -i nvidia
> nvidia-computeG04-367.57-18.2.x86_64
> nvidia-glG04-367.57-18.2.x86_64
> nvidia-gfxG04-kmp-default-367.57_k4.4.27_2-18.3.x86_64
> x11-video-nvidiaG04-367.57-18.2.x86_64
> 
> 
> Under the link:
> http://www.nvidia.de/download/driverResults.aspx/107956/de

Linux x64 (AMD64/EM64T) Display Driver
Version: 	340.98 
                ^^^^^^^

> So the GT 240 is supported by the G04 nvidia driver.

It's not.

http://www.nvidia.com/Download/driverResults.aspx/111596/en-us

No GT 240 mentioned.
Comment 7 Egon Niessner 2016-11-22 08:33:07 UTC
Under the web-link I posted in comment 5, the GT240 is listed.

But the problem here is not, if driver G04 is usable for the GT240
or another nvidia driver.
The problem is, that the nouveau driver lets run the fans
of the graphics card not or very late/slow.

When the nvidia driver of the manufacturer lets rise the temperature not
so high than the nouveau driver, I think, the manufacturer of the graphic
card knows better, what temperature is suitable for his card.

Has the nouveau driver a start parameter, where the handling of the fan speed
can be controlled ?
Comment 8 Stefan Dirsch 2016-11-22 11:44:31 UTC
(In reply to Egon Niessner from comment #7)
> Under the web-link I posted in comment 5, the GT240 is listed.

Again. This is 340.xx, which is G03.

> But the problem here is not, if driver G04 is usable for the GT240
> or another nvidia driver.
> The problem is, that the nouveau driver lets run the fans
> of the graphics card not or very late/slow.

I know. ;-)

> When the nvidia driver of the manufacturer lets rise the temperature not
> so high than the nouveau driver, I think, the manufacturer of the graphic
> card knows better, what temperature is suitable for his card.

I agree.

> Has the nouveau driver a start parameter, where the handling of the fan speed
> can be controlled ?

Check this. ;-)

https://wiki.archlinux.org/index.php/nouveau#Fan_Control
Comment 9 Egon Niessner 2016-11-23 11:41:19 UTC
I inserted the opensuse 42.2 installation dvd and loaded the rescue system.
After loading, the fan on the nvidia GT240 graphics card stopped.

In the rescue system, the nouveau driver is used.

I played around with the pwm1_enable register mentioned in the wiki of comment 8,
but there was not change in the fan speed, it remains stopped.

With the time, the card got hotter and hotter.

To avoid damage, I stopped the driver testing.

You should insert a warning into the release notes of opensuse 42.2 for users
of nvidia cards, which use the nouveau driver and give a hint, that they should
install the driver of the manufacturer as soon as possible to avoid damage
by overheating.
In the wiki there is a warning:
"Warning: Use at your own risk! Don't overheat your card!"

Also the nouveau driver team should be informed, that the driver makes problems.
Comment 10 Stefan Dirsch 2016-11-23 13:25:54 UTC
(In reply to Egon Niessner from comment #9)
> You should insert a warning into the release notes of opensuse 42.2 for users
> of nvidia cards, which use the nouveau driver and give a hint, that they
> should
> install the driver of the manufacturer as soon as possible to avoid damage
> by overheating.

It's a high percentage of users with NVIDIA GPUs and how many are affected by this in the end? Which kind of message would this be? Don't use this product. It's unsafe. Thank you.

> Also the nouveau driver team should be informed, that the driver makes
> problems.

I invite you to report this to the nouveau developers as contribution to the openSUSE community you're a member of. You're the person having this issue. And it may only occur on your gfx card with this one firmware.

Seriously I don't think it's a good idea to disable support in nouveau drm/kms driver for this GPU's Device ID. Others are then suffering from it, which may not have seen this problem at all.

First thing to do would be trying our kernel-of-the-day.

https://en.opensuse.org/openSUSE:Kernel_of_the_day

The issue may have been fixed meanwhile.
Comment 11 Egon Niessner 2016-11-25 12:05:38 UTC
With the kernel-of-the-day there was no change with the fan of the graphic card, it did not start.

I created the bug report 98852 on 
https://bugs.freedesktop.org/show_bug.cgi?id=98852
Comment 12 Stefan Dirsch 2016-11-25 13:52:54 UTC
(In reply to Egon Niessner from comment #11)
> With the kernel-of-the-day there was no change with the fan of the graphic
> card, it did not start.

Ok. Thanks for giving it a try!

> I created the bug report 98852 on 
> https://bugs.freedesktop.org/show_bug.cgi?id=98852

Thanks a bunch!
Comment 13 Max Staudt 2017-01-25 15:03:42 UTC
Thanks for your bug report, both in the openSUSE bugzilla as well as upstream!

Unfortunately, this bug has gone stale, even upstream.

Once you are aware of an upstream fix for your issue, we would appreciate if you reopened the bug and let us know.
We will be happy to include it in your openSUSE distribution if it's technically feasible.

In any other case, a fix should trickle down into one of the following releases automatically.