Bug 1032832

Summary: intel: gnuplot's qt driver needs an excessive amount of memory for Intel HD graphics; system locks up
Product: [openSUSE] openSUSE Distribution Reporter: Ulrich Windl <Ulrich.Windl>
Component: KernelAssignee: E-mail List <xorg-maintainer-bugs>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Critical    
Priority: P3 - Medium CC: forgotten_DV81ZEWZkN, mstaudt, sor.alexei, Ulrich.Windl, werner
Version: Leap 42.2   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE 42.2   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: Sample data input file (compressed)
blocks.png -- just a test of blocks.sed plotted
Screen shot when system had locked up

Description Ulrich Windl 2017-04-07 08:00:38 UTC
Created attachment 720197 [details]
Sample data input file (compressed)

I was trying to plot a datafile consisting of not quite four million numbers (command: plot "datafile").
The X server seemed to freeze immediately, even though my system has 16GB RAM (plus about 2GB swap). After a long time the system was usable again.
It seems the "qt" driver that outputs the plot needed at least 18GB of RAM.
When using the "x11" driver (set terminal x11), the lot was displayed.

If the memory consumption of the qt driver cannot be reduced, the driver should print a warning before eating up all the system memory.

Messages started with this (using Intel HDA graphics):
pr 05 21:27:29 i7g4x4a kernel: Unable to purge GPU memory due lock contention.
Apr 05 21:27:31 i7g4x4a kernel: gnuplot_qt invoked oom-killer: gfp_mask=0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=0, order=0, oom_score_adj=0
Apr 05 21:27:31 i7g4x4a kernel: gnuplot_qt cpuset=/ mems_allowed=0
Apr 05 21:27:31 i7g4x4a kernel: CPU: 1 PID: 4318 Comm: gnuplot_qt Not tainted 4.4.49-16-default #1

Ending with:
Apr 05 21:27:33 i7g4x4a kernel: Out of memory: Kill process 4318 (gnuplot_qt) score 909 or sacrifice child
Apr 05 21:27:33 i7g4x4a kernel: Killed process 4318 (gnuplot_qt) total-vm:18199436kB, anon-rss:14379968kB, file-rss:2444kB, shmem-rss:1628kB
Comment 1 Dr. Werner Fink 2017-04-07 08:33:11 UTC
I've no idea what qt does, sorry.  I'm, using wxt and not qt ... interesting that qt is now the default. Beside this, there seems, as shown with

 Apr 05 21:27:29 i7g4x4a kernel: Unable to purge GPU memory due lock contention.

a permission problem here ... maybe /usr/lib/gnuplot/5.0/gnuplot_qt needs some extra permissions
Comment 2 Dr. Werner Fink 2017-04-07 09:02:39 UTC
You might giver gnuplot 5.0.5 from Publishing/gnuplot a try, see

  http://download.opensuse.org/repositories/Publishing/openSUSE_Leap_42.2/x86_64/gnuplot-5.0.5-90.1.x86_64.rpm

there had been some changes for QT as well:

  Changes in 5.0.5
  * CHANGE qt terminal force selection of outline font rather than bitmap font
  * CHANGE qt terminal sets TERM_POLYGON_PIXELS to avoid aliasing artifacts
  * FIX qt - leading or trailing whitespace in enhanced text was being ignored
  Changes in 5.0.4
  * FIX 'set term qt <N> close' acts immediately rather than after next mouse event
  Changes in 5.0.3
  * CHANGE qt terminal: toggle plots on/off only on left-click
  Changes in 5.0.2
  * FIX qt terminals dots were invisible
  * FIX regression in 5.0.1 that left extraneous '@' in title columnhead(N)
  * FIX qt terminal could drop chars from stdin depending on external event timing
  Changes in 5.0.1
  * CHANGE autoconfigure of Qt5 support now looks for --variable=host_bins
  * FIX qt terminal 3D rotation mode tendency to get stuck "on"
  Changes in final release of 5.0
  * FIX lt 0 rendering by libgd and qt terminals
Comment 3 Dr. Werner Fink 2017-04-07 09:29:29 UTC
Created attachment 720217 [details]
blocks.png -- just a test of blocks.sed plotted

Just tested the command

  plot "blocks.sed"

as described in the initial report.  As here this is a GPU with

  RADEON(0): Chipset: "ATI FireGL V3400" (ChipID = 0x71d2)

and I only have 6 Gig of RAM the reported problem could also be a problem with the X11 driver for Intel HDA GPU
Comment 4 Ulrich Windl 2017-04-07 09:56:33 UTC
(In reply to Dr. Werner Fink from comment #3)
(...)
> and I only have 6 Gig of RAM the reported problem could also be a problem
> with the X11 driver for Intel HDA GPU

You seem to be on the right track: The system here has 4GB of RAM and an old
NVIDIA GPU GeForce 9500 GT (G96) at PCI:1:0:0 (GPU-0)

I was also able to get the plot. It took some time until the contents appeared, but it worked.

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
user 28760  4.4  7.2 536800 293524 pts/1   T    11:51   0:02 gnuplot
user 28762 43.1  7.0 849744 287016 ?       Sl   11:51   0:18 /usr/lib/gnuplot/5.0/gnuplot_qt
Comment 5 Dr. Werner Fink 2017-04-07 10:25:27 UTC
(In reply to Ulrich Windl from comment #4)
> (In reply to Dr. Werner Fink from comment #3)
> (...)
> > and I only have 6 Gig of RAM the reported problem could also be a problem
> > with the X11 driver for Intel HDA GPU
> 
> You seem to be on the right track: The system here has 4GB of RAM and an old
> NVIDIA GPU GeForce 9500 GT (G96) at PCI:1:0:0 (GPU-0)

This a different system than this one

(In reply to Ulrich Windl from comment #0)
> Messages started with this (using Intel HDA graphics):
> pr 05 21:27:29 i7g4x4a kernel: Unable to purge GPU memory due lock
> contention.
> Apr 05 21:27:31 i7g4x4a kernel: gnuplot_qt invoked oom-killer:
> gfp_mask=0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=0, order=0,
> oom_score_adj=0
> Apr 05 21:27:31 i7g4x4a kernel: gnuplot_qt cpuset=/ mems_allowed=0
> Apr 05 21:27:31 i7g4x4a kernel: CPU: 1 PID: 4318 Comm: gnuplot_qt Not
> tainted 4.4.49-16-default #1

in other words 4Gig + NVIDIA GPU works whereas 16Gig + Intel HDA GPU does not.
Now the question rises if QT has special enhancments for the different GPUs out there and/or if the X11:Xorg driver for Intel HDA GPUs has a problem
Comment 6 Ulrich Windl 2017-04-07 18:19:32 UTC
Created attachment 720315 [details]
Screen shot when system had locked up

(In reply to Ulrich Windl from comment #4)
> (...)
> > and I only have 6 Gig of RAM the reported problem could also be a problem
> > with the X11 driver for Intel HDA GPU

"HDA GPU" is wrong; it's:
[    59.955] (--) intel(0): Integrated Graphics Chipset: Intel(R) HD Graphics 4600
[    59.955] (--) intel(0): CPU: x86-64, sse2, sse3, ssse3, sse4.1, sse4.2, avx, avx2; using a maximum of 4 threads


To make things worse: The kernel in comment #0 was Linux 4.4.49-16-default on x86_64. After updating to 4.4.57-18.3-default, I don't see any message any more, and the system lock up hard. Produced two times.
Before the second attempt I also ran top in a separate window, as the syslog did not produce any message before lockup. Symptom for the lockup was that not even Num-Lock key toggled the PS/2 keyboard's LED.

(Changing severity and summary accordingly)
Comment 7 Dr. Werner Fink 2017-04-07 20:40:34 UTC
If the kernel locks up then this is a kernel bug within the code for the Intel GPU
Comment 8 Stefan Dirsch 2017-05-11 12:52:04 UTC
No good idea either. Reassigning.
Comment 9 Ulrich Windl 2017-05-11 13:19:17 UTC
For completeness I should add that the system where the problem occurs features an encrypted volume group on top of an Intel matrix RAID1, meaning that the swap is also encrypted. That fact might be related to the observed "kernel freeze" (unsure whether it really freezed, but I did not react to anything for minutes, including the ACPI power switch).
Comment 10 Ulrich Windl 2017-06-07 14:33:43 UTC
There might be three cases to check (from the online documentation "help qt"):
In Qt version 4.7 or newer this can be controlled by the environmental variable QT_GRAPHICSSYSTEM. The options are "native", "raster", or "opengl" in order of increasing rendering speed.  For earlier versions of Qt the terminal defaults to "raster".
Comment 11 Max Staudt 2018-02-07 18:27:05 UTC
Does this still happen on Leap 42.3?
What hardware were you using?

I just tried this on an Intel Broadwell machine, and while plotting this data set is slow as melasses, it did appear after about 15 seconds. haven't registered a crazy RAM usage either. Well, gnuplot does seem to peat at 700 MB or more of RAM, but in any case, my system has only 12 GB and it still works. The kernel doesn't spew any warnings at all.

Thanks!
Comment 12 Ulrich Windl 2018-02-10 19:45:21 UTC
I can confirm that the issue is gone in 42.3 with current updates; some of those were:
gnuplot-5.0.0-12.24.x86_64
kernel-default-4.4.114-42.1.x86_64
Comment 13 Stefan Dirsch 2018-02-11 06:35:58 UTC
Ok. Then let's close this as fixed with Leap 42.2 no longer being supported.