Bug 1074869

Summary: Linux-4.14.11 crashes on Wine (double fault)
Product: [openSUSE] openSUSE Distribution Reporter: kolA flash <kolAflash>
Component: KernelAssignee: E-mail List <kernel-maintainers>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: akontsevich, AxelKoellhofer, dbischof, dimstar, jslaby, kolAflash, luismiguel427, meissner, purevw, schwab, tiwai, tneo, vbabka
Version: Leap 42.3   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Bug Depends on:    
Bug Blocks: 1075183    
Attachments: error messages on CTRL-ALT-F10
running StarCraft-2 in Wine on 4.14.12 with nopti

Description kolA flash 2018-01-05 21:19:50 UTC
Created attachment 755014 [details]
error messages on CTRL-ALT-F10

I'm using a pretty recent Radeon-RX-550, because of which I upgraded my openSUSE 42.3 to Linux-4.14.x from https://download.opensuse.org/repositories/Kernel:/stable/standard/.

Until 4.14.7 everything was fine.

After upgrading to 4.14.11 Wine doesn't works anymore. Even "wine --version" (using wine-3.0-rc4 and wine-staging-2.20) makes my PC completly hang (picture on the screen freezes, no SSH connection possible).
Guess this maybe has something to do with the recent KPTI (Meltdown and Spectre) fixes and maybe it's an upstream Linux or Wine bug.

The only way to get any error information was to run "sleep 5; wine --version" and to switch to CTRL-ALT-F10. There I got the attached error log. (see attached photo)
After 90 seconds the PC reboots, as announced in the error message. CTRL-ALT-DEL and ALT-SYSRQ-(REISUB) don't work.

All other programs seem to work well until know. Including DRI 3D rendering for my KDE-5 desktop.

My hardware:
CPU: AMD Phenom(tm) II X4 955 Processor
RAM: 12 GB
GPU: Radeon RX-550
Mainboard: ASUS M4A785TD-V EVO

On my ThinkPad X220 (Intel Core i7 2xxx CPU without dedicated GPU) and a pretty similar openSUSE-42.3 system the bug doesn't appear when running Linux-4.14.11 and Wine.

Linux RPMs used:
https://download.opensuse.org/repositories/Kernel:/stable/standard/x86_64/kernel-default-4.14.11-2.1.gc36893f.x86_64.rpm
  https://web.archive.org/web/20180105210615/http://anorien.csc.warwick.ac.uk/mirrors/download.opensuse.org/repositories/Kernel:/stable/standard/x86_64/kernel-default-4.14.11-2.1.gc36893f.x86_64.rpm
https://download.opensuse.org/repositories/Kernel:/stable/standard/x86_64/kernel-syms-4.14.11-2.1.gc36893f.x86_64.rpm
  https://web.archive.org/web/20180105210621/http://anorien.csc.warwick.ac.uk/mirrors/download.opensuse.org/repositories/Kernel:/stable/standard/x86_64/kernel-syms-4.14.11-2.1.gc36893f.x86_64.rpm

Wine RPMs used:
https://download.opensuse.org/repositories/Emulators/openSUSE_Leap_42.3/x86_64/wine-3.0~rc4-468.1.x86_64.rpm
  https://web.archive.org/web/20180105205857/http://anorien.csc.warwick.ac.uk/mirrors/download.opensuse.org/repositories/Emulators/openSUSE_Leap_42.3/x86_64/wine-3.0~rc4-468.1.x86_64.rpm
https://download.opensuse.org/repositories/Emulators/openSUSE_Leap_42.3/x86_64/wine-32bit-3.0~rc4-468.1.x86_64.rpm
  https://web.archive.org/web/20180105205900/http://anorien.csc.warwick.ac.uk/mirrors/download.opensuse.org/repositories/Emulators/openSUSE_Leap_42.3/x86_64/wine-32bit-3.0~rc4-468.1.x86_64.rpm
Comment 1 Vlastimil Babka 2018-01-06 12:15:59 UTC
(In reply to kolA flash from comment #0)
> CPU: AMD Phenom(tm) II X4 955 Processor

This CPU should have page table isolation disabled as AMD should be safe, if that patch went into stable 4.14 already. Please grep for "pti" or "kaiser" in /proc/cpuinfo to verify neither is there.

Hm actually it seems from looking at source that pti will be there. Try booting with "nopti" kernel parameter, verify that pti is gone, and try wine again.

I recall from LWN comments wine was doing something with ldt that required some handling wrt PTI, maybe a patch is missing from stable...
Comment 2 Axel Köllhofer 2018-01-06 12:53:17 UTC
I can reproduce the problem reported on my a rather old AMD machine (8 year old Toshiba Netbook) with:

lscpu 
Architektur:           x86_64
CPU Operationsmodus:   32-bit, 64-bit
Byte-Reihenfolge:      Little Endian
CPU(s):                2
Liste der Online-CPU(s):0,1
Thread(s) pro Kern:    1
Kern(e) pro Socket:    2
Sockel:                1
NUMA-Knoten:           1
Anbieterkennung:       AuthenticAMD
Prozessorfamilie:      20
Modell:                1
Modellname:            AMD C-50 Processor
Stepping:              0
CPU MHz:               800.000
Maximale Taktfrequenz der CPU:1000,0000
Minimale Taktfrequenz der CPU:800,0000
BogoMIPS:              1994.96
Virtualisierung:       AMD-V
L1d Cache:             32K
L1i Cache:             32K
L2 Cache:              512K
NUMA-Knoten0 CPU(s):   0,1
Markierungen:          fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni monitor ssse3 cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch ibs skinit wdt hw_pstate pti vmmcall arat npt lbrv svm_lock nrip_save pausefilter

While my two other (Intel) machines don't show this behaviour, on the AMD machine the kernel crashes when executing _any_ 32-Bit application. I tried different types of 32-bit applications, statically linked against glibc, musl-libc or dietlibc and also dynamically linked against glibc and all of them crash the kernel immediately.

Booting with "nopti" successfully works around this issue.

AK
Comment 3 Vlastimil Babka 2018-01-06 13:35:35 UTC
So apparently this is known problem that will be fixed in 4.14.12. Which will also have the patch to automatically disable PTI on AMD processors. For now please use the "nopti" parameter.
Comment 4 Axel Köllhofer 2018-01-06 13:51:05 UTC
(In reply to Vlastimil Babka from comment #3)
> So apparently this is known problem that will be fixed in 4.14.12. 

For the record (and other people reading this thread), this seems to be the commit you are referring to:

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-4.14.y&id=151d7039757b71ebd9d170af0944562f51149372

AK
Comment 5 Jiri Slaby 2018-01-07 08:44:30 UTC
So does 4.14.12 work? Kernel stable has it already:
https://build.opensuse.org/project/monitor?project=Kernel%3Astable
Comment 6 Axel Köllhofer 2018-01-07 14:37:21 UTC
(In reply to Jiri Slaby from comment #5)
> So does 4.14.12 work? Kernel stable has it already:
> https://build.opensuse.org/project/monitor?project=Kernel%3Astable

Unfortunately, no.

I Installed 4.14.12-2.1 from Kernel:stable

Linux 4.14.12-2.g7637ae2-default #1 SMP PREEMPT Sat Jan 6 09:10:30 UTC 2018 (7637ae2) x86_64 x86_64 x86_64 GNU/Linux

I tried executing some 32-Bit applications one after the other (Disclaimer: I have a nice little collection of small 32-Bit applications, most of them statically linked on my custom "Rescue-USB stick even useful for old systems", so I have some choice to test) and at first, the problem seemed to be fixed.

But after a while, the system crashed and in order to reproduce the problem, I rebooted the system and tried the same executable which crashed the system before. To my surprise, this time the system kept running.

Now I tried calling that application in a loop (while true; do ; done) and the system crashed after a few repetitions.

Same goes for other 32-bit applications I tried, after a few repetitions in a loop the system crashes reproducibly. (It also seems the "bigger" the application is and the more output is produced, the lesser loops are needed to crash the system, though not really in a predictable manner.)

In addition, booting the kernel with the "nopti" parameter does no longer have any effect as in 4.14.11.

AK
Comment 7 Dominique Leuenberger 2018-01-07 15:03:24 UTC
I see something very same when building virtualbox in obs with kernel 4.14.11 running as the base:

https://build.opensuse.org/package/live_build_log/openSUSE:Factory:Staging:D:DVD/virtualbox/standard/x86_64

In openQQ, about 1% of the runs had a fail in glibc-sanity, which does nothing but run /lib/glibc.so.6 (this special lib allows that)
Comment 8 kolA flash 2018-01-07 23:35:01 UTC
Created attachment 755068 [details]
running StarCraft-2 in Wine on 4.14.12 with nopti

Test results:

4.14.7 (no errors with this)
- /proc/cpuinfo doesn't have "pti"

4.14.11
- /proc/cpuinfo has "pti"
- Boot with "nopti" option and "wine --version" and even more complex Windows applications like the "Blizzard App" or "StarCraft 2" run fine in Wine.

4.14.12
- /proc/cpuinfo doesn't have "pti"
- Boot with or without "nopti" makes no difference
- "wine --version" runs fine
- More complex Windows applications like the "Blizzard App" or "StarCraft 2" crash Wine. (see attached screenshot)
Comment 9 Takashi Iwai 2018-01-08 14:52:46 UTC
The double fault on 4.4.12 might be not specific to WINE.  We see lots of 32bit build package breakage now (bsc#1075018).
Comment 10 Jiri Slaby 2018-01-09 13:42:54 UTC
*** Bug 1074918 has been marked as a duplicate of this bug. ***
Comment 11 Jiri Slaby 2018-01-09 13:43:02 UTC
*** Bug 1074920 has been marked as a duplicate of this bug. ***
Comment 12 Jiri Slaby 2018-01-09 13:43:11 UTC
*** Bug 1074921 has been marked as a duplicate of this bug. ***
Comment 13 Jiri Slaby 2018-01-09 13:43:21 UTC
*** Bug 1075018 has been marked as a duplicate of this bug. ***
Comment 14 Jiri Slaby 2018-01-09 13:43:35 UTC
*** Bug 1075034 has been marked as a duplicate of this bug. ***
Comment 16 Daniel Bischof 2018-01-09 18:06:11 UTC
---
Linux version 4.14.12-2.gf4b3cf0-default (geeko@buildhost) (gcc version 7.2.1 20171020 [gcc-7-branch revision 253932] (SUSE Linux)) #1 SMP PREEMPT Tue Jan 9 13:48:04 UTC 2018 (f4b3cf0)
---

works for me, fixes bug 1074921
Comment 17 Axel Köllhofer 2018-01-09 18:27:37 UTC
(In reply to Jiri Slaby from comment #15)
> Could anyone test this?
>  
> https://build.opensuse.org/project/monitor?project=home%3Ajirislaby%3Astable-
> bnc1075183

Works for me, tested with several 32-bit applications running in a loop for several minutes, no more crashes. However, some additional testing with a "big" application like wine might be a good idea.

Linux 4.14.12-2.gf4b3cf0-default #1 SMP PREEMPT Tue Jan 9 13:48:04 UTC 2018 (f4b3cf0) x86_64 x86_64 x86_64 GNU/Linux

AK
Comment 18 Jiri Slaby 2018-01-09 18:59:37 UTC
(In reply to Jiri Slaby from comment #15)
> Could anyone test this?
>  
> https://build.opensuse.org/project/monitor?project=home%3Ajirislaby%3Astable-
> bnc1075183

I submitted the fix to Kernel:stable and removed this project. Use the former (after it builds) if you still want to test.

Also I submitted it to factory.

If someone still sees some issues with the fixed kernel, please reopen your respective bug.
Comment 19 Aleksey Kontsevich 2018-01-10 00:41:20 UTC
(In reply to Jiri Slaby from comment #18)
> Also I submitted it to factory.
KDE still crashes with latest version from factory: 4.14.12-1.4 - Bug 1075034.
Comment 20 Jiri Slaby 2018-01-10 08:22:55 UTC
Factory has not received the kernel yet -- see:
https://build.opensuse.org/request/show/563126

Or monitor changelog and check whether it mentions these bugs.
Comment 21 Jiri Slaby 2018-01-10 08:23:40 UTC
*** Bug 1075034 has been marked as a duplicate of this bug. ***
Comment 22 kolA flash 2018-01-11 04:30:31 UTC
(In reply to Jiri Slaby from comment #18)
> (In reply to Jiri Slaby from comment #15)
> > Could anyone test this?
> >  
> > https://build.opensuse.org/project/monitor?project=home%3Ajirislaby%3Astable-
> > bnc1075183
> 
> I submitted the fix to Kernel:stable and removed this project. Use the
> former (after it builds) if you still want to test.
> 
> Also I submitted it to factory.
> 
> If someone still sees some issues with the fixed kernel, please reopen your
> respective bug.

Just installed 4.14.13-1.1.gbd444a0 from
  https://download.opensuse.org/repositories/Kernel:/stable/standard/
and it looks like everything is back to normal. StarCraft 2 is running fine again on Wine!
Comment 23 Aleksey Kontsevich 2018-01-11 05:35:27 UTC
(In reply to kolA flash from comment #22)
> Just installed 4.14.13-1.1.gbd444a0 from and it looks like everything is back to normal.
Good, waiting for 4.14.13 in factory.
Comment 24 Aleksey Kontsevich 2018-01-17 12:10:54 UTC
KDE still crashes with latest kernel 4.14.12-1.8 from factory. When good kernel be available?
Comment 25 Marcus Meissner 2018-01-17 12:52:34 UTC
the 4.14.13 kernel will be in the next tumbleweed snapshot and I also submitted it for :Update
Comment 26 Jiri Slaby 2018-01-18 09:25:59 UTC
.
Comment 27 t neo 2018-01-18 20:17:59 UTC
Will this fix also work for GOG games that are installed locally?