Bug 1106635

Summary: DRM radeon GPU fault detected (gem object lookup failed)
Product: [openSUSE] openSUSE Tumbleweed Reporter: Andrey Karepin <egdfree>
Component: KernelAssignee: E-mail List <kernel-maintainers>
Status: RESOLVED DUPLICATE QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None CC: egdfree, sndirsch, tiwai
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
URL: https://bugs.freedesktop.org/show_bug.cgi?id=105381
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Andrey Karepin 2018-08-30 18:26:02 UTC
After plug two monitor in DisplayPort i see this errors in dmesg (second monitor connected in HDMI):

kernel: [drm:radeon_cs_parser_relocs [radeon]] *ERROR* gem object lookup failed 0xa
kernel: [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to parse relocation -2!

kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x0ec35014
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00105B76
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x03050014
kernel: VM fault (0x04, vmid 1) at page 1071990, write from CB (80)
kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x0e835014
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00105B57
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x03010014
kernel: VM fault (0x04, vmid 1) at page 1071959, write from CB (16)
kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x0ee36014
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00105C3E
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x03020014
kernel: VM fault (0x04, vmid 1) at page 1072190, write from CB (32)
kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x0ec36014
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00105CC0
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x03010014
kernel: VM fault (0x04, vmid 1) at page 1072320, write from CB (16)
kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x0ee35014
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00105D36
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x03060014
kernel: VM fault (0x04, vmid 1) at page 1072438, write from CB (96)
kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x0e836014
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00105D9C
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x03050014
kernel: VM fault (0x04, vmid 1) at page 1072540, write from CB (80)
kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x0ea35014
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00105E12
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x03010014
kernel: VM fault (0x04, vmid 1) at page 1072658, write from CB (16)
kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x07435014
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00105B38
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x03010014
kernel: VM fault (0x04, vmid 1) at page 1071928, write from CB (16)
kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x07035014
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00105BA3
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x03010014
kernel: VM fault (0x04, vmid 1) at page 1072035, write from CB (16)
kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x07035014
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00105C26
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x03060014
kernel: VM fault (0x04, vmid 1) at page 1072166, write from CB (96)
kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x07236014
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00105C23
kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x03020014

installed:
kernel-default-4.18.5-1.6.x86_64
libdrm_radeon1-32bit-2.4.93-1.1.x86_64
libdrm_amdgpu1-2.4.93-1.1.x86_64
libdrm_nouveau2-2.4.93-1.1.x86_64
libva-drm2-2.2.0-1.1.x86_64
libdrm_nouveau2-32bit-2.4.93-1.1.x86_64
libdrm2-2.4.93-1.1.x86_64
libdrm_intel1-32bit-2.4.93-1.1.x86_64
libdrm_intel1-2.4.93-1.1.x86_64
libdrm_radeon1-2.4.93-1.1.x86_64
libdrm2-32bit-2.4.93-1.1.x86_64
libdrm_amdgpu1-32bit-2.4.93-1.1.x86_64
libdrm-devel-2.4.93-1.1.x86_64
Mesa-libGLESv2-2-18.1.6-207.1.x86_64
Mesa-dri-32bit-18.1.6-205.1.x86_64
libOSMesa8-18.1.6-207.1.x86_64
Mesa-libva-18.1.6-207.1.x86_64
Mesa-libEGL1-18.1.6-207.1.x86_64
Mesa-gallium-18.1.6-207.1.x86_64
Mesa-libEGL-devel-18.1.6-207.1.x86_64
Mesa-libEGL1-32bit-18.1.6-207.1.x86_64
Mesa-gallium-32bit-18.1.6-205.1.x86_64
Mesa-libGL1-18.1.6-207.1.x86_64
Mesa-libglapi0-18.1.6-207.1.x86_64
Mesa-libd3d-32bit-18.1.6-205.1.x86_64
Mesa-dri-18.1.6-207.1.x86_64
Mesa-libglapi0-32bit-18.1.6-207.1.x86_64
Mesa-libGL-devel-18.1.6-207.1.x86_64
Mesa-demo-x-8.4.0-1.3.x86_64
Mesa-libd3d-18.1.6-207.1.x86_64
Mesa-32bit-18.1.6-207.1.x86_64
libOSMesa8-32bit-18.1.6-207.1.x86_64
Mesa-libGL1-32bit-18.1.6-207.1.x86_64
Mesa-18.1.6-207.1.x86_64

hardware:
Radeon HD7770
Comment 1 Andrey Karepin 2018-08-30 18:27:14 UTC
first monitor connected in HDMI, secon in DP
Comment 2 Takashi Iwai 2018-09-03 13:50:58 UTC
Is this a regression from older kernels?

Since there haven't been so many changes in radeon driver code between 4.17 and 4.18, I suppose the bug must be somewhere else.

In anyway, we need to report this to upstream.  Care to do it?  e.g. bugzilla.freedesktop.org, category DRI, DRM/Radeon.
Comment 3 Andrey Karepin 2018-09-03 17:44:08 UTC
> Is this a regression from older kernels?
second monitor works fine until I upgraded to Tumbleweed snapshot 20180827

> bugzilla.freedesktop.org, category DRI, DRM/Radeon
report submitted as 107819
Comment 4 Andrey Karepin 2018-09-04 10:59:46 UTC
patch
https://bugs.freedesktop.org/show_bug.cgi?id=105381#c20
Comment 5 Takashi Iwai 2018-09-04 12:08:53 UTC
Stefan, could you build a test package with these fix patches?
Comment 6 Stefan Dirsch 2018-09-04 19:26:29 UTC
Oh. Actually the patches are already in the package we currently have in our buildservice project.

https://build.opensuse.org/package/show/X11:XOrg/xf86-video-ati
Comment 7 Andrey Karepin 2018-09-05 11:10:22 UTC
I switch packages vendor to X11:XOrg and it's work perfectly, thanks!

Stefan, could you, please, submit this version to Tumbleweed?
Comment 8 Stefan Dirsch 2018-09-05 12:33:40 UTC
(In reply to Andrey Karepin from comment #7)
> I switch packages vendor to X11:XOrg and it's work perfectly, thanks!
> 
> Stefan, could you, please, submit this version to Tumbleweed?

It was just accepted today. Check for xf86-video-ati RPM changelog

Thu Aug 16 14:19:06 UTC 2018 - sndirsch@suse.com

- Update to release 18.0.99 (git describe: 18.0.1-44-g740f0850)
  * supposed to provide a fix for boo#1100759, fdo#107528, fdo#105381

Closing as fixed.
Comment 9 Andrey Karepin 2018-09-05 13:05:41 UTC
> It was just accepted today.
Thanks!
Comment 10 Andrey Karepin 2018-09-05 13:15:00 UTC

*** This bug has been marked as a duplicate of bug 1100759 ***