|
Bugzilla – Full Text Bug Listing |
| Summary: | X11 not starting - Tumbleweed - After Kernel update (5.0.9 +) - radeon | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Tumbleweed | Reporter: | James Roulston <james.roulston.1> |
| Component: | Kernel | Assignee: | E-mail List <kernel-maintainers> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Major | ||
| Priority: | P2 - High | CC: | anatoli.antonovitch, fabian.baumanis, james.roulston.1, msvec, tiwai, tzimmermann |
| Version: | Current | ||
| Target Milestone: | Current | ||
| Hardware: | x86-64 | ||
| OS: | SUSE Other | ||
| Whiteboard: | |||
| Found By: | Community User | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
mkinitrd output
Patch for the amdgpu-dkms specfile. |
||
The description is somewhat confusing. Not sure which kernel module is supposed to be the culprit here. radeon or radeonfb? AFAIK radeonfb build is completely disabled ... Could you explain in more detail, please? When booting with 'Splash=0' I get the terminal message 'fb0: switching to radeondrmfb to VESA VGA.' I saw in the blacklist file '50-blacklist.conf' that radeonfb is indeed disabled. I think it's the radeon module not loading by default when it should. When booting with nomodest enabled I do get to a login terminal and running 'modprobe -v radeon modeset=1' gets things up and going again. radeondrmfb could be loading by default instead of radeon. radeondrmfb is the fb (=framebuffer) *implementation* of radeon's DRM (=direct rendering manager/kernel mode setting) driver/kernel module 'radeon'. 'radeonfb' is an ancient separate framebuffer driver/kernel module for radeon cards. We never built/shipped, let alone enabled them by default. This should explains things a bit. So my conclusion would be, that perhaps radeon driver is not initialized yet completely when desktop is starting. Created attachment 805046 [details]
mkinitrd output
Did you install kernel-firmware package? The error message from dracut indicates the missing firmware files, and that can be the cause of the graphics problem at boot. Kernel Firmware is installed, version 20190502-1.1 according to yast. Below is a snippet of some of the files under the Dependencies tab in YaST Software management. firmware(r8a779x_usb3_v1.dlmem) firmware(r8a779x_usb3_v2.dlmem) firmware(r8a779x_usb3_v3.dlmem) firmware(radeon/ARUBA_me.bin) firmware(radeon/ARUBA_pfp.bin) firmware(radeon/ARUBA_rlc.bin) firmware(radeon/BARTS_mc.bin) firmware(radeon/BARTS_me.bin) firmware(radeon/BARTS_pfp.bin) firmware(radeon/BARTS_smc.bin) firmware(radeon/BONAIRE_ce.bin) My card is Radeon HD 8470D ARUBA And do you see the files /lib/firmare/xxx on your system for the files that are reported by dracut? For example, /lib/firmware/radeon/R520_cp.bin is present on your system? If not, check whether you really have kernel-firmware package installed. This is the listing of everything in /lib/firmware/radeon/ ARUBA_me.bin ARUBA_pfp.bin ARUBA_rlc.bin banks_k_2_smc.bin BARTS_mc.bin BARTS_me.bin BARTS_pfp.bin BARTS_smc.bin bonaire_ce.bin BONAIRE_ce.bin bonaire_k_smc.bin BONAIRE_mc2.bin bonaire_mc.bin BONAIRE_mc.bin bonaire_me.bin BONAIRE_me.bin bonaire_mec.bin BONAIRE_mec.bin bonaire_pfp.bin BONAIRE_pfp.bin bonaire_rlc.bin BONAIRE_rlc.bin bonaire_sdma1.bin bonaire_sdma.bin BONAIRE_sdma.bin bonaire_smc.bin BONAIRE_smc.bin bonaire_uvd.bin BONAIRE_uvd.bin bonaire_vce.bin BONAIRE_vce.bin BTC_rlc.bin CAICOS_mc.bin CAICOS_me.bin CAICOS_pfp.bin CAICOS_smc.bin CAYMAN_mc.bin CAYMAN_me.bin CAYMAN_pfp.bin CAYMAN_rlc.bin CAYMAN_smc.bin CEDAR_me.bin CEDAR_pfp.bin CEDAR_rlc.bin CEDAR_smc.bin CYPRESS_me.bin CYPRESS_pfp.bin CYPRESS_rlc.bin CYPRESS_smc.bin CYPRESS_uvd.bin hainan_ce.bin HAINAN_ce.bin hainan_k_smc.bin HAINAN_mc2.bin hainan_mc.bin HAINAN_mc.bin hainan_me.bin HAINAN_me.bin hainan_pfp.bin HAINAN_pfp.bin hainan_rlc.bin HAINAN_rlc.bin hainan_smc.bin HAINAN_smc.bin hawaii_ce.bin HAWAII_ce.bin hawaii_k_smc.bin HAWAII_mc2.bin hawaii_mc.bin HAWAII_mc.bin hawaii_me.bin HAWAII_me.bin hawaii_mec.bin HAWAII_mec.bin hawaii_pfp.bin HAWAII_pfp.bin hawaii_rlc.bin HAWAII_rlc.bin hawaii_sdma1.bin hawaii_sdma.bin HAWAII_sdma.bin hawaii_smc.bin HAWAII_smc.bin hawaii_uvd.bin hawaii_vce.bin JUNIPER_me.bin JUNIPER_pfp.bin JUNIPER_rlc.bin JUNIPER_smc.bin kabini_ce.bin KABINI_ce.bin kabini_me.bin KABINI_me.bin kabini_mec.bin KABINI_mec.bin kabini_pfp.bin KABINI_pfp.bin kabini_rlc.bin KABINI_rlc.bin kabini_sdma1.bin kabini_sdma.bin KABINI_sdma.bin kabini_uvd.bin kabini_vce.bin kaveri_ce.bin KAVERI_ce.bin kaveri_me.bin KAVERI_me.bin kaveri_mec2.bin kaveri_mec.bin KAVERI_mec.bin kaveri_pfp.bin KAVERI_pfp.bin kaveri_rlc.bin KAVERI_rlc.bin kaveri_sdma1.bin kaveri_sdma.bin KAVERI_sdma.bin kaveri_uvd.bin kaveri_vce.bin mullins_ce.bin MULLINS_ce.bin mullins_me.bin MULLINS_me.bin mullins_mec.bin MULLINS_mec.bin mullins_pfp.bin MULLINS_pfp.bin mullins_rlc.bin MULLINS_rlc.bin mullins_sdma1.bin mullins_sdma.bin MULLINS_sdma.bin mullins_uvd.bin mullins_vce.bin oland_ce.bin OLAND_ce.bin oland_k_smc.bin OLAND_mc2.bin oland_mc.bin OLAND_mc.bin oland_me.bin OLAND_me.bin oland_pfp.bin OLAND_pfp.bin oland_rlc.bin OLAND_rlc.bin oland_smc.bin OLAND_smc.bin PALM_me.bin PALM_pfp.bin pitcairn_ce.bin PITCAIRN_ce.bin pitcairn_k_smc.bin PITCAIRN_mc2.bin pitcairn_mc.bin PITCAIRN_mc.bin pitcairn_me.bin PITCAIRN_me.bin pitcairn_pfp.bin PITCAIRN_pfp.bin pitcairn_rlc.bin PITCAIRN_rlc.bin pitcairn_smc.bin PITCAIRN_smc.bin R100_cp.bin R200_cp.bin R300_cp.bin R420_cp.bin R520_cp.bin R600_me.bin R600_pfp.bin R600_rlc.bin R600_uvd.bin R700_rlc.bin REDWOOD_me.bin REDWOOD_pfp.bin REDWOOD_rlc.bin REDWOOD_smc.bin RS600_cp.bin RS690_cp.bin RS780_me.bin RS780_pfp.bin RS780_uvd.bin RV610_me.bin RV610_pfp.bin RV620_me.bin RV620_pfp.bin RV630_me.bin RV630_pfp.bin RV635_me.bin RV635_pfp.bin RV670_me.bin RV670_pfp.bin RV710_me.bin RV710_pfp.bin RV710_smc.bin RV710_uvd.bin RV730_me.bin RV730_pfp.bin RV730_smc.bin RV740_smc.bin RV770_me.bin RV770_pfp.bin RV770_smc.bin RV770_uvd.bin si58_mc.bin SUMO2_me.bin SUMO2_pfp.bin SUMO_me.bin SUMO_pfp.bin SUMO_rlc.bin SUMO_uvd.bin tahiti_ce.bin TAHITI_ce.bin tahiti_k_smc.bin TAHITI_mc2.bin tahiti_mc.bin TAHITI_mc.bin tahiti_me.bin TAHITI_me.bin tahiti_pfp.bin TAHITI_pfp.bin tahiti_rlc.bin TAHITI_rlc.bin tahiti_smc.bin TAHITI_smc.bin TAHITI_uvd.bin TAHITI_vce.bin TURKS_mc.bin TURKS_me.bin TURKS_pfp.bin TURKS_smc.bin verde_ce.bin VERDE_ce.bin verde_k_smc.bin VERDE_mc2.bin verde_mc.bin VERDE_mc.bin verde_me.bin VERDE_me.bin verde_pfp.bin VERDE_pfp.bin verde_rlc.bin VERDE_rlc.bin verde_smc.bin VERDE_smc.bin So, the file is present but still dracut fails to install it? Wait... Have you ever install amdgpu package? It's known to break dracut because of the buggy firmware path setup. If you have it, uninstall it and make sure that everything got cleaned up without stale files left from the package, and try to recreate initrd. A triad to install AMDGPU-Pro a while back but it didn't work so I got rid of it and it's repos so maybe it broke something. According to YaST amdgpu is installed so I'll delete those and see what happens. If it's still broken I can do a fresh reinstall to see if things work. IIRC amdgpu proprietary driver installed a file below /etc/dracut.conf.d in order to change the firmware file path (driver comes with his own firmware files). The big issue was, that it did'nt uninstall this config file during uninstall. So please make sure there is no such file left afterwards. Otherwise there will be no firmware files in initrd after running mkinitrd. BTW, there is a command to list initrd content. sudo lsinitrd /boot/initrd.. so you can check whether the firmware files are generated to initrd. I just reinstalled and everything seems to be working now. The amdgpu driver probably broke things. The result of /etc/dracut.conf.d is: -rw-r--r-- 1 root root 22 May 3 21:30 02-early-microcode.conf -rw-r--r-- 1 root root 487 May 3 21:30 99-debug.conf -rw-r--r-- 1 root root 821 May 4 02:27 ostree.conf Thanks for your help and time and if it happens again I know what to look out for. Ok. Let's assume the culprit was AMD's amdgpu proprietary driver packages. According to AMD the issue has been fixed in a later release of amdgpu driver packages though. I reinstalled the AMDGPU-Pro driver again to see if I could recreate the issue and it did. I got a blank screen on boot so booted with nomodeset and ran yast via terminal and manually removed the AMDGPU-Pro packages and ran mkinitrd, I got the same firmware missing messages and the system wouldn't boot into X afterwords like before so I checked the /etc/dracut.conf.d and there was a file called amdgpu-5.0.13-1-default.conf, so I removed it and ran mkinitrd and received no missing firmware messages. I then rebooted normally and everything is running fine again. So the AMD Driver was the culprit here. Thank you all for your time and help. Regards, James Hell, AMD claimed they would have fixed the issue with the latest available drivers and this is months ago ... Which driver version are you using? I believe the latest version is 18.50 https://www.amd.com/en/support/kb/release-notes/rn-rad-lin-18-50-unified The amdgpu-pro version is 19.10. which I downloaded from the same link you provided. It's not compatible with my card so I'm back using radeon which works well. (In reply to James Roulston from comment #17) > The amdgpu-pro version is 19.10. which I downloaded from the same link you > provided. > It's not compatible with my card so I'm back using radeon which works well. I have downloaded the SLED/SLES 15 RPM packages from https://www.amd.com/en/support/kb/release-notes/rn-rad-lin-19-10-unified In none of the packages I could find a dracut file. It's also not created by a %pre/%post RPM script. I have no idea how this file /etc/dracut.conf.d/amdgpu-5.0.13-1-default.conf is created. I reinstalled amdgpu-pro again to check Yast software manager for files that are provided by the driver but I did not see anything. This also meant it broke my machine again, but when removing the amdgpu file from drac.conf.d and restarting, it still wouldn't start X. but when I checked the drac.conf.d directory again I got the following output: total 16 -rw-r--r-- 1 root root 22 May 3 21:30 02-early-microcode.conf -rw-r--r-- 1 root root 487 May 3 21:30 99-debug.conf -rw-r--r-- 1 root root 87 May 15 18:20 amdgpu-5.0.13-1-default.conf -rw-r--r-- 1 root root 821 May 4 02:27 ostree.conf I deleted it and tried loading the amdgpu driver via modprobe, it didn't start but the amdgpu-5.0.13-1-default.conf file was back in again. I deleted the file again and rebooted and it appeared again. Maybe the driver is creating the file itself when it tries to load. I have deleted all amdgpu-pro related drivers and files again from the AMD repo. I checked the list of compatible cards for the amdgpu driver and mines is not supported, which is why it never worked. It was the amdgpu-5.0.13-1-default.conf that was causing the issue. /usr/src/amdgpu-19.10-785424/pre-build.sh [...] FW_DIR="/lib/firmware/$KERNELVER" mkdir -p $FW_DIR cp -ar /usr/src/amdgpu-19.10-785424/firmware/amdgpu $FW_DIR echo "add_drivers+=\" amdgpu\"" >/etc/dracut.conf.d/amdgpu-$KERNELVER.conf echo "add_drivers+=\" amdkfd\"" >>/etc/dracut.conf.d/amdgpu-$KERNELVER.conf echo "fw_dir+=\"$FW_DIR\"" >>/etc/dracut.conf.d/amdgpu-$KERNELVER.conf /usr/src/amdgpu-19.10-785424/post-remove.sh #!/bin/bash FW_DIR="/lib/firmware" rm -rf $FW_DIR/*/amdgpu [[ ! $(ls -A $FW_DIR) ]] && rm -rf $FW_DIR rm -f /etc/dracut.conf.d/amdgpu-*.conf These files belong to package amdgpu-dkms. rpm --scripts -qp amdgpu-dkms-19.10-785424.noarch.rpm [...] preuninstall scriptlet (using /bin/sh): dkms remove -m amdgpu -v 19.10-785424 --all --rpm_safe_upgrade exit $? I guess that this dkms should have removed the dracut file, which for some reason failed. s/dkms/dkms call/ Fabian, could you try to reproduce the issue by installing the SLE15 packages from https://www.amd.com/en/support/kb/release-notes/rn-rad-lin-19-10-unified Then - if reproducable - try to figure out, why this happens. One explanation would be, that dkms is no longer available when amdgpu-dkms is being uninstalled. I adjusted the pre-uninstall script from the amdgpu-dkms package. Now, after the 'dkms remove' call, the amdgpu-*.conf file is removed independently from the dkms call. See the attached patch. Created attachment 805594 [details]
Patch for the amdgpu-dkms specfile.
Strictly speaking, the postun scriptlet should check the argument $1 to see if it's an update or the actual uninstall. (In reply to Takashi Iwai from comment #25) > Strictly speaking, the postun scriptlet should check the argument $1 to see > if it's an update or the actual uninstall. Right, we would need something like this: %preun -p /bin/sh # not on update! if [ "$1" -eq 0 ]; then # dkms call may fail, so the script, which removes the dracut file, will not be executed # so make sure that it gets removed in any case rm -f /etc/dracut.conf.d/amdgpu-*.conf dkms remove -m amdgpu -v 19.10-785424 --all --rpm_safe_upgrade fi *** Bug 1147646 has been marked as a duplicate of this bug. *** The postun scriptlet has been updated in the spec. It should be available in the release 19.40. Thanks. Seems 19.40 is not available yet. At least I couldn't find it ... |
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0 Build Identifier: After updating Tumbleweed to latest kernel 5.0.9 (5.0.11 Current) the system would freeze during boot although there was still hard drive activity. I could successfully boot using 'Nomodeset' in the kernel parameter. I could startx but in some fallback mode so quite slow. I could load in the correct driver by doing 'modprobe -v radeon modeset=1'. this started X correctly. It seems to be loading 'radeonfb' instead of 'radeon' I fix for this will be in the additional info area. Video details below VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Richland [Radeon HD 8470D] [1002:9996] Subsystem: Hewlett-Packard Company Device [103c:2b60] Kernel driver in use: radeon Kernel modules: radeon Reproducible: Always Steps to Reproduce: 1. Update system using zypper dup with new kernel 2. reboot 3. Actual Results: The screen hangs with no X display, the keyboard shortcuts ctrl+alt+Fn keys don't work but ctrl+alt+del reboots. Expected Results: X window manager should start. I fixed the issue by doing the following. On the terminal i go into /etc/modprobe.d I created a file called 'blacklist.radeon.conf' in that file I added the line 'blacklist radeon' I then ran 'mkinitrd' when I rebooted the issue was still the same so I changed the line to 'blacklist radeonfb' but did not run mkinitrd. I rebooted and everything works again. I have to re-edit the 'blacklist.radeon.conf' file with 'blacklist radeon' then mkinitrd and then re-edit that file to 'blacklist radeonfb' I have to repeat these steps on every Kernel Update since 5.0.9.