Bug 769209

Summary: plymouthd: intel_bufmgr_gem.c:2783: drm_intel_bufmgr_gem_init: Assertion `0' failed.
Product: [openSUSE] openSUSE 12.2 Reporter: Thomas Renninger <trenn>
Component: X.OrgAssignee: Forgotten User sM9JzehKpy <forgotten_sM9JzehKpy>
Status: RESOLVED FIXED QA Contact: E-mail List <xorg-maintainer-bugs>
Severity: Critical    
Priority: P5 - None CC: coolo, fcrozat, forgotten_4P83LPe9jj, mmarek, msrb, sbrabec, sndirsch, tiwai
Version: Beta 2   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: /var/log/Xorg.0.log output
plymouth:debug log in /var/log/plymouth-debug.log
Xorg.0.log with IVY-M GT2
Patch to quit plymouth properly before starting DM

Description Thomas Renninger 2012-06-28 10:59:56 UTC
After installation (2nd installation stage) the system hangs hard while configuring HW.
All I see is a black screen with 2 lines (2 times the message stated in the title of this bug) in the top area.

NUM/CAPS lock of the keyboard do not respond anymore.

I found a possibly related bug here:
http://code.google.com/p/chromium-os/issues/detail?id=29499
Comment 1 Egbert Eich 2012-06-28 12:07:11 UTC
I have nothing to do with plymouthd. Please find the maintainer and complain loudly to him.
Comment 2 Stefan Dirsch 2012-06-28 12:32:14 UTC
Looking at the library deps of plymouth it seems to talk to libdrm directly. Reassigning to maintainer.
Comment 3 Stefan Dirsch 2012-06-28 12:43:16 UTC
I guess this needs to be debugged in libdrm ..

intel/intel_bufmgr_gem.c:

drm_intel_bufmgr_gem_init(...)
{
        [...]
        bufmgr_gem->pci_device = get_pci_device_id(bufmgr_gem);

        if (IS_GEN2(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 2;
        else if (IS_GEN3(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 3;
        else if (IS_GEN4(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 4;
        else if (IS_GEN5(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 5;
        else if (IS_GEN6(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 6;
        else if (IS_GEN7(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 7;
        else
                assert(0);
        [...]

Unfortunately installing a beta2 is currently already something between a nightmare and not possible at all. :-(
Comment 4 Thomas Renninger 2012-06-28 13:09:10 UTC
I can still log into the machine via ssh.
I tried i915_modeset=0, removed splash=silent boot options, without luck.
As soon as graphics are tried to get initialized, the screen stays black.

Please ping me and I can provide ssh login for someone to debug this (company internal only of course).
Comment 5 Thomas Renninger 2012-06-28 13:09:53 UTC
Created attachment 496758 [details]
/var/log/Xorg.0.log output
Comment 6 Stefan Dirsch 2012-06-28 13:11:53 UTC
Yes, please give me ssh access. Thanks.
Comment 7 Thomas Renninger 2012-06-28 13:15:03 UTC
Created attachment 496759 [details]
plymouth:debug log in /var/log/plymouth-debug.log
Comment 9 Stefan Dirsch 2012-06-28 15:53:36 UTC
I don't see such an assertion when running plymouthd on this machine with this kernel running. I didn't try to boot the original 12.2 kernel, since I'm not familiar with configuring grub2. :-(

There seem to be different renderers available for plymouth: drm, fb and text. Maybe we should use fb also for drm drivers since they provide also a generic fb interface. Just an idea, if we can't really debug that issue and it's easily and reliably fixed that way.
Comment 10 Michal Srb 2012-06-28 16:30:52 UTC
According to the Xorg.0.log, it is "Ivybridge Server (GT2)".

If I am looking correctly, this one is missing in IS_GEN* definitions in intel_chipset.h in our libdrm.

It was added recently in upstream, commit:
http://cgit.freedesktop.org/mesa/drm/commit/?id=e057a56448e2e785f74bc13dbd6ead8572ebed91

Could this be the cause?
Comment 11 Stefan Dirsch 2012-06-28 16:39:31 UTC
Indeed. That would perfectly match!
Comment 12 Takashi Iwai 2012-06-28 18:03:46 UTC
This reminds me of SP2.  We need to patch libdrm there, too... (bnc#759971)

BTW, plymouth is serious broken with Intel hardware, not only with IVY-S GT2, as of beta2.
I installed on IronLake laptop, SandyBridge laptop, 965GM desktop and IvyBridge laptops.  And all hanged up.  After uninstalling plymouth they start working better.
Comment 13 Stefan Dirsch 2012-06-28 19:06:34 UTC
I need to confirm what Takashi has written on any Ivybridge laptop I've tried so far. Just that I couldn't find out that Plymouth is the culprit here. So besides from adding the libdrm patch back to my proposal of comment #9?
Comment 14 Forgotten User sM9JzehKpy 2012-06-29 05:52:10 UTC
@Takashi:  Did those laptop's all ended up with this assertion error ? Plymouth itself is just using libdrm and the available KMS. Of course if something is wrong in libdrm, then you have the situation where this comes up early in the boot process. 

I am running a laptop with i915 chipset and there things works as they should. 

In the meantime we have version 0.8.5.1 available, but I am not sure if Coolo would allow any version update for 12.2. This new version should support all type of new cards through a generic renderer. I will package this version and submit it to Factory with an indication towards this bug, but as said I had no problems whatsoever with installing Beta 2 on my laptop (Lenovo T410) with an intel chipset.
Comment 15 Takashi Iwai 2012-06-29 10:04:13 UTC
OK, below is xorg.log.  The highlight is:

[   313.536] drmOpenDevice: node name is /dev/dri/card0
[   313.536] drmOpenDevice: open result is 9, (OK)
[   313.536] drmOpenByBusid: Searching for BusID pci:0000:00:02.0
[   313.536] drmOpenDevice: node name is /dev/dri/card0
[   313.536] drmOpenDevice: open result is 9, (OK)
[   313.536] drmOpenByBusid: drmOpenMinor returns 9
[   313.536] drmOpenByBusid: Interface 1.4 failed, trying 1.1
[   313.536] drmOpenByBusid: drmGetBusid reports 
[   313.536] drmOpenDevice: node name is /dev/dri/card1
[   313.540] drmOpenByBusid: drmOpenMinor returns -1
[   313.540] drmOpenDevice: node name is /dev/dri/card2
[   313.544] drmOpenByBusid: drmOpenMinor returns -1
[   313.544] drmOpenDevice: node name is /dev/dri/card3
[   313.548] drmOpenByBusid: drmOpenMinor returns -1
[   313.548] drmOpenDevice: node name is /dev/dri/card4
[   313.552] drmOpenByBusid: drmOpenMinor returns -1
[   313.552] drmOpenDevice: node name is /dev/dri/card5
[   313.556] drmOpenByBusid: drmOpenMinor returns -1
[   313.556] drmOpenDevice: node name is /dev/dri/card6
[   313.560] drmOpenByBusid: drmOpenMinor returns -1
[   313.560] drmOpenDevice: node name is /dev/dri/card7
[   313.564] drmOpenByBusid: drmOpenMinor returns -1
[   313.564] drmOpenDevice: node name is /dev/dri/card8
[   313.568] drmOpenByBusid: drmOpenMinor returns -1
[   313.568] drmOpenDevice: node name is /dev/dri/card9
[   313.572] drmOpenByBusid: drmOpenMinor returns -1
[   313.572] drmOpenDevice: node name is /dev/dri/card10
[   313.576] drmOpenByBusid: drmOpenMinor returns -1
[   313.576] drmOpenDevice: node name is /dev/dri/card11
[   313.580] drmOpenByBusid: drmOpenMinor returns -1
[   313.580] drmOpenDevice: node name is /dev/dri/card12
[   313.584] drmOpenByBusid: drmOpenMinor returns -1
[   313.584] drmOpenDevice: node name is /dev/dri/card13
[   313.587] drmOpenByBusid: drmOpenMinor returns -1
[   313.587] drmOpenDevice: node name is /dev/dri/card14
[   313.591] drmOpenByBusid: drmOpenMinor returns -1
[   313.591] drmOpenDevice: node name is /dev/dri/card15
[   313.594] drmOpenByBusid: drmOpenMinor returns -1
[   313.594] drmOpenDevice: node name is /dev/dri/card0
[   313.594] drmOpenDevice: open result is 9, (OK)
[   313.594] drmOpenDevice: node name is /dev/dri/card0
[   313.594] drmOpenDevice: open result is 9, (OK)
[   313.594] drmGetBusid returned ''
[   313.594] (EE) intel(0): [drm] failed to set drm interface version.
[   313.594] (EE) intel(0): Failed to become DRM master.
[   313.594] (II) intel(0): Creating default Display subsection in Screen section
        "Default Screen Section" for depth/fbbpp 24/32
[   313.594] (==) intel(0): Depth 24, (--) framebuffer bpp 32
[   313.594] (==) intel(0): RGB weight 888
[   313.594] (==) intel(0): Default visual is TrueColor
[   313.594] (II) intel(0): Integrated Graphics Chipset: Intel(R) Ivybridge Mobile (GT2)
[   313.594] (--) intel(0): Chipset: "Ivybridge Mobile (GT2)"
Comment 16 Takashi Iwai 2012-06-29 10:05:50 UTC
Created attachment 496902 [details]
Xorg.0.log with IVY-M GT2

At the point of 313.594, X hangs up.
Comment 17 Takashi Iwai 2012-06-29 10:54:26 UTC
It seems that the hang up on my test machine isn't directly related with drm (although the error messages above look scary).

When you start X manually while plymouth is showing a splash, it could be started well.  However, if you do "chvt 1; chvt 7", then it returns back to plymouth splash screen instead of X.
Then, even after you hide the plymouth screen or kill plymouthd, X never gets back to the normal state.  (Following "kill -HUP X" restores the visual but the inputs are gone.)

So, there could be an issue with VT handling in addition.

I guess we should test the newer version of plymouth at this moment instead of wasting too many efforts for debugging the old version.
Comment 18 Forgotten User sM9JzehKpy 2012-06-29 10:58:52 UTC
The latest version of Plymouth (0.8.5.1) is available in Base:Systems for openSUSE:Factory.
Comment 19 Thomas Renninger 2012-06-29 11:07:34 UTC
> In the meantime we have version 0.8.5.1 available, but I am not sure if Coolo
> would allow any version update for 12.2.
Adding him.
Most interesting comments: #12, #13.
Eh this is plymouth version, but the libdrm is broken as well?

The fact that HW has to be whitelisted or otherwise the installation will hang sounds like a serious design issue.
Is there any workaround to get around the HW dependencies?
Can I get things working with some kind of userspace program or kernel parameter or whatsoever quick fix (without recompiling/packaging)?

I tried plymouth-0.8.5.1 and thanks to Stefan I could give a patches libdrm version a try:


plymouth 0.8.5.1: fixes: splash screen working early (did not before) with and
         without i915_modeset=0 (didn't work at all before)

fixed libdrm version from Stefan: X comes up (does not with old or new plymouth
                                  version (X also throughs the assert failure
                                  message shown in the title like:
X: intel_bufmgr_gem.c:2783: drm_intel_bufmgr_gem_init: Assertion `0' failed. 


Summary: At least two bugs in different packages.
Comment 20 Thomas Renninger 2012-06-29 11:15:52 UTC
Raymond:
Have the plymouth systemd files been added to plymouth?
Adding a Requires: systemd-plymouth to plymouth and Provides: systemd-plymouth to systemd (and plymouth systemd files in systemd package) looks wrong?

If this is still needed, why?
Fixing this provide/require mess would be nice before the release is out and it has to be checked for versions forever...
Comment 21 Forgotten User sM9JzehKpy 2012-06-29 11:41:16 UTC
Thomas, 

In Base:System we have now two packages that provide the same files. Initially the plymouth integration came from systemd and was provided with systemd-plymouth. With the latest version of plymouth, the integration with systemd has been moved to the plymouth package itself. Therefore I packaged this temporarily in plymouth-systemd. This package has some provides/obsoletes on the systemd-plymouth package. 

I guess for the moment, the best way would be to delete the systemd integration inside plymouth and to keep it where it is now. Unfortunately fcrozat is not online in IRC, but this way we would prevent changes to systemd as well.
Comment 22 Bernhard Wiedemann 2012-06-29 12:00:07 UTC
This is an autogenerated message for OBS integration:
This bug (769209) was mentioned in
https://build.opensuse.org/request/show/126601 Factory / libdrm
https://build.opensuse.org/request/show/126602 Factory / libdrm
Comment 23 Thomas Renninger 2012-06-29 12:16:37 UTC
> I guess for the moment, the best way would be to delete the systemd integration
> inside plymouth and to keep it where it is now.
If still anyhow possible I would add the plymouth's systemd files into plymouth and remove them from systemd. This is how things should look like later, right?

Now you can still do that. Once 12.2 is released, one has to take a lot extra care to check the update case 12.2 -> 12.X if systemd plymouth files should get moved to where (I expect) they belong to (plymouth package).
Comment 24 Thomas Renninger 2012-06-29 12:24:51 UTC
I split out the "splash not working" issue: bug #769397
This bug is not about plymouth, but about a libdrm issue.
Stefan has at least a workaround.
Can the patch be posted, will this get submitted? Or does there a newer upstream version exist having the this fixed already?
Tell me if I shall test something else or if someone still wants to access the machine.
Comment 25 Takashi Iwai 2012-06-29 12:28:17 UTC
I tried plymouth-0.8.5 from Base:System, but the hang still occurs at the same place.  systemd-plymouth isn't updated.

Before update:
# rpm -qa | grep plymouth
plymouth-plugin-label-0.8.4-11.1.x86_64
plymouth-scripts-0.8.4-11.1.noarch
plymouth-plugin-script-0.8.4-11.1.x86_64
plymouth-0.8.4-11.1.x86_64
systemd-plymouth-44-8.1.x86_64
plymouth-branding-openSUSE-0.8.4-11.1.noarch

After upadate:
# rpm -qa | grep plymouth
plymouth-0.8.5.1-75.1.x86_64
plymouth-plugin-label-0.8.5.1-75.1.x86_64
plymouth-scripts-0.8.5.1-75.1.noarch
systemd-plymouth-44-8.1.x86_64
plymouth-branding-openSUSE-0.8.5.1-75.1.noarch
plymouth-plugin-script-0.8.5.1-75.1.x86_64
Comment 26 Forgotten User sM9JzehKpy 2012-06-29 12:31:15 UTC
Takashi,

Updating plymouth alone does not resolve the issue described here. The systemd-plymouth package doesn't require to be updated as that this has nothing to do with starting plymouth (as this is done within the initrd). 

I assume that you are missing the patches for libdrm.
Comment 27 Takashi Iwai 2012-06-29 12:39:36 UTC
(In reply to comment #26)
> Takashi,
> 
> Updating plymouth alone does not resolve the issue described here. The
> systemd-plymouth package doesn't require to be updated as that this has nothing
> to do with starting plymouth (as this is done within the initrd). 
> 
> I assume that you are missing the patches for libdrm.

No, you miss the point that plymouth itself is always working (showing splash) on my machines.  Although it's IvyBridge, but it's a different chip variant.
The problem on all of my test installations is that X hangs up when plymouth is running.  It's a different problem from libdrm.  Possibly some VT-handling issue.

The problem happens even if you remove drm rendering plugin from plymouth.
Actually I also updated libdrm now, but the problem still remains.

So, there are basically three things we are facing:
1. IVY-S GT2 is missing in libdrm
   Already fixed in OBS X11:XOrg, and submitted via SRID 126602. 
   (This is the answer to comment 24)

2. plymouth splash doesn't appear on IVY (bnc#769397)
   Updating plymouth to 0.8.5 fixed on Thomas' machine

3. X hangs up due to plymouth
   Updating plymouth 0.8.5 doesn't help.

Maybe better to open another bug to track the issue 3?
Comment 28 Bernhard Wiedemann 2012-06-29 13:00:21 UTC
This is an autogenerated message for OBS integration:
This bug (769209) was mentioned in
https://build.opensuse.org/request/show/126609 Factory / plymouth
https://build.opensuse.org/request/show/126610 Factory / plymouth
https://build.opensuse.org/request/show/126617 Factory / libdrm
Comment 29 Takashi Iwai 2012-06-29 13:31:53 UTC
Looks like Ubuntu has the same issue about X and plymouth race (issue 3 in comment 27):
  https://bugs.launchpad.net/ubuntu/+source/libdrm/+bug/982889
Comment 30 Takashi Iwai 2012-06-29 13:46:57 UTC
I correct my previous comment.  It has something to do with drm.

I removed /usr/lib64/plymouth/render/drm.so, and run mkinitrd, reboot.
Then now X starts correctly.  Maybe at the last time I forgot mkinitrd or re-installed plymouth again.

However, there is still something odd.  plymouthd keeps running and hogs 7% CPU to draw something.
And after chvt 1, some garbage are shown in the middle of the screen, likely the splash animation effect.
Comment 31 Bernhard Wiedemann 2012-06-29 14:00:15 UTC
This is an autogenerated message for OBS integration:
This bug (769209) was mentioned in
https://build.opensuse.org/request/show/126628 Factory / libdrm
Comment 32 Takashi Iwai 2012-06-29 14:06:54 UTC
OK, we found a workaround.  The whole problem is that plymouth is running while X starts.  It should have been terminated beforehand.

I guess other distros with systemd use a service or such for starting X, but in our case, it's still an init script.  Thus there is no depenency for plymouth.

Patch below to /etc/init.d/xdm makes X starting correctly.
Comment 33 Takashi Iwai 2012-06-29 14:07:51 UTC
Created attachment 496958 [details]
Patch to quit plymouth properly before starting DM
Comment 34 Stefan Dirsch 2012-06-29 14:15:49 UTC
(In reply to comment #33)
> Created an attachment (id=496958) [details]
> Patch to quit plymouth properly before starting DM

Fixed in obs://X11:XOrg and pushed to factory (SR#126630).
Comment 35 Takashi Iwai 2012-06-29 14:18:30 UTC
It seems that --retain-splash option doesn't give any better result, so better to omit the option.

I got some weird VT behavior when I tested this option a few times.  Not sure whether it's related with it, though.
Comment 36 Bernhard Wiedemann 2012-06-29 15:00:44 UTC
This is an autogenerated message for OBS integration:
This bug (769209) was mentioned in
https://build.opensuse.org/request/show/126630 Factory / xdm
Comment 37 Bernhard Wiedemann 2012-06-29 19:02:21 UTC
This is an autogenerated message for OBS integration:
This bug (769209) was mentioned in
https://build.opensuse.org/request/show/126666 Factory / xdm
Comment 38 Forgotten User sM9JzehKpy 2012-06-29 19:08:35 UTC
Be aware that the standard GNOME (GDM) and KDE (KDM) display managers nicely work together with plymouth to ensure that it has stopped before Xorg starts. 

The retain-splash option is required in order to get a seemless switch to the displaymanager from the plymouth splash. In this case plymouth keeps it's splash on the screen until Xorg takes the screen over. This is standard functionality of Xorg !!  (with the option --no-background). 

Which displaymanager is used in this case ? I assume that you are not using KDE or Gnome as a desktop ?
Comment 39 Takashi Iwai 2012-06-29 19:28:45 UTC
OK, that explains a lot.  FWIW, I've tested xdm and XFCE.

It's really a design flaw in plymouth.  It should have been integrated better in the session management.  Otherwise it breaks X so badly.
Comment 40 Michal Marek 2012-07-02 12:45:15 UTC
*** Bug 769037 has been marked as a duplicate of this bug. ***
Comment 41 Stefan Dirsch 2012-07-02 13:49:53 UTC
*** Bug 769416 has been marked as a duplicate of this bug. ***
Comment 42 Frederic Crozat 2012-07-02 14:02:46 UTC
I think the plymouth quit code should be moved to /usr/lib/X11/displaysmanager/<display_manager_code> to make sure it doesn't interfear with KDM / GDM discussion with plymouth.
Comment 43 Takashi Iwai 2012-07-02 14:10:20 UTC
The KDM/GDM check was already added in FACTORY xdm.

Yes, it's possible to fix in /usr/lib/X11/displaymanager/*, but in that case, you must add "plymouth quit" to each entry.  If you forget it, you'll lose the game -- X hangs up, and you cannot see even why, if you have no remote access.

Thus I think it's safer to take a white-list approach.
Comment 44 Michal Marek 2012-07-02 14:11:22 UTC
That would require each (new) displaymanager script to have it's own plymouth handling. How about skipping the 'plymouth quit' call in /etc/init.d/xdm if the configured display manager is either "kdm" or "gdm"? That way, the default is to quit plymount and not break X startup.
Comment 45 Michal Marek 2012-07-02 14:12:01 UTC
(In reply to comment #44)
> That would require each (new) displaymanager script to have it's own plymouth
> handling. [...]

This was answer to Frederic's comment 42.
Comment 46 Frederic Crozat 2012-07-02 14:17:58 UTC
(In reply to comment #43)
> The KDM/GDM check was already added in FACTORY xdm.
> 
> Yes, it's possible to fix in /usr/lib/X11/displaymanager/*, but in that case,
> you must add "plymouth quit" to each entry.  If you forget it, you'll lose the
> game -- X hangs up, and you cannot see even why, if you have no remote access.
> 
> Thus I think it's safer to take a white-list approach.

Sorry, I missed the check. I agree with you for the whitelist approach.
Comment 47 Takashi Iwai 2012-07-17 13:36:54 UTC
xdm package already contains the fix and no longer hang is seen.
Let's close the bug now.