Bug 1102563

Summary: GNOME autologin broken with Wayland on aarch64
Product: [openSUSE] openSUSE Distribution Reporter: Guillaume GARDET <guillaume.gardet>
Component: GNOMEAssignee: E-mail List <gnome-bugs>
Status: RESOLVED WONTFIX QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: guillaume.gardet, mlin, mloviska, okurz, qkzhu, xiaoguang.wang, yfjiang
Version: Leap 15.3   
Target Milestone: Leap 15.3   
Hardware: aarch64   
OS: Other   
URL: https://openqa.opensuse.org/tests/711960/modules/first_boot/steps/21
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: Yes
Marketing QA Status: --- IT Deployment: ---
Bug Depends on:    
Bug Blocks: 1175470    
Attachments: journalctl log
journalctl with virtio_gpu_dri.so installed

Description Guillaume GARDET 2018-07-25 13:20:32 UTC
## Observation

openQA test in scenario opensuse-Tumbleweed-DVD-aarch64-create_hdd_gnome@aarch64 fails in
[first_boot](https://openqa.opensuse.org/tests/711960/modules/first_boot/steps/21)

The autologin for GNOME is broken on AArch64, whereas it is fine for x86_64.


## Reproducible

Fails since (at least) Build [20180603](https://openqa.opensuse.org/tests/686547)


## Expected result

Last good: [20180530](https://openqa.opensuse.org/tests/685290) (or more recent)

Autologin should work also for AArch64 as it did previously.

## Further details

Always latest result in this scenario: [latest](https://openqa.opensuse.org/tests/latest?test=create_hdd_gnome&flavor=DVD&distri=opensuse&machine=aarch64&version=Tumbleweed&arch=aarch64)
Comment 1 Guillaume GARDET 2018-07-25 15:31:26 UTC
/etc/sysconfig/displaymanager and /etc/gdm/custom.conf files are identical on x86_64 and aarch64.

To make autologin work on aarch64, we must edit /etc/gdm/custom.conf and uncomment the following line which force the login screen to use Xorg:
  WaylandEnable=false
Comment 2 Guillaume GARDET 2018-07-25 16:13:29 UTC
Created attachment 777999 [details]
journalctl log
Comment 3 Oliver Kurz 2018-07-25 17:04:58 UTC
(In reply to Guillaume GARDET from comment #1)
> /etc/sysconfig/displaymanager and /etc/gdm/custom.conf files are identical
> on x86_64 and aarch64.
> 
> To make autologin work on aarch64, we must edit /etc/gdm/custom.conf and
> uncomment the following line which force the login screen to use Xorg:
>   WaylandEnable=false

I guess you could do that in the spec file of the package that owns this file, e.g. gdm, based on an arch-specific branch. Or require an additional fix-package which does this change? Have you checked for any upstream bugs describing this problem?

Regarding the openQA test, as we have support for logging in users, e.g. see the use of the variable NOAUTOLOGIN, you might be able to detect the problem and workaround it in first_boot (or the helper function wait_boot)
Comment 4 Guillaume GARDET 2018-07-25 20:42:51 UTC
(In reply to Oliver Kurz from comment #3)
> (In reply to Guillaume GARDET from comment #1)
> > /etc/sysconfig/displaymanager and /etc/gdm/custom.conf files are identical
> > on x86_64 and aarch64.
> > 
> > To make autologin work on aarch64, we must edit /etc/gdm/custom.conf and
> > uncomment the following line which force the login screen to use Xorg:
> >   WaylandEnable=false
> 
> I guess you could do that in the spec file of the package that owns this
> file, e.g. gdm, based on an arch-specific branch. Or require an additional
> fix-package which does this change? Have you checked for any upstream bugs
> describing this problem?

It is more a workaround than a fix. 
I did not find any upstream bug for this. Just an old openSUSE bug: https://bugzilla.suse.com/show_bug.cgi?id=1011356

> 
> Regarding the openQA test, as we have support for logging in users, e.g. see
> the use of the variable NOAUTOLOGIN, you might be able to detect the problem
> and workaround it in first_boot (or the helper function wait_boot)

Yes, I am testing an update of the first_boot test to record a soft failure for this bug, to be able to continue the tests.
Comment 5 Michal Srb 2018-07-26 08:43:53 UTC
A sanity check: Does manual login work with WaylandEnable=true on AArch64?

It sounds strange that Wayland + AArch64 would somehow affect autologin. But it is fairly possible that it tries to do autologin, the session crashes and then it goes back to the login screen.
Comment 6 Guillaume GARDET 2018-07-26 12:11:00 UTC
(In reply to Michal Srb from comment #5)
> A sanity check: Does manual login work with WaylandEnable=true on AArch64?

With WaylandEnable=true on AArch64 I am able to login but loginctl still tell me that it is an X11 session.

> It sounds strange that Wayland + AArch64 would somehow affect autologin. But
> it is fairly possible that it tries to do autologin, the session crashes and
> then it goes back to the login screen.

I noticed that login screen appears for 1 second, then a black screen for 1 second and the login screen appears again.
It seems that login screen using Wayland is shown, crashes and the login screen using X11 is finally loaded. Is it possible?
Comment 7 Michal Srb 2018-07-26 12:54:13 UTC
(In reply to Guillaume GARDET from comment #6)
> With WaylandEnable=true on AArch64 I am able to login but loginctl still
> tell me that it is an X11 session.

That is ok. You can log into both X11 and Wayland sessions from GDM that is itself running in Wayland mode. It seems that the default session type is a X11 session.

> I noticed that login screen appears for 1 second, then a black screen for 1
> second and the login screen appears again.
> It seems that login screen using Wayland is shown, crashes and the login
> screen using X11 is finally loaded. Is it possible?

Maybe yes. The black screen could be also the failed attempt for autologin.

Please delete any ~/.xsession-errors* files in the home directory of the user that was supposed to be autologged-in, then reboot. When you get the login screen instead of the expected autologin, check if any such files were recreated. If yes, please attach them.
Comment 8 Guillaume GARDET 2018-07-26 15:19:03 UTC
(In reply to Michal Srb from comment #7)
> Please delete any ~/.xsession-errors* files in the home directory of the
> user that was supposed to be autologged-in, then reboot. When you get the
> login screen instead of the expected autologin, check if any such files were
> recreated. If yes, please attach them.

There is no such file.
Comment 9 Guillaume GARDET 2018-07-27 12:12:27 UTC
virgl has been enabled on ARM to get /usr/lib64/dri/virtio_gpu_dri.so file on ARM:
https://build.opensuse.org/package/rdiff/X11:XOrg/Mesa?linkrev=base&rev=761

We must wait that it reaches Factory:ARM to check if it fixes this bug or not.
Comment 10 Guillaume GARDET 2018-07-27 12:16:12 UTC
*** Bug 1102735 has been marked as a duplicate of this bug. ***
Comment 11 Stefan Dirsch 2018-08-14 11:55:45 UTC
I came into Tumbleweed with snapshot 20180807, so you can give it a try.

New Tumbleweed snapshot 20180807 released!
[...]
Packages changed:
  Mesa (18.1.4 -> 18.1.5)
  Mesa-drivers (18.1.4 -> 18.1.5)

Mesa.changes
[...]
-------------------------------------------------------------------
Thu Aug  2 20:13:36 UTC 2018 - mimi.vx@gmail.com

- update to 18.1.5
  * several fixes for radv
  * A few fixes for virgil, spirv, radeonsi, nir, disk cache and build
     systems

-------------------------------------------------------------------
Thu Jul 26 10:30:26 UTC 2018 - guillaume.gardet@opensuse.org

- Enable virgl on ARM
Comment 12 Guillaume GARDET 2018-08-14 12:38:36 UTC
Created attachment 779693 [details]
journalctl with virtio_gpu_dri.so installed

The interesting part seems to be:

  gdm[1695]: gdm_session_handle_secret_info_query: assertion 'self->priv->user_verifier_interface != NULL' failed
  gdm-autologin][1723]: gkr-pam: couldn't get the password from user: Conversation error
Comment 13 Guillaume GARDET 2018-08-14 12:41:07 UTC
(In reply to Stefan Dirsch from comment #11)
> I came into Tumbleweed with snapshot 20180807, so you can give it a try.

I uploaded the new log. There is no more the message complaining about virtio_gpu_dri.so file missing, but auto-login is still broken.

This part of the log seems to be relevant:
  gdm[1695]: gdm_session_handle_secret_info_query: assertion 'self->priv->user_verifier_interface != NULL' failed
  gdm-autologin][1723]: gkr-pam: couldn't get the password from user: Conversation error
Comment 14 Guillaume GARDET 2018-08-21 12:47:17 UTC
I just tested ready to boot openSUSE-Tumbleweed-ARM-GNOME-raspberrypi3.aarch64-2018.08.13-Build.raw image on my Raspberry Pi 3 and when I enable autologin, it does work fine.

On RPi3, loginctl states it uses X11.
Comment 27 Guillaume GARDET 2019-02-14 11:14:31 UTC
According to openQA, with an upgrade from Leap 15.0 to Tumbleweed, auto-login works: https://openqa.opensuse.org/tests/853602#step/first_boot/2
Comment 28 Guillaume GARDET 2019-02-19 20:06:45 UTC
(In reply to Guillaume GARDET from comment #27)
> According to openQA, with an upgrade from Leap 15.0 to Tumbleweed,
> auto-login works: https://openqa.opensuse.org/tests/853602#step/first_boot/2

This comment is probably wrong as Leap 15.0 image has wayland disable to make auto-login works.

openQA test without gnome Wayland, but only with gnome X11, autologin passes:
https://openqa.opensuse.org/tests/858692

So, it is really Wayland specific.
Comment 31 Guillaume GARDET 2019-03-22 08:45:13 UTC
Anything to do to move forward?
Comment 32 Guillaume GARDET 2019-03-22 14:35:32 UTC
In the log there is:
  Failed to create backend: Could not find a primary drm kms device


Whereas X uses a drm kms driver after Wayland failed.

Is there any known incompatibilities with virtio-gpu-pci?
Comment 33 Guillaume GARDET 2019-03-22 14:39:08 UTC
Maybe drm kms just load too late, as suggested here: https://nrhodes91.github.io/2018/11/25/archlinux-gnome-wayland-session/
Comment 34 Guillaume GARDET 2019-03-25 14:35:09 UTC
(In reply to Guillaume GARDET from comment #33)
> Maybe drm kms just load too late, as suggested here:
> https://nrhodes91.github.io/2018/11/25/archlinux-gnome-wayland-session/

I tried to force the load the drm/kms modules earlier and drm/kms is up after 3 seconds (vs 7 seconds previously), so way earlier before Wayland try to start. But it did not change anything.
Comment 39 Guillaume GARDET 2019-05-16 08:39:53 UTC
It is fixed in Tumbleweed 20190508.
Snapshot 20190428 was still broken.

I think there was an update regarding tty numbering in gnome wayland, but I cannot find it again. Maybe this is the update which fixed it?
Comment 40 Guillaume GARDET 2019-05-23 09:10:14 UTC
Probably fixed by http://bugzilla.opensuse.org/show_bug.cgi?id=1116011
Comment 65 Oliver Kurz 2021-04-01 05:03:08 UTC
This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: gnome
https://openqa.opensuse.org/tests/1686787

To prevent further reminder comments one of the following options should be followed:
1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
2. The openQA job group is moved to "Released"
3. The label in the openQA scenario is removed
Comment 66 openQA Review 2021-05-26 05:17:53 UTC
This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: create_hdd_cryptlvm
https://openqa.opensuse.org/tests/1755934

To prevent further reminder comments one of the following options should be followed:
1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
2. The openQA job group is moved to "Released"
3. The label in the openQA scenario is removed
Comment 67 Oliver Kurz 2021-05-26 06:12:00 UTC
judging from the openQA reminder comments this was never fixed just *assumed* to be fine when apparently it's not
Comment 68 Guillaume GARDET 2021-05-26 06:17:31 UTC
(In reply to Oliver Kurz from comment #67)
> judging from the openQA reminder comments this was never fixed just
> *assumed* to be fine when apparently it's not

According to the test history https://openqa.opensuse.org/tests/1755934#next_previous until snapshot 152.1 it was fine but it started to fail with latest snapshot, 160.3
Comment 69 Guillaume GARDET 2021-05-26 06:18:49 UTC
(In reply to Guillaume GARDET from comment #68)
> (In reply to Oliver Kurz from comment #67)
> > judging from the openQA reminder comments this was never fixed just
> > *assumed* to be fine when apparently it's not
> 
> According to the test history
> https://openqa.opensuse.org/tests/1755934#next_previous until snapshot 152.1
> it was fine but it started to fail with latest snapshot, 160.3

At the same time create_hdd-gnome-wayland is fine: https://openqa.opensuse.org/tests/1755936
Comment 70 Oliver Kurz 2021-05-26 06:18:58 UTC
oh, ok. Interesting. Do you have an idea of the change in the latest build that could explain such regression?
Comment 71 Guillaume GARDET 2021-05-26 14:39:08 UTC
(In reply to Oliver Kurz from comment #70)
> oh, ok. Interesting. Do you have an idea of the change in the latest build
> that could explain such regression?

No idea. Max may know.
Comment 72 openQA Review 2021-06-10 05:18:11 UTC
This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: create_hdd_cryptlvm
https://openqa.opensuse.org/tests/1755934

To prevent further reminder comments one of the following options should be followed:
1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
2. The openQA job group is moved to "Released"
3. The label in the openQA scenario is removed
Comment 73 openQA Review 2021-06-24 05:18:35 UTC
This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: create_hdd_cryptlvm
https://openqa.opensuse.org/tests/1755934

To prevent further reminder comments one of the following options should be followed:
1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
2. The openQA job group is moved to "Released"
3. The label in the openQA scenario is removed
Comment 74 openQA Review 2021-07-09 00:46:02 UTC
This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: create_hdd_cryptlvm
https://openqa.opensuse.org/tests/1755934

To prevent further reminder comments one of the following options should be followed:
1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
3. The label in the openQA scenario is removed
Comment 75 openQA Review 2021-07-24 00:01:02 UTC
This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: create_hdd_cryptlvm
https://openqa.opensuse.org/tests/1755934

To prevent further reminder comments one of the following options should be followed:
1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
3. The label in the openQA scenario is removed
Comment 76 openQA Review 2021-08-08 00:40:16 UTC
This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: create_hdd_cryptlvm
https://openqa.opensuse.org/tests/1755934

To prevent further reminder comments one of the following options should be followed:
1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
3. The label in the openQA scenario is removed
Comment 77 Guillaume GARDET 2021-10-19 09:28:00 UTC
Tumbleweed is fixed since a while, but Leap 15.3 is still affected.
Comment 79 Guillaume GARDET 2023-01-25 14:17:30 UTC
Leap 15.3 EOL.