Bug 420057

Summary: gdm fails to start properly if hostname changes after X starts but before gdmslave connects to display
Product: [openSUSE] openSUSE 11.1 Reporter: JP Rosevear <jpr>
Component: GNOMEAssignee: Hans Petter Jansson <hpj>
Status: RESOLVED FIXED QA Contact: E-mail List <nld10-bugs-qa>
Severity: Major    
Priority: P1 - Urgent CC: ast, eich, fmfischer, michael.monreal
Version: Factory   
Target Milestone: Beta 4   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: Failing log snippet
Successful gdm start message.
gdm-xauthlocalhostname.patch

Description JP Rosevear 2008-08-25 14:53:05 UTC
gdm fails to start properly if hostname changes after X starts but before gdmslave connects to display, then there are "Unable to connect to display :0" errors.  Using NM and yast was set to change hostname upon DHCP.  Turning the hostname change off solved the problem.  Not sure if we are using X code to connect or what.


ug 25 00:13:28 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmSlave: Creating proxy for /org/gnome/DisplayManager/Display1
Aug 25 00:13:28 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmSlave: Got display id: /org/gnome/DisplayManager/Display1
Aug 25 00:13:28 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmSignalHandler: Adding handler 11: signum=10 0x40cbd0
Aug 25 00:13:28 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmServer: Starting X server process: /usr/bin/X :0 -br -verbose -auth /var/run/gdm/auth-for-gdm-FxaQmM/database \
-nolisten tcp vt7
Aug 25 00:13:29 linux-qno7 gdm-simple-slave[2713]: DEBUG(+): GdmServer: Opening logfile for server /var/log/gdm/:0.log
Aug 25 00:13:29 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmServer: Started X server process 2713 - waiting for READY
Aug 25 00:13:29 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmSimpleSlave: Started X server
Aug 25 00:13:31 linux-qno7 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 7
Aug 25 00:13:31 linux-qno7 dhclient: DHCPOFFER from 192.168.1.1
Aug 25 00:13:31 linux-qno7 dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Aug 25 00:13:31 linux-qno7 dhclient: DHCPACK from 192.168.1.1
Aug 25 00:13:31 linux-qno7 dhclient: bound to 192.168.1.20 -- renewal in 114275 seconds.
Aug 25 00:13:31 linux-qno7 kernel: vendor=1022 device=7455
Aug 25 00:13:31 linux-qno7 kernel: ACPI: PCI Interrupt 0000:05:00.0[A] -> GSI 16 (level, low) -> IRQ 16
Aug 25 00:13:31 linux-qno7 kernel: [drm] Initialized drm 1.1.0 20060810
Aug 25 00:13:31 linux-qno7 kernel: [drm] Initialized radeon 1.29.0 20080528 on minor 0
Aug 25 00:13:32 linux-qno7 nm-dispatcher.action: Script '/etc/NetworkManager/dispatcher.d/autofs' exited with error status 1.
Aug 25 00:13:33 linux-qno7 kernel: ip6_tables: (C) 2000-2006 Netfilter Core Team
Aug 25 00:13:33 linux-qno7 kernel: agpgart: Found an AGP 3.0 compliant device at 0000:04:00.0.
Aug 25 00:13:33 linux-qno7 kernel: agpgart: Device is in legacy mode, falling back to 2.x
Aug 25 00:13:33 linux-qno7 kernel: agpgart: Putting AGP V2 device at 0000:04:00.0 into 4x mode
Aug 25 00:13:33 linux-qno7 kernel: agpgart: Putting AGP V2 device at 0000:05:00.0 into 4x mode
Aug 25 00:13:33 linux-qno7 kernel: ip_tables: (C) 2000-2006 Netfilter Core Team
Aug 25 00:13:33 linux-qno7 SuSEfirewall2: SuSEfirewall2 not active
Aug 25 00:13:33 linux-qno7 nm-dispatcher.action: Script '/etc/NetworkManager/dispatcher.d/netcontrol_global_hooks' exited with error status 1.
Aug 25 00:13:33 linux-qno7 nm-dispatcher.action: Script '/etc/NetworkManager/dispatcher.d/nfs' exited with error status 2.
Aug 25 00:13:33 linux-qno7 sshd[2718]: Accepted keyboard-interactive/pam for root from 192.168.1.30 port 10665 ssh2
Aug 25 00:13:33 linux-qno7 nm-dispatcher.action: Script '/etc/NetworkManager/dispatcher.d/smbfs' exited with error status 6.
Aug 25 00:13:33 linux-qno7 kernel: [drm] Setting GART location based on new memory map
Aug 25 00:13:33 linux-qno7 kernel: [drm] Loading R100 Microcode
Aug 25 00:13:33 linux-qno7 kernel: [drm] writeback test succeeded in 1 usecs
Aug 25 00:13:36 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmSignalHandler: handling signal 10
Aug 25 00:13:36 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmSignalHandler: Found 2 callbacks
Aug 25 00:13:36 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmSignalHandler: running 10 handler: 0x40cbd0
Aug 25 00:13:36 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmSignalHandler: running 10 handler: 0x406c50
Aug 25 00:13:36 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): Got callback for signal 10
Aug 25 00:13:36 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): Got USR1 signal
Aug 25 00:13:36 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmSignalHandler: Done handling signals
Aug 25 00:13:36 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmServer: Got USR1 from X server - emitting READY
Aug 25 00:13:36 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmSlave: Server is ready - opening display :0
Aug 25 00:13:36 linux-qno7 gdm-simple-slave[2709]: WARNING: Unable to connect to display :0
Aug 25 00:13:37 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmSlave: Server is ready - opening display :0
Aug 25 00:13:37 linux-qno7 gdm-simple-slave[2709]: WARNING: Unable to connect to display :0
Aug 25 00:13:37 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmSlave: Server is ready - opening display :0
Aug 25 00:13:37 linux-qno7 gdm-simple-slave[2709]: WARNING: Unable to connect to display :0
Aug 25 00:13:38 linux-qno7 gdm-simple-slave[2709]: DEBUG(+): GdmSlave: Server is ready - opening display :0
Aug 25 00:13:38 linux-qno7 gdm-simple-slave[2709]: WARNING: Unable to connect to display :0


and when hostname is off:
Aug 25 08:55:07 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): GdmSignalHandler: Adding handler 11: signum=10 0x40cbd0
Aug 25 08:55:07 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): GdmServer: Starting X server process: /usr/bin/X :0 -br -verbose -auth /var/run/gdm/auth-for-gdm-Mrf7ym/database \
-nolisten tcp vt7
Aug 25 08:55:07 linux-qno7 gdm-simple-slave[2539]: DEBUG(+): GdmServer: Opening logfile for server /var/log/gdm/:0.log
Aug 25 08:55:07 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): GdmServer: Started X server process 2539 - waiting for READY
Aug 25 08:55:08 linux-qno7 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 6
Aug 25 08:55:08 linux-qno7 dhclient: DHCPOFFER from 192.168.1.1
Aug 25 08:55:08 linux-qno7 dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Aug 25 08:55:08 linux-qno7 dhclient: DHCPACK from 192.168.1.1
Aug 25 08:55:07 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): GdmSimpleSlave: Started X server
Aug 25 08:55:08 linux-qno7 dhclient: bound to 192.168.1.20 -- renewal in 112173 seconds.
Aug 25 08:55:10 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): GdmSignalHandler: handling signal 10
Aug 25 08:55:10 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): GdmSignalHandler: Found 2 callbacks
Aug 25 08:55:10 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): GdmSignalHandler: running 10 handler: 0x40cbd0
Aug 25 08:55:10 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): GdmSignalHandler: running 10 handler: 0x406c50
Aug 25 08:55:10 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): Got callback for signal 10
Aug 25 08:55:10 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): Got USR1 signal
Aug 25 08:55:10 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): GdmSignalHandler: Done handling signals
Aug 25 08:55:10 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): GdmServer: Got USR1 from X server - emitting READY
Aug 25 08:55:10 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): GdmSlave: Server is ready - opening display :0
Aug 25 08:55:10 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): GdmSlave: Connected to display :0
Aug 25 08:55:10 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): GdmSimpleSlave: Running greeter
Aug 25 08:55:10 linux-qno7 gdm-simple-slave[2482]: DEBUG(+): GdmSimpleSlave: Creating greeter for :0
Comment 1 Stefan Dirsch 2008-08-25 15:12:44 UTC
I thought we had fixed such network issues (chaging IP adress resulted in no longer accessible DISPLAY) a long time ago. Could you please explain what gdmslave does exactly? BTW, pasting outputs with long lines results in unreadable comments. Better attach these. Thanks.
Comment 2 JP Rosevear 2008-08-25 19:09:15 UTC
HPJ can better explain the gdm stuff.
Comment 3 JP Rosevear 2008-08-25 19:11:15 UTC
Created attachment 235352 [details]
Failing log snippet
Comment 4 JP Rosevear 2008-08-25 19:14:52 UTC
Created attachment 235353 [details]
Successful gdm start message.
Comment 5 Stefan Dirsch 2008-08-26 06:32:35 UTC
Oops. The bug has been classified as GNOME bug. Re-add me to Cc later if you like.
Comment 6 JP Rosevear 2008-09-17 12:56:39 UTC
Submitted a fix for this - couldn't get XAUTHLOCALHOSTNAME stuff to work for me, so just made the cookie FamilyWild in the local case.  With XAUTHLOCALHOSTNAME it appears like there is a race between cookie creation and exporting the environment variable. Egbert, sound right both for the fix and the race?
Comment 7 JP Rosevear 2008-09-17 18:10:06 UTC
For #6
Comment 8 Hans Petter Jansson 2008-10-07 00:03:45 UTC
Egbert: Ping.
Comment 9 Egbert Eich 2008-10-07 05:30:03 UTC
(In reply to comment #6 from JP Rosevear)
> Submitted a fix for this - couldn't get XAUTHLOCALHOSTNAME stuff to work for
> me, so just made the cookie FamilyWild in the local case.  With
> XAUTHLOCALHOSTNAME it appears like there is a race between cookie creation and
> exporting the environment variable. Egbert, sound right both for the fix and
> the race?
> 
The host name in XAUTHLOCALHOSTNAME needs to be the same as the one written to the $XAUTHORITY file or $HOME/.Xauthority (if the environment variable isn't set). Whoever modifies the authority file should remember the hostname used there. Of course one needs to make sure that this variable is set in the environment the client is started in.
Comment 10 Egbert Eich 2008-10-07 15:47:16 UTC
The environment variable only needs to be set before the hostname changes. If for whatever reasons a client tries to connect after the cookie has been set and the hostname has change but before the environment variable is set, the client will fail to connect.
I think here is what happens: gdm-slave is started immediately when the server is started as it is waiting for a SIGUSR1 which indicates that it can connect.
While the Xserver is still starting up (and before it is able to send the SIGUSR1) the IP has changed. If gdm-slave had been started with XAUTHLOCALHOST set to exaclty the same value that had written to the authority file everything should have been fine.
Can you a. log the value of XAUTLOCALHOST in gdm-slave and do a xauth list?
Comment 11 Hans Petter Jansson 2008-10-08 01:45:02 UTC
Egbert, is it acceptable to just set "localhost" in both the cookie and the XAUTHLOCALHOSTNAME var?
Comment 12 Hans Petter Jansson 2008-10-08 04:27:14 UTC
Created attachment 244153 [details]
gdm-xauthlocalhostname.patch

Here's a working patch that trades the old hack for just saving "localhost" into the xauth db and setting XAUTHLOCALHOSTNAME to "localhost" everywhere. The new gdm appears to be setting up environments in a number of places :/
Comment 13 Hans Petter Jansson 2008-10-08 04:29:00 UTC
If this looks good security-wise, I think it should go in. Since the problem is timing-related I'm not 100% positive that this fixes it, but it does look like it should.
Comment 14 Egbert Eich 2008-10-08 09:23:56 UTC
Gdm isn't saving the authority file in the user's home directory any more, right?
Otherwise 'localhost' would be a bad idea as home directories could be shared over nfs, thus a second login would overwrite the credentials of the first.
Comment 15 Egbert Eich 2008-10-08 09:24:26 UTC
In the future we may want to switch to 'server interpreted' access control  
by default. It works like host based access control but is able to
distinguish users.
No more cookies are required. This has downsides of course and the security 
implications still need to be investigated.
Comment 16 Hans Petter Jansson 2008-10-09 05:55:45 UTC
It uses /var/run/gdm/auth-for-$USER-$RANDOMSTRING/database, so I guess we're safe in that respect.

Thanks a lot, Egbert!
Comment 17 Hans Petter Jansson 2008-10-10 01:57:22 UTC
Submitted to GNOME:Factory.
Comment 20 Stefan Behlert 2008-10-22 11:25:54 UTC
I just build an mbuild of the package submitted to stable and updated the gdm-files in an x86_64 installation. Afterwards opening yast and other applications failed after entering the root-password in the gnomesu-pop-up.
Reverting to the old package fixes the problem.
Looks like the check-in has a flaw?
Comment 21 JP Rosevear 2008-10-23 16:59:39 UTC
Tested this, and yes it fails, but remove the stale /root/.Xauthority manually and all should start working again.
Comment 22 JP Rosevear 2008-10-23 20:23:41 UTC
*** Bug 438161 has been marked as a duplicate of this bug. ***
Comment 23 Hans Petter Jansson 2008-10-25 01:21:46 UTC
Also, make sure you have the updated libgnomesu package I submitted some time ago.
Comment 24 JP Rosevear 2008-10-27 20:19:13 UTC
Stephan, please re-open if you see this in b4.