Bug 219946

Summary: First Xserver (fbdev) Xserver crashes (in mouse driver) when second Xserver (native driver) is terminated
Product: [openSUSE] openSUSE 10.2 Reporter: Lukas Ocilka <locilka>
Component: X.OrgAssignee: Stefan Dirsch <sndirsch>
Status: RESOLVED FIXED QA Contact: Stefan Dirsch <sndirsch>
Severity: Critical    
Priority: P2 - High CC: chuller, eich, forgotten_CxVz4LpaB5, gholmer, hjh, hugo.costelha, mrmazda, ms, rf, sndirsch
Version: Beta 2   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: Other Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: The entire /var/log directory
bug219946.jpg, camera screenshot
Workaround for this issue

Description Lukas Ocilka 2006-11-10 16:50:54 UTC
I was running the Xconfiguration test.
After a successfull test, SaX offered to save the configuration, so I accepted.
Then it killed YaST.

I don't remember the exit message. Maybe SaX has just killed all X running?

2006-11-10 17:36:48 <1> gizo(4270) [YCP] clients/x11_proposal.ycp:60 required X11 packages: ["xorg-x11", "xorg-x11-server", "libusb",
 "sax2", "sax2-gui", "sax2-ident", "sax2-tools", "sax2-libsax", "sax2-libsax-perl"]
2006-11-10 17:36:49 <1> gizo(4270) [YCP] clients/x11_proposal.ycp:65 x11 package status is: <true>

--- here ---
2006-11-10 17:36:49 <1> gizo(4270) [YCP] clients/x11_proposal.ycp:314 X11: running a test server now...
2006-11-10 17:37:04 <2> gizo(4270) [qt-ui] YQUI_core.cc(qMessageHandler):661 qt-warning: qt: Fatal IO error: client killed
--- here ---

2006-11-10 17:37:04 <1> gizo(4270) [zypp::SourceManager] SourceManager.cc(~SourceManager):140 Deleted SourceManager Singleton.
2006-11-10 17:37:04 <0> gizo(4270) [zypp] PathInfo.cc(_Log_Result):295 recursive_rmdir /var/tmp/zypp.SYgS9j
2006-11-10 17:37:04 <0> gizo(4270) [zypp] TmpPath.cc(~Impl):78 TmpPath cleaned up /var/tmp/zypp.SYgS9j{d 0700 0/0}
2006-11-10 17:37:04 <1> gizo(4270) [Y2Perl] YPerl.cc(destroy):163 Shutting down embedded Perl interpreter.
Comment 1 Lukas Ocilka 2006-11-10 16:51:46 UTC
Created attachment 104711 [details]
The entire /var/log directory
Comment 3 Marcus Schaefer 2006-11-11 17:26:35 UTC
of course sax is not killing all servers. Calling test is always a risc
because multiple instances of X-Servers could crash the system
(bad X implementation ?)

does saving without testing work ?
Comment 4 Stefan Dirsch 2006-11-11 18:38:32 UTC
Could you please describe how to reproduce this problem exactly? During installation? Afterwards? sax2 started on top of Xserver or from console? etc. Thanks.
Comment 5 Lukas Ocilka 2006-11-11 19:21:49 UTC
Yes, according the bug subject: During installation, X configuration is detected and proposed. Runing the "test installation" offers to "Save the configuration" after a succesfull test (Like running SaX on running system does). So I did.

I guess, X configuration should not be offered to be saved because it is runinng in the installation workflow. The installation workflow itself should save the configuration after clicking [Next} in the hardware proposal dialog.

After that testing and clicking [Save] (or whatever), X server (installation in Qt) was killed.

BTW: if running the "test configuration" is a rict, user should be informed before running the test.
Comment 6 Stefan Dirsch 2006-11-11 21:29:27 UTC
Still not sure, who killed whom. This needs to be investigated (and hopefully reproduced) by a YaST developer first.
Comment 7 Lukas Ocilka 2006-11-13 07:33:52 UTC
This has been _reported_ by a YaST developer :)
What exactly do you need?
Comment 8 Martin Vidner 2006-11-13 07:58:45 UTC
Created attachment 104887 [details]
bug219946.jpg, camera screenshot
Comment 9 Stefan Dirsch 2006-11-13 08:06:18 UTC
Looks like the Save button is displayed still on the test Xserver. I don't think it matters if you chose there Save or Cancel. It seems there is no way back from the second Xserver to the first one after terminating the former. Instead the first Xserver crashes. Maybe Xorg.0.log.old is the correct logfile:

Backtrace:
0: Xorg(xf86SigHandler+0x81) [0x80c00b1]
1: [0xb7fdc420]
2: /usr/lib/xorg/modules/input//mouse_drv.so [0xb7c83072]
3: Xorg [0x80c022a]
4: Xorg [0x80aa441]
5: [0xb7fdc420]
6: Xorg(xf86UnblockSIGIO+0x57) [0x80a9e57]
7: Xorg [0x80b49dd]
8: Xorg(xf86EnterServerState+0x1f) [0x80b4a0f]
9: Xorg(xf86Wakeup+0x3de) [0x80c17ee]
10: Xorg(WakeupHandler+0x59) [0x808a279]
11: Xorg(WaitForSomething+0x1b9) [0x819aa29]
12: Xorg(Dispatch+0x82) [0x80861c2]
13: Xorg(main+0x495) [0x806e705]
14: /lib/libc.so.6(__libc_start_main+0xdc) [0xb7d6ef9c]
15: Xorg(FontFileCompleteXLFD+0x1e1) [0x806da21]
Comment 10 Lukas Ocilka 2006-11-13 09:06:16 UTC
Marcus, please, reevaluate this issue.
Comment 11 Stefan Dirsch 2006-11-13 09:11:28 UTC
It's an Xserver issue.
Comment 12 Stefan Dirsch 2006-11-13 11:42:09 UTC
Can you reproduce this problem on an installed system? 

First configure a fbdev based Xserver with "sax2 -a -r -m 0=fbdev". On top of this Xserver start "sax2" *with* DISPLAY set. Then press the Test button to try to reproduce this problem. If it can be reproduced we could try to debug the Xserver on an installed system, which would help a lot. Debugging the Xserver in an installed system is way more complicated.
Comment 13 Lukas Ocilka 2006-11-13 17:06:48 UTC
Hmm, now it says:
Your modifications will take effect the
next time the graphics system is
restarted. Exit SaX now?
[ Yes ] [ No ]

It doesn't offer to save the configuration :(

If I run X from the first console with `startx` and `sax2` from another console, it still opens SaX UI in the running X. It doesn't run any separate X.

Another case:
`startx` as nobody on console1
`sax2` as root on console2

Sax doesn't offer to save the configuration but operates as in the previous case "will take effect ... next time".

So the only way how to reproduce it is run the second stage installation again:
just `touch /var/lib/YaST2/runme_at_boot` and `restart`. I guess this could be somehow connected with some Marcus' "recently-WORKSFORME-closed" bug that talks about some blackout when the X are detected, see bug #219708.
Comment 14 Stefan Dirsch 2006-11-13 17:16:16 UTC
I don't want to run SaX2 a second Xserver! Is it really so difficult to start "sax2" inside of a xterm?
Comment 15 Lukas Ocilka 2006-11-13 17:19:01 UTC
No, it is not so difficult, this was the first case I've tried.
Sorry, that it wasn't obvious.

--- cut ---
Hmm, now it says:
Your modifications will take effect the
next time the graphics system is
restarted. Exit SaX now?
[ Yes ] [ No ]
--- cut ---
Comment 16 Stefan Dirsch 2006-11-13 17:24:57 UTC
Ok. So it will be very difficult/nearly impossible to debug. This is bad.
Comment 17 Stefan Dirsch 2006-11-14 09:22:45 UTC
*** Bug 216451 has been marked as a duplicate of this bug. ***
Comment 18 Stefan Dirsch 2006-11-15 09:35:09 UTC
*** Bug 220199 has been marked as a duplicate of this bug. ***
Comment 19 Stefan Dirsch 2006-11-15 15:25:46 UTC
Wow. I can easily reproduce this issue during installation and even afterwards when running SaX2 on top of a fbdev based Xserver.
Comment 21 Stefan Dirsch 2006-11-15 16:00:26 UTC
Just started sax2 on a naked Xserver running in gdb. Result when pressing Save button on test Xserver is a crashed fbdev Xserver.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1211565504 (LWP 8155)]
0x080d2109 in XisbBlockDuration (b=0x0, block_duration=-1) at xisb.c:178
178             b->block_duration = block_duration;
(gdb) #0  0x080d2109 in XisbBlockDuration (b=0x0, block_duration=-1) at xisb.c:178
#1  0xb7b7b072 in xf86MouseProtocolNameToID ()
   from /usr/lib/xorg/modules/input//mouse_drv.so
#2  0x080cc0aa in xf86SigioReadInput (fd=12, closure=0x8200de8)
    at xf86Events.c:1212
#3  0x080ad001 in xf86SIGIO (sig=29) at ./../shared/sigio.c:113
#4  <signal handler called>
#5  0xb7f30410 in ?? ()
#6  0xbff3e8f8 in ?? ()
#7  0x00000000 in ?? ()
(gdb) 

But this looks like a different problem to me !?!
Comment 22 Henrik Juul Hansen 2006-11-15 16:25:19 UTC
I have had the same error during installation. Is there any
information/logfiles that I can send to help?
BTW The "test" of monitor/graphic card is very good to have during installation
so please do not remove it. At least I have had many problems of unknown
monitors or wrong v./h. sync freq. settings.

Best regards,
HJH
Comment 23 Stefan Dirsch 2006-11-15 16:40:53 UTC
Null pointer reference. Where is pMse->buffer set? Not at all?

_X_EXPORT void
XisbBlockDuration (XISBuffer *b, int block_duration)
{
        b->block_duration = block_duration;
}

static void
MouseReadInput(InputInfoPtr pInfo)
{
    MouseDevPtr pMse;
    int j, buttons, dx, dy, dz, dw, baddata;
    int pBufP;
    int c;
    unsigned char *pBuf, u;


    pMse = pInfo->private;
    pBufP = pMse->protoBufTail;
    pBuf = pMse->protoBuf;

    /*
     * Set blocking to -1 on the first call because we know there is data to
     * read. Xisb automatically clears it after one successful read so that
     * succeeding reads are preceeded by a select with a 0 timeout to prevent
     * read from blocking indefinitely.
     */
    XisbBlockDuration(pMse->buffer, -1);
Comment 24 Stefan Dirsch 2006-11-15 20:29:33 UTC
Egbert, Matthias. Could you have a look at this one, please? Looks like pMse/pInfo->private is not initialized correctly.
Comment 25 Stefan Dirsch 2006-11-16 10:04:20 UTC
*** Bug 221592 has been marked as a duplicate of this bug. ***
Comment 26 Stefan Dirsch 2006-11-16 12:00:37 UTC
Matthias will investigate.
Comment 27 Stefan Dirsch 2006-11-16 12:06:23 UTC
The easiest way to reproduce is to start first Xserver with option
"-xf86config /etc/X11/xorg.conf.install" (fbdev based xorg.conf from installation), start "sax2 -r" on this Xserver, then click on 

  Ok --> Test --> Save

==> first fbdev based Xserver is crashing
Comment 28 Marcus Schaefer 2006-11-16 14:07:03 UTC
*** Bug 221301 has been marked as a duplicate of this bug. ***
Comment 29 Stefan Dirsch 2006-11-16 14:45:54 UTC
This bug is now mentioned as one of the most annoying bugs of Beta2:
--> http://en.opensuse.org/Bugs:Most_Annoying_Bugs

BTW, it would help us to know if this bugreport also occurs before Beta2. So if someone wants to test, please let us know, about your results. Thanks.
Comment 30 Hugo Costelha 2006-11-16 15:03:06 UTC
As you can see, I opened bug 216451 on the 30th of October, so long before beta2.
Comment 31 Stefan Dirsch 2006-11-16 15:10:20 UTC
Thanks! Good to know.
Comment 32 Matthias Hopf 2006-11-17 17:05:18 UTC
Created attachment 106067 [details]
Workaround for this issue

+Patch6:         mouse-readinput.diff

 pushd xf86-input-mouse-*/src
 %patch3 -p6
+%patch6 -p2
 popd


This seems to be a rare race condition. It apparently only happens with PS/2 mice, and if you select the Save button on the test screen with the mouse (and not the keyboard).

It seems that if a mouse event is detected during VT switch MouseReadInput() is called while it wasn't allowed to do so. As typically a VT switch is induced by keyboard and not by mouse, this doesn't happen during regular work.

This patch does *not* fix the bug that MouseReadInput() shouldn't be called, but it now returns gracefully, if it *is* called while the mouse device isn't open.
Comment 33 Hugo Costelha 2006-11-17 17:16:54 UTC
Just want to say I am using a USB mouse, and it happens to me.
Comment 34 Matthias Hopf 2006-11-17 17:25:46 UTC
Principally the same code. Maybe the difference for me to be able to reproduce was the resolution :-]

This is a rare race condition, it's difficult to trigger. But it seems to be crashing reliably if you finally hit it ;)
Comment 35 Stefan Dirsch 2006-11-17 17:46:47 UTC
Thanks. Fixed for RC1.
Comment 36 Hugo Costelha 2006-11-17 18:16:50 UTC
Guys, can you give the version of the fixed package, so that one can test when they become available?

Thanks.
Comment 37 Stefan Dirsch 2006-11-17 18:20:03 UTC
Check for RPM changelog entry ("rpm --changelog -q xorg-x11-driver-input")

-------------------------------------------------------------------
Fri Nov 17 18:11:52 CET 2006 - sndirsch@suse.de

- mouse-readinput.diff:
  * fixed mouse driver crash (Bug #219946)
Comment 38 Stefan Dirsch 2006-11-17 18:25:00 UTC
*** Bug 221649 has been marked as a duplicate of this bug. ***