Bug 385172

Summary: Installer crashes if Korean or Japanese is selected from the installer menu.
Product: [openSUSE] openSUSE 11.0 Reporter: Tejun Heo <teheo>
Component: InstallationAssignee: Martin Vidner <mvidner>
Status: RESOLVED FIXED QA Contact: Jiri Srain <jsrain>
Severity: Major    
Priority: P2 - High CC: coolo, helios_reds, jsuchome, lslezak, matz, shinkichi.yamazaki, snwint, tiwai
Version: Beta 1Flags: coolo: SHIP_STOPPER-
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: Development Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: y2logs
better log file
korean-crash.ycp
backtrace
backtrace with debug/vector
y2log snippet, try sort -k3 to see the sloppy comparison
/usr/share/YaST2/data/timezone_raw.ycp
korean-crash.cc
korean-crash2.ycp

Description Tejun Heo 2008-04-30 10:39:50 UTC
If Korean is selected from isolinux, it works fine but selecting Korean from the installer makes it crash every time and I'm dropped to linuxrc.  If I choose to re-start installation from there, linuxrc crashes.
Comment 1 Andreas Jaeger 2008-05-05 10:41:24 UTC
Please provide the YaST log files from the crash.
Comment 2 Tejun Heo 2008-05-06 06:55:07 UTC
Created attachment 212536 [details]
y2logs

Here you go.  This is from beta2 which shows some improvement over b1 in that it can retry w/o linuxrc crashing.  This is from the second try which crashed the same way.  If you want log from the first try, please lemme know.
Comment 3 Tejun Heo 2008-05-06 06:55:29 UTC
Clearing NEEDINFO.
Comment 4 Stephan Kulow 2008-05-07 12:44:37 UTC
the yast logs do not help ;(
Please boot with startshell=1 and then call yast - you should get better logs/console output

Comment 5 Tejun Heo 2008-05-14 06:24:45 UTC
Coolo, this one is very deterministic and doing the following with x86-64 b2 dvd will always give you the bug.

1. from isolinux menu, don't select anything.  just initiate installation.
2. from installer language selection menu, select korean but don't change keyboard layout and press next.
3. click through a warning message and select new installation.  yast will crash.

It's easily reproducible in vmware or any real machine.  If you need more info, I can provide it but having the problem reproduced locally beats any bug reporter, right?
Comment 6 Stephan Kulow 2008-05-14 07:21:27 UTC
ah, "press next" - because nothing crashed here when selecting korean. the fonts only got very weird.
Comment 7 Tejun Heo 2008-05-14 07:40:52 UTC
The Korean installer font isn't too bad.  It isn't too pretty but is the best free font we've got at this point.  Heh... A complete Korean fonts should contain 11172 glyphs, so creating one is a very difficult task.

Anyways, it crashes after new installation is initiated, so you need to clock next twice to get the crash.  I'm testing b2 now and it seems that selecting Korean from isolinux now gives the same crash.

Thanks.
Comment 8 Stephan Kulow 2008-05-14 09:02:17 UTC
Created attachment 215042 [details]
better log file
Comment 9 Stephan Kulow 2008-05-14 09:02:46 UTC
the interpreter seems to crash on the korean strings
Comment 10 Martin Vidner 2008-05-14 15:53:23 UTC
I tried to reproduce it outside the installation, but it did not crash:
  zypper in yast2-trans-ko
  LANG=ko_KR.UTF-8 yast2 timezone
  (click Seoul, Accept)
I will retry with the installation.
Comment 11 Martin Vidner 2008-05-20 17:48:11 UTC
Created attachment 216991 [details]
korean-crash.ycp

After desperate attempts to debug inside the inst-sys, I finally managed to make a small test case to reproduce it in an installed B3. Attached.

$ /usr/lib/YaST2/bin/y2base ./korean-crash.ycp  UI
YaST got signal 11 at YCP file /home/mvidner/snippets/korean-crash.ycp:47
Comment 12 Martin Vidner 2008-05-21 07:22:59 UTC
Created attachment 217112 [details]
backtrace

the code is at http://svn.opensuse.org/svn/yast/trunk/core/libycp/src
Comment 13 Martin Vidner 2008-05-21 07:55:01 UTC
Created attachment 217122 [details]
backtrace with debug/vector

I recompiled yast2-core with YCPList using __gnu_debug::vector and this is the result: /usr/include/c++/4.3/debug/safe_iterator.h:239:error: attempt to decrement a dereferenceable (start-of-sequence) iterator.
Comment 14 Martin Vidner 2008-05-21 08:30:42 UTC
To reproduce the bug, you need the test case from comment 11 and yast2-trans-ko.rpm.

It turns out that sorting is seriously confused if the comparison function behaves weirdly, like claiming that unequal elements are equal.

It looks like it compares Korean strings based on the LENGTH of space-separated words.
Comment 15 Martin Vidner 2008-05-21 08:31:56 UTC
Created attachment 217136 [details]
y2log snippet, try sort -k3 to see the sloppy comparison
Comment 16 Martin Vidner 2008-05-21 12:50:40 UTC
Created attachment 217244 [details]
/usr/share/YaST2/data/timezone_raw.ycp

This is as minimal data set as I could make. If I try to remove some items, it does not crash anymore. (Some automated pruning could help to make it really minimal)
Comment 17 Martin Vidner 2008-05-21 12:54:59 UTC
I think that the core of the bug is that wcscoll for Korean is broken.
I will try to make a simple test case of the above components and pass it to Pasky.

As a workaround, Jirka, could you disable the timezone sorting for Korean?
Comment 18 Tejun Heo 2008-05-22 00:59:18 UTC
If wcscoll doesn't work for Korean for whatever reason, you can simply convert the string to UTF16 or UTF32 and compare (q)word by (q)word.  That should give good enough result in most cases.
Comment 19 Jiří Suchomel 2008-05-22 06:30:15 UTC
(In reply to comment #17 from Martin Vidner)

> As a workaround, Jirka, could you disable the timezone sorting for Korean?

Sure: tested, and it helps.
From the comments above, I do not understand why doesn't it crash
a) when Korean is selected in linuxrc
b) when yast2 timezone is called on the installed system...
Comment 20 Martin Vidner 2008-05-22 12:07:04 UTC
Created attachment 217514 [details]
korean-crash.cc

This is a self-contained C++ test case that does not crash, but nevertheless illustrates the probable cause: When run with LC_ALL=ko_KR.UTF-8, it sorts correctly (AFAICS: the initial characters group together). When run with en_US.UTF-8, it falls back to sorting according to word length.
Comment 21 Jiří Suchomel 2008-05-22 12:37:36 UTC
(In reply to comment #17 from Martin Vidner)

> As a workaround, Jirka, could you disable the timezone sorting for Korean?

yast2-country-2.16.26
Comment 22 Mike Fabian 2008-06-11 12:16:30 UTC
Has anything been done here?

Could this be related to bug #399241 ?
Comment 23 Martin Vidner 2008-06-11 13:02:16 UTC
*** Bug 399241 has been marked as a duplicate of this bug. ***
Comment 24 Martin Vidner 2008-06-11 13:11:36 UTC
A quick and blunt workaround would be to disable the sorting unconditionally.
But I need to look at the core problem again.
Comment 25 Mike Fabian 2008-06-11 13:18:49 UTC
I think we need something now, even if it is a blunt workaround.
Comment 26 Martin Vidner 2008-06-11 13:28:04 UTC
Installation in Japanese (and possibly Chinese) will always crash. The workaround would be to install in another language and switch it after the installation has finished.
Comment 27 Jiří Suchomel 2008-06-11 13:31:30 UTC
I think I could disable timezone sorting for languages that I have marked as CJK in Language.ycp (although not only CJK in fact):

list cjk_languages      = [ "ja", "ko", "zh", "hi", "km", "pa", "bn" ];

Right now I tested with Japanese and it works.
Comment 28 Mike Fabian 2008-06-11 13:33:09 UTC
I just finished an RC4 installation in simplified Chinese and it did
*not* crash. Surprising, but it really works.
Comment 29 Jiří Suchomel 2008-06-11 13:43:38 UTC
(In reply to comment #28 from Mike Fabian)
> I just finished an RC4 installation in simplified Chinese and it did
> *not* crash. Surprising, but it really works.

Traditional Chinese is also fine.

To test it, you only need to do few installation steps: when you see the timezone dialog, you're (most probably) on the safe side.

Comment 30 Stephan Kulow 2008-06-11 13:46:22 UTC
even if it would always crash, we just have too few cjk users to warrent a delay. And the installation crashes very early, so there is no fear of data loss.
Comment 31 Stephan Kulow 2008-06-11 13:47:37 UTC
So keep working on the core problem and we'll do driver update
Comment 32 Martin Vidner 2008-06-11 15:04:54 UTC
So I thought that perhaps YaST was not setting LC_COLLATE properly. But it is:
I clicked to the dialog before the crash, got a shell, gdb'd the yast process, said "p (char*)setlocale(6,0)" (that's LC_ALL, NULL) and got "LC_CTYPE=ja_JP.UTF-8;LC_NUMERIC=C;LC_TIME=ja_JP.UTF-8;LC_COLLATE=ja_JP.UTF-8;LC_MONETARY=ja_JP.UTF-8;LC_MESSAGES=ja_JP.UTF-8;LC_PAPER=ja_JP.UTF-8;LC_NAME=ja_JP.UTF-8;LC_ADDRESS=ja_JP.UTF-8;LC_TELEPHON"...

I wanted to check whether the files /usr/lib/locale/*/LC_COLLATE are present, but they are not and in the inst-sys there is only a single file in /usr/lib/locale : locale-archive.

Pasky, is that enough for correct collation?
Comment 33 Mike Fabian 2008-06-11 15:15:58 UTC
locale-archive is a database to hold all the files usually below
/usr/lib/locale in a single file.

One can use such a database but one does not have to.

In the installed system we don’t use it (We tried a few years ago but
decided against using it).

I don’t know why we use it in the instsys, maybe because the
savings in efficiency and space are more important for the instsys.

The locale-archive file in the instsys has only 2.0 MB though.

That is certainly not enough to contain all the locales listed
by “locale -a” in the instsys, i.e. most of these locales are
probably faked and just copies of en_US.UTF-8.
Comment 34 Mike Fabian 2008-06-11 15:26:33 UTC
By the way, RC4 does *not* crash for Korean!

The only language I could find so far where it crashes before the
timezone dialog is Japanese.

Comment 35 Martin Vidner 2008-06-11 15:33:41 UTC
(In reply to comment #34 from Mike Fabian)
> By the way, RC4 does *not* crash for Korean!
That is because of the workaround in comment 21.
Comment 36 Stephan Kulow 2008-06-15 18:59:49 UTC
Jiri, I suggest you do #27 for 11.0 tree anyway, so we can release it as driverupdate part. And please tell me which file the fix is in.
Comment 44 Stephan Kulow 2008-06-17 10:58:18 UTC
the driver update is now online, you can try.
Comment 46 Mike Fabian 2008-06-17 12:43:33 UTC
Works for me as well. Thank you!
Comment 47 Petr Baudis 2008-06-18 11:36:20 UTC
As for comment #32, I think it should be enough, provided that all the required locales are in the locale-archive. I don't know what builds the locale-archive for the CD root and what gets put inside there.

Ad comment #20, I'm sorry but I don't see what does that testcase show; sorting by length in en_US locale makes sense if all the korean characters fall in the same collation class in the en_US locale, which I assume is the case. What exactly is the trouble with wcscoll()?

Maybe I'm a bit confused - does yast crash because of unexpected sort result from glibc, or is the crash inside glibc?
Comment 48 Martin Vidner 2008-06-18 11:48:52 UTC
The crash happens inside std::sort. I take it that std::sort can crash if the comparison function behaves badly, like saying a<b, b<c, c<a. I am just not sure whether wcscoll can do that, even if it is missing the proper collation  data.

Steffen, I assume that the Asian collation data is omitted in the inst-sys. Would it be too large to include it?
Comment 49 Petr Baudis 2008-06-18 12:05:50 UTC
wcscoll() should never behave like that, I believe. It does simply report plenty of strings to be collate-equal to each other in the en_US locale, but should produce valid ordering.

Are you sure there is no memory corruption going on? Are you feeding only valid pointers to std::sort?
Comment 50 Steffen Winterfeldt 2008-06-18 12:22:09 UTC
We could in principle load them along with the other language dependend data
in 11.1. But I don't think this solves this problem. IMHO sorting Korean
is simply broken (see, e.g. bug 325992).
Comment 51 Martin Vidner 2008-06-18 12:26:18 UTC
Comment 49: I am pretty sure. Naturally I used valgrind, but it did not report problems until the moment of the crash (where it accessed array[-1] I think).

Comment 50: no, you misunderstand. According to US sorting rules, all Korean characters are equal. It works as designed and that bug is rightfully invalid.
Comment 52 Michael Matz 2008-06-23 15:27:01 UTC
It is true that std::sort will overrun an array if the sort function is not a
strict weak ordering.

Martin, were you meanwhile able to create some C++ testcase that shows wcscoll
really misbehaving as in being inconsistent?  Because AFAICS the testcase sort
of is consistent.  It sorts according to word-length, which of course is a 
useless sort, but nevertheless a sort.  In which of the SVN files is the
implementation of the sort function?  I was lost in the levels of indirections
starting somewhere at include/ycp/YCPCodeCompare.h .
Comment 53 Martin Vidner 2008-06-23 17:07:32 UTC
Created attachment 223821 [details]
korean-crash2.ycp

The attached single YCP file is enough to crash 11.0 yast. The key observation is that the same list of strings, when used in the previous c++ program, does not cause a crash. So that got me thinking about the only difference, the way we simulate strcoll in YCP: lsort (locale sort) of a pair.
I checked that when one sorts the Korean pair Manila-Macao both ways, the order comes out unchanged. That effectively means that we use "<=" instead of "<" and thus std::sort can get a<b && b<a from us.

The original ycp code tries to convert "<=" to "<" by using
	    return lsorted[0]:"" == a && a != b;
instead of
	    return lsorted[0]:"" == a;
but it fails because it needs to apply the collation operator before "!=".
This seems to do the job:
(*)	    return lsorted[0]:"" == a && lsorted != lsort([b, a]);

So I will apply (*) as the fix.
Now, from the viewpoint of the YCP interpreter, it would be really handy if there was a safe sorting function that would not crash on any weird comparison operator. Oh well.
Comment 54 Martin Vidner 2008-07-08 13:21:27 UTC
*** Bug 407133 has been marked as a duplicate of this bug. ***
Comment 55 Stephan Binner 2008-07-28 08:58:09 UTC
No Blocker according to http://en.opensuse.org/Bugs/Definitions
Comment 56 Martin Vidner 2008-10-16 18:06:04 UTC
Actually the correct way is:
(*)         return lsorted[0]:"" == a && lsorted == lsort([b, a]);
Yay to unit tests!

Fixed in SVN in yast2-country-2.17.22 and yast2-packager-2.17.30.
Jirka, Lada, please submit the packages.
Comment 57 Jiří Suchomel 2008-10-21 07:01:29 UTC
yast2-country-2.17.22 submitted.
Comment 58 Ladislav Slezák 2008-10-21 07:48:05 UTC
yast2-packager-2.17.30 has been submitted

Can we close the bug now?
Comment 59 Martin Vidner 2008-10-21 08:01:58 UTC
Thank you!