Bug 1100080

Summary: KDE Bug 389848 baloo crash on login causes havoc
Product: [openSUSE] openSUSE Distribution Reporter: Stakanov Schufter <stakanov>
Component: OtherAssignee: Peter Varkoly <varkoly>
Status: RESOLVED DUPLICATE QA Contact: E-mail List <qa-bugs>
Severity: Enhancement    
Priority: P5 - None CC: anicka, boris, bwhiteley, chris, dmueller, duwe, fvogt, jfehlig, kkaempf, noga.dany, rsalevsky, thomasbechtold, trenn
Version: Leap 15.0   
Target Milestone: ---   
Hardware: x86-64   
OS: SUSE Other   
See Also: http://bugzilla.opensuse.org/show_bug.cgi?id=1136132
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: Supplemental output of konqi.
This is from the same system but another user.

Description Stakanov Schufter 2018-07-04 08:21:14 UTC
I experiencing the following:
system of 8 GB Ram and 8 GB Swap, SSD Samsung. 
opening a user with kmail and kpop accounts open, filters active. 
openning a second user with kmail and imap active
leaving now the machine idle (only automatic mail check, with ore w/o a browser open. 
The, as mentioned in bug 1097605, the machine has:
with the 4.4 kernel and this configuration: a memory consumption of 2.5 gb
with the original leap 15 kernel: a memory consumption of 6-7 GB of ram and a swap of up to 3.7 gb. 
All memory seems to be used by the both processes of akonadi, were the akonadi process of the kpop accounts take the most. 
Closing down kmail will free about 1-1,5 GB of ram. Stopping akonadi will cause a slow freeing up of about an additional 4 GB of ram, however swap will stay on 2GB once created. 
As said, this is linked to how the kernel 4.12 interacts with the system when idle. 
This is the case of 100% of sessions, it is sufficient to leave the user open and to let the system idle. 
If working with the system, the memory will not rise in that way. 

When recalling the system to life, after being idle, it will then work heavily with the ssd (because it seems it takes the necessary data from swap. 
The system will be sluggish and will need very long, at times mouse cursor will be frozen (and free after about a minute). 
Closing down the user with kmail open will eventually free up up to 90% of the claimed memory. 
When both users are open it may happen that the user instance with imap accounts running will also bloat but only up to 2-3 GB for akonadi, without claiming full 7 GB of ram and 3 of swap. 
If left alone idle for a period superior to an estimated 3 hours or more, the system will run out of memory, temp will go critical and the system will shut down. 

All this, I have to point this out will happen only with the combination of standard leap 15 kernel and akonadi/kmail. It does not happen if one uses leap 15 with the 4.4 kernel from 42.3.
Comment 1 Stakanov Schufter 2018-10-26 07:17:52 UTC
This is actually:
https://bugs.kde.org/show_bug.cgi?id=389848

After this occurs you get on the machine:

a double instance of kmail
a memory leak filling up all memory and swap
several http.so instances and qt-webkit instances
a hung / dysfunctional akonadi, be it with postgres96 be it with mariadb.

I do not know if it would be possible to backport this to leap, but it would be good, since the crash happens on a daily basis, requiring a manual kill of all processes involved, a akonadictl fsck and then a restart of kontact plus sudo swapoff -a , sudo swapon -a, and of course a system restart if you notice it too late because the whole system eventually gets unresponsive. 

Output on my system is the following:

Application: Baloo File Indexing Daemon (baloo_file), signal: Aborted
Using host libthread_db library "/lib64/libthread_db.so.1".
[Current thread is 1 (Thread 0x7f7510af4100 (LWP 3014))]

Thread 3 (Thread 0x7f75056d0700 (LWP 5101)):
[KCrash Handler]
#6  0x00007f750e44f0e0 in raise () from /lib64/libc.so.6
#7  0x00007f750e4506c1 in abort () from /lib64/libc.so.6
#8  0x00007f750c213922 in mdb_assert_fail (env=0x55ced90d7160, expr_txt=expr_txt@entry=0x7f750c2153af "rc == 0", func=func@entry=0x7f750c215ce8 <__func__.6935> "mdb_page_dirty", line=line@entry=2071, file=0x7f750c215390 "mdb.c") at mdb.c:1487
#9  0x00007f750c208e05 in mdb_page_dirty (txn=0x55ced90d8520, mp=<optimized out>) at mdb.c:2071
#10 0x00007f750c209fea in mdb_page_alloc (num=num@entry=1, mp=mp@entry=0x7f75056cf088, mc=<optimized out>) at mdb.c:2252
#11 0x00007f750c20a259 in mdb_page_touch (mc=mc@entry=0x7f75056cf5c0) at mdb.c:2370
#12 0x00007f750c20bd2f in mdb_cursor_touch (mc=mc@entry=0x7f75056cf5c0) at mdb.c:6273
#13 0x00007f750c20eeee in mdb_cursor_put (mc=0x7f75056cf5c0, key=0x7f75056cf9a0, data=0x7f75056cf9b0, flags=<optimized out>) at mdb.c:6407
#14 0x00007f750c2119ab in mdb_put (txn=0x55ced90d8520, dbi=4, key=key@entry=0x7f75056cf9a0, data=data@entry=0x7f75056cf9b0, flags=flags@entry=0) at mdb.c:8765
#15 0x00007f750fad8faf in Baloo::DocumentDB::put (this=this@entry=0x7f75056cfa60, docId=<optimized out>, docId@entry=35466169188154883, list=...) at /usr/src/debug/baloo5-5.45.0-lp150.2.1.x86_64/src/engine/documentdb.cpp:77
#16 0x00007f750faf1d85 in Baloo::WriteTransaction::addDocument (this=0x7f34f80055e0, doc=...) at /usr/src/debug/baloo5-5.45.0-lp150.2.1.x86_64/src/engine/writetransaction.cpp:62
#17 0x00007f750faed479 in Baloo::Transaction::addDocument (this=this@entry=0x7f75056cfb90, doc=...) at /usr/src/debug/baloo5-5.45.0-lp150.2.1.x86_64/src/engine/transaction.cpp:226
#18 0x000055ced7d42596 in Baloo::NewFileIndexer::run (this=0x55ced92af730) at /usr/src/debug/baloo5-5.45.0-lp150.2.1.x86_64/src/file/newfileindexer.cpp:72
#19 0x00007f750ef5a372 in QThreadPoolThread::run (this=0x55ced927a550) at thread/qthreadpool.cpp:99
#20 0x00007f750ef5d0ce in QThreadPrivate::start (arg=0x55ced927a550) at thread/qthread_unix.cpp:368
#21 0x00007f750d568559 in start_thread () from /lib64/libpthread.so.0
#22 0x00007f750e51182f in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7f7506213700 (LWP 3018)):
#0  0x00007f750e50708b in poll () from /lib64/libc.so.6
#1  0x00007f750abf0109 in ?? () from /usr/lib64/libglib-2.0.so.0
#2  0x00007f750abf021c in g_main_context_iteration () from /usr/lib64/libglib-2.0.so.0
#3  0x00007f750f180c0b in QEventDispatcherGlib::processEvents (this=0x7f7500000b10, flags=...) at kernel/qeventdispatcher_glib.cpp:425
#4  0x00007f750f12909a in QEventLoop::exec (this=this@entry=0x7f7506212ca0, flags=..., flags@entry=...) at kernel/qeventloop.cpp:212
#5  0x00007f750ef584da in QThread::exec (this=this@entry=0x7f751060dd60 <(anonymous namespace)::Q_QGS__q_manager::innerFunction()::holder>) at thread/qthread.cpp:515
#6  0x00007f751039d985 in QDBusConnectionManager::run (this=0x7f751060dd60 <(anonymous namespace)::Q_QGS__q_manager::innerFunction()::holder>) at qdbusconnection.cpp:178
#7  0x00007f750ef5d0ce in QThreadPrivate::start (arg=0x7f751060dd60 <(anonymous namespace)::Q_QGS__q_manager::innerFunction()::holder>) at thread/qthread_unix.cpp:368
#8  0x00007f750d568559 in start_thread () from /lib64/libpthread.so.0
#9  0x00007f750e51182f in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f7510af4100 (LWP 3014)):
#0  0x00007f750e50708b in poll () from /lib64/libc.so.6
#1  0x00007f750abf0109 in ?? () from /usr/lib64/libglib-2.0.so.0
#2  0x00007f750abf021c in g_main_context_iteration () from /usr/lib64/libglib-2.0.so.0
#3  0x00007f750f180bef in QEventDispatcherGlib::processEvents (this=0x55ced90cb8d0, flags=...) at kernel/qeventdispatcher_glib.cpp:423
#4  0x00007f750f12909a in QEventLoop::exec (this=this@entry=0x7ffea34c89e0, flags=..., flags@entry=...) at kernel/qeventloop.cpp:212
#5  0x00007f750f1319e4 in QCoreApplication::exec () at kernel/qcoreapplication.cpp:1289
#6  0x000055ced7d3a21c in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/baloo5-5.45.0-lp150.2.1.x86_64/src/file/main.cpp:104
[Inferior 1 (process 3014) detached]


I am putting this as an enhancement since I do not know if it is possible to backport this at all.
Comment 2 Fabian Vogt 2018-10-26 10:51:26 UTC
lmdb bug, reassigning.
Comment 3 Stakanov Schufter 2018-10-31 18:47:57 UTC
Created attachment 788083 [details]
Supplemental output of konqi.

normally does not add a lot of information. But for the sake of completeness and since this one produced not at "startup" but during a session, I attach here the specific output of konqi. 
Happy to provide any information if needed.
Comment 4 Stakanov Schufter 2018-11-14 14:13:19 UTC
Created attachment 789687 [details]
This is from the same system but another user.

the attachment has a crash from baloo from another user of the same system. I join it because it could be interesting, given that the crash frequency between the two users is very different. (mercurio - the one of the report - crashes regularly every day. Has pop3 accounts. entropia - the one of this attachment - nearly never crashes, maybe one of 50 logins, or higher and does not crash in the same way, only after a consistent uptime, and has only IMAP accounts. While the other sees crashes every login after minutes). 
So for the sake of completeness I join it.
Comment 5 Stakanov Schufter 2019-07-06 09:39:11 UTC
FYI:
this is fixed in 15.1 provided you erase all old search data and reindex all. 
In 15.0 this is not solved by this procedure instead. 
So fixed in 15.1 but continues to exist in 15.0.
Comment 6 Daniel Noga 2019-07-06 11:07:57 UTC
(In reply to Stakanov Schufter from comment #5)
> FYI:
> this is fixed in 15.1 provided you erase all old search data and reindex
> all. 
> In 15.0 this is not solved by this procedure instead. 
> So fixed in 15.1 but continues to exist in 15.0.

Are you sure? 15.1 uses the same lmdb as 15.0. I think it is only bad luck that workaround did not work in 15.0 for you . I had no problem in Leap 15.0 and it started only with 15.1 for me (delete database workaround resolved it).
Comment 7 Stakanov Schufter 2019-07-07 20:55:06 UTC
(In reply to Daniel Noga from comment #6)
> (In reply to Stakanov Schufter from comment #5)
> > FYI:
> > this is fixed in 15.1 provided you erase all old search data and reindex
> > all. 
> > In 15.0 this is not solved by this procedure instead. 
> > So fixed in 15.1 but continues to exist in 15.0.
> 
> Are you sure? 15.1 uses the same lmdb as 15.0. I think it is only bad luck
> that workaround did not work in 15.0 for you . I had no problem in Leap 15.0
> and it started only with 15.1 for me (delete database workaround resolved
> it).

Unfortunately yes. As soon as I did install 15.1 I had even worse crashes with the indexer. But after going through the procedure once again, it worked finally. 
So for me it did a difference. Why it did not work before I cannot tell you because with 15.0 I performed exactly the same procedure. 
anyway, I confirm that it works in 15.1 with the procedure. If nobody else experience the problem in 15.0 then we may close this. By seeing the numbers of duplicates of this bug in Kde bugzilla I have some doubts however. 
Normally with this crash I experienced shortly after a memory leak of kmail. Also this is gone in 15.1.
Comment 8 Peter Varkoly 2019-07-10 13:43:29 UTC
Dupl

*** This bug has been marked as a duplicate of bug 1136132 ***
Comment 9 Stakanov Schufter 2019-07-13 06:35:09 UTC
(In reply to Peter Varkoly from comment #8)
> Dupl
> 
> *** This bug has been marked as a duplicate of bug 1136132 ***

So a bug from 2018 is becoming a duplicate of a bug from 2019 and no answer since the original report? 
I am confused. But O.K. as long as you care about it.