Bug 557609

Summary: preload gets stuck in read loop on boot has to be killed
Product: [openSUSE] openSUSE 11.2 Reporter: Alex Tsariounov <alext>
Component: BasesystemAssignee: Stephan Kulow <coolo>
Status: RESOLVED WONTFIX QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P2 - High CC: dutchkind, matz, vladimir.psenicka
Version: Final   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE 11.2   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: My current prepared file
preload strace log

Description Alex Tsariounov 2009-11-22 17:48:36 UTC
On and after boot, the preload gets stuck in a read loop that consumes all IO for the disks and interferes with normal operation.  

Iotop output as follows:

  PID USER      DISK READ  DISK WRITE   SWAPIN    IO>    COMMAND                                                      
  461 root       57.08 M/s       0 B/s  0.00 % 99.99 % preload /var/cache/preload/prepared

The preload process reads at 34 to 57 MB/s reading the prepared file of size 4.5 MB seemingly forever.  After 8 minutes, I kill the preload process and the machine IO goes back to zero.  Contents of /var/cache/preload are:

[zuul:Desktop]$ ll -h /var/cache/preload/
total 7.1M
-rw-r--r-- 1 root root  30K 2009-11-15 13:15 Firefox.preload
-rw-r--r-- 1 root root  58K 2009-11-15 13:15 gdm.preload
-rw-r--r-- 1 root root 284K 2009-11-15 13:15 kde.preload
-rw-r--r-- 1 root root  21K 2009-11-15 13:15 kdm.auto.preload
-rw-r--r-- 1 root root  38K 2009-11-15 13:15 kdm.preload
-rw-r--r-- 1 root root 230K 2009-11-15 13:15 Khelpcenter.preload
-rw-r--r-- 1 root root  81K 2009-11-15 13:15 OpenOffice.preload
-rw-r--r-- 1 root root 4.5M 2009-11-21 11:45 prepared
-rw-r--r-- 1 root root 1.9M 2009-11-21 11:45 trace.gz

This happens almost on every boot.
Comment 1 Stephan Kulow 2009-12-08 13:29:13 UTC
what is in your prepared file?
Comment 2 Alex Tsariounov 2009-12-08 16:32:39 UTC
Created attachment 331590 [details]
My current prepared file

Attached is my prepared file.  Currently my workaround is to add "nopreload" to the kernel cmdline.
Comment 3 Stephan Kulow 2010-01-21 16:06:51 UTC
you have a lot of files referenced in there and it tries to read them all - so it's not suprising it takes a while. 8 minutes is a bit much though.

Can you strace preload to see what it works on?
Comment 4 David Kerkhof 2010-02-27 09:15:39 UTC
I have the same. I think that this came after the last update I installed 2 days ago which included a lot of KDE 4.3 files. Since then I notice a high cpu usage and a sluggish system. I killed the preload process and now the system is quiet again.
Comment 5 Alex Tsariounov 2010-02-27 18:15:41 UTC
Sorry, but I've not had time to debug the issue yet.  I get around it by putting "nopreload" into /etc/sysconfig/bootloader for the kernel boot options.  I don't notice any degradation from not preloading, but did notice degradation when preloading was active.  I'll do an strace soon. One thing to note is that I never intentionally added files to the prepared file (don't even know how to do that).  The preload file was generated for me by install and day to day operation. Thanks.
Comment 6 Vladimir Psenicka 2010-03-27 09:48:32 UTC
Hi. 
I have same issues with preload. After boot my system is unresponsive for about 10 minutes and process "preload /var/cache/preload/prepared" consumes 40-80MB/s in iotop disk read. I think this slowness came after update to KDE 4.3.5.
Comment 7 Vladimir Psenicka 2010-03-27 10:12:12 UTC
Created attachment 350987 [details]
preload strace log

After slow preload (about 10 minutes) I restarted PC and straced preload process, but this time preload ended quick (about 2 minutes)
Comment 8 Stephan Kulow 2010-06-25 11:13:32 UTC
I can't make any sense out of the given data ;(

I always hoped someone would come with something more that would add the missing puzzle piece, but from what I have the only theory I have is that the filesystem is _very_ defragmented causing too many blocks to be read. And I'm not sure what I can do here. 

The only thing I can think of is adding a limit on how long it should run, so it stops e.g. after 2 minutes. Michael, what do you think? Any idea for a better limit of sanity?
Comment 9 Michael Matz 2010-06-25 12:05:33 UTC
The values in comment #7 strace log are completely bollocks.  Repeated fadvices
on files 10,11,12,13,17 with sizes and offsets being nonsensical:

E.g.:

370   1269684166.071359 fadvise64(13, 3422535680, 8388608, POSIX_FADV_WILLNEED)
370   1269684166.290542 fadvise64(12, 1653669888, 7274496, POSIX_FADV_WILLNEED)

So, file 13 at offset 3.4 GB, reading 8MB in size.  And so on ...

Unfortunately the strace doesn't show us which files are connected with those
file descriptors.  But files of such size can't possibly be interesting
files to preload.  Such files normally shouldn't be involved in the startup
process at all.  One guess would be for instance some log file gone wild,
that the preload tracer sees as being opened and hence like to read in
completely.

Instead of introducing an artificial limit on filesize (size > N --> don't
preload at all), or on read data size (ala always read in N MB max, per file),
I'd first like to know which files are the ones being so large.

But as stop-gap measure, I do think it makes sense to not preload, say,
8 MB per file (or to ignore all files > say 16 MB, this needs to be a bit
large so as to not ignore some of the very large shared libraries we have).

One other idea: if the above files are indeed logfiles or the like, then
they are probably opened O_WRONLY.  We can map such open calls to just
being stat()s, instead of causing a full read-in.
Comment 10 Stephan Kulow 2010-09-23 11:04:16 UTC
O_WRONLY is already ignored - it won't even be in the log file. So a size limit of what's read from a file has to work
Comment 11 Stephan Kulow 2011-09-01 11:42:37 UTC
I dropped preload for 12.1