Bug 590242

Summary: MALLOC_CHECKING reminder (turn off) hwinfo --framebuffer suddenly takes very long
Product: [openSUSE] openSUSE 11.3 Reporter: Stefan Seyfried <seife>
Component: BasesystemAssignee: Marcus Meissner <meissner>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P2 - High CC: snwint
Version: Factory   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: Community User Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: output of "mdt --modes --show dump"
malloc test program
test run logs
updated test program

Description Stefan Seyfried 2010-03-22 16:26:42 UTC
Always running current FACTORY, this bug appeared ~4 weeks ago.

strolchi:~ # time hwinfo --framebuffer
02: None 00.0: 11001 VESA Framebuffer                           
  [Created at bios.459]
  Unique ID: rdCR.bQVdqIybBKC
  Hardware Class: framebuffer
  Model: "ATI P10"
  Vendor: "ATI Technologies Inc."
  Device: "P10"
  SubVendor: "ATI MOBILITY RADEON 9600"
  SubDevice: 
  Revision: "01.00"
  Memory Size: 31 MB + 960 kB
  Memory Range: 0x98000000-0x99feffff (rw)
  Mode 0x0382: 320x200 (+320), 8 bits

[... lots of modes... ]

  Mode 0x0344: 1400x1050 (+5600), 24 bits
  Config Status: cfg=no, avail=yes, need=no, active=unknown

real    0m42.932s
user    0m3.566s
sys     0m16.508s


It is running with 100% CPU usage, in stage
> bios.4.2: mode info                                           
most of the time.

The run time varies between ~25 and ~50 seconds, depending on idleness of the machine.

Hardware is a Compaq nc6000 with Pentium M 1.7GHz and 1GB of RAM.
Comment 1 Stefan Seyfried 2010-03-22 16:28:51 UTC
Did you eventually switch to libx86 or similar CPU emulation?

It's especially annoing since hwinfo --framebuffer is called during mkinitrd, apparently for every kernel image, which can lead to quite some waiting ;)
Comment 2 Steffen Winterfeldt 2010-03-22 16:59:30 UTC
Your cpu is probably in some deep sleep mode or so. :-)

Pick up 'mdt' from BS and try 'mdt --modes'. Does it also take that long?
If so, please attach 'mdt --modes --show dump'.
Comment 3 Stefan Seyfried 2010-03-23 08:11:14 UTC
Created attachment 349949 [details]
output of "mdt --modes --show dump"

Yes. Each mode seems to take ~500 ms to get.
Additionally, both hwinfo and mdt seem to use obscene amounts of memory:

strolchi:~ # /usr/bin/time mdt --modes > /dev/null 
2.69user 17.20system 0:25.80elapsed 77%CPU (0avgtext+0avgdata 2355200maxresident)k
16inputs+0outputs (1major+3039779minor)pagefaults 0swaps
Comment 4 Steffen Winterfeldt 2010-03-23 10:07:45 UTC
The log shows about 2ms per mode. So that is not the problem. Is
'mdt --mode 0x0101' also that slow? Is it also using 2GB of memory?

Is that a 32 bit machine?
Comment 5 Stefan Seyfried 2010-03-23 11:40:24 UTC
Yes, 32bits, non-PAE, kernel-desktop

strolchi:~ # /usr/bin/time mdt --mode 0x0101
  0x0101[bb]: 640x480+640, 8 bpp, fb: 0x98000000, a0000.7/a0000.0: 64k
0.35user 0.83system 0:01.86elapsed 63%CPU (0avgtext+0avgdata 2355200maxresident)k
0inputs+0outputs (0major+147363minor)pagefaults 0swaps

strolchi:~ # /usr/bin/time hwinfo --framebuffer > /dev/null 
3.63user 16.65system 0:32.91elapsed 61%CPU (0avgtext+0avgdata 1585856maxresident)k
7600inputs+0outputs (5major+3039276minor)pagefaults 0swaps

So mdt --mode 0x0101 also takes long (almost two seconds) and uses lots of ram. hwinfo --framebuffer in fact does not need *that much* of it (still enough ;)
Comment 6 Steffen Winterfeldt 2010-03-23 12:44:50 UTC
I somehow suspect it's a gcc45 thing.
Comment 7 Steffen Winterfeldt 2010-03-23 12:57:03 UTC
Ok, I can reproduce it. I'll check.
Comment 8 Steffen Winterfeldt 2010-03-23 14:34:58 UTC
Created attachment 350048 [details]
malloc test program

The kernel makes the difference. The attached test program gets oom-killed
in the second mmap() if I call it with e.g.'t3 2 200000000' on a 512MB machine.

For you it will probably first fill up swap, then get killed.

I don't know if it is intentional that you can't use more virtual memory
than physical memory. Looks like a bug to me.
Comment 9 Steffen Winterfeldt 2010-03-23 14:55:50 UTC
After playing a bit with /proc/sys/vm/overcommit_memory it seems
the problem is that now the kernel seems to explicitly initialize (!?)
the memory with 0.
Comment 10 Jeff Mahoney 2010-03-23 14:59:07 UTC
Wait what?
Comment 11 Stefan Seyfried 2010-03-23 15:27:45 UTC
seife@strolchi:/tmp> ./t3 16 200000000
16 x 200000000

works for me :-) Swaps like hell, but does not kill anything.
I have 1.5G of swap because of hibernation:

seife@strolchi:/tmp> free
             total       used       free     shared    buffers     cached
Mem:       1030116     375620     654496          0       1672     149744
-/+ buffers/cache:     224204     805912
Swap:      1574360     327188    1247172
Comment 12 Steffen Winterfeldt 2010-03-23 15:48:22 UTC
Yes, but it shouldn't swap; that's the point. It didn't do it before.
Comment 13 Jeff Mahoney 2010-03-23 17:42:12 UTC
I don't think this is a kernel issue. I think it's a glibc issue.

When I boot the openSUSE 11.3 M3 kernel on my SLES11 SP1 install, I can't reproduce the issue.

When I boot the openSUSE 11.2 kernel on my Factory install, I can reproduce the issue.

Upping the priority since this will affect performance all over the place.
Comment 14 Steffen Winterfeldt 2010-03-24 10:46:29 UTC
You are right; when I replace malloc() with mmap() the problem is gone.
Comment 15 Petr Baudis 2010-03-25 16:19:37 UTC
Huh, I'm confused... SLE11SP1 and Factory use virtually the same glibc.

I cannot reproduce the problem with SLE11 kernel and HEAD userspace; the testcase returns immediately even with 16 200000000.

Can you post a strace of the problematic testcase run? Do we have any SUSE-internal machine exhibiting the issue?
Comment 16 Steffen Winterfeldt 2010-03-25 16:52:47 UTC
Created attachment 350606 [details]
test run logs

I also can't reproduce this in a factory chroot with 11.2 kernel. That's
why I assigned it to kernel in the first place.

Fact is, though, that running factory on a factory kernel shows a difference
between using malloc() and mmap(). The attached log (taken in a vmware vm)
shows the time between mmap() calls increasing because it starts swapping
if the memory is used up. If I take the swap away, the process gets oom-killed.
Comment 17 Petr Baudis 2010-03-25 17:02:46 UTC
Hm, but what is happenning between the calls is then a question. Does unfiltered strace show anything extra? Doesn't the small extra mmap() size make a difference?
Comment 18 Steffen Winterfeldt 2010-03-25 17:03:26 UTC
Created attachment 350615 [details]
updated test program

In case I'm missing something obvious, here is the test program I used.
Comment 19 Steffen Winterfeldt 2010-03-25 17:06:03 UTC
No there are no other syscalls between the large mmaps().
Comment 20 Steffen Winterfeldt 2010-03-25 17:13:52 UTC
> Doesn't the small extra mmap() size make a difference?

No. Using the values from the malloc() log for mmap() shows no difference.
Comment 21 Petr Baudis 2010-03-25 17:19:46 UTC
Ok, can you please also check your environment for funky stuff? export | grep MALLOC_.

I still can't see where could it spend the extra time otherwise. :( Would it be possible to oprofile it?
Comment 22 Steffen Winterfeldt 2010-03-26 10:02:13 UTC
I had it running in the rescue system from our install dvd. I don't think
there's anything unusual in the environment there.
Comment 23 Stefan Seyfried 2010-03-26 12:53:41 UTC
Petr, you're dead on:
strolchi:~ # /usr/bin/time hwinfo --framebuffer >/dev/null
3.34user 16.89system 0:29.99elapsed 67%CPU (0avgtext+0avgdata 1586256maxresident)k
5480inputs+0outputs (10major+3039317minor)pagefaults 0swaps
strolchi:~ # MALLOC_CHECK_= MALLOC_PERTURB_= /usr/bin/time hwinfo --framebuffer > /dev/null 
1.10user 0.07system 0:01.67elapsed 70%CPU (0avgtext+0avgdata 20688maxresident)k
1160inputs+0outputs (0major+9544minor)pagefaults 0swaps
strolchi:~ # env|grep MALLOC
MALLOC_CHECK_=3
MALLOC_PERTURB_=69
strolchi:~ #
Comment 24 Steffen Winterfeldt 2010-03-26 13:04:33 UTC
The difference is MALLOC_PERTURB_ = E vs. 69.
Comment 25 Steffen Winterfeldt 2010-03-26 13:09:35 UTC
What is 'E' (on my 11.2 machine) supposed to mean? An alias for 0?
Comment 26 Petr Baudis 2010-05-10 21:42:21 UTC
Yes, anything non-numeric is alias for 0 aka "don't perturb" (perturb will fill allocated/freed space with non-zero bytes).

What on your system is setting these environment variables? My system has nothing of this. We should get rid of them if this is some OpenSUSE default now.
Comment 27 Stefan Seyfried 2010-05-11 06:42:59 UTC
it is default enabled on FACTORY, which is not too bad, given that FACTORY is for adventurous people who like to find bugs ;)

seife@susi:~> env|grep MALLOC
MALLOC_CHECK_=3
MALLOC_PERTURB_=69

I did fix it for me by not doing it for root like that (added the "if"):

susi:~ # cat /etc/profile.d/malloc-debug.sh 
# remove this file before shipment
if test x"$UID" != x0; then
export MALLOC_CHECK_=3
export MALLOC_PERTURB_=69
fi
Comment 28 Petr Baudis 2010-05-11 08:44:51 UTC
Okay, then I'm reassigning this to Coolo to decide whether we want to change or document the defaults in any way. Thank you all.
Comment 29 Stephan Kulow 2010-05-28 14:36:56 UTC
no, we will get rid of this file before the final. But they are there (also for root) for a reason - to find bugs
Comment 30 Marcus Meissner 2010-05-28 14:44:43 UTC
it also shows that hwinfo --framebuffer does way too much allocations ;)
Comment 31 Marcus Meissner 2010-06-23 11:44:45 UTC
malloc checking is now off in rc1.