Bug 679092

Summary: kernel-desktop debuginfo missing - unable to debug kernel crash
Product: [openSUSE] openSUSE 11.4 Reporter: Jon Nelson <jnelson-suse>
Component: BasesystemAssignee: Michal Marek <mmarek>
Status: RESOLVED WONTFIX QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None CC: antoine.mechelynck, forgotten_sxozS5NPY1, ptesarik
Version: Final   
Target Milestone: ---   
Hardware: x86-64   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Jon Nelson 2011-03-12 02:58:13 UTC
User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:2.0b12) Gecko/20110222 Firefox/4.0b12

My openSUSE 11.4 install (kernel-desktop) oopses (almost) every time I unplug my USB audio headset.

I made a kernel crash, but 'crash' claims that the debuginfo can't be found:

worklaptop:/var/crash # crash 2011-03-11-19\:29/System.map-2.6.37.1-1.2-desktop 2011-03-11-19\:29/vmlinux-2.6.37.1-1.2-desktop 2011-03-11-19\:29/vmcore 
...
crash: 2011-03-11-19:29/vmlinux-2.6.37.1-1.2-desktop: no debugging data available


worklaptop:/var/crash # rpm -qa | grep kernel-desktop
kernel-desktop-2.6.37.1-1.2.2.x86_64
kernel-desktop-devel-2.6.37.1-1.2.2.x86_64
kernel-desktop-devel-debuginfo-2.6.37.1-1.2.2.x86_64
kernel-desktop-debuginfo-2.6.37.1-1.2.2.x86_64
kernel-desktop-base-2.6.37.1-1.2.2.x86_64
kernel-desktop-base-debuginfo-2.6.37.1-1.2.2.x86_64
worklaptop:/var/crash # 


What's more, boot.kdump is *not* in /etc/init.d/boot.d

and trying to enable it results in this:

worklaptop:/var/crash # chkconfig --add boot.kdump 
insserv: FATAL: service network is missed in the runlevels 4 to use service SuSEfirewall2_setup
insserv: exiting now!
/sbin/insserv failed, exit code 1
boot.kdump                0:off  1:off  2:off  3:off  4:off  5:off  6:off
worklaptop:/var/crash # 

finally, 'openSUSE 11.4' is missing from the Operating System dropdown in bugzilla.





Reproducible: Always

Steps to Reproduce:
1.
2.
3.
Comment 1 Jiri Slaby 2011-03-14 11:14:07 UTC
(In reply to comment #0)
> User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:2.0b12) Gecko/20110222
> Firefox/4.0b12
> 
> My openSUSE 11.4 install (kernel-desktop) oopses (almost) every time I unplug
> my USB audio headset.
> 
> I made a kernel crash, but 'crash' claims that the debuginfo can't be found:
> 
> worklaptop:/var/crash # crash 2011-03-11-19\:29/System.map-2.6.37.1-1.2-desktop
> 2011-03-11-19\:29/vmlinux-2.6.37.1-1.2-desktop 2011-03-11-19\:29/vmcore 
> ...
> crash: 2011-03-11-19:29/vmlinux-2.6.37.1-1.2-desktop: no debugging data
> available

Where does this path come from? Does
gunzip -k /boot/vmlinux-2.6.37.1-1.2-desktop.gz
help?

> What's more, boot.kdump is *not* in /etc/init.d/boot.d
> 
> and trying to enable it results in this:
> 
> worklaptop:/var/crash # chkconfig --add boot.kdump 
> insserv: FATAL: service network is missed in the runlevels 4 to use service
> SuSEfirewall2_setup
> insserv: exiting now!

Please report that as a separate bug.
Comment 2 Jon Nelson 2011-03-14 15:16:20 UTC
(In reply to comment #1)
> (In reply to comment #0)
> > User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:2.0b12) Gecko/20110222
> > Firefox/4.0b12
> > 
> > My openSUSE 11.4 install (kernel-desktop) oopses (almost) every time I unplug
> > my USB audio headset.
> > 
> > I made a kernel crash, but 'crash' claims that the debuginfo can't be found:
> > 
> > worklaptop:/var/crash # crash 2011-03-11-19\:29/System.map-2.6.37.1-1.2-desktop
> > 2011-03-11-19\:29/vmlinux-2.6.37.1-1.2-desktop 2011-03-11-19\:29/vmcore 
> > ...
> > crash: 2011-03-11-19:29/vmlinux-2.6.37.1-1.2-desktop: no debugging data
> > available
> 
> Where does this path come from? Does
> gunzip -k /boot/vmlinux-2.6.37.1-1.2-desktop.gz
> help?

The path? /var/crash/2011-03-11-19:29/  is a directory. it came from the kdump package after the kernel crashed.

Within that directory are the following files (all placed there by kdump):

README.txt
System.map-2.6.37.1-1.2-desktop
vmcore
vmlinux-2.6.37.1-1.2-desktop.gz

vmlinux-... is an exact copy of the file from /boot.
I tried 'crash' as it was provided, and also after I had gunzip'd it.

an strace of 'crash' shows that it is failing to find the right debug symbols:

...
access("/usr/lib/debug/.build-id/75/6e58827a1e0c4f90c96272f71ccd2e64161fe1.debug", F_OK) = -1 ENOENT (No such file or directory)
lseek(5, 18980864, SEEK_SET)            = 18980864
read(5, "OspWG\0OspWg\0OSpWG\0GCC: (SUSE Lin"..., 4096) = 4096
lstat("/var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/var/crash", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat("/var/crash/2011-03-11-19:29", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
open("/var/crash/2011-03-11-19:29/vmlinux-2.6.37.1-1.2-desktop.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/var/crash/2011-03-11-19:29/.debug/vmlinux-2.6.37.1-1.2-desktop.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/debug//var/crash/2011-03-11-19:29/vmlinux-2.6.37.1-1.2-desktop.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/debug/var/crash/2011-03-11-19:29/vmlinux-2.6.37.1-1.2-desktop.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
close(5)                                = 0
munmap(0x7f0b87a56000, 4096)            = 0
open("/var/crash/2011-03-11-19:29/.gdb_history", O_RDONLY) = -1 ENOENT (No such file or directory)
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
write(1, "\n", 1
)                       = 1
write(1, "crash: vmlinux-2.6.37.1-1.2-desk"..., 65crash: vmlinux-2.6.37.1-1.2-desktop: no debugging data available
) = 65
write(1, "\n", 1
)                       = 1
exit_group(1)                           = ?



as noted in the initial report, I do have the debuginfo packages installed.
Comment 3 Jiri Slaby 2011-03-14 15:38:46 UTC
(In reply to comment #2)
> access("/usr/lib/debug/.build-id/75/6e58827a1e0c4f90c96272f71ccd2e64161fe1.debug",
> F_OK) = -1 ENOENT (No such file or directory)
> lseek(5, 18980864, SEEK_SET)            = 18980864
> read(5, "OspWG\0OspWg\0OSpWG\0GCC: (SUSE Lin"..., 4096) = 4096
> lstat("/var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> lstat("/var/crash", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
> lstat("/var/crash/2011-03-11-19:29", {st_mode=S_IFDIR|0755, st_size=4096, ...})
> = 0
> open("/var/crash/2011-03-11-19:29/vmlinux-2.6.37.1-1.2-desktop.debug",
> O_RDONLY) = -1 ENOENT (No such file or directory)
> open("/var/crash/2011-03-11-19:29/.debug/vmlinux-2.6.37.1-1.2-desktop.debug",
> O_RDONLY) = -1 ENOENT (No such file or directory)
> open("/usr/lib/debug//var/crash/2011-03-11-19:29/vmlinux-2.6.37.1-1.2-desktop.debug",
> O_RDONLY) = -1 ENOENT (No such file or directory)
> open("/usr/lib/debug/var/crash/2011-03-11-19:29/vmlinux-2.6.37.1-1.2-desktop.debug",
> O_RDONLY) = -1 ENOENT (No such file or directory)

Aha, I see. So either kdump doesn't put there the debuginfo or crash looks at wrong places. Petr/Bernard, what's the correct way to fix this (in crash or in kdump)?
Comment 4 Forgotten User sxozS5NPY1 2011-03-14 17:47:20 UTC
The debug info file should be

/usr/lib/debug/boot/vmlinux-2.6.37.1-1.2-desktop.debug

If crash should find it there, you have to open it like

  % crash /boot/vmlinux-2.6.37.1-1.2-desktop  path/to/vmcore

Or just symlink it 

  % cd /var/crash/2011-03-11-19:29 
  % ln -s /usr/lib/debug/boot/vmlinux-2.6.37.1-1.2-desktop.debug .
  % crash vmlinux-2.6.37.1-1.2-desktop vmcore
Comment 5 Jon Nelson 2011-03-14 17:54:14 UTC
symlinking it worked!

Does this imply a bug in the 'kdump' or 'crash' packages, or in the -debuginfo package?


Now that I can look at the crash, should I just file a new bug for the crasher?
Comment 6 Jiri Slaby 2011-03-14 17:57:56 UTC
(In reply to comment #5)
> Now that I can look at the crash, should I just file a new bug for the crasher?

Yes, please. And make a note here.
Comment 7 Jon Nelson 2011-03-14 18:00:50 UTC
Bug 679484 just filed regarding the actual crash.
Comment 8 Forgotten User sxozS5NPY1 2011-03-14 18:06:33 UTC
(In reply to comment #5)
> symlinking it worked!
> 
> Does this imply a bug in the 'kdump' or 'crash' packages, or in the -debuginfo
> package?

It's no bug in crash, it works as designed[tm].

But I wonder why kdump didn't copy the debuginfo, too. Can you please attach /etc/sysconfig/kdump (you can remove the comments and empty lines if you like)? Did you install the debuginfo package *before* or *after* the crash has been created?
Comment 9 Jon Nelson 2011-03-14 18:30:47 UTC
The only change I made is to change the KDUMP_SAVEDIR (I have a separate /var, so I just added /crashes to the root FS).


KDUMP_KERNELVER=""
KDUMP_COMMANDLINE=""
KDUMP_COMMANDLINE_APPEND=""
KEXEC_OPTIONS=""
KDUMP_IMMEDIATE_REBOOT="yes"
KDUMP_TRANSFER=""
KDUMP_SAVEDIR="file:///crashes"
KDUMP_KEEP_OLD_DUMPS="5"
KDUMP_FREE_DISK_SIZE="64"
KDUMP_VERBOSE="3"
KDUMP_VERBOSE="31"
KDUMP_DUMPLEVEL="0"
KDUMP_DUMPFORMAT="compressed"
KDUMP_CONTINUE_ON_ERROR="false"
KDUMP_REQUIRED_PROGRAMS=""
KDUMP_PRESCRIPT=""
KDUMP_POSTSCRIPT=""
KDUMP_COPY_KERNEL="yes"
KDUMPTOOL_FLAGS=""
KDUMP_NETCONFIG="auto"
KDUMP_SMTP_SERVER=""
KDUMP_SMTP_USER=""
KDUMP_SMTP_PASSWORD=""
KDUMP_NOTIFICATION_TO=""
KDUMP_NOTIFICATION_CC=""




I believe the debuginfo was installed after the crash.
However, I feel that doing so should not be necessary.  'crash' should be able to find the debuginfo files (once they are installed), IMO.  I humbly request you consider this use-case. It's hard enough to get kdump to work as it is for most users.
Comment 10 Petr Tesařík 2011-03-31 08:25:33 UTC
It's actually a consequence of gdb's default search path. The trouble here is that the debuginfo file is stored somewhere under /usr/lib/debug, but the vmlinux file doesn't really say where.

OTOH I believe crash could add "/usr/lib/debug/boot" to the embedded gdb's search path, because that's where all kernel debuginfo files are installed by default. Let me see if that's easily possible.
Comment 11 Petr Tesařík 2011-03-31 09:52:58 UTC
It turns out that crash already has "/usr/lib/debug/boot" in the search path. But there is problem with kernel debuginfo packaging.

Crash does search for the debuginfo file, but ONLY if the vmlinux file contains no symbols. However, this is not the case with openSUSE vmlinux:

$ file /boot/vmlinux-2.6.37.1-1.2-desktop 
/boot/vmlinux-2.6.37.1-1.2-desktop: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped
                           ^^^^^^^^^^^^
And, indeed, there is a symbol table in that file. So crash does nothing, assuming that the embedded gdb can handle the vmlinux file. But gdb has a different search logic...

Anyway, the gdb logic would easily find the debuginfo file by its build-id. The vmlinux file contains a valid NT_GNU_BUILD_ID note:

Contents of section .notes:
 ffffffff8152d288 04000000 14000000 03000000 474e5500  ............GNU.
 ffffffff8152d298 756e5882 7a1e0c4f 90c96272 f71ccd2e  unX.z..O..br....
 ffffffff8152d2a8 64161fe1                             d...            

The strace output actually shows that crash tries to open the debuginfo for build-id 756e58827a1e0c4f90c96272f71ccd2e64161fe1. Sadly, the corresponding file is missing under /usr/lib/debug/.build-id/.

There are two ways of fixing it:

1. Strip /boot/vmlinux-$ver (quick, but a bit hackish)
2. Provide the missing symlink to /usr/lib/debug/boot/vmlinux-$ver.debug
   under /usr/lib/debug/.build-id/.

Michal, I find the packaging quite confusing:

1. /boot/vmlinux-2.6.37.1-1.2-desktop.gz is contained in all of these
   packages:
   - kernel-desktop-2.6.37.1-1.2.2.x86_64.rpm
   - kernel-desktop-base-2.6.37.1-1.2.2.x86_64.rpm
   - kernel-desktop-devel-2.6.37.1-1.2.2.x86_64.rpm

2. /usr/lib/debug/boot/vmlinux-2.6.37.1-1.2-desktop.debug is only in
   kernel-desktop-devel-debuginfo-2.6.37.1-1.2.2.x86_64.rpm

3. kernel-desktop-devel-debuginfo does not contain any other files
   most notably the link under /usr/lib/debug/.build-id is missing

4. kernel-desktop-devel-debuginfo does not provide
     debuginfo(build-id) = 756e58827a1e0c4f90c96272f71ccd2e64161fe1

   this looks like an omission to me, because all other debuginfo packages
   have the correct provides (e.g. for kernel modules).

Can you please fix this?
Comment 12 Jon Nelson 2013-01-24 02:10:36 UTC
Is this going to get fixed (or has it been fixed)?  I'm on openSUSE 12.2 now.
Comment 13 Michal Marek 2013-09-30 08:09:44 UTC
openSUSE <= 12.1 is no longer active. If you still can reproduce the problem with openSUSE 12.3 or Factory, please reopen the bug and change the product field accordingly. Sorry that I did not have time to address this bug.