Bug 1191718

Summary: gdb crashes (SIGABRT) debugging firefox when issuing 'info threads' command
Product: [openSUSE] openSUSE Tumbleweed Reporter: Michael Pujos <pujos.michael>
Component: DevelopmentAssignee: Tom de Vries <tdevries>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: martin.liska, matz, pujos.michael
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Michael Pujos 2021-10-17 10:32:43 UTC
While attempting to run firefox under gdb to check if it could give a hint about issue #1191659, it revealed that gdb is crashing doing this:

start firefox under gdb (you can also attach to a running firefox pid):

firefox -d gdb

It will take a while initially, as all debug info will be downloaded.
Once firefox is started and you have control in gdb, issue this command

info threads

gdb will start printing some thread info then core dump on SIGABRT.

examining the core dump in gdb with 'coredumpctl gdb', then the 'bt' command gives stack trace below. Crash happens at dwarf2/read.c:23098, in something that seems to be related with loading Rust binaries.
This crash does not happen on a fresh gdb cloned from git, but it is a much newer version compiled with default settings and no debuggerd support. openSUSE is the older v10.1 with zillions patches that almost look like a fork when you look at the very complex spec file.

////////

#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44	      return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
[Current thread is 1 (Thread 0x7f60d7e63500 (LWP 26804))]
(gdb) bt
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007f60d8bfc8e3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2  0x00007f60d8baf6f6 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007f60d8b997b3 in __GI_abort () at abort.c:79
#4  0x000056168a8fc57b in handle_sigsegv (sig=<optimized out>) at ../../gdb/event-top.c:891
#5  <signal handler called>
#6  follow_die_offset (sect_off=sect_off@entry=(unknown: 0x53ad3194), offset_in_dwz=<optimized out>, ref_cu=0x7ffc28541a00) at ../../gdb/dwarf2/read.c:23098
#7  0x000056168aa64032 in follow_die_ref (src_die=0x5616e0127390, attr=0x5616e01273c8, ref_cu=<optimized out>) at ../../gdb/dwarf2/read.c:23117
#8  0x000056168aa4b20b in rust_containing_type (cu=0x5616c7f62b50, die=0x5616e0127390) at ../../gdb/dwarf2/read.c:14174
#9  read_variable (cu=0x5616c7f62b50, die=0x5616e0127390) at ../../gdb/dwarf2/read.c:14192
#10 process_die (die=0x5616e0127390, cu=0x5616c7f62b50) at ../../gdb/dwarf2/read.c:10278
#11 0x000056168aa4a04a in read_file_scope (cu=0x5616c7f62b50, die=0x5616b8353ec0) at ../../gdb/dwarf2/read.c:11184
#12 process_die (die=0x5616b8353ec0, cu=0x5616c7f62b50) at ../../gdb/dwarf2/read.c:10191
#13 0x000056168aa773d7 in process_full_comp_unit (pretend_language=<optimized out>, cu=0x5616c7f62b50) at ../../gdb/dwarf2/read.c:9953
#14 process_queue (per_objfile=0x5616924c1ec0) at ../../gdb/dwarf2/read.c:9173
#15 dw2_do_instantiate_symtab(dwarf2_per_cu_data*, dwarf2_per_objfile*, bool) [clone .part.0] [clone .lto_priv.0] (per_cu=<optimized out>, per_objfile=0x5616924c1ec0, skip_partial=<optimized out>) at ../../gdb/dwarf2/read.c:2435
#16 0x000056168aa4088a in dw2_do_instantiate_symtab (skip_partial=false, per_objfile=0x5616924c1ec0, per_cu=<optimized out>) at ../../gdb/dwarf2/read.h:601
#17 dwarf2_psymtab::expand_psymtab (this=0x5616a53901e0, objfile=0x5616926477e0) at ../../gdb/dwarf2/read.c:9202
#18 0x000056168aa48d22 in dwarf2_psymtab::read_symtab (this=0x5616a53901e0, objfile=0x5616926477e0) at ../../gdb/dwarf2/read.c:9050
#19 0x000056168aba5911 in psymtab_to_symtab (objfile=0x5616926477e0, pst=0x5616a53901e0) at ../../gdb/psymtab.c:766
#20 0x000056168ab9d8ee in psym_find_pc_sect_compunit_symtab (objfile=0x5616926477e0, msymbol=..., pc=<optimized out>, section=<optimized out>, warn_if_readin=1) at ../../gdb/psymtab.c:394
#21 0x000056168ac8bfa9 in find_pc_sect_compunit_symtab (pc=140737253796073, section=0x0) at ../../gdb/symtab.c:3001
#22 0x000056168a981efb in find_pc_compunit_symtab (pc=140737253796073) at ../../gdb/symtab.c:3019
#23 call_site_for_pc (gdbarch=0x56168c969950, pc=<optimized out>) at ../../gdb/block.c:229
#24 0x000056168aa24c18 in call_site_find_chain_1 (callee_pc=140737350397632, caller_pc=140737253796074, gdbarch=0x56168c969950) at ../../gdb/dwarf2/loc.c:1193
#25 call_site_find_chain (callee_pc=140737350397661, caller_pc=140737253796074, gdbarch=0x56168c969950) at ../../gdb/dwarf2/loc.c:1296
#26 dwarf2_tailcall_sniffer_first (entry_cfa_sp_offsetp=0x0, tailcall_cachep=0x5616e9f77690, this_frame=0x5616e9f77590) at ../../gdb/dwarf2/frame-tailcall.c:390
#27 dwarf2_frame_cache (this_frame=0x5616e9f77590, this_cache=<optimized out>) at ../../gdb/dwarf2/frame.c:1198
#28 0x000056168aa25eb3 in dwarf2_frame_this_id (this_frame=0x5616e9f77590, this_cache=<optimized out>, this_id=0x5616e9f775f0) at ../../gdb/dwarf2/frame.c:1225
#29 0x000056168aa8d264 in compute_frame_id (fi=0x5616e9f77590) at ../../gdb/frame.c:590
#30 0x000056168aa8d405 in get_frame_id (fi=0x5616e9f77590) at ../../gdb/frame.c:638
#31 0x000056168ac6c853 in scoped_restore_selected_frame::scoped_restore_selected_frame (this=0x7ffc285422c0) at ../../gdb/frame.c:320
#32 print_frame_args (fp_opts=..., func=0x0, frame=0x5616e9f77590, num=-1, stream=0x56168c2f1ea0) at ../../gdb/stack.c:750
#33 0x000056168ac726b4 in print_frame (sal=..., sal=..., print_args=<optimized out>, print_what=LOCATION, print_level=<optimized out>, frame=0x5616e9f77590, fp_opts=...) at ../../gdb/top.c:102
#34 print_frame_info (fp_opts=..., frame=0x5616e9f77590, print_level=<optimized out>, print_what=LOCATION, print_args=<optimized out>, set_current_sal=0) at ../../gdb/stack.c:1119
#35 0x000056168ac64db5 in print_stack_frame (frame=0x5616e9f77590, print_level=0, print_what=LOCATION, set_current_sal=0) at ../../gdb/stack.c:366
#36 0x000056168acc8f4b in print_thread_info_1 (uiout=0x56168c2be950, requested_threads=0x0, global_ids=0, pid=-1, show_global_ids=0) at ../../gdb/thread.c:1160
#37 0x000056168acc994b in info_threads_command (arg=<optimized out>, from_tty=<optimized out>) at ../../gdb/top.c:124
#38 0x000056168a9d08a2 in cmd_func (cmd=<optimized out>, args=<optimized out>, from_tty=<optimized out>) at ../../gdb/cli/cli-decode.c:2181
#39 0x000056168acc4afe in execute_command (p=<optimized out>, from_tty=1) at ../../gdb/top.c:668
#40 0x000056168aa810ed in command_handler (command=0x56168bfb0f20 "info threads") at ../../gdb/event-top.c:591
#41 0x000056168aa8118d in command_line_handler (rl=...) at ../../gdb/event-top.c:776
#42 0x000056168aa78069 in gdb_rl_callback_handler (rl=0x5616a52fcc30 "info threads") at ../../gdb/event-top.c:220
#43 0x00007f60d96150be in rl_callback_read_char () at ../callback.c:281
#44 0x000056168aa80d13 in gdb_rl_callback_read_char_wrapper_noexcept () at ../../gdb/event-top.c:178
#45 0x000056168aa80ef0 in gdb_rl_callback_read_char_wrapper (client_data=<optimized out>) at ../../gdb/event-top.c:194
#46 0x000056168aa81010 in stdin_event_handler (error=<optimized out>, client_data=0x56168bfb0d40) at ../../gdb/event-top.c:519
#47 0x000056168ae69c16 in gdb_wait_for_event (block=1) at ../gdbsupport/../../gdbsupport/event-loop.cc:673
#48 gdb_wait_for_event (block=block@entry=1) at ../gdbsupport/../../gdbsupport/event-loop.cc:561
#49 0x000056168ae6a054 in gdb_do_one_event () at ../gdbsupport/../../gdbsupport/event-loop.cc:215
#50 gdb_do_one_event () at ../gdbsupport/../../gdbsupport/event-loop.cc:163
#51 0x000056168ab2fd65 in start_event_loop () at ../../gdb/main.c:356
#52 captured_command_loop () at ../../gdb/main.c:416
#53 0x000056168aec4185 in captured_main(void*) [clone .constprop.0] (data=data@entry=0x7ffc28542af0) at ../../gdb/main.c:1304
#54 0x000056168a91256b in gdb_main (args=0x7ffc28542af0) at ../../gdb/main.c:1315
Comment 1 Michael Pujos 2021-10-17 10:44:54 UTC
Note that this crash happens only when debuginfod is enabled (DEBUGINFOD_URLS environmnent variable set), which is the default.
Comment 2 Michael Pujos 2021-10-17 11:36:32 UTC
Latest gdb from git compiled with --with-debuginfod does not crash.
Comment 3 Michael Matz 2021-10-18 13:17:12 UTC
gdb internally aborting: something for Tom :-)
Comment 4 Tom de Vries 2021-10-18 20:44:36 UTC
Thanks for the report.

Sorry, I can't reproduce this.

Please either:
- provide more complete instructions on how to reproduce
- a script that reproduces the problem
- a log of a debugging session that demonstrates the problem
Comment 5 Tom de Vries 2021-10-19 20:01:18 UTC
OK, I got a reproducer:
...
$ gdb -q -batch \
    /usr/lib/debug/usr/lib64/firefox/libxul.so-93.0-1.1.x86_64.debug \
    -ex "maint expand-symtabs gfx/wr/webrender/src/lib.rs" \
    -ex "maint expand-symtabs gfx/wr/webrender/src/lib.rs"
Dwarf Error: Cannot find DIE at 0x563dca71 referenced from DIE at 0x56a674c0 [in module /usr/lib/debug/usr/lib64/firefox/libxul.so-93.0-1.1.x86_64.debug]
Aborted (core dumped)
...
Comment 6 Martin Liška 2021-10-20 13:08:51 UTC
I noticed:

$ readelf -w /home/marxin/.cache/debuginfod_client/61dc5d47ba1789f494278644181a1b6099fd140b/debuginfo
...

readelf: Warning: Location lists in .debug_loc section start at 0x10
Contents of the .debug_loc section:

    Offset   Begin            End              Expression

    0000000c v000000000000000 v000000000000000 location view pair
    0000000e v000000000000000 v000000000000000 location view pair

    00000010 v000000000000000 v000000000000000 views at 0000000c for:
             0000000000000040 00000000000000ea (DW_OP_reg4 (rsi); DW_OP_piece: 8)
    00000025 <End of list>
readelf: Warning: Hole and overlap detection requires adjacent view lists and loclists.

    00000027 v000000000000000 v000000000000000 location view pair
    00000029 v000000000000000 v000000000000000 location view pair

    0000002b v000000000000000 v000000000000000 views at 00000027 for:
             0000000000000000 ffffffffffff0000 ((Unknown location op 0xd0))
    0001003c v000000000000000 v000000000000000 views at 00000029 for:
             0000000004edb4a0 00000000000022a8 ((Unknown location op 0x0)) (start > end)
    000122fe v000000000000000 v000000000000000 views at 0000002b for:
             c700000000000028 0a00000000000028 ((Unknown location op 0xc0)) (start > end)
    00019a10 v000000000000000 v000000000000000 views at 0000002d for:
             000001b300000000 9353000600000000 (DW_OP_piece: 8; (Unknown location op 0x0))
    0001f82a v000000000000000 v000000000000000 views at 0000002f for:
             000000040b08935c 000000044e000000 ()
    0001f83c v000000000000000 v000000000000000 views at 00000031 for:
             935208935c000600 000000000004da08 ((Unknown location op 0x4)) (start > end)
    0002df4e v000000000000000 v000000000000000 views at 00000033 for:
             0100000000000002 0000000008445600 () (start > end)
    0002df60 readelf: Error: ../../binutils/dwarf.c:6405: read LEB value is too large to store in destination variable
vd0ffffffffffffff v000000000000000 views at 00000035 for:
             0000000000000855 0000000000500001 ()
    0002df72 v000000000000000 v000000000000000 views at 00000042 for:
Comment 7 Tom de Vries 2021-10-20 14:17:40 UTC
On trunk, this bisects to:
...
commit 8457e5ecc45295bc9550c4f705a276d5ca90d908
Author: Tom de Vries <tdevries@suse.de>
Date:   Wed Jun 16 12:44:30 2021 +0200

    [gdb/symtab] Fix infinite recursion in dwarf2_cu::get_builder(), again
...

I did another bisect carrying that patch, and got to:
...
commit bf6e5d01d7b149e116a008bd4348983c6f56e9ba
Author: Simon Marchi <simon.marchi@polymtl.ca>
Date:   Thu Nov 12 17:42:55 2020 -0500

    gdb/dwarf: fix call to dwarf2_queue_guard in dw2_do_instantiate_symtab
...

So, the first patch was backported to the package, the second one not, and that caused the failure on the package.

Should be fixed by the upcoming update to 11.1, which contains both patches.
Comment 8 Tom de Vries 2021-10-20 14:47:26 UTC
*** Bug 1191864 has been marked as a duplicate of this bug. ***
Comment 9 Martin Liška 2021-11-04 09:59:50 UTC
It's fixed now after gdb update to 11.1.