From: "Jan Vraný" <Jan.Vrany@labware.com>
To: "gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Subject: [PING] Re: [PATCH] Revert "gdb: change blockvector::contains() to handle blockvectors with "holes""
Date: Mon, 15 Dec 2025 12:08:06 +0000 [thread overview]
Message-ID: <6cadb0cf3862706f4d70bb07a5002d8c99ea6906.camel@labware.com> (raw)
In-Reply-To: <20251208180717.176567-1-jan.vrany@labware.com>
Polite ping.
Thanks,
Jan
On Mon, 2025-12-08 at 18:07 +0000, Jan Vrany wrote:
> This reverts commit cc1fc6af4150b19f9c4c70d0463ff498703fb637, since it
> causes a number of regressions that seem not to be easily fixable.
>
> The problem lies in existence of "freestanding" code, a code that is
> part of a CU but does not have any block associated with it. Consider
> following program:
>
> __asm__(
> ".type foo,@function \n"
> "foo: \n"
> " mov %rdi, %rax \n"
> " ret \n"
> );
>
> static int foo(int i);
>
> int main(int argc, char **argv) {
> return foo(argc);
> }
>
> When compiled, the foo function has no block of itself:
>
> Blockvector:
>
> no map
>
> block #000, object at 0x55978957b510, 1 symbols in 0x1129..0x1148
> int main(int, char **); block object 0x55978957b380, 0x112d..0x1148 section .text
> block #001, object at 0x55978957b470 under 0x55978957b510, 2 symbols in 0x1129..0x1148
> typedef int int;
> typedef char char;
> block #002, object at 0x55978957b380 under 0x55978957b470, 2 symbols in 0x112d..0x1148, function main
> int argc; computed at runtime
> char **argv; computed at runtime
>
> In this case lookup(0x1129) returns static block and, because of the
> change in cc1fc6af4, contains(0x1129) which is wrong.
>
> Such "freestanding" code is perhaps not common but it does exist,
> especially in system code. In fact the regressions were at least in part
> caused by such "freestanding" code in glibc (libc_sigaction.c).
>
> The whole idea of commit cc1fc6af4 was to handle "holes" in CUs, a case
> where one CU spans over multiple disjoint regions, possibly interleaved
> with other CUs. Consider somewhat extreme case with two CUs:
>
> /* hole-1.c */
> int give_me_zero ();
>
> int
> main ()
> {
> return give_me_zero ();
> }
>
> /* hole-2.c */
> int __attribute__ ((section (".text_give_me_one"))) __attribute__((noinline))
> baz () { return 42; }
>
> __asm__(
> ".section .text_give_me_one,\"ax\",@progbits\n"
> ".type foo,@function \n"
> "foo: \n"
> " mov %rdi, %rax \n"
> " ret \n"
> " nop \n"
> " nop \n"
> " nop \n"
> );
> int __attribute__ ((section (".text_give_me_one"))) __attribute__((noinline))
> give_me_one ()
> {
> return 1;
> }
>
> __asm__(
> ".section .text_give_me_zero,\"ax\",@progbits\n"
> "bar: \n"
> " jmp give_me_one \n"
> " nop \n"
> " nop \n"
> " nop \n"
> );
> int __attribute__ ((section (".text_give_me_zero")))
> give_me_zero ()
> {
> extern int bar();
> return give_me_one() - 1;
> }
>
> This when compiled with a carefully crafted linker script to force code
> at certain positions, creates following layout:
>
> 0x080000..0x080007 # "freestanding" bar from hole-2.c
> 0x080008..0x080016 # give_me_zero() from hole-2.c
> 0x080109..0x080114 # main from hole-1.c
> 0xf00000..0xf0000b # baz() from hole-2.c
> 0xf0000b..0xf00011 # "freestanding" foo from hole-2.
> 0xf0000b..0xf0001c # gice_me_one() from hole-2.
>
> The block vector for hole-1.c looks:
>
> Blockvector:
>
> no map
>
> block #000, object at 0x555a5d85fb90, 1 symbols in 0x80109..0x80114
> int main(void); block object 0x555a5d85faa0, 0x80109..0x80114 section .text
> block #001, object at 0x555a5d85faf0 under 0x555a5d85fb90, 1 symbols in 0x80109..0x80114
> typedef int int;
> block #002, object at 0x555a5d85faa0 under 0x555a5d85faf0, 0 symbols in 0x80109..0x80114, function main
>
> And for hole-2.c:
>
> Blockvector:
>
> map
> 0x0 -> 0x0
> 0x80008 -> 0x555a5d85ff50
> 0x80016 -> 0x0
> 0xf00000 -> 0x555a5d860280
> 0xf0000b -> 0x0
> 0xf00012 -> 0x555a5d860110
> 0xf0001d -> 0x0
>
> block #000, object at 0x555a5d8603b0, 3 symbols in 0x80008..0xf0001d
> int give_me_zero(void); block object 0x555a5d85ff50, 0x80008..0x80016 section .text
> int give_me_one(void); block object 0x555a5d860110, 0xf00012..0xf0001d section .text
> int baz(void); block object 0x555a5d860280, 0xf00000..0xf0000b section .text
> block #001, object at 0x555a5d8602d0 under 0x555a5d8603b0, 1 symbols in 0x80008..0xf0001d
> typedef int int;
> block #002, object at 0x555a5d85ff50 under 0x555a5d8602d0, 0 symbols in 0x80008..0x80016, function give_me_zero
> block #003, object at 0x555a5d860280 under 0x555a5d8602d0, 0 symbols in 0xf00000..0xf0000b, function baz
> block #004, object at 0x555a5d860110 under 0x555a5d8602d0, 0 symbols in 0xf00012..0xf0001d, function give_me_one
>
> Note that despite the fact "freestanding" bar belongs to hole-2.c, the
> corresponding CU's global and static blocks start at 0x80008! Looking
> at DWARF for the second program, it looks like that the compiler (GCC 15)
> did not record the presence of "freestanding" code:
>
> <0><71>: Abbrev Number: 1 (DW_TAG_compile_unit)
> <72> DW_AT_producer : (indirect string, offset: 0): GNU C23 15.2.0 -mtune=generic -march=x86-64 -g -fasynchronous-unwind-tables
> <76> DW_AT_language : 29 (C11)
> <77> Unknown AT value: 90: 3
> <78> Unknown AT value: 91: 0x31647
> <7c> DW_AT_name : (indirect line string, offset: 0x2d): hole-2.c
> <80> DW_AT_comp_dir : (indirect line string, offset: 0): test_programs
> <84> DW_AT_ranges : 0xc
> <88> DW_AT_low_pc : 0
> <90> DW_AT_stmt_list : 0x51
>
> and corresponding part of .debug_aranges:
>
> Length: 76
> Version: 2
> Offset into .debug_info: 0x65
> Pointer Size: 8
> Segment Size: 0
>
> Address Length
> 0000000000f00000 000000000000000b
> 0000000000f00012 000000000000000b
> 0000000000080008 000000000000000e
> 0000000000000000 0000000000000000
>
> Thiago suggested to use minsymbols to tell whether or a CU contains
> given address. I do not think this would work reliably as minsymbols do
> no know to which CU they belong. In slightly more complicated case of
> interleaved CUs it does not seem to be possible to tell for sure to which
> one a given minsymbol belongs.
>
> Moreover, Tom suggested that the comment in find_compunit_symtab_for_pc_sect
> (which led to cc1fc6af4) may be outdated [2].
>
> Given all that, I'm just reverting the change.
>
> [1]: https://sourceware.org/bugzilla/show_bug.cgi?id=33679#c13
> [2]: https://inbox.sourceware.org/gdb-patches/87cy6xzd3j.fsf@tromey.com/
>
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33679
> ---
> gdb/block.c | 29 +----------------------------
> 1 file changed, 1 insertion(+), 28 deletions(-)
>
> diff --git a/gdb/block.c b/gdb/block.c
> index 3d2c51cc554..e21580bcf63 100644
> --- a/gdb/block.c
> +++ b/gdb/block.c
> @@ -864,34 +864,7 @@ blockvector::lookup (CORE_ADDR addr) const
> bool
> blockvector::contains (CORE_ADDR addr) const
> {
> - auto b = lookup (addr);
> - if (b == nullptr)
> - return false;
> -
> - /* Handle the case that the blockvector has no address map but still has
> - "holes". For example, consider the following blockvector:
> -
> - B0 0x1000 - 0x4000 (global block)
> - B1 0x1000 - 0x4000 (static block)
> - B3 0x1000 - 0x2000
> - (hole)
> - B4 0x3000 - 0x4000
> -
> - In this case, the above blockvector does not contain address 0x2500 but
> - lookup (0x2500) would return the blockvector's static block.
> -
> - So here we check if the returned block is a static block and if yes, still
> - return false. However, if the blockvector contains no blocks other than
> - the global and static blocks and ADDR falls into the static block,
> - conservatively return true.
> -
> - See comment in find_compunit_symtab_for_pc_sect, symtab.c.
> -
> - Also, note that if the blockvector in the above example would contain
> - an address map, then lookup (0x2500) would return NULL instead of
> - the static block.
> - */
> - return b != static_block () || num_blocks () == 2;
> + return lookup (addr) != nullptr;
> }
>
> /* See block.h. */
next prev parent reply other threads:[~2025-12-15 12:08 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-08 18:07 Jan Vrany
2025-12-15 12:08 ` Jan Vraný [this message]
2025-12-15 14:49 ` Tom Tromey
2025-12-16 14:24 ` Jan Vraný
2025-12-18 18:55 ` Tom Tromey
2025-12-18 20:45 ` Jan Vraný
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6cadb0cf3862706f4d70bb07a5002d8c99ea6906.camel@labware.com \
--to=jan.vrany@labware.com \
--cc=gdb-patches@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox