Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Simon Marchi <simark@simark.ca>
To: Jan Vrany <jan.vrany@labware.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH 2/2] gdb: use std::vector<> to hold on blocks in struct blockvector
Date: Tue, 14 Oct 2025 15:29:42 -0400	[thread overview]
Message-ID: <706dd0b5-a1f0-4cdb-bd41-b3b1a2bbac3a@simark.ca> (raw)
In-Reply-To: <20251013182318.1045138-3-jan.vrany@labware.com>

On 10/13/25 2:23 PM, Jan Vrany wrote:
> This commit changes internal implementation of struct blockvector to use
> std::vector<> rather than flexible array, allowing one to easily add
> blocks to existing blockvector.
> 
> This is needed to for lazy CU expansion and for Python API to build
> objfiles, compunits and symtabs dynamically (similarly to JIT reader
> API).
> 
> To avoid extensive reallocation when adding a lot of blocks, ctor takes
> initial number of blocks to be stored in blockvector.
> 
> While at is, it also adds blockvector::add_block() to add new blocks
> and changes mdebugread.c to use it.
> 
> The downside is higher memory consumption. The size of std::vector with
> obstack allocator is 32 bytes (GCC 14) compared to 8 bytes used
> currently to store the number of blocks (m_num_blocks). Stopping gdb at
> its main(), followed by "maint expand-symtabs" results in 4593
> compunit symtabs so in this case the overhead is 24*4593 = 110232 bytes
> which I hope is acceptable.
> ---
>  gdb/block.c      | 36 +++++++++++++++++++++++++++++
>  gdb/block.h      | 59 +++++++++++++++++++++++-------------------------
>  gdb/buildsym.c   |  2 +-
>  gdb/jit.c        |  2 +-
>  gdb/mdebugread.c | 30 +++---------------------
>  5 files changed, 69 insertions(+), 60 deletions(-)
> 
> diff --git a/gdb/block.c b/gdb/block.c
> index e09bccdcf86..a9191c3fdff 100644
> --- a/gdb/block.c
> +++ b/gdb/block.c
> @@ -17,6 +17,7 @@
>     You should have received a copy of the GNU General Public License
>     along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
>  
> +#include <cstring>

I'm not sure which change of yours requires cstring (perhaps this include has been
auto-inserted, but if it's needed, I'd use <string.h>, just for consistency with the
rest of the codebase.

>  #include "block.h"
>  #include "symtab.h"
>  #include "symfile.h"
> @@ -809,6 +810,41 @@ make_blockranges (struct objfile *objfile,
>    return blr;
>  }
>  
> +static bool
> +block_ordering_predicate (struct block *b1, struct block *b2)
> +{
> +  CORE_ADDR start1 = b1->start ();
> +  CORE_ADDR start2 = b2->start ();
> +
> +  if (start1 != start2)
> +    return start1 < start2;
> +  return (b2->end () < b1->end ());
> +}
> +
> +/* See block.h.  */
> +
> +void
> +blockvector::add_block (struct block *block)
> +{
> +  if (num_blocks () <= FIRST_LOCAL_BLOCK)
> +    {
> +      /* No blocks (except global and static block).  */
> +      m_blocks.push_back (block);
> +    }
> +  else
> +    {
> +      /* blockvector already contains some blocks.  Insert new block
> +   to a correct place.  */

Indentation is off.

> +      auto first = m_blocks.begin () + FIRST_LOCAL_BLOCK;
> +      auto last = m_blocks.end ();
> +
> +      auto insert_before = std::upper_bound (first, last, block,
> +					     block_ordering_predicate);
> +
> +      m_blocks.insert (insert_before, block);
> +    }
> +}

This function makes me a bit worried, because I feel like it could (even
though unlikely) end up mis-used and end up with a situation like this:

https://inbox.sourceware.org/gdb-patches/b5853237-a041-4a74-aceb-0406eeb926ec@polymtl.ca/T/#m95998810ae7943611bec2a499070b735f6ec4928

That is, if you add some blocks not in the sorted order already and
insert blocks in the middle of the vector, you end up shifting a lot of
elements, and it can start to get noticeable.  The only user of this
function right now is mdebugread, and it didn't seem to need this
"insert" in the middle approach.  For that case, we could perhaps have
an append_block method, which would have the precondition (could be
quickly checked with a gdb_assert) that it must sort after the last
element of the vector?

However, I guess that for lazy CU expansion, we'll need the ability to
insert blocks in the middle, as we expand more of a CU?  In this case,
we would perhaps want a way to add all the new blocks, and then do one
sort at the end, rather than sorted insertions.

I also wish we could avoid having a "set_block" method, because it
requires having the blockvector in an invalid state (some blocks slots
set to nullptr) while we build it, which could be error prone.  The
users of the set_block method are buildsym and jit.  From what i saw,
they could both very well prepare an std::vector with their blocks and
std::move that vector into a blockvector constructor.  That constructor
could assert that the blocks are correctly ordered, or to the sort
itself (something that both buildsym and jit do already).  buildsym
starts with a singly-linked list (m_pending_blocks), moves the blocks to a
temporary vector (in end_compunit_symtab_get_static_block), sorts that
vector, builds a singly-linked list again, and then traverses that list
to insert the blocks into the blockvector.  I think that could all be
simplified by making m_pending_blocks an std::vector<block *> from the
start.

Hopefully, something like this would make it harder to build a
blockvector in an invalid state.

Simon

  parent reply	other threads:[~2025-10-14 19:30 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-13 18:23 [PATCH 0/2] " Jan Vrany
2025-10-13 18:23 ` [PATCH 1/2] gdb: allocate blockvector on heap Jan Vrany
2025-10-14 18:41   ` Simon Marchi
2025-10-13 18:23 ` [PATCH 2/2] gdb: use std::vector<> to hold on blocks in struct blockvector Jan Vrany
2025-10-14 15:00   ` Andrew Burgess
2025-10-14 19:41     ` Simon Marchi
2025-10-15 21:34       ` Jan Vraný
2025-10-14 19:29   ` Simon Marchi [this message]
2025-10-14 20:05     ` Tom Tromey
2025-10-14 20:19       ` Simon Marchi
2025-10-14 20:40         ` Jan Vraný
2025-10-15  2:00           ` Simon Marchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=706dd0b5-a1f0-4cdb-bd41-b3b1a2bbac3a@simark.ca \
    --to=simark@simark.ca \
    --cc=gdb-patches@sourceware.org \
    --cc=jan.vrany@labware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox