Re: [PATCH 2/2] gdb: use std::vector<> to hold on blocks in struct blockvector

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Andrew Burgess <aburgess@redhat.com>
To: Jan Vrany <jan.vrany@labware.com>, gdb-patches@sourceware.org
Cc: Jan Vrany <jan.vrany@labware.com>
Subject: Re: [PATCH 2/2] gdb: use std::vector<> to hold on blocks in struct blockvector
Date: Tue, 14 Oct 2025 16:00:43 +0100	[thread overview]
Message-ID: <87v7kha5pg.fsf@redhat.com> (raw)
In-Reply-To: <20251013182318.1045138-3-jan.vrany@labware.com>


Hi Jan,

Thanks for working on this.

I think that this commit should be merged with the previous one.  For me
at least, I'm unable to build the previous commit as I get compiler
errors about malloc/free being poisoned.  Not sure why this didn't
impact you, but I didn't look too closely once I realised patch #2
removes those calls.  Given the overlap of the patches, I think the
easiest solution would be to just merge the two.

Jan Vrany <jan.vrany@labware.com> writes:

> This commit changes internal implementation of struct blockvector to use
> std::vector<> rather than flexible array, allowing one to easily add
> blocks to existing blockvector.
>
> This is needed to for lazy CU expansion and for Python API to build
> objfiles, compunits and symtabs dynamically (similarly to JIT reader
> API).
>
> To avoid extensive reallocation when adding a lot of blocks, ctor takes
> initial number of blocks to be stored in blockvector.
>
> While at is, it also adds blockvector::add_block() to add new blocks
> and changes mdebugread.c to use it.
>
> The downside is higher memory consumption. The size of std::vector with
> obstack allocator is 32 bytes (GCC 14) compared to 8 bytes used
> currently to store the number of blocks (m_num_blocks). Stopping gdb at
> its main(), followed by "maint expand-symtabs" results in 4593
> compunit symtabs so in this case the overhead is 24*4593 = 110232 bytes
> which I hope is acceptable.
> ---
>  gdb/block.c      | 36 +++++++++++++++++++++++++++++
>  gdb/block.h      | 59 +++++++++++++++++++++++-------------------------
>  gdb/buildsym.c   |  2 +-
>  gdb/jit.c        |  2 +-
>  gdb/mdebugread.c | 30 +++---------------------
>  5 files changed, 69 insertions(+), 60 deletions(-)
>
> diff --git a/gdb/block.c b/gdb/block.c
> index e09bccdcf86..a9191c3fdff 100644
> --- a/gdb/block.c
> +++ b/gdb/block.c
> @@ -17,6 +17,7 @@
>     You should have received a copy of the GNU General Public License
>     along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
>  
> +#include <cstring>

I couldn't see why this should be needed.

>  #include "block.h"
>  #include "symtab.h"
>  #include "symfile.h"
> @@ -809,6 +810,41 @@ make_blockranges (struct objfile *objfile,
>    return blr;
>  }
>  
> +static bool
> +block_ordering_predicate (struct block *b1, struct block *b2)

Functions should have a comment above them.

> +{
> +  CORE_ADDR start1 = b1->start ();
> +  CORE_ADDR start2 = b2->start ();
> +
> +  if (start1 != start2)
> +    return start1 < start2;
> +  return (b2->end () < b1->end ());
> +}

I notice that we use a slightly different ordering function in
buildsym.c, one that doesn't consider the end address.  Probably doesn't
matter, but maybe it's at least worth mentioning in a comment or the
commit message, that way, if someone is looking at the code in the
future they can understand why these things are different.  Even if the
reason is just "we have no motivating example for why these should be
the same right now".

> +
> +/* See block.h.  */
> +
> +void
> +blockvector::add_block (struct block *block)
> +{
> +  if (num_blocks () <= FIRST_LOCAL_BLOCK)
> +    {
> +      /* No blocks (except global and static block).  */
> +      m_blocks.push_back (block);
> +    }
> +  else
> +    {
> +      /* blockvector already contains some blocks.  Insert new block
> +   to a correct place.  */

The indentation of the second line is wrong here.

> +      auto first = m_blocks.begin () + FIRST_LOCAL_BLOCK;
> +      auto last = m_blocks.end ();
> +
> +      auto insert_before = std::upper_bound (first, last, block,
> +					     block_ordering_predicate);
> +
> +      m_blocks.insert (insert_before, block);
> +    }
> +}
> +
>  /* Implement 'maint info blocks' command.  If passed an argument then
>     print a list of all blocks at the given address.  With no arguments
>     then list all blocks at the current address of the current inferior.  */
> diff --git a/gdb/block.h b/gdb/block.h
> index d1ca41d7482..c86650753a4 100644
> --- a/gdb/block.h
> +++ b/gdb/block.h
> @@ -20,6 +20,7 @@
>  #ifndef GDB_BLOCK_H
>  #define GDB_BLOCK_H
>  
> +#include <algorithm>
>  #include "dictionary.h"
>  #include "gdbsupport/array-view.h"
>  #include "gdbsupport/next-iterator.h"
> @@ -420,59 +421,58 @@ struct global_block : public block
>  
>  struct blockvector
>  {
> -  void *operator new (size_t size, size_t num_blocks)
> -  {
> -    size_t size_in_bytes = size + (num_blocks - 1) * (sizeof (struct block *));
> -    return std::malloc (size_in_bytes);
> -  }
> -
> -  void operator delete (void *memory)
> -  {
> -    std::free (memory);
> -  }
> -
>    blockvector (int nblocks)
>      : m_map (nullptr),
> -      m_num_blocks (nblocks)
> -  {
> -    std::memset (m_blocks, 0, nblocks * sizeof (struct block *));
> -  }
> +      m_blocks (nblocks, nullptr)
> +  {}
>  
>    /* Return a view on the blocks of this blockvector.  */
>    gdb::array_view<struct block *> blocks ()
>    {
> -    return gdb::array_view<struct block *> (m_blocks, m_num_blocks);
> +    return gdb::array_view<struct block *> (m_blocks.data (),
> +					    m_blocks.size ());
>    }
>  
>    /* Const version of the above.  */
>    gdb::array_view<const struct block *const> blocks () const
>    {
> -    const struct block **blocks = (const struct block **) m_blocks;
> -    return gdb::array_view<const struct block *const> (blocks, m_num_blocks);
> +    const struct block **blocks = (const struct block **) m_blocks.data ();
> +    return gdb::array_view<const struct block *const> (blocks,
> +						       m_blocks.size ());
>    }
>  
>    /* Return the block at index I.  */
>    struct block *block (size_t i)
> -  { return this->blocks ()[i]; }
> +  {
> +    return m_blocks[i];
> +  }
>  
>    /* Const version of the above.  */
>    const struct block *block (size_t i) const
> -  { return this->blocks ()[i]; }
> +  {
> +    return m_blocks[i];
> +  }
>  
>    /* Set the block at index I.  */
>    void set_block (int i, struct block *block)
>    { m_blocks[i] = block; }
>  
> -  /* Set the number of blocks of this blockvector.
> -
> -     The storage of blocks is done using a flexible array member, so the number
> -     of blocks set here must agree with what was effectively allocated.  */
> -  void set_num_blocks (int num_blocks)
> -  { m_num_blocks = num_blocks; }
> +  /* Add BLOCK, making sure blocks are ordered by code-addresses
> +     as required.  Update global and static block start and end
> +     adresses accordingly.  */
> +  void add_block (struct block *block);

Maybe this comment needs tweaking.  In the general case the global
and/or static blocks are not updated, but I can see why you would
consider them as updated if we started from an empty block list.  But I
think expanding this comment would make things clearer.

>  
>    /* Return the number of blocks in this blockvector.  */
>    int num_blocks () const
> -  { return m_num_blocks; }
> +  {
> +    return m_blocks.size ();
> +  }
> +
> +  /* Set the number of blocks of this blockvector.  */
> +  void set_num_blocks (int num_blocks)
> +  {
> +    m_blocks.resize (num_blocks, nullptr);
> +  }
>  
>    /* Return the global block of this blockvector.  */
>    struct global_block *global_block ()
> @@ -511,11 +511,8 @@ struct blockvector
>       enough.  */
>    addrmap_fixed *m_map;
>  
> -  /* Number of blocks in the list.  */
> -  int m_num_blocks;
> -
>    /* The blocks themselves.  */
> -  struct block *m_blocks[1];
> +  std::vector<struct block *> m_blocks;
>  };
>  
>  extern const struct blockvector *blockvector_for_pc (CORE_ADDR,
> diff --git a/gdb/buildsym.c b/gdb/buildsym.c
> index a61ef68ada1..8daa68d296b 100644
> --- a/gdb/buildsym.c
> +++ b/gdb/buildsym.c
> @@ -427,7 +427,7 @@ buildsym_compunit::make_blockvector ()
>      {
>      }
>  
> -  blockvector = new (i) struct blockvector (i);
> +  blockvector = new struct blockvector (i);
>  
>    /* Copy the blocks into the blockvector.  This is done in reverse
>       order, which happens to put the blocks into the proper order
> diff --git a/gdb/jit.c b/gdb/jit.c
> index d95fe542a29..4e3abd1802c 100644
> --- a/gdb/jit.c
> +++ b/gdb/jit.c
> @@ -553,7 +553,7 @@ finalize_symtab (struct gdb_symtab *stab, struct objfile *objfile)
>        filetab->set_linetable (new_table);
>      }
>  
> -  bv = new (actual_nblocks) blockvector (actual_nblocks);
> +  bv = new blockvector (actual_nblocks);
>    cust->set_blockvector (bv);
>  
>    /* At the end of this function, (begin, end) will contain the PC range this
> diff --git a/gdb/mdebugread.c b/gdb/mdebugread.c
> index 3cedf0e4fc2..90bbcd81ed2 100644
> --- a/gdb/mdebugread.c
> +++ b/gdb/mdebugread.c
> @@ -256,8 +256,6 @@ static legacy_psymtab *new_psymtab (const char *, psymtab_storage *,
>  static void mdebug_expand_psymtab (legacy_psymtab *pst,
>  				  struct objfile *objfile);
>  
> -static void add_block (struct block *, struct symtab *);
> -
>  static void add_symbol (struct symbol *, struct symtab *, struct block *);
>  
>  static int add_line (struct linetable *, int, CORE_ADDR, int);
> @@ -808,7 +806,7 @@ parse_symbol (SYMR *sh, union aux_ext *ax, char *ext_sh, int bigend,
>        b->set_start (sh->value);
>        b->set_end (sh->value);
>        b->set_superblock (top_stack->cur_block);
> -      add_block (b, top_stack->cur_st);
> +      top_stack->cur_st->compunit ()->blockvector ()->add_block (b);

I don't see it being called out anywhere that this change is now going
to reorder the blocks.  I have no idea if this change is important or
not, but I think it is usually a good idea to talk about this sort thing
in the commit message -- imagine in a few months someone hits a bug in
ECOFF support related to block ordering, they look at this commit, and
they have to figure out why the block ordering changed....

Thanks,
Andrew

>  
>        /* Not if we only have partial info.  */
>        if (SC_IS_UNDEF (sh->sc) || sh->sc == scNil)
> @@ -1136,7 +1134,7 @@ parse_symbol (SYMR *sh, union aux_ext *ax, char *ext_sh, int bigend,
>        b->set_start (sh->value + top_stack->procadr);
>        b->set_superblock (top_stack->cur_block);
>        top_stack->cur_block = b;
> -      add_block (b, top_stack->cur_st);
> +      top_stack->cur_st->compunit ()->blockvector ()->add_block (b);
>        break;
>  
>      case stEnd:		/* end (of anything) */
> @@ -4492,28 +4490,6 @@ add_symbol (struct symbol *s, struct symtab *symtab, struct block *b)
>    mdict_add_symbol (b->multidict (), s);
>  }
>  
> -/* Add a new block B to a symtab S.  */
> -
> -static void
> -add_block (struct block *b, struct symtab *s)
> -{
> -  /* Cast away "const", but that's ok because we're building the
> -     symtab and blockvector here.  */
> -  struct blockvector *old_bv
> -    = (struct blockvector *) s->compunit ()->blockvector ();
> -
> -  struct blockvector *new_bv = new (old_bv->num_blocks () + 1)
> -    blockvector (old_bv->num_blocks () + 1);
> -
> -  s->compunit ()->set_blockvector (old_bv);
> -
> -  for (int i = 0; i < old_bv->num_blocks (); i++)
> -    new_bv->set_block (i, old_bv->block (i));
> -  new_bv->set_block (new_bv->num_blocks () - 1, b);
> -
> -  delete old_bv;
> -}
> -
>  /* Add a new linenumber entry (LINENO,ADR) to a linevector LT.
>     MIPS' linenumber encoding might need more than one byte
>     to describe it, LAST is used to detect these continuation lines.
> @@ -4635,7 +4611,7 @@ new_symtab (const char *name, int maxlines, struct objfile *objfile)
>    lang = cust->language ();
>  
>    /* All symtabs must have at least two blocks.  */
> -  bv = new (2) blockvector (2);
> +  bv = new blockvector (2);
>    bv->set_block (GLOBAL_BLOCK, new_block (objfile, NON_FUNCTION_BLOCK, lang));
>    bv->set_block (STATIC_BLOCK, new_block (objfile, NON_FUNCTION_BLOCK, lang));
>    bv->static_block ()->set_superblock (bv->global_block ());
> -- 
> 2.51.0

next prev parent reply	other threads:[~2025-10-14 15:01 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-13 18:23 [PATCH 0/2] " Jan Vrany
2025-10-13 18:23 ` [PATCH 1/2] gdb: allocate blockvector on heap Jan Vrany
2025-10-14 18:41   ` Simon Marchi
2025-10-13 18:23 ` [PATCH 2/2] gdb: use std::vector<> to hold on blocks in struct blockvector Jan Vrany
2025-10-14 15:00   ` Andrew Burgess [this message]
2025-10-14 19:41     ` Simon Marchi
2025-10-15 21:34       ` Jan Vraný
2025-10-14 19:29   ` Simon Marchi
2025-10-14 20:05     ` Tom Tromey
2025-10-14 20:19       ` Simon Marchi
2025-10-14 20:40         ` Jan Vraný
2025-10-15  2:00           ` Simon Marchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v7kha5pg.fsf@redhat.com \
    --to=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=jan.vrany@labware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox