Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
* [PATCH 0/2] Fix memory leak with .debug_types
@ 2025-10-23 17:17 Tom Tromey
  2025-10-23 17:17 ` [PATCH 1/2] Two bug fixes in mdict_free Tom Tromey
  2025-10-23 17:17 ` [PATCH 2/2] Free multidicts from blockvector Tom Tromey
  0 siblings, 2 replies; 11+ messages in thread
From: Tom Tromey @ 2025-10-23 17:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

I noticed a while ago that nothing in the tree ever calls mdict_free,
and looking deeper I found that .debug_types has a longstanding memory
leak.

Now that blockvectors are heap-allocated, this is relatively easy to
fix, which is what this series does.

Regression tested on x86-64 Fedora 40.

Signed-off-by: Tom Tromey <tom@tromey.com>
---
Tom Tromey (2):
      Two bug fixes in mdict_free
      Free multidicts from blockvector

 gdb/block.c      |  6 ++++++
 gdb/block.h      |  2 ++
 gdb/dictionary.c | 13 +++++++++----
 3 files changed, 17 insertions(+), 4 deletions(-)
---
base-commit: 8eeb52e3c7701f03711d51a2209c39d85d109500
change-id: 20251023-mdict-free-47eec1d7635d

Best regards,
-- 
Tom Tromey <tom@tromey.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] Two bug fixes in mdict_free
  2025-10-23 17:17 [PATCH 0/2] Fix memory leak with .debug_types Tom Tromey
@ 2025-10-23 17:17 ` Tom Tromey
  2025-10-23 18:26   ` Simon Marchi
  2025-10-23 17:17 ` [PATCH 2/2] Free multidicts from blockvector Tom Tromey
  1 sibling, 1 reply; 11+ messages in thread
From: Tom Tromey @ 2025-10-23 17:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

A heap-allocated multidictionary should be freed by calling
mdict_free.  However, while this function does free the contents of
the dictionary, it neglects to free the dictionary itself.

There's also a second bug, which is that if a multidictionary is
created with no dictionaries, gdb will crash on the first line of
mdict_free:

  enum dict_type type = mdict->dictionaries[0]->vector->type;

So, this patch also adds the type to struct multidictionary, avoiding
this problem.  Note that this does not increase the structure size on
x86-64, because the new member fits into the padding.
---
 gdb/dictionary.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/gdb/dictionary.c b/gdb/dictionary.c
index f435ad5a47e817dc59087ac2ff5637415a9256d6..9dec7ccb2af5fad1e0cb62b92dc79e41e2fbbc15 100644
--- a/gdb/dictionary.c
+++ b/gdb/dictionary.c
@@ -911,6 +911,9 @@ struct multidictionary
   /* The number of language dictionaries currently allocated.
      Only used for expandable dictionaries.  */
   unsigned short n_allocated_dictionaries;
+
+  /* The type of dictionary.  */
+  enum dict_type type;
 };
 
 /* A helper function to collate symbols on the pending list by language.  */
@@ -948,6 +951,7 @@ mdict_create_hashed (struct obstack *obstack,
   retval->dictionaries
     = XOBNEWVEC (obstack, struct dictionary *, nsyms.size ());
   retval->n_allocated_dictionaries = nsyms.size ();
+  retval->type = DICT_HASHED;
 
   int idx = 0;
   for (const auto &[language, symlist] : nsyms)
@@ -969,6 +973,7 @@ mdict_create_hashed_expandable (enum language language)
   retval->n_allocated_dictionaries = 1;
   retval->dictionaries = XNEW (struct dictionary *);
   retval->dictionaries[0] = dict_create_hashed_expandable (language);
+  retval->type = DICT_HASHED_EXPANDABLE;
 
   return retval;
 }
@@ -988,6 +993,7 @@ mdict_create_linear (struct obstack *obstack,
   retval->dictionaries
     = XOBNEWVEC (obstack, struct dictionary *, nsyms.size ());
   retval->n_allocated_dictionaries = nsyms.size ();
+  retval->type = DICT_LINEAR;
 
   int idx = 0;
   for (const auto &[language, symlist] : nsyms)
@@ -1009,6 +1015,7 @@ mdict_create_linear_expandable (enum language language)
   retval->n_allocated_dictionaries = 1;
   retval->dictionaries = XNEW (struct dictionary *);
   retval->dictionaries[0] = dict_create_linear_expandable (language);
+  retval->type = DICT_LINEAR_EXPANDABLE;
 
   return retval;
 }
@@ -1018,15 +1025,12 @@ mdict_create_linear_expandable (enum language language)
 void
 mdict_free (struct multidictionary *mdict)
 {
-  /* Grab the type of dictionary being used.  */
-  enum dict_type type = mdict->dictionaries[0]->vector->type;
-
   /* Loop over all dictionaries and free them.  */
   for (unsigned short idx = 0; idx < mdict->n_allocated_dictionaries; ++idx)
     dict_free (mdict->dictionaries[idx]);
 
   /* Free the dictionary list, if needed.  */
-  switch (type)
+  switch (mdict->type)
     {
     case DICT_HASHED:
     case DICT_LINEAR:
@@ -1036,6 +1040,7 @@ mdict_free (struct multidictionary *mdict)
     case DICT_HASHED_EXPANDABLE:
     case DICT_LINEAR_EXPANDABLE:
       xfree (mdict->dictionaries);
+      xfree (mdict);
       break;
     }
 }

-- 
2.49.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/2] Free multidicts from blockvector
  2025-10-23 17:17 [PATCH 0/2] Fix memory leak with .debug_types Tom Tromey
  2025-10-23 17:17 ` [PATCH 1/2] Two bug fixes in mdict_free Tom Tromey
@ 2025-10-23 17:17 ` Tom Tromey
  2025-10-23 18:31   ` Simon Marchi
  1 sibling, 1 reply; 11+ messages in thread
From: Tom Tromey @ 2025-10-23 17:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

Currently, nothing in the tree ever calls mdict_free.  However, code
does heap-allocate some multidicts.  A simple way to see this is to
use valgrind, run "gdb -readnow" on the executable created by
gdb.dwarf2/struct-with-sig.exp, and then use "file" to clear the
objfile list.  This yields:

==1522843== 144 (16 direct, 128 indirect) bytes in 1 blocks are definitely lost in loss record 905 of 3,005
==1522843==    at 0x4843866: malloc (vg_replace_malloc.c:446)
==1522843==    by 0x48E397: xmalloc (alloc.c:52)
==1522843==    by 0x59DE66: multidictionary* xnew<multidictionary>() (poison.h:102)
==1522843==    by 0x59CFF4: mdict_create_hashed_expandable(language) (dictionary.c:965)
==1522843==    by 0x50A269: buildsym_compunit::finish_block_internal(symbol*, pending**, pending_block*, dynamic_prop const*, unsigned long, unsigned long, int, int) (buildsym.c:221)
==1522843==    by 0x50AE04: buildsym_compunit::end_compunit_symtab_get_static_block(unsigned long, int, int) (buildsym.c:818)
==1522843==    by 0x50C4CF: buildsym_compunit::end_expandable_symtab(unsigned long) (buildsym.c:1037)
==1522843==    by 0x61DBC6: process_full_type_unit (read.c:4970)

This patch fixes the leaks by calling mdict_free when a blockvector is
destroyed.
---
 gdb/block.c | 6 ++++++
 gdb/block.h | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/gdb/block.c b/gdb/block.c
index 02e6d0e4be2bab2c1b3750a2f396e4bc97b7bc5d..2d8d40ec4eb94a1c891a3f8b48972eff3050e66d 100644
--- a/gdb/block.c
+++ b/gdb/block.c
@@ -842,6 +842,12 @@ blockvector::append_block (struct block *block)
   m_blocks.push_back (block);
 }
 
+blockvector::~blockvector ()
+{
+  for (struct block *bl : m_blocks)
+    mdict_free (bl->multidict ());
+}
+
 /* Implement 'maint info blocks' command.  If passed an argument then
    print a list of all blocks at the given address.  With no arguments
    then list all blocks at the current address of the current inferior.  */
diff --git a/gdb/block.h b/gdb/block.h
index 61070619ca7daa7b3310a9c5cd8d6ca86c044ada..98ce5a3839dbb8419c9e3a918d27b3b58485fea4 100644
--- a/gdb/block.h
+++ b/gdb/block.h
@@ -424,6 +424,8 @@ struct blockvector
     : m_blocks (nblocks, nullptr)
   {}
 
+  ~blockvector ();
+
   /* Return a view on the blocks of this blockvector.  */
   gdb::array_view<struct block *> blocks ()
   {

-- 
2.49.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] Two bug fixes in mdict_free
  2025-10-23 17:17 ` [PATCH 1/2] Two bug fixes in mdict_free Tom Tromey
@ 2025-10-23 18:26   ` Simon Marchi
  2025-10-23 18:32     ` Simon Marchi
  0 siblings, 1 reply; 11+ messages in thread
From: Simon Marchi @ 2025-10-23 18:26 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches

On 10/23/25 1:17 PM, Tom Tromey wrote:
> A heap-allocated multidictionary should be freed by calling
> mdict_free.  However, while this function does free the contents of
> the dictionary, it neglects to free the dictionary itself.
> 
> There's also a second bug, which is that if a multidictionary is
> created with no dictionaries, gdb will crash on the first line of
> mdict_free:
> 
>   enum dict_type type = mdict->dictionaries[0]->vector->type;
> 
> So, this patch also adds the type to struct multidictionary, avoiding
> this problem.  Note that this does not increase the structure size on
> x86-64, because the new member fits into the padding.

I think you could just remove dict_vector::type and use
multidictionary::type at the single spot where dict_vector::type is
still used.

Simon

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Free multidicts from blockvector
  2025-10-23 17:17 ` [PATCH 2/2] Free multidicts from blockvector Tom Tromey
@ 2025-10-23 18:31   ` Simon Marchi
  2025-10-23 18:52     ` Tom Tromey
  0 siblings, 1 reply; 11+ messages in thread
From: Simon Marchi @ 2025-10-23 18:31 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches

On 10/23/25 1:17 PM, Tom Tromey wrote:
> Currently, nothing in the tree ever calls mdict_free.  However, code
> does heap-allocate some multidicts.  A simple way to see this is to
> use valgrind, run "gdb -readnow" on the executable created by
> gdb.dwarf2/struct-with-sig.exp, and then use "file" to clear the
> objfile list.  This yields:
> 
> ==1522843== 144 (16 direct, 128 indirect) bytes in 1 blocks are definitely lost in loss record 905 of 3,005
> ==1522843==    at 0x4843866: malloc (vg_replace_malloc.c:446)
> ==1522843==    by 0x48E397: xmalloc (alloc.c:52)
> ==1522843==    by 0x59DE66: multidictionary* xnew<multidictionary>() (poison.h:102)
> ==1522843==    by 0x59CFF4: mdict_create_hashed_expandable(language) (dictionary.c:965)
> ==1522843==    by 0x50A269: buildsym_compunit::finish_block_internal(symbol*, pending**, pending_block*, dynamic_prop const*, unsigned long, unsigned long, int, int) (buildsym.c:221)
> ==1522843==    by 0x50AE04: buildsym_compunit::end_compunit_symtab_get_static_block(unsigned long, int, int) (buildsym.c:818)
> ==1522843==    by 0x50C4CF: buildsym_compunit::end_expandable_symtab(unsigned long) (buildsym.c:1037)
> ==1522843==    by 0x61DBC6: process_full_type_unit (read.c:4970)
> 
> This patch fixes the leaks by calling mdict_free when a blockvector is
> destroyed.
> ---
>  gdb/block.c | 6 ++++++
>  gdb/block.h | 2 ++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/gdb/block.c b/gdb/block.c
> index 02e6d0e4be2bab2c1b3750a2f396e4bc97b7bc5d..2d8d40ec4eb94a1c891a3f8b48972eff3050e66d 100644
> --- a/gdb/block.c
> +++ b/gdb/block.c
> @@ -842,6 +842,12 @@ blockvector::append_block (struct block *block)
>    m_blocks.push_back (block);
>  }
>  
> +blockvector::~blockvector ()
> +{
> +  for (struct block *bl : m_blocks)
> +    mdict_free (bl->multidict ());
> +}

Ideally it would be struct block's just to call that.  You could add a
destructor to struct block and call it explicitly here.  It would be the
same result, but would be more correct encapsulation-wise.  But I guess
this is fine too for the time being.

> +
>  /* Implement 'maint info blocks' command.  If passed an argument then
>     print a list of all blocks at the given address.  With no arguments
>     then list all blocks at the current address of the current inferior.  */
> diff --git a/gdb/block.h b/gdb/block.h
> index 61070619ca7daa7b3310a9c5cd8d6ca86c044ada..98ce5a3839dbb8419c9e3a918d27b3b58485fea4 100644
> --- a/gdb/block.h
> +++ b/gdb/block.h
> @@ -424,6 +424,8 @@ struct blockvector
>      : m_blocks (nblocks, nullptr)
>    {}
>  
> +  ~blockvector ();

Since there is a user-defined destructor, I would add a
DISABLE_COPY_AND_ASSIGN.  Otherwise, LGTM.

Approved-By: Simon Marchi <simon.marchi@efficios.com>

Simon

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] Two bug fixes in mdict_free
  2025-10-23 18:26   ` Simon Marchi
@ 2025-10-23 18:32     ` Simon Marchi
  2025-10-23 18:49       ` Tom Tromey
  0 siblings, 1 reply; 11+ messages in thread
From: Simon Marchi @ 2025-10-23 18:32 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches

On 10/23/25 2:26 PM, Simon Marchi wrote:
> On 10/23/25 1:17 PM, Tom Tromey wrote:
>> A heap-allocated multidictionary should be freed by calling
>> mdict_free.  However, while this function does free the contents of
>> the dictionary, it neglects to free the dictionary itself.
>>
>> There's also a second bug, which is that if a multidictionary is
>> created with no dictionaries, gdb will crash on the first line of
>> mdict_free:
>>
>>   enum dict_type type = mdict->dictionaries[0]->vector->type;
>>
>> So, this patch also adds the type to struct multidictionary, avoiding
>> this problem.  Note that this does not increase the structure size on
>> x86-64, because the new member fits into the padding.
> 
> I think you could just remove dict_vector::type and use
> multidictionary::type at the single spot where dict_vector::type is
> still used.
> 
> Simon

Btw, your change LGTM with or without my suggestion.

Approved-By: Simon Marchi <simon.marchi@efficios.com>

Simon

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] Two bug fixes in mdict_free
  2025-10-23 18:32     ` Simon Marchi
@ 2025-10-23 18:49       ` Tom Tromey
  0 siblings, 0 replies; 11+ messages in thread
From: Tom Tromey @ 2025-10-23 18:49 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Tom Tromey, gdb-patches

>> I think you could just remove dict_vector::type and use
>> multidictionary::type at the single spot where dict_vector::type is
>> still used.

Simon> Btw, your change LGTM with or without my suggestion.

I made the change.

Tom

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Free multidicts from blockvector
  2025-10-23 18:31   ` Simon Marchi
@ 2025-10-23 18:52     ` Tom Tromey
  2025-10-24 14:08       ` Tom Tromey
  0 siblings, 1 reply; 11+ messages in thread
From: Tom Tromey @ 2025-10-23 18:52 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Tom Tromey, gdb-patches

>>>>> "Simon" == Simon Marchi <simark@simark.ca> writes:

>> +  for (struct block *bl : m_blocks)
>> +    mdict_free (bl->multidict ());
>> +}

Simon> Ideally it would be struct block's just to call that.  You could add a
Simon> destructor to struct block and call it explicitly here.  It would be the
Simon> same result, but would be more correct encapsulation-wise.  But I guess
Simon> this is fine too for the time being.

Yeah.  I have kind of an aversion to explicit destructor calls.
Though maybe not for any good reason.

Simon> Since there is a user-defined destructor, I would add a
Simon> DISABLE_COPY_AND_ASSIGN.  Otherwise, LGTM.

I did this.

Tom

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Free multidicts from blockvector
  2025-10-23 18:52     ` Tom Tromey
@ 2025-10-24 14:08       ` Tom Tromey
  2025-10-24 16:25         ` Simon Marchi
  0 siblings, 1 reply; 11+ messages in thread
From: Tom Tromey @ 2025-10-24 14:08 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Simon Marchi, gdb-patches

Simon> Ideally it would be struct block's just to call that.  You could add a
Simon> destructor to struct block and call it explicitly here.  It would be the
Simon> same result, but would be more correct encapsulation-wise.  But I guess
Simon> this is fine too for the time being.

Tom> Yeah.  I have kind of an aversion to explicit destructor calls.
Tom> Though maybe not for any good reason.

I was thinking about this last night and realized there's a bigger
issue, which is that blocks are allocated on an obstack:

    struct block : public allocate_on_obstack<block>

and

    block = new (&m_objfile->objfile_obstack) struct block;

but allocate_on_obstack requires that the object be trivially
destructible.

Maybe this could be remedied by resurrecting Jan's obstack allocator;
making an "obstack_ptr" class that explicitly calls the destructor but
not 'delete'; and finally changing blockvector to hold these.

Tom

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Free multidicts from blockvector
  2025-10-24 14:08       ` Tom Tromey
@ 2025-10-24 16:25         ` Simon Marchi
  2025-10-24 18:04           ` Tom Tromey
  0 siblings, 1 reply; 11+ messages in thread
From: Simon Marchi @ 2025-10-24 16:25 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

On 10/24/25 10:08 AM, Tom Tromey wrote:
> Simon> Ideally it would be struct block's just to call that.  You could add a
> Simon> destructor to struct block and call it explicitly here.  It would be the
> Simon> same result, but would be more correct encapsulation-wise.  But I guess
> Simon> this is fine too for the time being.
> 
> Tom> Yeah.  I have kind of an aversion to explicit destructor calls.
> Tom> Though maybe not for any good reason.
> 
> I was thinking about this last night and realized there's a bigger
> issue, which is that blocks are allocated on an obstack:
> 
>     struct block : public allocate_on_obstack<block>
> 
> and
> 
>     block = new (&m_objfile->objfile_obstack) struct block;
> 
> but allocate_on_obstack requires that the object be trivially
> destructible.
> 
> Maybe this could be remedied by resurrecting Jan's obstack allocator;
> making an "obstack_ptr" class that explicitly calls the destructor but
> not 'delete'; and finally changing blockvector to hold these.

I guess that this restriction of "objects allocated on obstack must be
trivially destructible" is a bit artificial and arbitrary?  It is
technically fine to allocate on obstack an object having a destructor,
as long as you ensure the destructor is called at some point.  I guess
the restriction was put because it's easy to forget and error prone.
This obstack_ptr you talk about (obstack_unique_ptr?) sounds like a good
way to automate it.

I was wondering if we needed to keep allocating these on an obstack, but
my understanding is that obstack is a good choice here because:

 - We can't use a vector<block> to store them, since the blocks
   themselves can't move, as that would break the parent (superblock)
   relationship

 - There is a ton of blocks created, so it's nice to be able to free
   their memory in O(1).  Unfortunately, having to free
   block::m_multidict means that part is O(n), but the less we have to
   do in O(n) the better I guess?

Simon

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] Free multidicts from blockvector
  2025-10-24 16:25         ` Simon Marchi
@ 2025-10-24 18:04           ` Tom Tromey
  0 siblings, 0 replies; 11+ messages in thread
From: Tom Tromey @ 2025-10-24 18:04 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Tom Tromey, gdb-patches

Simon> I guess that this restriction of "objects allocated on obstack must be
Simon> trivially destructible" is a bit artificial and arbitrary?  It is
Simon> technically fine to allocate on obstack an object having a destructor,
Simon> as long as you ensure the destructor is called at some point.  I guess
Simon> the restriction was put because it's easy to forget and error prone.

I think the specific difference from ordinary 'new' is that valgrind
will detect a new-based memory leak, but not an obstack-based leak.  So,
in the heap case, the leak will also point out that the destructor isn't
being run; whereas in the obstack case it could go completely unnoticed.

Simon>  - There is a ton of blocks created, so it's nice to be able to free
Simon>    their memory in O(1).  Unfortunately, having to free
Simon>    block::m_multidict means that part is O(n), but the less we have to
Simon>    do in O(n) the better I guess?

We could also track the ownership of the dictionaries separately,
e.g. with unordered_set<mdict_unique_ptr> or similar.  This would reduce
the cost if it matters.

Tom

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-10-24 18:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-23 17:17 [PATCH 0/2] Fix memory leak with .debug_types Tom Tromey
2025-10-23 17:17 ` [PATCH 1/2] Two bug fixes in mdict_free Tom Tromey
2025-10-23 18:26   ` Simon Marchi
2025-10-23 18:32     ` Simon Marchi
2025-10-23 18:49       ` Tom Tromey
2025-10-23 17:17 ` [PATCH 2/2] Free multidicts from blockvector Tom Tromey
2025-10-23 18:31   ` Simon Marchi
2025-10-23 18:52     ` Tom Tromey
2025-10-24 14:08       ` Tom Tromey
2025-10-24 16:25         ` Simon Marchi
2025-10-24 18:04           ` Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox