[PATCH] Shrink size of dwarf2_per_cu

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

* [PATCH] Shrink size of dwarf2_per_cu_data
@ 2020-11-03 22:13 Tom Tromey
  2020-11-04 10:59 ` Gary Benson via Gdb-patches
  0 siblings, 1 reply; 5+ messages in thread
From: Tom Tromey @ 2020-11-03 22:13 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

I noticed some holes in struct dwarf2_per_cu_data.  This patch
rearranges the type slightly, and shrinks the size of some fields.
This reduces it from 136 bytes to 112 bytes (on x86-64).

I also reduced the size of the DWARF "version" fields in a couple of
spots.  It seemed needless to use a short to hold a value that ranges
from 2 to 5, and this also helped the goal of shrinking
dwarf2_per_cu_data.

gdb/ChangeLog
2020-11-03  Tom Tromey  <tom@tromey.com>

	* dwarf2/read.h (struct dwarf2_per_cu_data) <dwarf_version>: Now
	unsigned char.
	(struct dwarf2_per_cu_data): Rearrange.
	* dwarf2/comp-unit.h (struct comp_unit_head) <version>: Now
	unsigned char.
	(struct comp_unit_head): Rearrange.
	* dwarf2/comp-unit.c (read_comp_unit_head): Update.
---
 gdb/ChangeLog          | 10 ++++++++++
 gdb/dwarf2/comp-unit.c |  7 ++++---
 gdb/dwarf2/comp-unit.h | 15 +++++++--------
 gdb/dwarf2/read.h      | 26 +++++++++++++-------------
 4 files changed, 34 insertions(+), 24 deletions(-)

diff --git a/gdb/dwarf2/comp-unit.c b/gdb/dwarf2/comp-unit.c
index 7ddc893aac6..05cf3f1c6a8 100644
--- a/gdb/dwarf2/comp-unit.c
+++ b/gdb/dwarf2/comp-unit.c
@@ -75,11 +75,12 @@ read_comp_unit_head (struct comp_unit_head *cu_header,
   cu_header->initial_length_size = bytes_read;
   cu_header->offset_size = (bytes_read == 4) ? 4 : 8;
   info_ptr += bytes_read;
-  cu_header->version = read_2_bytes (abfd, info_ptr);
-  if (cu_header->version < 2 || cu_header->version > 5)
+  unsigned version = read_2_bytes (abfd, info_ptr);
+  if (version < 2 || version > 5)
     error (_("Dwarf Error: wrong version in compilation unit header "
 	   "(is %d, should be 2, 3, 4 or 5) [in module %s]"),
-	   cu_header->version, filename);
+	   version, filename);
+  cu_header->version = version;
   info_ptr += 2;
   if (cu_header->version < 5)
     switch (section_kind)
diff --git a/gdb/dwarf2/comp-unit.h b/gdb/dwarf2/comp-unit.h
index 35bca8f8e9d..09ffac8d2da 100644
--- a/gdb/dwarf2/comp-unit.h
+++ b/gdb/dwarf2/comp-unit.h
@@ -35,7 +35,7 @@
 struct comp_unit_head
 {
   unsigned int length;
-  short version;
+  unsigned char version;
   unsigned char addr_size;
   unsigned char signed_addr_p;
   sect_offset abbrev_sect_off;
@@ -48,14 +48,16 @@ struct comp_unit_head
 
   enum dwarf_unit_type unit_type;
 
-  /* Offset to the first byte of this compilation unit header in the
-     .debug_info section, for resolving relative reference dies.  */
-  sect_offset sect_off;
-
   /* Offset to first die in this cu from the start of the cu.
      This will be the first byte following the compilation unit header.  */
   cu_offset first_die_cu_offset;
 
+  /* Offset to the first byte of this compilation unit header in the
+     .debug_info section, for resolving relative reference dies.  */
+  sect_offset sect_off;
+
+  /* For types, offset in the type's DIE of the type defined by this TU.  */
+  cu_offset type_cu_offset_in_tu;
 
   /* 64-bit signature of this unit. For type units, it denotes the signature of
      the type (DW_UT_type in DWARF 4, additionally DW_UT_split_type in DWARF 5).
@@ -63,9 +65,6 @@ struct comp_unit_head
      DW_UT_skeleton or DW_UT_split_compile.  */
   ULONGEST signature;
 
-  /* For types, offset in the type's DIE of the type defined by this TU.  */
-  cu_offset type_cu_offset_in_tu;
-
   /* Return the total length of the CU described by this header.  */
   unsigned int get_length () const
   {
diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
index a0d76f349e8..f358c562fdf 100644
--- a/gdb/dwarf2/read.h
+++ b/gdb/dwarf2/read.h
@@ -433,7 +433,7 @@ struct dwarf2_per_cu_data
   unsigned int length;
 
   /* DWARF standard version this data has been read from (such as 4 or 5).  */
-  short dwarf_version;
+  unsigned char dwarf_version;
 
   /* Flag indicating this compilation unit will be read in before
      any of the current compilation units are processed.  */
@@ -469,6 +469,18 @@ struct dwarf2_per_cu_data
      This flag is only valid if is_debug_types is true.  */
   unsigned int tu_read : 1;
 
+  /* True if HEADER has been read in.
+
+     Don't access this field directly.  It should be private, but we can't make
+     it private at the moment.  */
+  mutable bool m_header_read_in : 1;
+
+  /* The unit type of this CU.  */
+  ENUM_BITFIELD (dwarf_unit_type) unit_type : 8;
+
+  /* The language of this CU.  */
+  ENUM_BITFIELD (language) lang : LANGUAGE_BITS;
+
   /* Our index in the unshared "symtabs" vector.  */
   unsigned index;
 
@@ -477,12 +489,6 @@ struct dwarf2_per_cu_data
      not the DWO file.  */
   struct dwarf2_section_info *section;
 
-  /* The unit type of this CU.  */
-  enum dwarf_unit_type unit_type;
-
-  /* The language of this CU.  */
-  enum language lang;
-
   /* Backlink to the owner of this.  */
   dwarf2_per_bfd *per_bfd;
 
@@ -495,12 +501,6 @@ struct dwarf2_per_cu_data
      should be private, but we can't make it private at the moment.  */
   mutable comp_unit_head m_header;
 
-  /* True if HEADER has been read in.
-
-     Don't access this field directly.  It should be private, but we can't make
-     it private at the moment.  */
-  mutable bool m_header_read_in;
-
   /* When dwarf2_per_bfd::using_index is true, the 'quick' field
      is active.  Otherwise, the 'psymtab' field is active.  */
   union
-- 
2.17.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Shrink size of dwarf2_per_cu_data
  2020-11-03 22:13 [PATCH] Shrink size of dwarf2_per_cu_data Tom Tromey
@ 2020-11-04 10:59 ` Gary Benson via Gdb-patches
  2021-04-21 23:04   ` Tom Tromey
  0 siblings, 1 reply; 5+ messages in thread
From: Gary Benson via Gdb-patches @ 2020-11-04 10:59 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

Tom Tromey wrote:
> I noticed some holes in struct dwarf2_per_cu_data.  This patch
> rearranges the type slightly, and shrinks the size of some fields.
> This reduces it from 136 bytes to 112 bytes (on x86-64).

Nice.  Though I'd assumed the compiler would do this.

> I also reduced the size of the DWARF "version" fields in a couple of
> spots.  It seemed needless to use a short to hold a value that
> ranges from 2 to 5, and this also helped the goal of shrinking
> dwarf2_per_cu_data.

There's other things in there if you were going that way, e.g.:

  unsigned char addr_size;
  unsigned char signed_addr_p;
  sect_offset abbrev_sect_off;

  /* Size of file offsets; either 4 or 8.  */
  unsigned int offset_size;

  /* Size of the length field; either 4 or 12.  */
  unsigned int initial_length_size;

I presume you'd need to lose 8 whole bytes to shrink the struct
further, so marking these as <8bit values probably wouldn't help
any, but maybe it's worth it?  How many CUs does something big
have?

Anyway, LGTM, I don't require any changes, the above is just
commentary.

Thanks,
Gary

> gdb/ChangeLog
> 2020-11-03  Tom Tromey  <tom@tromey.com>
> 
> 	* dwarf2/read.h (struct dwarf2_per_cu_data) <dwarf_version>: Now
> 	unsigned char.
> 	(struct dwarf2_per_cu_data): Rearrange.
> 	* dwarf2/comp-unit.h (struct comp_unit_head) <version>: Now
> 	unsigned char.
> 	(struct comp_unit_head): Rearrange.
> 	* dwarf2/comp-unit.c (read_comp_unit_head): Update.
> ---
>  gdb/ChangeLog          | 10 ++++++++++
>  gdb/dwarf2/comp-unit.c |  7 ++++---
>  gdb/dwarf2/comp-unit.h | 15 +++++++--------
>  gdb/dwarf2/read.h      | 26 +++++++++++++-------------
>  4 files changed, 34 insertions(+), 24 deletions(-)
> 
> diff --git a/gdb/dwarf2/comp-unit.c b/gdb/dwarf2/comp-unit.c
> index 7ddc893aac6..05cf3f1c6a8 100644
> --- a/gdb/dwarf2/comp-unit.c
> +++ b/gdb/dwarf2/comp-unit.c
> @@ -75,11 +75,12 @@ read_comp_unit_head (struct comp_unit_head *cu_header,
>    cu_header->initial_length_size = bytes_read;
>    cu_header->offset_size = (bytes_read == 4) ? 4 : 8;
>    info_ptr += bytes_read;
> -  cu_header->version = read_2_bytes (abfd, info_ptr);
> -  if (cu_header->version < 2 || cu_header->version > 5)
> +  unsigned version = read_2_bytes (abfd, info_ptr);
> +  if (version < 2 || version > 5)
>      error (_("Dwarf Error: wrong version in compilation unit header "
>  	   "(is %d, should be 2, 3, 4 or 5) [in module %s]"),
> -	   cu_header->version, filename);
> +	   version, filename);
> +  cu_header->version = version;
>    info_ptr += 2;
>    if (cu_header->version < 5)
>      switch (section_kind)
> diff --git a/gdb/dwarf2/comp-unit.h b/gdb/dwarf2/comp-unit.h
> index 35bca8f8e9d..09ffac8d2da 100644
> --- a/gdb/dwarf2/comp-unit.h
> +++ b/gdb/dwarf2/comp-unit.h
> @@ -35,7 +35,7 @@
>  struct comp_unit_head
>  {
>    unsigned int length;
> -  short version;
> +  unsigned char version;
>    unsigned char addr_size;
>    unsigned char signed_addr_p;
>    sect_offset abbrev_sect_off;
> @@ -48,14 +48,16 @@ struct comp_unit_head
>  
>    enum dwarf_unit_type unit_type;
>  
> -  /* Offset to the first byte of this compilation unit header in the
> -     .debug_info section, for resolving relative reference dies.  */
> -  sect_offset sect_off;
> -
>    /* Offset to first die in this cu from the start of the cu.
>       This will be the first byte following the compilation unit header.  */
>    cu_offset first_die_cu_offset;
>  
> +  /* Offset to the first byte of this compilation unit header in the
> +     .debug_info section, for resolving relative reference dies.  */
> +  sect_offset sect_off;
> +
> +  /* For types, offset in the type's DIE of the type defined by this TU.  */
> +  cu_offset type_cu_offset_in_tu;
>  
>    /* 64-bit signature of this unit. For type units, it denotes the signature of
>       the type (DW_UT_type in DWARF 4, additionally DW_UT_split_type in DWARF 5).
> @@ -63,9 +65,6 @@ struct comp_unit_head
>       DW_UT_skeleton or DW_UT_split_compile.  */
>    ULONGEST signature;
>  
> -  /* For types, offset in the type's DIE of the type defined by this TU.  */
> -  cu_offset type_cu_offset_in_tu;
> -
>    /* Return the total length of the CU described by this header.  */
>    unsigned int get_length () const
>    {
> diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
> index a0d76f349e8..f358c562fdf 100644
> --- a/gdb/dwarf2/read.h
> +++ b/gdb/dwarf2/read.h
> @@ -433,7 +433,7 @@ struct dwarf2_per_cu_data
>    unsigned int length;
>  
>    /* DWARF standard version this data has been read from (such as 4 or 5).  */
> -  short dwarf_version;
> +  unsigned char dwarf_version;
>  
>    /* Flag indicating this compilation unit will be read in before
>       any of the current compilation units are processed.  */
> @@ -469,6 +469,18 @@ struct dwarf2_per_cu_data
>       This flag is only valid if is_debug_types is true.  */
>    unsigned int tu_read : 1;
>  
> +  /* True if HEADER has been read in.
> +
> +     Don't access this field directly.  It should be private, but we can't make
> +     it private at the moment.  */
> +  mutable bool m_header_read_in : 1;
> +
> +  /* The unit type of this CU.  */
> +  ENUM_BITFIELD (dwarf_unit_type) unit_type : 8;
> +
> +  /* The language of this CU.  */
> +  ENUM_BITFIELD (language) lang : LANGUAGE_BITS;
> +
>    /* Our index in the unshared "symtabs" vector.  */
>    unsigned index;
>  
> @@ -477,12 +489,6 @@ struct dwarf2_per_cu_data
>       not the DWO file.  */
>    struct dwarf2_section_info *section;
>  
> -  /* The unit type of this CU.  */
> -  enum dwarf_unit_type unit_type;
> -
> -  /* The language of this CU.  */
> -  enum language lang;
> -
>    /* Backlink to the owner of this.  */
>    dwarf2_per_bfd *per_bfd;
>  
> @@ -495,12 +501,6 @@ struct dwarf2_per_cu_data
>       should be private, but we can't make it private at the moment.  */
>    mutable comp_unit_head m_header;
>  
> -  /* True if HEADER has been read in.
> -
> -     Don't access this field directly.  It should be private, but we can't make
> -     it private at the moment.  */
> -  mutable bool m_header_read_in;
> -
>    /* When dwarf2_per_bfd::using_index is true, the 'quick' field
>       is active.  Otherwise, the 'psymtab' field is active.  */
>    union
> -- 
> 2.17.2
> 

-- 
Gary Benson - he / him / his
Principal Software Engineer, Red Hat


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Shrink size of dwarf2_per_cu_data
  2020-11-04 10:59 ` Gary Benson via Gdb-patches
@ 2021-04-21 23:04   ` Tom Tromey
  2021-04-22 21:39     ` Christian Biesinger via Gdb-patches
  0 siblings, 1 reply; 5+ messages in thread
From: Tom Tromey @ 2021-04-21 23:04 UTC (permalink / raw)
  To: Gary Benson; +Cc: Tom Tromey, gdb-patches

>>>>> "Gary" == Gary Benson <gbenson@redhat.com> writes:

Gary> Tom Tromey wrote:
>> I noticed some holes in struct dwarf2_per_cu_data.  This patch
>> rearranges the type slightly, and shrinks the size of some fields.
>> This reduces it from 136 bytes to 112 bytes (on x86-64).

Gary> Nice.  Though I'd assumed the compiler would do this.

C and C++ compilers can't generally do this, though other languages like
Rust and Ada do allow it.

>> I also reduced the size of the DWARF "version" fields in a couple of
>> spots.  It seemed needless to use a short to hold a value that
>> ranges from 2 to 5, and this also helped the goal of shrinking
>> dwarf2_per_cu_data.

Gary> There's other things in there if you were going that way, e.g.:

Gary>   unsigned char addr_size;
Gary>   unsigned char signed_addr_p;
Gary>   sect_offset abbrev_sect_off;

Gary>   /* Size of file offsets; either 4 or 8.  */
Gary>   unsigned int offset_size;

Gary>   /* Size of the length field; either 4 or 12.  */
Gary>   unsigned int initial_length_size;

Gary> I presume you'd need to lose 8 whole bytes to shrink the struct
Gary> further, so marking these as <8bit values probably wouldn't help
Gary> any, but maybe it's worth it?

I looked at this, and it doesn't remove enough space to help.
Might still be worth doing for clarity, I dunno.

Gary> How many CUs does something big
Gary> have?

gdb has 1197 as of today, but IMO it is moderately-sized at best.
I don't have anything really big that's convenient to check.

Tom

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Shrink size of dwarf2_per_cu_data
  2021-04-21 23:04   ` Tom Tromey
@ 2021-04-22 21:39     ` Christian Biesinger via Gdb-patches
  2021-04-23 12:55       ` Tom Tromey
  0 siblings, 1 reply; 5+ messages in thread
From: Christian Biesinger via Gdb-patches @ 2021-04-22 21:39 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

On Wed, Apr 21, 2021 at 6:04 PM Tom Tromey <tom@tromey.com> wrote:
>
> >>>>> "Gary" == Gary Benson <gbenson@redhat.com> writes:
>
> Gary> Tom Tromey wrote:
> >> I noticed some holes in struct dwarf2_per_cu_data.  This patch
> >> rearranges the type slightly, and shrinks the size of some fields.
> >> This reduces it from 136 bytes to 112 bytes (on x86-64).
>
> Gary> Nice.  Though I'd assumed the compiler would do this.
>
> C and C++ compilers can't generally do this, though other languages like
> Rust and Ada do allow it.
>
> >> I also reduced the size of the DWARF "version" fields in a couple of
> >> spots.  It seemed needless to use a short to hold a value that
> >> ranges from 2 to 5, and this also helped the goal of shrinking
> >> dwarf2_per_cu_data.
>
> Gary> There's other things in there if you were going that way, e.g.:
>
> Gary>   unsigned char addr_size;
> Gary>   unsigned char signed_addr_p;
> Gary>   sect_offset abbrev_sect_off;
>
> Gary>   /* Size of file offsets; either 4 or 8.  */
> Gary>   unsigned int offset_size;
>
> Gary>   /* Size of the length field; either 4 or 12.  */
> Gary>   unsigned int initial_length_size;
>
> Gary> I presume you'd need to lose 8 whole bytes to shrink the struct
> Gary> further, so marking these as <8bit values probably wouldn't help
> Gary> any, but maybe it's worth it?
>
> I looked at this, and it doesn't remove enough space to help.
> Might still be worth doing for clarity, I dunno.
>
> Gary> How many CUs does something big
> Gary> have?
>
> gdb has 1197 as of today, but IMO it is moderately-sized at best.
> I don't have anything really big that's convenient to check.

I tried to approximate the number of CUs that Chrome has (Linux debug build):

$ objdump -g chrome | grep 'Compilation Unit' | wc -l
9233

Plus the parts that are in shared libraries:

$ find -name \*.so | xargs -n 1 objdump -g | grep 'Compilation Unit' | wc -l
57942

I'm not sure if all the .so files get loaded at the same time, so it
may be a bit fewer in practice. Of course this does not include any
CUs that may come from debug symbols for system libraries. Is there a
gdb command that could tell me the exact number?

Christian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Shrink size of dwarf2_per_cu_data
  2021-04-22 21:39     ` Christian Biesinger via Gdb-patches
@ 2021-04-23 12:55       ` Tom Tromey
  0 siblings, 0 replies; 5+ messages in thread
From: Tom Tromey @ 2021-04-23 12:55 UTC (permalink / raw)
  To: Christian Biesinger; +Cc: Tom Tromey, gdb-patches

Christian> I'm not sure if all the .so files get loaded at the same time, so it
Christian> may be a bit fewer in practice. Of course this does not include any
Christian> CUs that may come from debug symbols for system libraries. Is there a
Christian> gdb command that could tell me the exact number?

I don't think so.

(gdb) pipe maint print stat | grep "psym tables"

is pretty close, but claims to omit expanded CUs.
Also if you're using an index you need something more like

(gdb) pipe maint print stat | egrep "(psym tables|read CUs)"

... then add up both the read and unread CUs.

Tom

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-04-23 12:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-03 22:13 [PATCH] Shrink size of dwarf2_per_cu_data Tom Tromey
2020-11-04 10:59 ` Gary Benson via Gdb-patches
2021-04-21 23:04   ` Tom Tromey
2021-04-22 21:39     ` Christian Biesinger via Gdb-patches
2021-04-23 12:55       ` Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox