Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Tom de Vries <tdevries@suse.de>
To: Simon Marchi <simark@simark.ca>, gdb-patches@sourceware.org
Cc: Tom Tromey <tom@tromey.com>
Subject: Re: [PATCH][gdb] Fix missing symtab includes
Date: Sun, 29 Mar 2020 17:56:34 +0200	[thread overview]
Message-ID: <e6bab374-b216-9ac5-6de5-159e12635b29@suse.de> (raw)
In-Reply-To: <caa7a104-350c-04c3-0c21-3322d4aa8161@simark.ca>

[-- Attachment #1: Type: text/plain, Size: 1944 bytes --]

On 29-03-2020 17:39, Simon Marchi wrote:
> On 2020-03-28 1:24 p.m., Tom de Vries wrote:
>>> I am not able to reproduce this.  In both cases, I don't get the `includes`.
>>>
>>> What transformation is dwz expected to do on the binary?  Here, it looks like
>>> it just compressed the debug info a little bit, changing the addresses, but the
>>> general structure was untouched.
>>>
>>> I'm using the latest git version of dwz (commit b7111689a2ccec2f57343f1051ec8f1df5e89e5c).
>>
>> Hi Simon,
>>
>> thanks for trying this out.
>>
>> I've attached the original a.out here (
>> https://sourceware.org/bugzilla/show_bug.cgi?id=25718#c3 ) and the
>> dwz-ed a.out here (
>> https://sourceware.org/bugzilla/show_bug.cgi?id=25718#c4  ).
>>
>> I'm hoping you might be able to reproduce using the latter file (and
>> FWIW, I'm using the same dwz version).
>>
>> I think the reason for the difference in what we are seeing is due to me
>> using openSUSE, which has debug info on various linked in objects like
>> glibc's init.c and elf-init.c. Looking at the readelf -wi output of the
>> dwz-ed executable, dwz exploits commonality between those objects and
>> hello.c, so it's not surprising dwz does not create partial units for
>> platforms that do not have debug info for those objects.
>>
>> Anyway, I'll need to construct a better test-case that reproduces the
>> problem on other platforms.
>>
>> Thanks,
>> - Tom
>>
> 
> Thanks, your explanations make more sense with the the DWARF in those files,
> I can reproduce the problem now.
> 
> I have a bit of trouble understanding the current code.  In particular, I am
> trying to put in words what's the difference between the `read_symtab` and
> `expand_symtab` methods of `partial_symtab`, but I can't quite do it.  I think
> this would help to determine what's the right solution.

Hi,

I gave that a try, but got stalled working on the test-case.

Anyway here's the current state.

Thanks,
- Tom


[-- Attachment #2: 0001-gdb-Fix-missing-symtab-includes.patch --]
[-- Type: text/x-patch, Size: 6844 bytes --]

[gdb] Fix missing symtab includes

Consider hello.h:
...
 inline static const char*
 foo (void)
 {
   return "foo";
 }
...
and hello.c:
...
 #include <stdio.h>
 #include "hello.h"

 int
 main (void)
 {
   printf ("hello: %s\n", foo ());
   return 0;
 }
...
compiled with -g, and dwz-compressed:
...
$ gcc -g hello.c
$ dwz a.out
...

When breaking on foo and printing the symbol tables, we have two includes for
the hello.c compunit_symtab, representing two imported partial units:
...
$ gdb -iex "set language c" -batch a.out -ex "b foo" -ex "maint info symtabs"
Breakpoint 1 at 0x40050b: file hello.h, line 4.
  ...
  { ((struct compunit_symtab *) 0x38fa890)
    debugformat DWARF 2
    producer GNU C11 7.5.0 -mtune=generic -march=x86-64 -g
    dirname /data/gdb_versions/devel
    blockvector ((struct blockvector *) 0x39af9f0)
    user ((struct compunit_symtab *) (null))
    ( includes
      ((struct compunit_symtab *) 0x39afb10)
      ((struct compunit_symtab *) 0x39b00c0)
    )
        { symtab hello.c ((struct symtab *) 0x38fa940)
          fullname (null)
          linetable ((struct linetable *) 0x39afa80)
        }
...

But if we instead break on the same location using a linespec, the includes
are gone:
...
$ gdb -iex "set language c" -batch a.out -ex "b hello.h:4" -ex "maint info symtabs"
Breakpoint 1 at 0x40050b: file hello.h, line 4.
  ...
  { ((struct compunit_symtab *) 0x2283500)
    debugformat DWARF 2
    producer GNU C11 7.5.0 -mtune=generic -march=x86-64 -g
    dirname /data/gdb_versions/devel
    blockvector ((struct blockvector *) 0x23086c0)
    user ((struct compunit_symtab *) (null))
        { symtab hello.c ((struct symtab *) 0x22835b0)
          fullname (null)
          linetable ((struct linetable *) 0x2308750)
        }
...

The includes are calculated by process_cu_includes in gdb/dwarf2/read.c.

In the case of "b foo":
- the hello.c partial symtab is found to contain foo
- the hello.c partial symtab is expanded using psymtab_to_symtab
- psymtab_to_symtab calls dwarf2_psymtab::read_symtab
- dwarf2_psymtab::read_symtab calls dwarf2_psymtab::expand_psymtab
- dwarf2_psymtab::read_symtab calls process_cu_includes, and we have the
  includes

In the case of "b hello.h:4":
- the hello.h partial symtab is found to represent hello.h
- the hello.h partial symtab is expanded using psymtab_to_symtab
- psymtab_to_symtab calls dwarf2_include_psymtab::read_symtab
- dwarf2_include_psymtab::read_symtab calls
  dwarf2_include_psymtab::expand_psymtab
- dwarf2_include_psymtab::expand_psymtab calls
  partial_symtab::read_dependencies
- partial_symtab::read_dependencies calls dwarf2_psymtab::expand_psymtab
  for partial symtab hello.c
- the hello.c partial symtab is expanded using dwarf2_psymtab::expand_psymtab
- process_cu_includes is never called

Fix this by making sure in dwarf2_include_psymtab::read_symtab that
read_symtab is called for the hello.c partial symtab.

Tested on x86_64-linux, with native, and target board cc-with-dwz and
cc-with-dwz-m.

gdb/ChangeLog:

2020-03-27  Tom de Vries  <tdevries@suse.de>

	PR symtab/25718
	* psympriv.h (struct partial_symtab::read_symtab)
	(struct partial_symtab::expand_psymtab)
	(struct partial_symtab::read_dependencies): Update comments.
	* dwarf2/read.c (dwarf2_include_psymtab::read_symtab): Directly call
	read_symtab for includer.
	(dwarf2_include_psymtab::expand_psymtab): Assert false.

---
 gdb/dwarf2/read.c | 28 +++++++++++++++++++++-------
 gdb/psympriv.h    | 22 ++++++++++++++++------
 2 files changed, 37 insertions(+), 13 deletions(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 8c5046ef41..9d366449f7 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -5830,20 +5830,34 @@ struct dwarf2_include_psymtab : public partial_symtab
   }
 
   void read_symtab (struct objfile *objfile) override
-  {
-    expand_psymtab (objfile);
-  }
-
-  void expand_psymtab (struct objfile *objfile) override
   {
     if (m_readin)
       return;
+
     /* It's an include file, no symbols to read for it.
-       Everything is in the parent symtab.  */
-    read_dependencies (objfile);
+       Everything is in the includer symtab.  */
+
+    /* The expansion of a dwarf2_include_psymtab is just a trigger for
+       expansion of the includer psymtab.  We use the dependencies[0] field to
+       model the includer.  But if we go the regular route of calling
+       expand_psymtab here, and having expand_psymtab call read_dependencies to
+       expand the includer, we'll only use expand_psymtab on the includer
+       (making it a non-toplevel psymtab), while if we expand the includer via
+       another path, we'll use read_symtab (making it a toplevel psymtab).
+       So, don't pretend a dwarf2_include_psymtab is an actual toplevel
+       psymtab, and trigger read_symtab on the includer here directly.  */
+    if (!dependencies[0]->readin_p ())
+      dependencies[0]->read_symtab (objfile);
     m_readin = true;
   }
 
+  void expand_psymtab (struct objfile *objfile) override
+  {
+    /* This is not called by read_symtab, and should not be called by any
+       read_dependencies.  */
+    gdb_assert (false);
+  }
+
   bool readin_p () const override
   {
     return m_readin;
diff --git a/gdb/psympriv.h b/gdb/psympriv.h
index 0effedc4ec..4317392149 100644
--- a/gdb/psympriv.h
+++ b/gdb/psympriv.h
@@ -124,16 +124,26 @@ struct partial_symtab
   {
   }
 
-  /* Read the full symbol table corresponding to this partial symbol
-     table.  */
+  /* Psymtab expansion is done in two steps:
+     - a call to read_symtab
+     - while that call is in progress, calls to expand_psymtab can be made,
+       both for this psymtab, and its dependencies.
+     This makes a distinction between a toplevel psymtab (for which both
+     read_symtab and expand_psymtab will be called) and a non-toplevel
+     psymtab (for which only expand_psymtab will be called). The
+     distinction can be used f.i. to do things before and after all
+     dependencies of a top-level psymtab have been expanded.
+
+     Read the full symbol table corresponding to this partial symbol
+     table.  Typically calls expand_psymtab.  */
   virtual void read_symtab (struct objfile *) = 0;
 
-  /* Psymtab expansion is done in two steps.  The first step is a call
-     to read_symtab; but while that is in progress, calls to
-     expand_psymtab can be made.  */
+  /* Expand the full symbol table for this partial symbol table.  Typically
+     calls read_dependencies.  */
   virtual void expand_psymtab (struct objfile *) = 0;
 
-  /* Ensure that all the dependencies are read in.  */
+  /* Ensure that all the dependencies are read in.  Typically calls
+     expand_psymtab for each dependency.  */
   void read_dependencies (struct objfile *);
 
   /* Return true if the symtab corresponding to this psymtab has been

  reply	other threads:[~2020-03-29 15:56 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-27 20:49 Tom de Vries
2020-03-28 16:32 ` Simon Marchi
2020-03-28 17:24   ` Tom de Vries
2020-03-29 15:39     ` Simon Marchi
2020-03-29 15:56       ` Tom de Vries [this message]
2020-03-29 21:44         ` Simon Marchi
2020-03-30 17:42           ` Tom de Vries
2020-04-14 13:31             ` Tom de Vries
2020-03-28 19:08 ` Tom Tromey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e6bab374-b216-9ac5-6de5-159e12635b29@suse.de \
    --to=tdevries@suse.de \
    --cc=gdb-patches@sourceware.org \
    --cc=simark@simark.ca \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox