Re: Regression handling linkage vs source name

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

From: Michael Eager <eager@eagercon.com>
To: Doug Evans <dje@google.com>
Cc: gdb@sourceware.org, Jan Kratochvil <jan.kratochvil@redhat.com>
Subject: Re: Regression handling linkage vs source name
Date: Wed, 22 Feb 2012 19:27:00 -0000	[thread overview]
Message-ID: <4F454188.7000904@eagercon.com> (raw)
In-Reply-To: <CADPb22QP16WgMffDAwdCNBjjZ7+OUmfd5M4qDZeRAaE=YZo6-Q@mail.gmail.com>

On 02/22/2012 08:40 AM, Doug Evans wrote:
> On Tue, Feb 21, 2012 at 4:41 PM, Michael Eager<eager@eagercon.com>  wrote:
>> There were changes to dwarf2_physname() around July 1, 2011, which
>> cause it to return a different value.  Arguably, the changes make
>> the function work better, returning the linkage name for a symbol where
>> it previously returned the source name.
>
> Tangential data point: Outside of dwarf2read.c, gdb generally uses
> "physname" to mean linkage name.

It doesn't appear to be the case for the value returned by
dwarf2_compute_name().  I haven't looked at other places.

> I'm not sure dwarf2_physname ever returned the linkage name (for c/c++).
> I know there's been some changes in how it computes the source name.

Before the patch, it didn't for C.  I didn't look at C++.

>> But since the source name
>> is overwritten by the linkage name in the symbol entry, gdb's
>> behavior is different, and, IMO, incorrect.
>
> Can you elaborate on what you mean by overwritten?

In new_symbol_full():
       /* Cache this symbol's name and the name's demangled form (if any).  */
       SYMBOL_SET_LANGUAGE (sym, cu->language);
       linkagename = dwarf2_physname (name, die, cu);
       SYMBOL_SET_NAMES (sym, linkagename, strlen (linkagename), 0, objfile);

Clearly the comment is incorrect or misleading:  only one name is stored in
the symbol.

Before the patch, dwarf2_physname() would call dwarf2_compute_name() which
would return 'name' unmodified.

After the patch, dwarf2_physname() returns the linkage name or a mangled
name.

(I initially misread the code to save the source name in the symbol,
then overwrite it with the linkage name.)

> [It's not what I see.]
>
>> Here is the test case:
>>
>> $ cat t.c
>> extern int foo(void) asm ("xfoo");
>>
>> int foo(void) {}
>>
>> int main (void)
>> {
>>   foo ();
>> }
>>
>> $ gdb-before a.out
>> ...
>> (gdb) b foo
>> Breakpoint 1 at ...
>>
>> $ gdb-after a.out
>> ...
>> (gdb) b foo
>> Make breakpoint pending on future shared library load? (y or [n])? n
>> (gdb) b xfoo
>> Breakpoint 1 at ...
>>
>>
>> The symbol "foo" is no longer present in gdb symbol table
>> so trying to use the name specified in the source fails.
>> Before the change, breakpoint, backtrace, and other commands
>> display the symbol name as in the source (and in the DWARF DIE).
>> After the change, the linkage name is displayed instead of the
>> source name.
>
> Even pre-dwarf2_physname gdb's have this problem.
> Maybe you tried a 7.x variant that I didn't (I think I tried 6.8, 7.0,
> 7.1, and 7.4 - at the time I didn't have easy access to 7.2,7.3 and
> didn't want to go to trouble of building them).

This works correctly in 7.2.  Maybe this was an "accident".  :-)

> The problem (or at least one of the problems), as I see it, is that
> the API for creating symbols is broken.

I agree.

> gdb does (or at least can) record both the linkage and source names
> (it does this for "minsyms", e.g. ELF symbols).  But the way to set a
> symbol's name is via symbol_set_names and it only takes a linkage name
> (though the dwarf2read.c further compounds the problem by passing it
> the source name - or more accurately the "search" name).
> symbol_set_names then tries to demangle the name and will record that
> name as well if the demangling succeeds.

Well, the minisym only stores a single name, not both linkage and source
names.  Despite what the comments preceding symbol_set_names() says,
there's only one name saved.

The code in symbol_set_names() seems to be doing way more than it needs
to.  If both the source and linkage names are available, then there
is no need to mangle, demangle, add a prefix or do anything but save
the name.

(symbol_set_demangled_name() does save a demangled name for C++.)

>> Before the change, dwarf2_physname() calls dwarf2_compute_name()
>> which returns the symbol name if the language is not C++, Java,
>> or Fortran, even if the DIE has a DW_AT_linkage_name attribute.
>> After the change, dwarf2_physname() returns the DW_AT_linkage_name.
>>
>> Since gdb doesn't keep both the source name and the linkage
>> name in the symbol table (a design flaw, IMO) what is the
>> right way to get the previous behavior, where gdb recognizes
>> the symbol names from the source file?
>
> We need to have dwarf2read.c store both names (linkage name and dwarf
> name). [More changes may be required beyond that, but I think that's a
> start.]  Your test program shows that we can't assume we can simply
> demangle the linkage name to get the source name.

Yes, this was my conclusion as well.

When debug info is not available, mangling a source name or demangling
a linkage name may be necessary.  I don't know that this should be done
when reading the DWARF info.  When the debug info contains both names,
or the linkage name can be inferred from the DIE, doing anything
other than saving their values should not be necessary.

Understanding what the symbol routines are trying to do with all of
the mangling and demangling is a challenge.   One thing which is not
clear is whether there is agreement whether the name stored for a symbol
is supposed to be the source name or the linkage name.  symbol_set_names()
and symbol_set_demangled_name() suggest that symtab.c thinks it is the
linkage name.  Commands which look up a symbol assume that it is the
source name.

Should there be two symbol entries, one for "foo" and one for "xfoo"?
A single entry which is searched for a match on either name or linkage
name?  Something else?

-- 
Michael Eager	 eager@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

next prev parent reply	other threads:[~2012-02-22 19:27 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-22  0:41 Michael Eager
2012-02-22 16:40 ` Doug Evans
2012-02-22 19:27   ` Michael Eager [this message]
2012-02-23  3:08     ` Doug Evans

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F454188.7000904@eagercon.com \
    --to=eager@eagercon.com \
    --cc=dje@google.com \
    --cc=gdb@sourceware.org \
    --cc=jan.kratochvil@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox