* Re: RFA: Search for symbol names the same way they're hashed.
[not found] <1033595444.9324.ezmlm@sources.redhat.com>
@ 2002-10-02 17:49 ` Jim Ingham
2002-10-02 18:16 ` Daniel Jacobowitz
0 siblings, 1 reply; 18+ messages in thread
From: Jim Ingham @ 2002-10-02 17:49 UTC (permalink / raw)
To: gdb-patches
On Wednesday, October 2, 2002, at 02:50 PM,
gdb-patches-digest-help@sources.redhat.com wrote:
>> We need to make demangling-style only affect *printout* and *user
>> entered strings*, and during symbol reading, force it to auto, so it
>> always gets the right names in the symbol table in the first place.
>
> Doesn't that sort of defeat the point of letting the user set
> demangling style? It's in case something goes wrong with
> autodetection....
>
>>> The source code name of a symbol does not depend depend on the
>>> current
>>> demangling setting;
>>
>> And to enforce this, you have to make the readers *not* honor the
>> demangling style. If you just fix SYMBOL_SOURCE_NAME,
>> SYMBOL_INIT_DEMANGLED_NAME will still be only called once, and it'll
>> have the wrong demangling style when it calls cplus_demangle,
>> resulting
>> in the symbol having the wrong demangled name forevermore.
>
> Perhaps we need to decide what the point of letting users force the
> demangle style is, first.
The case where we have had to use this was because we had private C++
API's in some of the Mac OS X frameworks for 10.2 (which was compiled
with gcc 3.1) but users who didn't want to move their C++ code to 3.1
yet. When you hit the frameworks, gdb would see _Z, and assume the
mangling style was the 3.1 style. Of course, all their code was 2.95,
and they didn't in general care about the C++ stuff in frameworks
(Apple tries not to export C++ API's if it can help it). So forcing
demangling to 2.95 was useful in this case.
This should, hopefully, be just a short term problem. Very few of our
customers are still using 2.95 that we know about. But it is still
worth keeping in mind for the next year or so...
Jim
--
Jim Ingham jingham@apple.com
Developer Tools
Apple Computer
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-02 17:49 ` RFA: Search for symbol names the same way they're hashed Jim Ingham
@ 2002-10-02 18:16 ` Daniel Jacobowitz
2002-10-02 18:25 ` Jim Ingham
0 siblings, 1 reply; 18+ messages in thread
From: Daniel Jacobowitz @ 2002-10-02 18:16 UTC (permalink / raw)
To: Jim Ingham; +Cc: gdb-patches
On Wed, Oct 02, 2002 at 05:48:01PM -0700, Jim Ingham wrote:
>
> On Wednesday, October 2, 2002, at 02:50 PM,
> gdb-patches-digest-help@sources.redhat.com wrote:
>
> >>We need to make demangling-style only affect *printout* and *user
> >>entered strings*, and during symbol reading, force it to auto, so it
> >>always gets the right names in the symbol table in the first place.
> >
> >Doesn't that sort of defeat the point of letting the user set
> >demangling style? It's in case something goes wrong with
> >autodetection....
> >
> >>>The source code name of a symbol does not depend depend on the
> >>>current
> >>>demangling setting;
> >>
> >>And to enforce this, you have to make the readers *not* honor the
> >>demangling style. If you just fix SYMBOL_SOURCE_NAME,
> >>SYMBOL_INIT_DEMANGLED_NAME will still be only called once, and it'll
> >>have the wrong demangling style when it calls cplus_demangle,
> >>resulting
> >>in the symbol having the wrong demangled name forevermore.
> >
> >Perhaps we need to decide what the point of letting users force the
> >demangle style is, first.
>
> The case where we have had to use this was because we had private C++
> API's in some of the Mac OS X frameworks for 10.2 (which was compiled
> with gcc 3.1) but users who didn't want to move their C++ code to 3.1
> yet. When you hit the frameworks, gdb would see _Z, and assume the
> mangling style was the 3.1 style. Of course, all their code was 2.95,
> and they didn't in general care about the C++ stuff in frameworks
> (Apple tries not to export C++ API's if it can help it). So forcing
> demangling to 2.95 was useful in this case.
>
> This should, hopefully, be just a short term problem. Very few of our
> customers are still using 2.95 that we know about. But it is still
> worth keeping in mind for the next year or so...
Time to unconfuse one issue:
Jim, are you sure that you are talking about _demangling_ style? It is
orthogonal to 'set cp-abi'. I'm talking about 'set demangle-style'
here. I assume in a mixed v2/v3 environment you'd want 'auto' anyway.
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-02 18:16 ` Daniel Jacobowitz
@ 2002-10-02 18:25 ` Jim Ingham
2002-10-03 8:15 ` Daniel Berlin
0 siblings, 1 reply; 18+ messages in thread
From: Jim Ingham @ 2002-10-02 18:25 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: gdb-patches
Daniel,
On Wednesday, October 2, 2002, at 06:16 PM, Daniel Jacobowitz wrote:
> On Wed, Oct 02, 2002 at 05:48:01PM -0700, Jim Ingham wrote:
>>
>> On Wednesday, October 2, 2002, at 02:50 PM,
>> gdb-patches-digest-help@sources.redhat.com wrote:
>>
>>>> We need to make demangling-style only affect *printout* and *user
>>>> entered strings*, and during symbol reading, force it to auto, so it
>>>> always gets the right names in the symbol table in the first place.
>>>
>>> Doesn't that sort of defeat the point of letting the user set
>>> demangling style? It's in case something goes wrong with
>>> autodetection....
>>>
>>>>> The source code name of a symbol does not depend depend on the
>>>>> current
>>>>> demangling setting;
>>>>
>>>> And to enforce this, you have to make the readers *not* honor the
>>>> demangling style. If you just fix SYMBOL_SOURCE_NAME,
>>>> SYMBOL_INIT_DEMANGLED_NAME will still be only called once, and it'll
>>>> have the wrong demangling style when it calls cplus_demangle,
>>>> resulting
>>>> in the symbol having the wrong demangled name forevermore.
>>>
>>> Perhaps we need to decide what the point of letting users force the
>>> demangle style is, first.
>>
>> The case where we have had to use this was because we had private C++
>> API's in some of the Mac OS X frameworks for 10.2 (which was compiled
>> with gcc 3.1) but users who didn't want to move their C++ code to 3.1
>> yet. When you hit the frameworks, gdb would see _Z, and assume the
>> mangling style was the 3.1 style. Of course, all their code was 2.95,
>> and they didn't in general care about the C++ stuff in frameworks
>> (Apple tries not to export C++ API's if it can help it). So forcing
>> demangling to 2.95 was useful in this case.
>>
>> This should, hopefully, be just a short term problem. Very few of our
>> customers are still using 2.95 that we know about. But it is still
>> worth keeping in mind for the next year or so...
>
> Time to unconfuse one issue:
>
> Jim, are you sure that you are talking about _demangling_ style? It is
> orthogonal to 'set cp-abi'. I'm talking about 'set demangle-style'
> here. I assume in a mixed v2/v3 environment you'd want 'auto' anyway.
>
Oh, sorry, my mistake... But you don't want "auto" in a mixed
environment, because auto chooses v3 when it sees it. For the folks
using v2, the v3 bits are all stuff they don't care at all about. So
you have to force it to v2. We did all this already, however, IIRC...
Jim
--
Jim Ingham jingham@apple.com
Developer Tools
Apple Computer
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-02 18:25 ` Jim Ingham
@ 2002-10-03 8:15 ` Daniel Berlin
0 siblings, 0 replies; 18+ messages in thread
From: Daniel Berlin @ 2002-10-03 8:15 UTC (permalink / raw)
To: Jim Ingham; +Cc: Daniel Jacobowitz, gdb-patches
On Wednesday, October 2, 2002, at 09:24 PM, Jim Ingham wrote:
> Daniel,
>
> On Wednesday, October 2, 2002, at 06:16 PM, Daniel Jacobowitz wrote:
>
>> On Wed, Oct 02, 2002 at 05:48:01PM -0700, Jim Ingham wrote:
>>>
>>> On Wednesday, October 2, 2002, at 02:50 PM,
>>> gdb-patches-digest-help@sources.redhat.com wrote:
>>>
>>>>> We need to make demangling-style only affect *printout* and *user
>>>>> entered strings*, and during symbol reading, force it to auto, so
>>>>> it
>>>>> always gets the right names in the symbol table in the first place.
>>>>
>>>> Doesn't that sort of defeat the point of letting the user set
>>>> demangling style? It's in case something goes wrong with
>>>> autodetection....
>>>>
>>>>>> The source code name of a symbol does not depend depend on the
>>>>>> current
>>>>>> demangling setting;
>>>>>
>>>>> And to enforce this, you have to make the readers *not* honor the
>>>>> demangling style. If you just fix SYMBOL_SOURCE_NAME,
>>>>> SYMBOL_INIT_DEMANGLED_NAME will still be only called once, and
>>>>> it'll
>>>>> have the wrong demangling style when it calls cplus_demangle,
>>>>> resulting
>>>>> in the symbol having the wrong demangled name forevermore.
>>>>
>>>> Perhaps we need to decide what the point of letting users force the
>>>> demangle style is, first.
>>>
>>> The case where we have had to use this was because we had private C++
>>> API's in some of the Mac OS X frameworks for 10.2 (which was compiled
>>> with gcc 3.1) but users who didn't want to move their C++ code to 3.1
>>> yet. When you hit the frameworks, gdb would see _Z, and assume the
>>> mangling style was the 3.1 style. Of course, all their code was
>>> 2.95,
>>> and they didn't in general care about the C++ stuff in frameworks
>>> (Apple tries not to export C++ API's if it can help it). So forcing
>>> demangling to 2.95 was useful in this case.
>>>
>>> This should, hopefully, be just a short term problem. Very few of
>>> our
>>> customers are still using 2.95 that we know about. But it is still
>>> worth keeping in mind for the next year or so...
>>
>> Time to unconfuse one issue:
>>
>> Jim, are you sure that you are talking about _demangling_ style? It
>> is
>> orthogonal to 'set cp-abi'. I'm talking about 'set demangle-style'
>> here. I assume in a mixed v2/v3 environment you'd want 'auto' anyway.
>>
>
> Oh, sorry, my mistake... But you don't want "auto" in a mixed
> environment, because auto chooses v3 when it sees it. For the folks
> using v2, the v3 bits are all stuff they don't care at all about. So
> you have to force it to v2. We did all this already, however, IIRC...
>
No, auto doesn't *switch* to v3 mode, when it sees a v3, it just uses
the v3 demangler for that string.
If you next hand it a v2 string, it should demangle it properly as well.
Otherwise, something is broken.
I just tried it with C++filt, to make sure, and it does work:
bash-2.05a$ c++filt
_a__3bob [This is a v2 mangled name]
bob::a(void)
__ZN3bob1aEv [This is a v3 mangled name]
bob::a()
--Dan
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-07 16:19 ` David Carlton
@ 2002-10-07 16:38 ` Daniel Jacobowitz
0 siblings, 0 replies; 18+ messages in thread
From: Daniel Jacobowitz @ 2002-10-07 16:38 UTC (permalink / raw)
To: David Carlton; +Cc: Jim Blandy, gdb-patches
On Mon, Oct 07, 2002 at 04:19:10PM -0700, David Carlton wrote:
> On Wed, 2 Oct 2002 14:05:15 -0400, Daniel Jacobowitz <drow@mvista.com> said:
>
> > Right now, there are multiple functions with the same symbol
> > demangled name, but different mangled names.
>
> Just out of curiosity, what are the current situations that you know
> of where that can happen? I just noticed today that unnamed
Well, that comment was about constructors:
_ZN1AC1Eb
A::A[in-charge](bool)
_ZN1AC2Eb
A::A[not-in-charge](bool)
We strip off the [in-charge] bit because we don't have any use for it.
> namespaces in different files can demangle to the same name, which
> sure doesn't thrill me; I'm curious about where else demangling is
> losing important info.
We can't even use the demangled names of such things, anyway. We'll
need to figure out what to do with them eventually...
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-02 11:04 ` Daniel Jacobowitz
2002-10-02 12:10 ` David Carlton
@ 2002-10-07 16:19 ` David Carlton
2002-10-07 16:38 ` Daniel Jacobowitz
1 sibling, 1 reply; 18+ messages in thread
From: David Carlton @ 2002-10-07 16:19 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: Jim Blandy, gdb-patches
On Wed, 2 Oct 2002 14:05:15 -0400, Daniel Jacobowitz <drow@mvista.com> said:
> Right now, there are multiple functions with the same symbol
> demangled name, but different mangled names.
Just out of curiosity, what are the current situations that you know
of where that can happen? I just noticed today that unnamed
namespaces in different files can demangle to the same name, which
sure doesn't thrill me; I'm curious about where else demangling is
losing important info.
Though I don't _really_ mind it all that much in the namespace
situation: as far as I'm concerned, it's just further evidence that
namespaces should be treated as first-class objects instead of being
hacked at via names of symbols. Still, I wouldn't mind if the
approach via names of symbols worked a _bit_ better than it actually
will...
David Carlton
carlton@math.stanford.edu
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-02 14:03 ` Daniel Jacobowitz
@ 2002-10-02 16:10 ` Daniel Berlin
0 siblings, 0 replies; 18+ messages in thread
From: Daniel Berlin @ 2002-10-02 16:10 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: Jim Blandy, David Carlton, gdb-patches
On Wed, 2 Oct 2002, Daniel Jacobowitz wrote:
> On Wed, Oct 02, 2002 at 04:43:34PM -0400, Daniel Berlin wrote:
> >
> > On Wednesday, October 2, 2002, at 03:18 PM, Jim Blandy wrote:
> >
> > >
> > >I just did a quick survey of the uses of SYMBOL_SOURCE_NAME. They all
> > >fall into two categories:
> > >- printing symbol names, and
> > >- sort comparison functions.
> > >
> > >The first usage is exactly correct: the way a symbol prints should
> > >respect the current demangling setting.
> > >
> > >The second usage seems wrong to me: if you sort under one demangling
> > >setting, but then search under a different one, well, ... duh.
> >
> > SYMBOL_SOURCE_NAME changes won't help here, it'll only fix half the
> > problem.
> >
> > Try setting a demangling style that is wrong (say EDG), then loading a
> > C++ file (ie reading it's symbols), then setting the right demangling
> > style.
> >
> > Observe:
> > [dberlin@dberlin gdb]$ ./gdb
> > GNU gdb 2002-08-15-cvs
> > Copyright 2002 Free Software Foundation, Inc.
> > GDB is free software, covered by the GNU General Public License, and
> > you are
> > welcome to change it and/or distribute copies of it under certain
> > conditions.
> > Type "show copying" to see the conditions.
> > There is absolutely no warranty for GDB. Type "show warranty" for
> > details.
> > This GDB was configured as "i686-pc-linux-gnu".
> > Setting up the environment for debugging gdb.
> > .gdbinit:5: Error in sourced command file:
> > No symbol table is loaded. Use the "file" command.
> > (gdb) set demangle-style edg
> > (gdb) file a.out
> > Reading symbols from a.out...done.
> > (gdb) info func bob
> > All functions matching regular expression "bob":
> >
> > File testcpp.cpp:
> > int _Z3bobv(void);
> > (gdb) set demangle-style gnu
> > gnu gnu-v3
> > (gdb) set demangle-style gnu-v3
> > (gdb) info func bob
> > All functions matching regular expression "bob":
> >
> > File testcpp.cpp:
> > int _Z3bobv(void);
> > (gdb) b bob
> > Function "bob" not defined.
> > (gdb)
> >
> > Even after you set the right demangling style, it's too late.
> > The symbols in the symbol table will have the wrong demangled names, so
> > even if you made SYMBOL_DEMANGLEST or whatever, it will still be wrong.
> >
> > We need to make demangling-style only affect *printout* and *user
> > entered strings*, and during symbol reading, force it to auto, so it
> > always gets the right names in the symbol table in the first place.
>
> Doesn't that sort of defeat the point of letting the user set
> demangling style?
What is your definition of what the demangling style should apply to?
> It's in case something goes wrong with
> autodetection....
Then fix the auto-detection.
Is there an ambiguous case you know of (or anyone, for that matter), or
is this theoretical?
The point of set demangle-style, AFAIK, is for assembler printouts.
*Not* for normal source code.
The user, even if they were to set the demangling style, would probably
not expect it to apply to *all symbol files* loaded after they set it, but
not those before you set it.
It gives you different behavior if you do "gdb ./testobj", then set the
demanging style, then if you do "gdb", set the demangling style, then load
the file.
Watch:
[dberlin@dberlin gdb]$ gdb ./a.out
GNU gdb 5.2.1-2mdk (Mandrake Linux)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "i586-mandrake-linux-gnu"...
Setting up the environment for debugging gdb.
.gdbinit:5: Error in sourced command file:
Function "internal_error" not defined.
(gdb) set demangle-style edg
(gdb) b bob
Breakpoint 1 at 0x804835c: file testcpp.cpp, line 2.
(gdb) info func bob
All functions matching regular expression "bob":
File testcpp.cpp:
int _Z3bobv(void);
(gdb) q
[dberlin@dberlin gdb]$ gdb
GNU gdb 5.2.1-2mdk (Mandrake Linux)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "i586-mandrake-linux-gnu".
Setting up the environment for debugging gdb.
.gdbinit:5: Error in sourced command file:
No symbol table is loaded. Use the "file" command.
(gdb) set de
debug debugvarobj demangle-style
(gdb) set demangle-style edg
(gdb) file a.out
Reading symbols from a.out...done.
(gdb) b bob
Function "bob" not defined.
(gdb) info func bob
All functions matching regular expression "bob":
File testcpp.cpp:
int _Z3bobv(void);
(gdb)
This is minimal symbol fun in exactly the way i described.
Since it's already loaded msyms in "auto" mode in the first case, even
though the psyms and normal syms have the *wrong* name.
As a user, the fact that "b bob" works in the first case, but not in the
second, confuses the hell out of me.
I would venture nobody noticed because nobody changed the demangling
style, not because it was coded this way.
>
> Perhaps we need to decide what the point of letting users force the
> demangle style is, first.
Assembler printouts.
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-02 13:43 ` Daniel Berlin
@ 2002-10-02 14:03 ` Daniel Jacobowitz
2002-10-02 16:10 ` Daniel Berlin
0 siblings, 1 reply; 18+ messages in thread
From: Daniel Jacobowitz @ 2002-10-02 14:03 UTC (permalink / raw)
To: Daniel Berlin; +Cc: Jim Blandy, David Carlton, gdb-patches
On Wed, Oct 02, 2002 at 04:43:34PM -0400, Daniel Berlin wrote:
>
> On Wednesday, October 2, 2002, at 03:18 PM, Jim Blandy wrote:
>
> >
> >I just did a quick survey of the uses of SYMBOL_SOURCE_NAME. They all
> >fall into two categories:
> >- printing symbol names, and
> >- sort comparison functions.
> >
> >The first usage is exactly correct: the way a symbol prints should
> >respect the current demangling setting.
> >
> >The second usage seems wrong to me: if you sort under one demangling
> >setting, but then search under a different one, well, ... duh.
>
> SYMBOL_SOURCE_NAME changes won't help here, it'll only fix half the
> problem.
>
> Try setting a demangling style that is wrong (say EDG), then loading a
> C++ file (ie reading it's symbols), then setting the right demangling
> style.
>
> Observe:
> [dberlin@dberlin gdb]$ ./gdb
> GNU gdb 2002-08-15-cvs
> Copyright 2002 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and
> you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for
> details.
> This GDB was configured as "i686-pc-linux-gnu".
> Setting up the environment for debugging gdb.
> .gdbinit:5: Error in sourced command file:
> No symbol table is loaded. Use the "file" command.
> (gdb) set demangle-style edg
> (gdb) file a.out
> Reading symbols from a.out...done.
> (gdb) info func bob
> All functions matching regular expression "bob":
>
> File testcpp.cpp:
> int _Z3bobv(void);
> (gdb) set demangle-style gnu
> gnu gnu-v3
> (gdb) set demangle-style gnu-v3
> (gdb) info func bob
> All functions matching regular expression "bob":
>
> File testcpp.cpp:
> int _Z3bobv(void);
> (gdb) b bob
> Function "bob" not defined.
> (gdb)
>
> Even after you set the right demangling style, it's too late.
> The symbols in the symbol table will have the wrong demangled names, so
> even if you made SYMBOL_DEMANGLEST or whatever, it will still be wrong.
>
> We need to make demangling-style only affect *printout* and *user
> entered strings*, and during symbol reading, force it to auto, so it
> always gets the right names in the symbol table in the first place.
Doesn't that sort of defeat the point of letting the user set
demangling style? It's in case something goes wrong with
autodetection....
> >The source code name of a symbol does not depend depend on the current
> >demangling setting;
>
> And to enforce this, you have to make the readers *not* honor the
> demangling style. If you just fix SYMBOL_SOURCE_NAME,
> SYMBOL_INIT_DEMANGLED_NAME will still be only called once, and it'll
> have the wrong demangling style when it calls cplus_demangle, resulting
> in the symbol having the wrong demangled name forevermore.
Perhaps we need to decide what the point of letting users force the
demangle style is, first.
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-02 12:35 ` Jim Blandy
2002-10-02 12:41 ` David Carlton
@ 2002-10-02 13:43 ` Daniel Berlin
2002-10-02 14:03 ` Daniel Jacobowitz
1 sibling, 1 reply; 18+ messages in thread
From: Daniel Berlin @ 2002-10-02 13:43 UTC (permalink / raw)
To: Jim Blandy; +Cc: David Carlton, gdb-patches
On Wednesday, October 2, 2002, at 03:18 PM, Jim Blandy wrote:
>
> I just did a quick survey of the uses of SYMBOL_SOURCE_NAME. They all
> fall into two categories:
> - printing symbol names, and
> - sort comparison functions.
>
> The first usage is exactly correct: the way a symbol prints should
> respect the current demangling setting.
>
> The second usage seems wrong to me: if you sort under one demangling
> setting, but then search under a different one, well, ... duh.
SYMBOL_SOURCE_NAME changes won't help here, it'll only fix half the
problem.
Try setting a demangling style that is wrong (say EDG), then loading a
C++ file (ie reading it's symbols), then setting the right demangling
style.
Observe:
[dberlin@dberlin gdb]$ ./gdb
GNU gdb 2002-08-15-cvs
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "i686-pc-linux-gnu".
Setting up the environment for debugging gdb.
.gdbinit:5: Error in sourced command file:
No symbol table is loaded. Use the "file" command.
(gdb) set demangle-style edg
(gdb) file a.out
Reading symbols from a.out...done.
(gdb) info func bob
All functions matching regular expression "bob":
File testcpp.cpp:
int _Z3bobv(void);
(gdb) set demangle-style gnu
gnu gnu-v3
(gdb) set demangle-style gnu-v3
(gdb) info func bob
All functions matching regular expression "bob":
File testcpp.cpp:
int _Z3bobv(void);
(gdb) b bob
Function "bob" not defined.
(gdb)
Even after you set the right demangling style, it's too late.
The symbols in the symbol table will have the wrong demangled names, so
even if you made SYMBOL_DEMANGLEST or whatever, it will still be wrong.
We need to make demangling-style only affect *printout* and *user
entered strings*, and during symbol reading, force it to auto, so it
always gets the right names in the symbol table in the first place.
> The source code name of a symbol does not depend depend on the current
> demangling setting;
And to enforce this, you have to make the readers *not* honor the
demangling style. If you just fix SYMBOL_SOURCE_NAME,
SYMBOL_INIT_DEMANGLED_NAME will still be only called once, and it'll
have the wrong demangling style when it calls cplus_demangle, resulting
in the symbol having the wrong demangled name forevermore.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-02 13:20 ` David Carlton
@ 2002-10-02 13:25 ` Daniel Jacobowitz
0 siblings, 0 replies; 18+ messages in thread
From: Daniel Jacobowitz @ 2002-10-02 13:25 UTC (permalink / raw)
To: gdb-patches
On Wed, Oct 02, 2002 at 01:20:30PM -0700, David Carlton wrote:
> On Wed, 2 Oct 2002 16:02:03 -0400, Daniel Jacobowitz <drow@mvista.com> said:
>
> > Dare I suggest SYMBOL_SEARCH_NAME? That is, the only name we should
> > search for when we're looking up this function in symbol tables.
> > Since that's the other use.
>
> That makes sense; the main thing that worries me is that I'm not sure
> that we might not sometimes search by other names (at least in the
> minimal symbol table?). I think that, if SYMBOL_SOURCE_NAME didn't
> already exist, then I'd prefer SYMBOL_SOURCE_NAME to
> SYMBOL_SEARCH_NAME; but there's probably some advantage to not having
> anything called SYMBOL_SOURCE_NAME. Hmm; I could go either way on
> this one.
I'd rather not _change_ the meaning of SYMBOL_SOURCE_NAME; either leave
it or eliminate it. I don't mind either of those.
I don't see a real benefit to renaming it...
>
> > I agree about SYMBOL_PRINT_NAME... or, how about
> > SYMBOL_PRINTABLE_NAME?
>
> SYMBOL_PRINT_NAME is better, I think: lots of names are printable, but
> we're only going to print one of them. (Though I like the idea of a
> function SYMBOL_PRINTABLE_NAME that takes an unprintable symbol name
> and bowdlerizes it by putting asterisks in place of swear words...)
>
> David Carlton
> carlton@math.stanford.edu
>
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-02 13:02 ` Daniel Jacobowitz
@ 2002-10-02 13:20 ` David Carlton
2002-10-02 13:25 ` Daniel Jacobowitz
0 siblings, 1 reply; 18+ messages in thread
From: David Carlton @ 2002-10-02 13:20 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: Jim Blandy, gdb-patches
On Wed, 2 Oct 2002 16:02:03 -0400, Daniel Jacobowitz <drow@mvista.com> said:
> Dare I suggest SYMBOL_SEARCH_NAME? That is, the only name we should
> search for when we're looking up this function in symbol tables.
> Since that's the other use.
That makes sense; the main thing that worries me is that I'm not sure
that we might not sometimes search by other names (at least in the
minimal symbol table?). I think that, if SYMBOL_SOURCE_NAME didn't
already exist, then I'd prefer SYMBOL_SOURCE_NAME to
SYMBOL_SEARCH_NAME; but there's probably some advantage to not having
anything called SYMBOL_SOURCE_NAME. Hmm; I could go either way on
this one.
> I agree about SYMBOL_PRINT_NAME... or, how about
> SYMBOL_PRINTABLE_NAME?
SYMBOL_PRINT_NAME is better, I think: lots of names are printable, but
we're only going to print one of them. (Though I like the idea of a
function SYMBOL_PRINTABLE_NAME that takes an unprintable symbol name
and bowdlerizes it by putting asterisks in place of swear words...)
David Carlton
carlton@math.stanford.edu
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-02 12:41 ` David Carlton
@ 2002-10-02 13:02 ` Daniel Jacobowitz
2002-10-02 13:20 ` David Carlton
0 siblings, 1 reply; 18+ messages in thread
From: Daniel Jacobowitz @ 2002-10-02 13:02 UTC (permalink / raw)
To: David Carlton; +Cc: Jim Blandy, gdb-patches
On Wed, Oct 02, 2002 at 12:41:46PM -0700, David Carlton wrote:
> On 02 Oct 2002 14:18:51 -0500, Jim Blandy <jimb@redhat.com> said:
> > David Carlton <carlton@math.stanford.edu> writes:
>
> >> 1) This concept of 'a name that is as demangled as possible' is a
> >> pretty important one and occurs in multiple places in GDB's sources,
>
> > Yeah, I think we need something like that, too.
>
> > How about SYMBOL_DEMANGLEDEST_NAME? Um.
>
> Um indeed.
Dare I suggest SYMBOL_SEARCH_NAME? That is, the only name we should
search for when we're looking up this function in symbol tables. Since
that's the other use.
> > I just did a quick survey of the uses of SYMBOL_SOURCE_NAME.
>
> Thanks! I'd been meaning to do that, but I hadn't gotten around to it
> yet. Also, somebody should look at all uses of SYMBOL_DEMANGLED_NAME
> to see which ones are part of a chunk of code that could be replaced
> by SYMBOL_BEST_NAME.
>
> > They all fall into two categories:
> > - printing symbol names, and
> > - sort comparison functions.
>
> > The first usage is exactly correct: the way a symbol prints should
> > respect the current demangling setting.
>
> > The second usage seems wrong to me: if you sort under one demangling
> > setting, but then search under a different one, well, ... duh.
>
> Right. That's exactly what I was afraid of: we should make it clear
> that SYMBOL_SOURCE_NAME is for external use only. (Both by comments
> and by renaming it, as you suggest below.)
Blech. There's some of these left? I swear I fixed them! I see, in
symtab.c, bogus uses in lookup_partial_symbol and lookup_block_symbol,
but only in the linear search code. So I think I got the rest. The
ones for sort_search_symbols are benign.
> > The source code name of a symbol does not depend depend on the current
> > demangling setting; the way it should be printed obviously does. So
> > I'd suggest:
> > - changing the first category of uses to use SYMBOL_PRINT_NAME,
> > a new macro which will be defined just like the current
> > SYMBOL_SOURCE_NAME (or David's defn below looks nice, too), and
> > - defining a new SYMBOL_SOURCE_NAME which does what David's
> > SYMBOL_BEST_NAME does above.
>
> I agree.
I agree about SYMBOL_PRINT_NAME... or, how about SYMBOL_PRINTABLE_NAME?
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-02 12:35 ` Jim Blandy
@ 2002-10-02 12:41 ` David Carlton
2002-10-02 13:02 ` Daniel Jacobowitz
2002-10-02 13:43 ` Daniel Berlin
1 sibling, 1 reply; 18+ messages in thread
From: David Carlton @ 2002-10-02 12:41 UTC (permalink / raw)
To: Jim Blandy; +Cc: gdb-patches
On 02 Oct 2002 14:18:51 -0500, Jim Blandy <jimb@redhat.com> said:
> David Carlton <carlton@math.stanford.edu> writes:
>> 1) This concept of 'a name that is as demangled as possible' is a
>> pretty important one and occurs in multiple places in GDB's sources,
> Yeah, I think we need something like that, too.
> How about SYMBOL_DEMANGLEDEST_NAME? Um.
Um indeed.
> I just did a quick survey of the uses of SYMBOL_SOURCE_NAME.
Thanks! I'd been meaning to do that, but I hadn't gotten around to it
yet. Also, somebody should look at all uses of SYMBOL_DEMANGLED_NAME
to see which ones are part of a chunk of code that could be replaced
by SYMBOL_BEST_NAME.
> They all fall into two categories:
> - printing symbol names, and
> - sort comparison functions.
> The first usage is exactly correct: the way a symbol prints should
> respect the current demangling setting.
> The second usage seems wrong to me: if you sort under one demangling
> setting, but then search under a different one, well, ... duh.
Right. That's exactly what I was afraid of: we should make it clear
that SYMBOL_SOURCE_NAME is for external use only. (Both by comments
and by renaming it, as you suggest below.)
> The source code name of a symbol does not depend depend on the current
> demangling setting; the way it should be printed obviously does. So
> I'd suggest:
> - changing the first category of uses to use SYMBOL_PRINT_NAME,
> a new macro which will be defined just like the current
> SYMBOL_SOURCE_NAME (or David's defn below looks nice, too), and
> - defining a new SYMBOL_SOURCE_NAME which does what David's
> SYMBOL_BEST_NAME does above.
I agree.
David Carlton
carlton@math.stanford.edu
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-02 10:45 ` David Carlton
@ 2002-10-02 12:35 ` Jim Blandy
2002-10-02 12:41 ` David Carlton
2002-10-02 13:43 ` Daniel Berlin
0 siblings, 2 replies; 18+ messages in thread
From: Jim Blandy @ 2002-10-02 12:35 UTC (permalink / raw)
To: David Carlton; +Cc: gdb-patches
David Carlton <carlton@math.stanford.edu> writes:
> 1) This concept of 'a name that is as demangled as possible' is a
> pretty important one and occurs in multiple places in GDB's sources,
> so let's formalize it:
>
> /* Macro that returns the demangled name of the symbol if if possible
> and the symbol name if not possible. This is like
> SYMBOL_SOURCE_NAME except that it doesn't depend on the value of
> 'demangle' (and is hence more suitable for internal usage). The
> result should never be NULL. */
>
> /* FIXME: carlton/2002-09-26: Probably the situation with this and
> SYMBOL_SOURCE_NAME should be rethought. */
>
> #define SYMBOL_BEST_NAME(symbol) \
> (SYMBOL_DEMANGLED_NAME (symbol) != NULL \
> ? SYMBOL_DEMANGLED_NAME (symbol) \
> : SYMBOL_NAME (symbol))
>
> (I'm not exactly wedded to the name 'SYMBOL_BEST_NAME'; it was the
> first acceptable name that I thought of, given that SYMBOL_SOURCE_NAME
> was already taken to mean something subtly different.)
Yeah, I think we need something like that, too.
How about SYMBOL_DEMANGLEDEST_NAME? Um.
I just did a quick survey of the uses of SYMBOL_SOURCE_NAME. They all
fall into two categories:
- printing symbol names, and
- sort comparison functions.
The first usage is exactly correct: the way a symbol prints should
respect the current demangling setting.
The second usage seems wrong to me: if you sort under one demangling
setting, but then search under a different one, well, ... duh.
The source code name of a symbol does not depend depend on the current
demangling setting; the way it should be printed obviously does. So
I'd suggest:
- changing the first category of uses to use SYMBOL_PRINT_NAME,
a new macro which will be defined just like the current
SYMBOL_SOURCE_NAME (or David's defn below looks nice, too), and
- defining a new SYMBOL_SOURCE_NAME which does what David's
SYMBOL_BEST_NAME does above.
This could be two separate patches --- one for each of the steps above
--- but I think it should actually be one patch, since you either want
to accept or revert the two as a unit; having either one without the
other is just a mess.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-02 11:04 ` Daniel Jacobowitz
@ 2002-10-02 12:10 ` David Carlton
2002-10-07 16:19 ` David Carlton
1 sibling, 0 replies; 18+ messages in thread
From: David Carlton @ 2002-10-02 12:10 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: Jim Blandy, gdb-patches
On Wed, 2 Oct 2002 14:05:15 -0400, Daniel Jacobowitz <drow@mvista.com> said:
> (What is the right solution? Simple. While lookup_block_symbol
> presumably only returns one symbol, all of GDB needs to be aware
> that lookup_symbol can find more than one. We need a way to
> distinguish them (for static functions in multiple files - often all
> with the same filename! This happens in BFD) and a way to collapse
> them (for cloned constructors, breaking on one should probably break
> on all of them).
Yup. For what it's worth, in my current version of the dictionary
stuff for blocks, I've basically given up on having a dict_lookup
function that returns "the" match: instead, I have
dict_iter_name_{first,next} that allow you to iterate through all
entries in a dictionary whose SYMBOL_BEST_NAME strcmp_iw's to what
you're looking for.
Of course, I still have lookup_block_symbol return a single value, so,
for example, the relevant code in the non-function case for my version
of lookup_block_symbol looks like this:
if (!BLOCK_FUNCTION (block))
{
for (sym = dict_iter_name_first (BLOCK_DICT (block), name, &iter);
sym; sym = dict_iter_name_next (name, &iter))
{
if (SYMBOL_NAMESPACE (sym) == namespace
&& (mangled_name
? strcmp (SYMBOL_NAME (sym), mangled_name) == 0 : 1))
return sym;
}
return NULL;
}
(the BLOCK_FUNCTION (block) case is a little more complicated), but it
should be friendly to adding variants that might find multiple
symbols.
David Carlton
carlton@math.stanford.edu
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-01 20:45 Jim Blandy
2002-10-02 10:45 ` David Carlton
@ 2002-10-02 11:04 ` Daniel Jacobowitz
2002-10-02 12:10 ` David Carlton
2002-10-07 16:19 ` David Carlton
1 sibling, 2 replies; 18+ messages in thread
From: Daniel Jacobowitz @ 2002-10-02 11:04 UTC (permalink / raw)
To: Jim Blandy; +Cc: gdb-patches
On Tue, Oct 01, 2002 at 10:29:14PM -0500, Jim Blandy wrote:
>
> This one is odd. If the C++ folks (and Elena, if she has time) could
> review this carefully, that would be great.
>
> We build hashed blocks in buildsym.c:finish_block as follows:
>
> for (next = *listhead; next; next = next->next)
> {
> for (j = next->nsyms - 1; j >= 0; j--)
> {
> struct symbol *sym;
> unsigned int hash_index;
> const char *name = SYMBOL_DEMANGLED_NAME (next->symbol[j]);
> if (name == NULL)
> name = SYMBOL_NAME (next->symbol[j]);
> hash_index = msymbol_hash_iw (name);
> hash_index = hash_index % BLOCK_BUCKETS (block);
> sym = BLOCK_BUCKET (block, hash_index);
> BLOCK_BUCKET (block, hash_index) = next->symbol[j];
> next->symbol[j]->hash_next = sym;
> }
> }
>
> But then, we search for them in the hashed block in
> symtab.c:lookup_block_symbol like this:
>
> if (BLOCK_HASHTABLE (block))
> {
> unsigned int hash_index;
> hash_index = msymbol_hash_iw (name);
> hash_index = hash_index % BLOCK_BUCKETS (block);
> for (sym = BLOCK_BUCKET (block, hash_index); sym; sym = sym->hash_next)
> {
> if (SYMBOL_NAMESPACE (sym) == namespace
> && (mangled_name
> ? strcmp (SYMBOL_NAME (sym), mangled_name) == 0
> : SYMBOL_MATCHES_NAME (sym, name)))
> return sym;
> }
> return NULL;
> }
>
> Now, it's a basic fact of hash tables that you can only search using a
> matching criterion strictly more conservative than equality under your
> hash function. If you hash one way, but then search another way, some
> of your matches might have been filed in other hash buckets.
>
> So the above code probably isn't quite right.
OK. So the fact is that if name was not the name which was hashed, we
can't assume we'll find it here. So either we should be hashing both
names we want to check, or we should only check one. That's the basic
principle.
> Let's assume that, if the caller has supplied a non-zero MANGLED_NAME,
> it is indeed the mangled form of NAME. Otherwise, the caller is
> broken and deserves whatever it gets. Mangled name matching is a more
> conservative matching criterion than demangled name matching (right?),
> so the `mangled_name ? strcmp ...' part is okay.
I like your assumption. That's what ought to happen. The whole
mangled_name thing is for handling multiple variants of constructors,
etc; I don't think it's the long-term right solution but it works for
now.
(What is the right solution? Simple. While lookup_block_symbol
presumably only returns one symbol, all of GDB needs to be aware that
lookup_symbol can find more than one. We need a way to distinguish
them (for static functions in multiple files - often all with the same
filename! This happens in BFD) and a way to collapse them (for cloned
constructors, breaking on one should probably break on all of them).
The one-to-one mapping of symbols to addresses doesn't hold up under
pressure.
But I digress.
Right now, there are multiple functions with the same symbol demangled
name, but different mangled names. If we start using DMGL_VERBOSE
(spelling?) then that won't happen but we need to be ready for the
extra information it provides; things like "X::X[in-charge](const X&)".
So we just take "X::X(const X&)" for now, and all the X::X's are hashed
to the same bucket, which we could take advantage of in overload
resolution once all symbol readers do that. And the mangled name tells
us which one we want back.
> But when MANGLED_NAME is zero, then the SYMBOL_MATCHES_NAME part is
> questionable. The definition of SYMBOL_MATCHES_NAME from symtab.h is
> as follows:
>
> #define SYMBOL_MATCHES_NAME(symbol, name) \
> (STREQ (SYMBOL_NAME (symbol), (name)) \
> || (SYMBOL_DEMANGLED_NAME (symbol) != NULL \
> && strcmp_iw (SYMBOL_DEMANGLED_NAME (symbol), (name)) == 0))
>
> It returns true if NAME matches either SYMBOL_NAME or
> SYMBOL_DEMANGLED_NAME.
>
> If the intention really was to use SYMBOL_MATCHES_NAME to recognize
> matches, then the code is broken. If SYMBOL_NAME (sym) matches NAME,
> and sym has a demangled name which is different from NAME (which will
> usually be the case), SYMBOL_MATCHES_NAME (sym, name) will be true,
> but finish_block will have hashed on the demangled name, and probably
> will have filed sym in a different hash bucket than the one we'll
> search.
>
> If finish_block's criteria are the correct ones, this liberal
> searching criteria can't cause us any problems. Where finish_block
> would consider a name to match a symbol, SYMBOL_MATCHES_NAME would
> too: the latter checks both names. And where finish_block would
> consider two names to differ, SYMBOL_MATCHES_NAME behaves the same way
> where there is no a demangled name, and since mangled matching is more
> conservative than demangled matching, it will also behave the same way
> where there is a demangled name.
>
> Does that sound right? :)
Yep!
> My first reaction was to say that finish_block is broken --- we want
> to recognize matches on either mangled or unmangled names, but
> finish_block's behavior means that only matches on unmangled names, or
> mangled names where there are no unmangled_names, work reliably.
>
> But the block hashing code has been in since July, and we haven't had
> any problems due to this behavior. (The bug I fixed in
> lookup_symbol_aux had to do with bad arguments being passed to
> lookup_block_symbol, so that doesn't count.) The analysis above means
> that we've been living with finish_block's criteria since then.
>
> Changing lookup_block_symbol to use the same criteria as finish block
> should have no effect, and will make all the subtlety above go away.
I agree. Your patch is functionally the same, given our invariants;
it's a little different (two strcmps for the mangled_name case instead
of one) but much clearer. It's fine with me, although if you and David
can agree on (or find a better name than) SYMBOL_BEST_NAME you might
want to use a macro or function to centralize this matching, and move
the comment there. I can't think of anywhere else we'd want to match
in the hashtable, so a private function in symtab.c might be best.
> So here's the patch I'm offering. The point is that it makes the
> match criteria clearly stricter than the hashing criteria, avoiding
> the confusion that I've had to wade through.
>
> 2002-10-01 Jim Blandy <jimb@redhat.com>
>
> * symtab.c (lookup_block_symbol): Use the same matching criteria
> that buildsym.c:finish_block used when constructing the hash
> table.
>
> Index: gdb/symtab.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/symtab.c,v
> retrieving revision 1.70
> diff -c -r1.70 symtab.c
> *** gdb/symtab.c 20 Sep 2002 14:58:58 -0000 1.70
> --- gdb/symtab.c 2 Oct 2002 03:40:28 -0000
> ***************
> *** 1347,1357 ****
> hash_index = hash_index % BLOCK_BUCKETS (block);
> for (sym = BLOCK_BUCKET (block, hash_index); sym; sym = sym->hash_next)
> {
> ! if (SYMBOL_NAMESPACE (sym) == namespace
> ! && (mangled_name
> ! ? strcmp (SYMBOL_NAME (sym), mangled_name) == 0
> ! : SYMBOL_MATCHES_NAME (sym, name)))
> ! return sym;
> }
> return NULL;
> }
> --- 1347,1372 ----
> hash_index = hash_index % BLOCK_BUCKETS (block);
> for (sym = BLOCK_BUCKET (block, hash_index); sym; sym = sym->hash_next)
> {
> ! /* Note that you can't just tweak these matching criteria
> ! arbitrarily. They must be stricter than those assumed
> ! when we build the hash table in finish_block; otherwise,
> ! the code there will put symbols you'd like to match in
> ! different hash buckets. */
> ! if (SYMBOL_NAMESPACE (sym) == namespace)
> ! {
> ! /* We hash using a hash function that respects
> ! strcmp_iw; strcmp is more conservative than
> ! strcmp_iw, so it's fine to use that instead here if
> ! we like. */
> ! if (SYMBOL_DEMANGLED_NAME (sym)
> ! ? strcmp_iw (SYMBOL_DEMANGLED_NAME (sym), name) == 0
> ! : strcmp (SYMBOL_NAME (sym), name) == 0)
> ! {
> ! if (! mangled_name
> ! || strcmp (SYMBOL_NAME (sym), mangled_name) == 0)
> ! return sym;
> ! }
> ! }
> }
> return NULL;
> }
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RFA: Search for symbol names the same way they're hashed.
2002-10-01 20:45 Jim Blandy
@ 2002-10-02 10:45 ` David Carlton
2002-10-02 12:35 ` Jim Blandy
2002-10-02 11:04 ` Daniel Jacobowitz
1 sibling, 1 reply; 18+ messages in thread
From: David Carlton @ 2002-10-02 10:45 UTC (permalink / raw)
To: Jim Blandy; +Cc: gdb-patches
On Tue, 1 Oct 2002 22:29:14 -0500, Jim Blandy <jimb@redhat.com> said:
> But when MANGLED_NAME is zero, then the SYMBOL_MATCHES_NAME part is
> questionable. The definition of SYMBOL_MATCHES_NAME from symtab.h is
> as follows:
> #define SYMBOL_MATCHES_NAME(symbol, name) \
> (STREQ (SYMBOL_NAME (symbol), (name)) \
> || (SYMBOL_DEMANGLED_NAME (symbol) != NULL \
> && strcmp_iw (SYMBOL_DEMANGLED_NAME (symbol), (name)) == 0))
> It returns true if NAME matches either SYMBOL_NAME or
> SYMBOL_DEMANGLED_NAME.
> If the intention really was to use SYMBOL_MATCHES_NAME to recognize
> matches, then the code is broken. If SYMBOL_NAME (sym) matches
> NAME, and sym has a demangled name which is different from NAME
> (which will usually be the case), SYMBOL_MATCHES_NAME (sym, name)
> will be true, but finish_block will have hashed on the demangled
> name, and probably will have filed sym in a different hash bucket
> than the one we'll search.
I was thinking about this some last week, albeit from a slightly
different angle, and I agree with you that this doesn't work well. My
attitude is that, in lookup_block_symbol, 'name' should be the
demangled form of the name. (After all, there's a separate
'mangled_name' argument, not a separate 'demangled_name' argument.)
Assuming of course that we're in the C++ case where there's a
difference between mangled and demangled names, then using the current
definition of SYMBOL_MATCHES_NAME is wrong: it first does
STREQ (SYMBOL_NAME (symbol), (name)
which tests 'name' against the _mangled_ form of the name, which we
never want to do, so this could lead to false positives. It's not
very likely to run into false positives, of course, because a
demangled name should never look like a mangled name, but why run the
risk? (Though I hadn't thought of the interaction with hashing that
you brought up: that certainly decreases the likelihood of false
positives.)
My current solution is perhaps a bit hard to read from the sources on
my branch, but basically it boils down to this:
1) This concept of 'a name that is as demangled as possible' is a
pretty important one and occurs in multiple places in GDB's sources,
so let's formalize it:
/* Macro that returns the demangled name of the symbol if if possible
and the symbol name if not possible. This is like
SYMBOL_SOURCE_NAME except that it doesn't depend on the value of
'demangle' (and is hence more suitable for internal usage). The
result should never be NULL. */
/* FIXME: carlton/2002-09-26: Probably the situation with this and
SYMBOL_SOURCE_NAME should be rethought. */
#define SYMBOL_BEST_NAME(symbol) \
(SYMBOL_DEMANGLED_NAME (symbol) != NULL \
? SYMBOL_DEMANGLED_NAME (symbol) \
: SYMBOL_NAME (symbol))
(I'm not exactly wedded to the name 'SYMBOL_BEST_NAME'; it was the
first acceptable name that I thought of, given that SYMBOL_SOURCE_NAME
was already taken to mean something subtly different.)
I then went and inserted this macro in some (but by no means all: I
haven't yet had time to do an exhaustive search) of all the places in
GDB where it would fit. E.g. the definition for SYMBOL_SOURCE_NAME
now becomes
/* Macro that returns the "natural source name" of a symbol. In C++ this is
the "demangled" form of the name if demangle is on and the "mangled" form
of the name if demangle is off. In other languages this is just the
symbol name. The result should never be NULL. */
/* NOTE: carlton/2002-09-26: For external use only; in many
situations, SYMBOL_BEST_NAME is more appropriate. */
#define SYMBOL_SOURCE_NAME(symbol) \
(demangle ? SYMBOL_BEST_NAME (symbol) : SYMBOL_NAME (symbol))
or, when I'm hashing symbol names to build the hash table, I do
hash_index = msymbol_hash_iw (SYMBOL_BEST_NAME (sym)) % nbuckets;
2) So then what happens to lookup_block_symbol? When looking up names
in a hash table, I find the index just like I do when building it.
Then, to test whether or not I've found the right symbol, if
'mangled_name' is nonzero, I do
strcmp (SYMBOL_NAME (sym), mangled_name)
and otherwise, I do
strcmp_iw (SYMBOL_BEST_NAME (sym), name)
Some further thoughts:
It seems entirely plausible to me that SYMBOL_MATCHES_NAME should
really be defined as follows:
#define SYMBOL_MATCHES_NAME(symbol, name) \
(strcmp_iw (SYMBOL_BEST_NAME (sym), name) == 0)
But I haven't seriously looked at the other places where
SYMBOL_MATCHES_NAME is being used. I suppose another plausible option
would be
#define SYMBOL_MATCHES_NAME(symbol, name) \
(SYMBOL_DEMANGLED_NAME (sym) != NULL
? strcmp_iw (SYMBOL_DEMANGLED_NAME (sym), name) == 0
: strcmp (SYMBOL_NAME (sym), name) == 0)
(This is effectively what you're doing in your patch to
lookup_block_symbol.) The difference here would be that we use strcmp
in the non-C++ case instead of strcmp_iw. My guess is that, in
practice, it wouldn't make a difference in the non-C++ case (since
symbol names shouldn't contain spaces or parentheses) and, if it
somehow did make a difference, then one could make a good case for
strcmp_iw being the right thing. Given that I want to encourage usage
of SYMBOL_BEST_NAME wherever possible, I prefer the first definition
of SYMBOL_MATCHES_NAME.
If people agree with me that introducing a macro like SYMBOL_BEST_NAME
would clarify matters here and elsewhere, I could spend some more time
looking at where it could be used and submit a patch that introduces
it.
David Carlton
carlton@math.stanford.edu
^ permalink raw reply [flat|nested] 18+ messages in thread
* RFA: Search for symbol names the same way they're hashed.
@ 2002-10-01 20:45 Jim Blandy
2002-10-02 10:45 ` David Carlton
2002-10-02 11:04 ` Daniel Jacobowitz
0 siblings, 2 replies; 18+ messages in thread
From: Jim Blandy @ 2002-10-01 20:45 UTC (permalink / raw)
To: gdb-patches
This one is odd. If the C++ folks (and Elena, if she has time) could
review this carefully, that would be great.
We build hashed blocks in buildsym.c:finish_block as follows:
for (next = *listhead; next; next = next->next)
{
for (j = next->nsyms - 1; j >= 0; j--)
{
struct symbol *sym;
unsigned int hash_index;
const char *name = SYMBOL_DEMANGLED_NAME (next->symbol[j]);
if (name == NULL)
name = SYMBOL_NAME (next->symbol[j]);
hash_index = msymbol_hash_iw (name);
hash_index = hash_index % BLOCK_BUCKETS (block);
sym = BLOCK_BUCKET (block, hash_index);
BLOCK_BUCKET (block, hash_index) = next->symbol[j];
next->symbol[j]->hash_next = sym;
}
}
But then, we search for them in the hashed block in
symtab.c:lookup_block_symbol like this:
if (BLOCK_HASHTABLE (block))
{
unsigned int hash_index;
hash_index = msymbol_hash_iw (name);
hash_index = hash_index % BLOCK_BUCKETS (block);
for (sym = BLOCK_BUCKET (block, hash_index); sym; sym = sym->hash_next)
{
if (SYMBOL_NAMESPACE (sym) == namespace
&& (mangled_name
? strcmp (SYMBOL_NAME (sym), mangled_name) == 0
: SYMBOL_MATCHES_NAME (sym, name)))
return sym;
}
return NULL;
}
Now, it's a basic fact of hash tables that you can only search using a
matching criterion strictly more conservative than equality under your
hash function. If you hash one way, but then search another way, some
of your matches might have been filed in other hash buckets.
So the above code probably isn't quite right.
Let's assume that, if the caller has supplied a non-zero MANGLED_NAME,
it is indeed the mangled form of NAME. Otherwise, the caller is
broken and deserves whatever it gets. Mangled name matching is a more
conservative matching criterion than demangled name matching (right?),
so the `mangled_name ? strcmp ...' part is okay.
But when MANGLED_NAME is zero, then the SYMBOL_MATCHES_NAME part is
questionable. The definition of SYMBOL_MATCHES_NAME from symtab.h is
as follows:
#define SYMBOL_MATCHES_NAME(symbol, name) \
(STREQ (SYMBOL_NAME (symbol), (name)) \
|| (SYMBOL_DEMANGLED_NAME (symbol) != NULL \
&& strcmp_iw (SYMBOL_DEMANGLED_NAME (symbol), (name)) == 0))
It returns true if NAME matches either SYMBOL_NAME or
SYMBOL_DEMANGLED_NAME.
If the intention really was to use SYMBOL_MATCHES_NAME to recognize
matches, then the code is broken. If SYMBOL_NAME (sym) matches NAME,
and sym has a demangled name which is different from NAME (which will
usually be the case), SYMBOL_MATCHES_NAME (sym, name) will be true,
but finish_block will have hashed on the demangled name, and probably
will have filed sym in a different hash bucket than the one we'll
search.
If finish_block's criteria are the correct ones, this liberal
searching criteria can't cause us any problems. Where finish_block
would consider a name to match a symbol, SYMBOL_MATCHES_NAME would
too: the latter checks both names. And where finish_block would
consider two names to differ, SYMBOL_MATCHES_NAME behaves the same way
where there is no a demangled name, and since mangled matching is more
conservative than demangled matching, it will also behave the same way
where there is a demangled name.
Does that sound right? :)
My first reaction was to say that finish_block is broken --- we want
to recognize matches on either mangled or unmangled names, but
finish_block's behavior means that only matches on unmangled names, or
mangled names where there are no unmangled_names, work reliably.
But the block hashing code has been in since July, and we haven't had
any problems due to this behavior. (The bug I fixed in
lookup_symbol_aux had to do with bad arguments being passed to
lookup_block_symbol, so that doesn't count.) The analysis above means
that we've been living with finish_block's criteria since then.
Changing lookup_block_symbol to use the same criteria as finish block
should have no effect, and will make all the subtlety above go away.
So here's the patch I'm offering. The point is that it makes the
match criteria clearly stricter than the hashing criteria, avoiding
the confusion that I've had to wade through.
2002-10-01 Jim Blandy <jimb@redhat.com>
* symtab.c (lookup_block_symbol): Use the same matching criteria
that buildsym.c:finish_block used when constructing the hash
table.
Index: gdb/symtab.c
===================================================================
RCS file: /cvs/src/src/gdb/symtab.c,v
retrieving revision 1.70
diff -c -r1.70 symtab.c
*** gdb/symtab.c 20 Sep 2002 14:58:58 -0000 1.70
--- gdb/symtab.c 2 Oct 2002 03:40:28 -0000
***************
*** 1347,1357 ****
hash_index = hash_index % BLOCK_BUCKETS (block);
for (sym = BLOCK_BUCKET (block, hash_index); sym; sym = sym->hash_next)
{
! if (SYMBOL_NAMESPACE (sym) == namespace
! && (mangled_name
! ? strcmp (SYMBOL_NAME (sym), mangled_name) == 0
! : SYMBOL_MATCHES_NAME (sym, name)))
! return sym;
}
return NULL;
}
--- 1347,1372 ----
hash_index = hash_index % BLOCK_BUCKETS (block);
for (sym = BLOCK_BUCKET (block, hash_index); sym; sym = sym->hash_next)
{
! /* Note that you can't just tweak these matching criteria
! arbitrarily. They must be stricter than those assumed
! when we build the hash table in finish_block; otherwise,
! the code there will put symbols you'd like to match in
! different hash buckets. */
! if (SYMBOL_NAMESPACE (sym) == namespace)
! {
! /* We hash using a hash function that respects
! strcmp_iw; strcmp is more conservative than
! strcmp_iw, so it's fine to use that instead here if
! we like. */
! if (SYMBOL_DEMANGLED_NAME (sym)
! ? strcmp_iw (SYMBOL_DEMANGLED_NAME (sym), name) == 0
! : strcmp (SYMBOL_NAME (sym), name) == 0)
! {
! if (! mangled_name
! || strcmp (SYMBOL_NAME (sym), mangled_name) == 0)
! return sym;
! }
! }
}
return NULL;
}
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2002-10-07 23:38 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <1033595444.9324.ezmlm@sources.redhat.com>
2002-10-02 17:49 ` RFA: Search for symbol names the same way they're hashed Jim Ingham
2002-10-02 18:16 ` Daniel Jacobowitz
2002-10-02 18:25 ` Jim Ingham
2002-10-03 8:15 ` Daniel Berlin
2002-10-01 20:45 Jim Blandy
2002-10-02 10:45 ` David Carlton
2002-10-02 12:35 ` Jim Blandy
2002-10-02 12:41 ` David Carlton
2002-10-02 13:02 ` Daniel Jacobowitz
2002-10-02 13:20 ` David Carlton
2002-10-02 13:25 ` Daniel Jacobowitz
2002-10-02 13:43 ` Daniel Berlin
2002-10-02 14:03 ` Daniel Jacobowitz
2002-10-02 16:10 ` Daniel Berlin
2002-10-02 11:04 ` Daniel Jacobowitz
2002-10-02 12:10 ` David Carlton
2002-10-07 16:19 ` David Carlton
2002-10-07 16:38 ` Daniel Jacobowitz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox