* [PATCH] Fix coff symbol table reading problem for C code compiled by g++
@ 2004-08-13 0:06 Ton van Overbeek
2004-08-14 11:36 ` Eli Zaretskii
2004-09-16 0:23 ` Jim Blandy
0 siblings, 2 replies; 5+ messages in thread
From: Ton van Overbeek @ 2004-08-13 0:06 UTC (permalink / raw)
To: gdb-patches
I found a bug/problem in the symbol reading code in symtab.c in gdb-6.2.
The problem occurs when reading symbols from a coff object file produced
from a C source file compiled by g++ (not by gcc).
The particular compiler is m68k-palmos-gcc, which is still based on gcc-2.95.3.
See http://prc-tools.sourceforge.net.
I know this gcc version is very old, but I believe the problem may also exist
for other compilers.
When compiling normal C code by g++ it produces mangled function names.
The coff symbol reader first reads the function name and inserts this in
the minimal symbol table and in a demangled name hash table in
symtab_set_names(). When reading the '.bf' symbol the function name is inserted
in the real symbol table. This time the symbol is already in the hash table
in symtab_set_names() and symbol_find_demangled_name() is not called. A side
effect of symbol_find_demangled_name() is that it changes/corrects the
gsymbol->language field. In the case of 'C compiled by g++' it changes
it from language_auto to language_cplus.
Because symbol_find_demangled_name() is not called, the gsymbol->language field
in the full symbol table stays set to language_auto.
This causes all kinds of problems when looking up symbols later, since the
stored name is the mangled name and the demangled name is empty: the symbol is
not found in the full symbol table and the code falls back on the minimal
symbol table or e.g. function names.
When trying to set a breakpoint on a function, the breakpoint is then set
on the last line of the preceding function.
I have applied the following fix to ensure that symbol_find_demangled_name()
is also called in this case. It is working for me. I do not know
if something else is needed for other languages/compilers/compiler
versions.
The patch below is against the 6.2 code.
===============================================================================
--- symtab.c.orig 2004-07-08 07:18:27.000000000 -0400
+++ symtab.c 2004-08-12 19:13:12.920308800 -0400
@@ -524,6 +524,7 @@ symbol_set_names (struct general_symbol_
const char *lookup_name;
/* The length of lookup_name. */
int lookup_len;
+ char *demangled_name = NULL;
if (objfile->demangled_names_hash == NULL)
create_demangled_names_hash (objfile);
@@ -569,9 +570,10 @@ symbol_set_names (struct general_symbol_
/* If this name is not in the hash table, add it. */
if (*slot == NULL)
{
- char *demangled_name = symbol_find_demangled_name (gsymbol,
- linkage_name_copy);
- int demangled_len = demangled_name ? strlen (demangled_name) : 0;
+ int demangled_len;
+
+ demangled_name = symbol_find_demangled_name (gsymbol, linkage_name_copy);
+ demangled_len = demangled_name ? strlen (demangled_name) : 0;
/* If there is a demangled name, place it right after the mangled name.
Otherwise, just place a second zero byte after the end of the mangled
@@ -582,7 +584,6 @@ symbol_set_names (struct general_symbol_
if (demangled_name != NULL)
{
memcpy (*slot + lookup_len + 1, demangled_name, demangled_len + 1);
- xfree (demangled_name);
}
else
(*slot)[lookup_len + 1] = '\0';
@@ -590,10 +591,20 @@ symbol_set_names (struct general_symbol_
gsymbol->name = *slot + lookup_len - len;
if ((*slot)[lookup_len + 1] != '\0')
- gsymbol->language_specific.cplus_specific.demangled_name
- = &(*slot)[lookup_len + 1];
+ {
+ gsymbol->language_specific.cplus_specific.demangled_name
+ = &(*slot)[lookup_len + 1];
+ /* In case we found the demangled name in the hash table we still have
+ to call SYMBOL_FIND_DEMANGLED_NAME in order to set the
+ gsymbol->language field correctly. */
+ if (demangled_name == NULL)
+ demangled_name = symbol_find_demangled_name (gsymbol,
+ linkage_name_copy);
+ }
else
gsymbol->language_specific.cplus_specific.demangled_name = NULL;
+
+ if (demangled_name != NULL) xfree (demangled_name);
}
/* Initialize the demangled name of GSYMBOL if possible. Any required space
===============================================================================
Please let me know if I should submit this as a PR and/or if there is a
better fix for this problem.
Ton van Overbeek
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Fix coff symbol table reading problem for C code compiled by g++
2004-08-13 0:06 [PATCH] Fix coff symbol table reading problem for C code compiled by g++ Ton van Overbeek
@ 2004-08-14 11:36 ` Eli Zaretskii
2004-08-15 0:29 ` Ton van Overbeek
2004-09-16 0:23 ` Jim Blandy
1 sibling, 1 reply; 5+ messages in thread
From: Eli Zaretskii @ 2004-08-14 11:36 UTC (permalink / raw)
To: Ton van Overbeek; +Cc: gdb-patches
> Date: Fri, 13 Aug 2004 02:06:00 +0200 (CEST)
> From: Ton van Overbeek <v-overbeek@cistron.nl>
>
> I found a bug/problem in the symbol reading code in symtab.c in gdb-6.2.
Thanks for the report and the patch.
Could you please also provide a short test case and an example
compile-link-GDB session script which both shows how to eproduce the
problem and the exact outputs from GDB?
TIA
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Fix coff symbol table reading problem for C code compiled by g++
2004-08-14 11:36 ` Eli Zaretskii
@ 2004-08-15 0:29 ` Ton van Overbeek
2004-08-21 12:19 ` Eli Zaretskii
0 siblings, 1 reply; 5+ messages in thread
From: Ton van Overbeek @ 2004-08-15 0:29 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: gdb-patches
On Sat, 14 Aug 2004, Eli Zaretskii wrote:
> > Date: Fri, 13 Aug 2004 02:06:00 +0200 (CEST)
> > From: Ton van Overbeek <v-overbeek@cistron.nl>
> >
> > I found a bug/problem in the symbol reading code in symtab.c in gdb-6.2.
>
> Thanks for the report and the patch.
>
> Could you please also provide a short test case and an example
> compile-link-GDB session script which both shows how to eproduce the
> problem and the exact outputs from GDB?
>
Here is a script illustrating the problem. Since I am working on a gdb-6.x
port for m68k-palmos the script is m68k-palmos specific (main has to be
called PilotMain), but it should be easy to adapt to the general case.
-------------------------------------------------------------------------------
#!/bin/sh
set -x
# Set to gdb used
GDB=./gdb
cat << EOF > test.c
/* Following #ifdef m68k-palmos specific */
#ifdef __cplusplus
extern "C" {
int PilotMain(void);
}
#endif
void g (void)
{
}
void f (void)
{
}
/* In m68k-palmos no main(), but PilotMain() */
int PilotMain (void)
{
f();
g();
return 0;
}
EOF
cat <<EOF > test.cmd
file test
break f
info break
quit
EOF
m68k-palmos-g++ --version
echo "Compiling & Linking test program"
m68k-palmos-g++ -g -o test test.c
echo "Running gdb"
$GDB --command test.cmd
rm -f test.c test.cmd test
-------------------------------------------------------------------------------
Here is the output from the script when runnin with a gdb without the
patch:
-------------------------------------------------------------------------------
+ GDB=./gdb
+ cat
+ cat
+ m68k-palmos-g++ --version
2.95.3-kgpd
+ echo Compiling & Linking test program
Compiling & Linking test program
+ m68k-palmos-g++ -g -o test test.c
+ echo Running gdb
Running gdb
+ ./gdb --command test.cmd
GNU gdb 6.2
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "--host=i686-pc-cygwin --target=m68k-palmos".
Breakpoint 1 at 0x138: file test.c, line 9.
Num Type Disp Enb Address What
1 breakpoint keep y 0x00000138 in g__Fv at test.c:9
+ rm -f test.c test.cmd test
------------------------------------------------------------------------------
Note the breakpoint is set at the end of the preceding function and the
preceding function name g__Fv is the mangled name.
Now the output of the patched gdb:
------------------------------------------------------------------------------
+ GDB=./gdb-new
+ cat
+ cat
+ m68k-palmos-g++ --version
2.95.3-kgpd
+ echo Compiling & Linking test program
Compiling & Linking test program
+ m68k-palmos-g++ -g -o test test.c
+ echo Running gdb
Running gdb
+ ./gdb-new --command test.cmd
GNU gdb 6.2
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "--host=i686-pc-cygwin --target=m68k-palmos".
Breakpoint 1 at 0x148: file test.c, line 13.
Num Type Disp Enb Address What
1 breakpoint keep y 0x00000148 in f(void) at test.c:13
+ rm -f test.c test.cmd test
------------------------------------------------------------------------------
Now the breakpoint is reported correctly.
I hope you can convert this to a test program + expect script for the
testsuite.
Let me know i this is sufficient.
Ton van Overbeek
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Fix coff symbol table reading problem for C code compiled by g++
2004-08-15 0:29 ` Ton van Overbeek
@ 2004-08-21 12:19 ` Eli Zaretskii
0 siblings, 0 replies; 5+ messages in thread
From: Eli Zaretskii @ 2004-08-21 12:19 UTC (permalink / raw)
To: Ton van Overbeek; +Cc: gdb-patches
> Date: Sun, 15 Aug 2004 02:29:40 +0200 (CEST)
> From: Ton van Overbeek <v-overbeek@cistron.nl>
>
> Here is a script illustrating the problem. Since I am working on a gdb-6.x
> port for m68k-palmos the script is m68k-palmos specific (main has to be
> called PilotMain), but it should be easy to adapt to the general case.
I tried to reproduce this problem with GCC 2.95.2, but couldn't see
the problem: an unpatched GDB 6.1 reported "f(void)" instead of
"g__Fv". Perhaps the problem was solved in GCC 2.95.2, or it is
specific to the m68k-palmos port.
Nevertheless, I cannot see any problems with the proposed patch, so as
a maintainer of one of the few GDB ports that use COFF debugging info,
I recommend this patch for being accepted.
Jim and Elena, could you please review this and approve it if you
don't have any objections?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Fix coff symbol table reading problem for C code compiled by g++
2004-08-13 0:06 [PATCH] Fix coff symbol table reading problem for C code compiled by g++ Ton van Overbeek
2004-08-14 11:36 ` Eli Zaretskii
@ 2004-09-16 0:23 ` Jim Blandy
1 sibling, 0 replies; 5+ messages in thread
From: Jim Blandy @ 2004-09-16 0:23 UTC (permalink / raw)
To: Ton van Overbeek, Daniel Jacobowitz; +Cc: gdb-patches
Ton van Overbeek <v-overbeek@cistron.nl> writes:
> I found a bug/problem in the symbol reading code in symtab.c in gdb-6.2.
>
> The problem occurs when reading symbols from a coff object file produced
> >From a C source file compiled by g++ (not by gcc).
> The particular compiler is m68k-palmos-gcc, which is still based on gcc-2.95.3.
> See http://prc-tools.sourceforge.net.
> I know this gcc version is very old, but I believe the problem may also exist
> for other compilers.
>
> When compiling normal C code by g++ it produces mangled function names.
> The coff symbol reader first reads the function name and inserts this in
> the minimal symbol table and in a demangled name hash table in
> symtab_set_names(). When reading the '.bf' symbol the function name is inserted
> in the real symbol table. This time the symbol is already in the hash table
> in symtab_set_names() and symbol_find_demangled_name() is not called. A side
> effect of symbol_find_demangled_name() is that it changes/corrects the
> gsymbol->language field. In the case of 'C compiled by g++' it changes
> it from language_auto to language_cplus.
> Because symbol_find_demangled_name() is not called, the gsymbol->language field
> in the full symbol table stays set to language_auto.
> This causes all kinds of problems when looking up symbols later, since the
> stored name is the mangled name and the demangled name is empty: the symbol is
> not found in the full symbol table and the code falls back on the minimal
> symbol table or e.g. function names.
> When trying to set a breakpoint on a function, the breakpoint is then set
> on the last line of the preceding function.
>
> I have applied the following fix to ensure that symbol_find_demangled_name()
> is also called in this case. It is working for me. I do not know
> if something else is needed for other languages/compilers/compiler
> versions.
So, let me make sure I understand this correctly:
The essential problem is that symbol_set_names sometimes has the side
effect of setting GSYMBOL->language, and sometimes it doesn't: whether
it does depends on whether that particular mangled name has been seen
before in this objfile, which shouldn't matter.
Here's the thread about introducing demangled_names_hash:
http://sources.redhat.com/ml/gdb-patches/2003-01/msg00726.html
The main motivation for introducing it was to be able to include
mangled names in the partial symbol tables; we were also hoping to
save time by avoiding calling the demangler. As it turns out, the
time saved by not calling the demangler was used up (to within 1%) by
the overhead of the patch, so there was no net performance win.
(Assuming you weren't paging...)
The problem with your patch is that it brings back all the calls to
the demangler that the hash table allowed us to avoid: the demangler
gets called every time, whether we've already demangled the symbol
before or not.
I think the fundamental problem is that the hash table only retains
partial information about the results from symbol_find_demangled_name:
it retains the demangled name, but not the language whose demangler we
used. If we could retain that information, then symbol_set_names
could consistently provide the language.
I see two approaches. Based on the discussion in the thread, space is
at a premium, so we're only considering things which won't
significantly increase the memory usage. Specific numbers are from
the test case discussed in the thread.
- Store the language in another byte beyond the demangled name. This
makes the form of the hash table entries even less obvious. It
would also add 200k of memory consumption. On the other hand,
depending on the granularity of obstack_alloc, perhaps many of those
would fall into the padding at the end of the value.
The hair could be localized to symbol_set_names, though.
- Have a separate hash table for each language. In 'struct objfile',
we'd have:
struct htab *demangled_names_hashes[nr_languages];
They'd be allocated lazily. One would need to probe all hash tables
before deciding that a symbol hadn't been seen yet (or, only the
hash tables that'd actually been allocated, typically only one
unless you're mixing languages). Then, the index of the hash table
you'd found your name in would tell you the language.
This would entail a lot of changes elsewhere to properly initialize
and free demangled_names_hashes.
Daniel, what do you think? Have I at least got the problem right?
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2004-09-16 0:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-08-13 0:06 [PATCH] Fix coff symbol table reading problem for C code compiled by g++ Ton van Overbeek
2004-08-14 11:36 ` Eli Zaretskii
2004-08-15 0:29 ` Ton van Overbeek
2004-08-21 12:19 ` Eli Zaretskii
2004-09-16 0:23 ` Jim Blandy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox