* Question about blockframe.c:inside_main_func()
@ 2004-04-29 0:17 Jason Molenda
2004-04-29 1:02 ` Joel Brobecker
2004-04-29 15:09 ` Andrew Cagney
0 siblings, 2 replies; 6+ messages in thread
From: Jason Molenda @ 2004-04-29 0:17 UTC (permalink / raw)
To: Andrew Cagney, Joel Brobecker; +Cc: gdb-patches
Hi all,
We're bringing up the currentish gdb sources here at Apple and I was
debugging a problem with inside_main_func () [*] when I noticed that
there seems to be a bit of extra computation that has snuck into the
function during the changes since July.
Previously, inside_main_func() would find the "main" function in the
"symfile_objfile", find its start and end addresses (if debug symbols
were present I guess) and on subsequent invocations, use those cached
addresses to determine if the addr in question is contained within the
"main" function.
The current inside_main_func() will do
msymbol = lookup_minimal_symbol (main_name (), NULL,
symfile_objfile);// every time
if (msymbol != NULL // once
&& symfile_objfile->ei.main_func_lowpc == INVALID_ENTRY_LOWPC
&& symfile_objfile->ei.main_func_highpc == INVALID_ENTRY_HIGHPC)
if (msymbol != NULL && MSYMBOL_TYPE (msymbol) == mst_text) // every
time
{
[... lots of stuff ...]
}
I realize this is hardly a performance critical function, but it's
still a long shot from the version that existed before July which would
find the start/end addresses and then do
if (symfile_objfile->ei.main_func_lowpc == INVALID_ENTRY_LOWPC && //
once
symfile_objfile->ei.main_func_highpc == INVALID_ENTRY_HIGHPC)
[... lookup symbol ... ]
return (symfile_objfile->ei.main_func_lowpc <= pc
&& symfile_objfile->ei.main_func_highpc > pc);
Is there some reason why this shortcut has been dropped? Is there a
reason not to add a conditional to the top to detect "main"'s bounds
being detected and short-circuit the searching we're doing every time.
Jason
[*] We have something called "ZeroLink" where the main executable --
the symfile_objfile -- is a tiny stub that demand-loads each object
file (formatted like a shared library) as functions/global variables in
those .o's are referenced. So in our case, the symfile_objfile doesn't
contain main at all; hence me looking into this function and scratching
my head about why it's re-searching for this function every time...
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question about blockframe.c:inside_main_func()
2004-04-29 0:17 Question about blockframe.c:inside_main_func() Jason Molenda
@ 2004-04-29 1:02 ` Joel Brobecker
2004-04-29 1:50 ` Jason Molenda
2004-04-29 15:09 ` Andrew Cagney
1 sibling, 1 reply; 6+ messages in thread
From: Joel Brobecker @ 2004-04-29 1:02 UTC (permalink / raw)
To: Jason Molenda; +Cc: Andrew Cagney, gdb-patches
> We're bringing up the currentish gdb sources here at Apple and I was
> debugging a problem with inside_main_func () [*] when I noticed that
> there seems to be a bit of extra computation that has snuck into the
> function during the changes since July.
Indeed, it seems that we could improve a bit this code: It really looks
like we are trying to compute the low/high addresses for function main
twice! Even when main is found within the debug info, we go through the
through the trouble of finding the bounds of function "main" from the
min syms anyway. A missing condition somewhere?
At first glance, I would think something like this ought to work:
if (lowpc == INVALID && highpc == INVALID)
{
msymbol = lookup_minimal_symbol (main_name ()
if (msymbol != NULL)
{
/* Try finding the bounds from the debug info. */
struct symbol *mainsym = find_pc_function (...);
if (mainsym != NULL && ...)
{
lowpc = BLOCK_START (...);
highpc = BLOCK_END (...);
}
else
{
/* Not found in the debug info, use the minsyms... */
CORE_ADDR maddr = SYMBOL_VALUE_ADDRESS (msymbol);
[...]
}
}
}
return (lowpc <= pc && highpc > pc);
Unless Andrew had a reason to believe that recomputing the low/high
PCs was necessary?
See http://sources.redhat.com/ml/gdb-patches/2003-07/msg00144.html,
where the block computing the function bounds from minimal symbols
was introduced.
--
Joel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question about blockframe.c:inside_main_func()
2004-04-29 1:02 ` Joel Brobecker
@ 2004-04-29 1:50 ` Jason Molenda
0 siblings, 0 replies; 6+ messages in thread
From: Jason Molenda @ 2004-04-29 1:50 UTC (permalink / raw)
To: Joel Brobecker; +Cc: Andrew Cagney, gdb-patches
On Apr 28, 2004, at 6:02 PM, Joel Brobecker wrote:
> Indeed, it seems that we could improve a bit this code: It really looks
> like we are trying to compute the low/high addresses for function main
> twice! Even when main is found within the debug info, we go through the
> through the trouble of finding the bounds of function "main" from the
> min syms anyway. A missing condition somewhere?
Yeah, if I'm not mistaken we'll look up main() twice for each frame in
a backtrace every time. With a GUI front end keeping a stack view
up-to-date on each step, and apps with deep stacks, that will add up
pretty fast. Here at Apple we found some real performance problems
with this type of code path - things that slowed down with deep stacks
and frequent 'backtraces' from the UI - so it's work fixing.
Looking at the individual changes to inside_main_func, I'm pretty well
convinced this is just a combination of edits that didn't look at the
function's behavior overall. But it's worth checking with Andrew to
see if there is some edge-case that isn't immediately obvious which
would cause us to do this recomputation all the time..
J
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question about blockframe.c:inside_main_func()
2004-04-29 0:17 Question about blockframe.c:inside_main_func() Jason Molenda
2004-04-29 1:02 ` Joel Brobecker
@ 2004-04-29 15:09 ` Andrew Cagney
2004-04-30 0:27 ` Jason Molenda
1 sibling, 1 reply; 6+ messages in thread
From: Andrew Cagney @ 2004-04-29 15:09 UTC (permalink / raw)
To: Jason Molenda, Joel Brobecker; +Cc: gdb-patches
> Hi all,
>
> We're bringing up the currentish gdb sources here at Apple and I was debugging a problem with inside_main_func () [*] when I noticed that there seems to be a bit of extra computation that has snuck into the function during the changes since July.
>
> Previously, inside_main_func() would find the "main" function in the "symfile_objfile", find its start and end addresses (if debug symbols were present I guess) and on subsequent invocations, use those cached addresses to determine if the addr in question is contained within the "main" function.
>
> The current inside_main_func() will do
>
> msymbol = lookup_minimal_symbol (main_name (), NULL, symfile_objfile);// every time
>
> if (msymbol != NULL // once
> && symfile_objfile->ei.main_func_lowpc == INVALID_ENTRY_LOWPC
> && symfile_objfile->ei.main_func_highpc == INVALID_ENTRY_HIGHPC)
>
> if (msymbol != NULL && MSYMBOL_TYPE (msymbol) == mst_text) // every time
> {
> [... lots of stuff ...]
> }
>
> I realize this is hardly a performance critical function, but it's still a long shot from the version that existed before July which would find the start/end addresses and then do
>
> if (symfile_objfile->ei.main_func_lowpc == INVALID_ENTRY_LOWPC && // once
> symfile_objfile->ei.main_func_highpc == INVALID_ENTRY_HIGHPC)
> [... lookup symbol ... ]
>
> return (symfile_objfile->ei.main_func_lowpc <= pc
> && symfile_objfile->ei.main_func_highpc > pc);
> Is there some reason why this shortcut has been dropped? Is there a reason not to add a conditional to the top to detect "main"'s bounds being detected and short-circuit the searching we're doing every time.
Per Joel's comments, I'd guess accident.
However, I think the entire function's contents are bogus. It should
look like:
if (symtab_find_function_range_by_name (main_name (), &low_pc, &high_pc))
return pc in [low_pc, high_pc);
else
return 0;
so that the logic is pushed back into the symbol table (an obvious thing
for lookup_function_range_by_name to do is implement a look-aside cache).
This also lets us kill off main_func_lowpc and main_func_highpc (they
need to be killed off anyway as PIE breaks the assumption that the
values are constant across function invocations).
> Jason
>
> [*] We have something called "ZeroLink" where the main executable -- the symfile_objfile -- is a tiny stub that demand-loads each object file (formatted like a shared library) as functions/global variables in those .o's are referenced. So in our case, the symfile_objfile doesn't contain main at all; hence me looking into this function and scratching my head about why it's re-searching for this function every time...
you might want to look at PIE.
Andrew
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question about blockframe.c:inside_main_func()
2004-04-29 15:09 ` Andrew Cagney
@ 2004-04-30 0:27 ` Jason Molenda
2004-04-30 0:49 ` Andrew Cagney
0 siblings, 1 reply; 6+ messages in thread
From: Jason Molenda @ 2004-04-30 0:27 UTC (permalink / raw)
To: Andrew Cagney; +Cc: Joel Brobecker, gdb-patches
On Apr 29, 2004, at 8:09 AM, Andrew Cagney wrote:
> However, I think the entire function's contents are bogus. It should
> look like:
>
> if (symtab_find_function_range_by_name (main_name (), &low_pc,
> &high_pc))
> return pc in [low_pc, high_pc);
> else
> return 0;
>
> so that the logic is pushed back into the symbol table (an obvious
> thing for lookup_function_range_by_name to do is implement a
> look-aside cache).
Just so I'm clear -- this is a function that doesn't exist right now,
right?
We have at least one similar address range cache in the Apple gdb to
keep track of some oft-referenced ObjC dispatch functions (which could
be subsumed by a symtab_find_function_range_by_name() type function).
> (they need to be killed off anyway as PIE breaks the assumption that
> the values are constant across function invocations).
I don't really know what PIE means - I thought it meant that the
executable was built PIC and would be loaded at an arbitrary address on
each run. How could a function shift locations while the inferior is
executing?
>> [*] We have something called "ZeroLink" where the main executable --
>> the symfile_objfile -- is a tiny stub that demand-loads each object
>> file (formatted like a shared library) as functions/global variables
>> in those .o's are referenced. So in our case, the symfile_objfile
>> doesn't contain main at all; hence me looking into this function and
>> scratching my head about why it's re-searching for this function
>> every time...
>
> you might want to look at PIE.
It's a pretty different thing, if I'm not mistaken. PIE is about
loading your executable at an arbitrary address, isn't it? ZeroLink is
about avoiding the static link editor stage in development. You build
your .o's (and they're built as little shared libraries), and you run
the ZL stub program in place of your main application. The ZL stub
program loads at the usual 0x0 address, like a normal program. It
builds up a list of available functions in all the .o's and pulls them
in on-demand. It's entirely a development-time speed deal. I thought
PIE was more about security, putting the executable in different places
so hax0rs can't hardcode where interesting functions et al are located.
Maybe I misunderstood what PIE encompasses?
Thanks,
Jason
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question about blockframe.c:inside_main_func()
2004-04-30 0:27 ` Jason Molenda
@ 2004-04-30 0:49 ` Andrew Cagney
0 siblings, 0 replies; 6+ messages in thread
From: Andrew Cagney @ 2004-04-30 0:49 UTC (permalink / raw)
To: Jason Molenda; +Cc: Joel Brobecker, gdb-patches
>
> On Apr 29, 2004, at 8:09 AM, Andrew Cagney wrote:
>
>> However, I think the entire function's contents are bogus. It should look like:
>>
>> if (symtab_find_function_range_by_name (main_name (), &low_pc, &high_pc))
>> return pc in [low_pc, high_pc);
>> else
>> return 0;
>>
>> so that the logic is pushed back into the symbol table (an obvious thing for lookup_function_range_by_name to do is implement a look-aside cache).
>
>
> Just so I'm clear -- this is a function that doesn't exist right now, right?
Yes.
> We have at least one similar address range cache in the Apple gdb to keep track of some oft-referenced ObjC dispatch functions (which could be subsumed by a symtab_find_function_range_by_name() type function).
I think there are also signal trampoline lookups that do something similar.
>> (they need to be killed off anyway as PIE breaks the assumption that the values are constant across function invocations).
>
>
> I don't really know what PIE means - I thought it meant that the executable was built PIC and would be loaded at an arbitrary address on each run. How could a function shift locations while the inferior is executing?
Tipo, it should say ``program invocations'' like you describe.
>>> [*] We have something called "ZeroLink" where the main executable -- the symfile_objfile -- is a tiny stub that demand-loads each object file (formatted like a shared library) as functions/global variables in those .o's are referenced. So in our case, the symfile_objfile doesn't contain main at all; hence me looking into this function and scratching my head about why it's re-searching for this function every time...
>>
>>
>> you might want to look at PIE.
>
>
> It's a pretty different thing, if I'm not mistaken. PIE is about loading your executable at an arbitrary address, isn't it? ZeroLink is about avoiding the static link editor stage in development. You build your .o's (and they're built as little shared libraries), and you run the ZL stub program in place of your main application. The ZL stub program loads at the usual 0x0 address, like a normal program. It builds up a list of available functions in all the .o's and pulls them in on-demand. It's entirely a development-time speed deal. I thought PIE was more about security, putting the executable in different places so hax0rs can't hardcode where interesting functions et al are located.
>
> Maybe I misunderstood what PIE encompasses?
In both cases:
- the relevant symbol cache has to be flushed each time the program is
re-started
- the lookup is based on a name
so while some aspects are different, others are not.
Andrew
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2004-04-30 0:49 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-04-29 0:17 Question about blockframe.c:inside_main_func() Jason Molenda
2004-04-29 1:02 ` Joel Brobecker
2004-04-29 1:50 ` Jason Molenda
2004-04-29 15:09 ` Andrew Cagney
2004-04-30 0:27 ` Jason Molenda
2004-04-30 0:49 ` Andrew Cagney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox