Mirror of the gdb mailing list
 help / color / mirror / Atom feed
* Unambiguously specifying source locations
@ 2003-10-10 15:30 ` Daniel Jacobowitz
  2003-10-10 15:44   ` David Ayers
  2003-10-11  2:21   ` Felix Lee
  0 siblings, 2 replies; 7+ messages in thread
From: Daniel Jacobowitz @ 2003-10-10 15:30 UTC (permalink / raw)
  To: gdb

This isn't a proposal - I haven't clearly thought out the details - just
some ramblings on a problem.

I would like to have a way to clearly identify a location, not by address. 
Primarily this would be for breakpoints; when we re-read an objfile's
symbols, we need to replace breakpoints somehow, and the more of the user's
intent we can preserve the better.  Obviously this is not a perfectly
solvable problem, but we can do pretty well.

It's an important problem, too.  Not only is recompile/reload a pretty
common thing to do, but IDEs which save breakpoints across sessions would
use this also.  Et cetera.

I think it's only source locations that I need to identify in this way, in
the short term, but we should use an extensible syntax.  For a
way-beyond-current-state example, consider an inlined function with multiple
inlined instances.  One of the simpler things we could do is:

  record the number of instances in this objfile before reloading
  check the number of instances in this objfile after reloading
  if same, then we can preserve things like the breakpoint enabled state
    to the new set
  if not, punt - enable all?  warn?

The goal there being to preserve the status of particular breakpoints across
reload, as best as we can, when unrelated changes are made to the source.
Someone fixes emit_foo () and recompiles and we try not to disturb their
eight disabled and one enabled breakpoint on inlined copies of obstack_alloc
where they were tracking the allocation of the bad object.

We could be much more thorough.  Try this on for size:
  Record the source location of the emitted function which contains each
    inlined instance, as unambiguously as we can.
  Record the inlining path.

The result would look something like (I don't like this syntax, just
illustrating):
  [libfoo.so.2][foo-1.cc:foo_func:75][bar-1.h:inline_one:33]\
    [bar-1.h:inline_two:36]
Then after reload if the path still leads to an inlined copy of inline_two
we can re-establish the breakpoint.  Or there are various fuzzy matching
things we could do.  Et cetera.

Generating such paths would be useful for output anyway; and if we can do
them in such a way as to accept them for input too, that would be useful to
our users.  We could also accept as input ambiguous paths and do basically
wildcarding:
  break [libfoo.so.2][bar-1.h]inline_two

Wouldn't that be nice?

The use of braces is not entirely coincidental.  decode_line_1 currently
does not accept anything that starts with a '[' as far as I can see; ObjC
selectors always have +[ or -[.  Using braces simplifies quoting and parsing
quite a bit.  And it could be extended as necessary without too much
trouble.

There are other problems: for instance, we might want to use linker names
for non-inlined functions where possible, for GDB-generated location
descriptions; that would handle keeping track of which constructor we were
stopped on.  Otherwise we'd need some other method for that.

Another nice thing to have might be an element in that list describing
C++ template instantiation.  Not sure if that's necessary or should just be
added to the function name as needed.

Include paths may also be needed.  This happens in C, with static functions
in headers.  Here's an example that bites me all the time:
  [libbfd.so][elf32-i386.c][elflink.h:2300]
  [libbfd.so][elf32-sparc.c][elflink.h:2300]
or maybe it should be:
  [libbfd.so][elf32-i386.c:elflink.h:2300]
  [libbfd.so][elf32-sparc.c:elflink.h:2300]
The latter, I think.

Right now the equivalent to this is handled in build_canonical_line_spec.
It's a little simple-minded; there are plenty of cases it doesn't handle.
It would be nice to do better.

Any specification of this should be explicit about quoting rules, comma,
dammit.

What other thoughts do you all have?  Am I on the right track, should we
draw up a formal specification for these?

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Unambiguously specifying source locations
  2003-10-10 15:30 ` Unambiguously specifying source locations Daniel Jacobowitz
@ 2003-10-10 15:44   ` David Ayers
  2003-10-10 15:46     ` Daniel Jacobowitz
  2003-10-11  2:21   ` Felix Lee
  1 sibling, 1 reply; 7+ messages in thread
From: David Ayers @ 2003-10-10 15:44 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb

Daniel Jacobowitz wrote:

>The use of braces is not entirely coincidental.  decode_line_1 currently
>does not accept anything that starts with a '[' as far as I can see; ObjC
>selectors always have +[ or -[.  
>
Actually this is not the case.  If the +/- is omitted both are offered 
as breakpoints if they exist:

(gdb) b [NSObject autorelease]
[0] cancel
[1] all
[2] -[NSObject autorelease] at NSObject.m:1553
[3] +[NSObject autorelease] at NSObject.m:1575
 >

But maybe you could rely on the fact that there must be a white space 
between the class and the message.

Cheers,
David Ayers




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Unambiguously specifying source locations
  2003-10-10 15:44   ` David Ayers
@ 2003-10-10 15:46     ` Daniel Jacobowitz
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Jacobowitz @ 2003-10-10 15:46 UTC (permalink / raw)
  To: gdb

On Fri, Oct 10, 2003 at 05:39:34PM +0200, David Ayers wrote:
> Daniel Jacobowitz wrote:
> 
> >The use of braces is not entirely coincidental.  decode_line_1 currently
> >does not accept anything that starts with a '[' as far as I can see; ObjC
> >selectors always have +[ or -[.  
> >
> Actually this is not the case.  If the +/- is omitted both are offered 
> as breakpoints if they exist:
> 
> (gdb) b [NSObject autorelease]
> [0] cancel
> [1] all
> [2] -[NSObject autorelease] at NSObject.m:1553
> [3] +[NSObject autorelease] at NSObject.m:1575
> >

Hmm, thank you.  I don't quite see how that happens so I'll have to go
peer at linespec.c again.

> But maybe you could rely on the fact that there must be a white space 
> between the class and the message.

I'd rather not - there can be whitespace in file names, shared object
names, et cetera.  Perhaps something different is in order.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Unambiguously specifying source locations
  2003-10-10 15:30 ` Unambiguously specifying source locations Daniel Jacobowitz
  2003-10-10 15:44   ` David Ayers
@ 2003-10-11  2:21   ` Felix Lee
  1 sibling, 0 replies; 7+ messages in thread
From: Felix Lee @ 2003-10-11  2:21 UTC (permalink / raw)
  To: gdb

>  [libfoo.so.2][foo-1.cc:foo_func:75][bar-1.h:inline_one:33]\
>    [bar-1.h:inline_two:36]

One thing I'm not clear about, you're mixing source locators with
object locators.  Is it really a path?  Maybe it should be
an arbitrary qualifier list instead?  Like
    foofunc[obj libfoo.so]
    foofunc[src foo.h in foo1.cc]
the first would be all foofunc in libfoo.so. the second would be
all foofunc the compiler generates from foo.cc/foo.h.  and
perhaps let the user combine qualifiers with boolean operators.
--


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Unambiguously specifying source locations
       [not found] <1066172792.26963.ezmlm@sources.redhat.com>
@ 2003-10-14 23:47 ` Jim Ingham
  0 siblings, 0 replies; 7+ messages in thread
From: Jim Ingham @ 2003-10-14 23:47 UTC (permalink / raw)
  To: gdb

So one corollary that makes this easier is that instead of holding onto 
the requested shared library (and requested file for static functions) 
in the address string, we should hold them in the breakpoint structure 
itself.  So then we don't have to worry about what the canonical form 
is.  That too becomes simple, it is:

b->requested_filename = "myFile.c"
b->requested_shlib = "myShlib.dylib"

And if there are any others, we can add them.  So we only have to break 
out the specifications when the user instructs us to set the 
breakpoint.  After that all the bits are clear.

I started playing around with this for dylibs for purposes of my own... 
  I called them requested_* because I wanted to make it clear that these 
were not where we found the thing, but where the user asked for the 
thing.  It is also interesting where the breakpoint eventually landed, 
but that bit should be in the "implementation" side of the breakpoint.  
This clearly belongs in the user side...

Jim


On Oct 14, 2003, at 4:06 PM, gdb-digest-help@sources.redhat.com wrote:

>
>
> On Mon, Oct 13, 2003 at 10:25:20AM -0700, Jim Ingham wrote:
>> I think the intent here is great!
>>
>> I have a heartfelt plea, however, from one who while not as
>> battle-scarred as some others, have waded my way through plenty of
>> decode_line_1 bugs...
>>
>> Is there any way we can not have to keep overloading the expression
>> parser with more specifications?  It seems to me this is just a way to
>> obfuscate the user's intent so that we can get it wrong trying to
>> decode it later.  Daniel's proposed syntax - no offense intended - is
>> sufficiently awful that it should give us pause.  Would:
>>
>> break -shlib foo.dylib -file foo.c MyStaticFunction
>>
>> be all that awful?  This is unambiguous, represents the user's intent
>> exactly, is not too hard to type, and trivial to parse.  Then
>> internally, the breakpoint could actually hold all these separate bits
>> separately, rather than munging them into a canonical form which we 
>> can
>> trip over later on...
>>
>> We will probably have to support the specifications that we do now for
>> ever - though adding switches for them would allow unambiguous entry
>> and would probably be taken up by a good number of users cause it is
>> almost impossible to get wrong...
>
> Feel free to propose a better canonical form :)  You basically just
> did, above.  We need a canonical representation, for instance:
>   - We use it internally to re-place breakpoints after rereading an
>     objfile.
>   - We would like to be able to display it so that breakpoints can be
>     saved and reloaded.
>
> I guess the question is whether these are useful for anything other
> than specifying breakpoint (or tracepoint) locations.  If not then
> flags to break might be canonical enough.
>
> -- 
> Daniel Jacobowitz
> MontaVista Software                         Debian GNU/Linux Developer
>
>
--
Jim Ingham                                   jingham@apple.com
Developer Tools
Apple Computer


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Unambiguously specifying source locations
  2003-10-13 17:24 ` Jim Ingham
@ 2003-10-13 17:32   ` Daniel Jacobowitz
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Jacobowitz @ 2003-10-13 17:32 UTC (permalink / raw)
  To: gdb

On Mon, Oct 13, 2003 at 10:25:20AM -0700, Jim Ingham wrote:
> I think the intent here is great!
> 
> I have a heartfelt plea, however, from one who while not as 
> battle-scarred as some others, have waded my way through plenty of 
> decode_line_1 bugs...
> 
> Is there any way we can not have to keep overloading the expression 
> parser with more specifications?  It seems to me this is just a way to 
> obfuscate the user's intent so that we can get it wrong trying to 
> decode it later.  Daniel's proposed syntax - no offense intended - is 
> sufficiently awful that it should give us pause.  Would:
> 
> break -shlib foo.dylib -file foo.c MyStaticFunction
> 
> be all that awful?  This is unambiguous, represents the user's intent 
> exactly, is not too hard to type, and trivial to parse.  Then 
> internally, the breakpoint could actually hold all these separate bits 
> separately, rather than munging them into a canonical form which we can 
> trip over later on...
> 
> We will probably have to support the specifications that we do now for 
> ever - though adding switches for them would allow unambiguous entry 
> and would probably be taken up by a good number of users cause it is 
> almost impossible to get wrong...

Feel free to propose a better canonical form :)  You basically just
did, above.  We need a canonical representation, for instance:
  - We use it internally to re-place breakpoints after rereading an
    objfile.
  - We would like to be able to display it so that breakpoints can be
    saved and reloaded.

I guess the question is whether these are useful for anything other
than specifying breakpoint (or tracepoint) locations.  If not then
flags to break might be canonical enough.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Unambiguously specifying source locations
       [not found] <1065875539.13549.ezmlm@sources.redhat.com>
@ 2003-10-13 17:24 ` Jim Ingham
  2003-10-13 17:32   ` Daniel Jacobowitz
  0 siblings, 1 reply; 7+ messages in thread
From: Jim Ingham @ 2003-10-13 17:24 UTC (permalink / raw)
  To: gdb

I think the intent here is great!

I have a heartfelt plea, however, from one who while not as 
battle-scarred as some others, have waded my way through plenty of 
decode_line_1 bugs...

Is there any way we can not have to keep overloading the expression 
parser with more specifications?  It seems to me this is just a way to 
obfuscate the user's intent so that we can get it wrong trying to 
decode it later.  Daniel's proposed syntax - no offense intended - is 
sufficiently awful that it should give us pause.  Would:

break -shlib foo.dylib -file foo.c MyStaticFunction

be all that awful?  This is unambiguous, represents the user's intent 
exactly, is not too hard to type, and trivial to parse.  Then 
internally, the breakpoint could actually hold all these separate bits 
separately, rather than munging them into a canonical form which we can 
trip over later on...

We will probably have to support the specifications that we do now for 
ever - though adding switches for them would allow unambiguous entry 
and would probably be taken up by a good number of users cause it is 
almost impossible to get wrong...

Jim

On Oct 11, 2003, at 5:32 AM, gdb-digest-help@sources.redhat.com wrote:

> This isn't a proposal - I haven't clearly thought out the details - 
> just
> some ramblings on a problem.
>
> I would like to have a way to clearly identify a location, not by 
> address.
> Primarily this would be for breakpoints; when we re-read an objfile's
> symbols, we need to replace breakpoints somehow, and the more of the 
> user's
> intent we can preserve the better.  Obviously this is not a perfectly
> solvable problem, but we can do pretty well.
>
> It's an important problem, too.  Not only is recompile/reload a pretty
> common thing to do, but IDEs which save breakpoints across sessions 
> would
> use this also.  Et cetera.
>
> I think it's only source locations that I need to identify in this 
> way, in
> the short term, but we should use an extensible syntax.  For a
> way-beyond-current-state example, consider an inlined function with 
> multiple
> inlined instances.  One of the simpler things we could do is:
>
>   record the number of instances in this objfile before reloading
>   check the number of instances in this objfile after reloading
>   if same, then we can preserve things like the breakpoint enabled 
> state
>     to the new set
>   if not, punt - enable all?  warn?
>
> The goal there being to preserve the status of particular breakpoints 
> across
> reload, as best as we can, when unrelated changes are made to the 
> source.
> Someone fixes emit_foo () and recompiles and we try not to disturb 
> their
> eight disabled and one enabled breakpoint on inlined copies of 
> obstack_alloc
> where they were tracking the allocation of the bad object.
>
> We could be much more thorough.  Try this on for size:
>   Record the source location of the emitted function which contains 
> each
>     inlined instance, as unambiguously as we can.
>   Record the inlining path.
>
> The result would look something like (I don't like this syntax, just
> illustrating):
>   [libfoo.so.2][foo-1.cc:foo_func:75][bar-1.h:inline_one:33]\
>     [bar-1.h:inline_two:36]
> Then after reload if the path still leads to an inlined copy of 
> inline_two
> we can re-establish the breakpoint.  Or there are various fuzzy 
> matching
> things we could do.  Et cetera.
>
> Generating such paths would be useful for output anyway; and if we can 
> do
> them in such a way as to accept them for input too, that would be 
> useful to
> our users.  We could also accept as input ambiguous paths and do 
> basically
> wildcarding:
>   break [libfoo.so.2][bar-1.h]inline_two
>
> Wouldn't that be nice?
>
> The use of braces is not entirely coincidental.  decode_line_1 
> currently
> does not accept anything that starts with a '[' as far as I can see; 
> ObjC
> selectors always have +[ or -[.  Using braces simplifies quoting and 
> parsing
> quite a bit.  And it could be extended as necessary without too much
> trouble.
>
> There are other problems: for instance, we might want to use linker 
> names
> for non-inlined functions where possible, for GDB-generated location
> descriptions; that would handle keeping track of which constructor we 
> were
> stopped on.  Otherwise we'd need some other method for that.
>
> Another nice thing to have might be an element in that list describing
> C++ template instantiation.  Not sure if that's necessary or should 
> just be
> added to the function name as needed.
>
> Include paths may also be needed.  This happens in C, with static 
> functions
> in headers.  Here's an example that bites me all the time:
>   [libbfd.so][elf32-i386.c][elflink.h:2300]
>   [libbfd.so][elf32-sparc.c][elflink.h:2300]
> or maybe it should be:
>   [libbfd.so][elf32-i386.c:elflink.h:2300]
>   [libbfd.so][elf32-sparc.c:elflink.h:2300]
> The latter, I think.
>
> Right now the equivalent to this is handled in 
> build_canonical_line_spec.
> It's a little simple-minded; there are plenty of cases it doesn't 
> handle.
> It would be nice to do better.
>
> Any specification of this should be explicit about quoting rules, 
> comma,
> dammit.
>
> What other thoughts do you all have?  Am I on the right track, should 
> we
> draw up a formal specification for these?
>
> -- 
> Daniel Jacobowitz
> MontaVista Software                         Debian GNU/Linux Developer
>
--
Jim Ingham                                   jingham@apple.com
Developer Tools
Apple Computer


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-10-14 23:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <drow@mvista.com>
2003-10-10 15:30 ` Unambiguously specifying source locations Daniel Jacobowitz
2003-10-10 15:44   ` David Ayers
2003-10-10 15:46     ` Daniel Jacobowitz
2003-10-11  2:21   ` Felix Lee
     [not found] <1065875539.13549.ezmlm@sources.redhat.com>
2003-10-13 17:24 ` Jim Ingham
2003-10-13 17:32   ` Daniel Jacobowitz
     [not found] <1066172792.26963.ezmlm@sources.redhat.com>
2003-10-14 23:47 ` Jim Ingham

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox