Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
* Frame sniffers in Python/Guile/*
@ 2015-03-02 13:28 Andy Wingo
  2015-03-08 20:04 ` Doug Evans
  2015-03-09 15:39 ` Pedro Alves
  0 siblings, 2 replies; 7+ messages in thread
From: Andy Wingo @ 2015-03-02 13:28 UTC (permalink / raw)
  To: gdb-patches, Alexander Smundak; +Cc: Pedro Alves, Doug Evans

Hi,

I have a local patch that allows me to unwind frames in Guile, loosely
modelled off of Alexander's patch that does the same in Python.  I've
been working on getting an unwinder for V8 that uses this interface, but
it is challenging for a number of reasons that seem more to do with GDB
than with the extension interface.

First I have some comments about the Python patch; apologies for not
threading, as I wasn't on the list before.  I think the interface is a
bit strange.  To sum up, the interface looks like this:

    # private helper
    def execute_sniffers(sniffer_info):
        for sniffer in enabled_sniffers:
            unwind_info = sniffer(sniffer_info)
            if unwind_info:
                return unwind_info

"sniffer_info" is a new data type, a sniffer info object, which exposes
only a read_register() API.  The return is a (REGISTERS, FRAME_ID)
tuple, or None if the sniffer didn't handle the frame.  REGISTERS and
FRAME_ID are both built out of tuples and lists.

To me it seems that this interface does not lend itself to good error
reporting.  An API that instead built a specialized unwind_info object
and called set_frame_id(), set_register() etc on it would simplify many
things.

The return value type is also not very extensible.  If it were an
object, it would be more future-proof.  For example, what about the
stop_reason ?

Exposing register numbers in the API does not makes sense to me, as most
users won't know those register numbers.  It shouldn't be necessary to
open up GDB's C source to implement an unwinder.  Instead why not
specify registers as strings, as elsewhere
(e.g. gdb.Frame.read_register)?

The sniffer_info object is unfortunate -- it's a frame, but without
frame methods.  You can't get its architecture from python, for
example, or get the next frame.  More about that later.

In your patch, the sniffer_info object could outlive the call to the
sniffer, and so AFAICT it needs to be nulled out afterwards, as its
internal frame_info pointer will become invalid.

In the read_register() function, I believe you can use
get_frame_register_value instead of deprecated_frame_register_read.

                          *   *   *

As a general comment, separation of frame decorators from unwinders
causes duplicate work.  For example in V8 I will need to find the
v8::internal::Code* object that corresponds to the frame in order to
unwind the frame, but I will also need it for getting the line number.
I can't cache it in a sensible way because GC could free or move the
code.  (I suppose I could attach to GC-related breakpoints and
invalidate a GDB-side cache, but that makes the GDB extension more
brittle.)

I don't know if there is a good solution here, though.

                          *   *   *

I have a similar interface in Scheme.  The Scheme unwinder constructs an
"ephemeral frame object", which wraps this data:

    /* Data associated with a ephemeral frame.  */
    struct uwscm_ephemeral_frame
    {
      /* The frame being unwound, used for the read-register interface.  */
      struct frame_info *this_frame;

      /* The architecture of the frame, here for convenience.  */
      struct gdbarch *gdbarch;

      /* The frame_id for the ephemeral frame; initially unset.  */
      struct frame_id frame_id;

      /* Nonzero if the frame_id has been set.  */
      int has_frame_id;

      /* A list of (REGNUM . VALUE) pairs, indicating register values for the
         ephemeral frame.  */
      SCM registers;
    };

(Why a new object type?  Because all <gdb:frame> objects have a
frame_id, and this one does not, and it turns out it's a deep invariant
that frames have identifiers.)

The list of unwinders is run in order over the ephemeral frame, and the
first one that calls (set-ephemeral-frame-id! frame sp [ip [special]])
on the frame takes responsibility of unwinding the frame.  You can read
registers with ephemeral-frame-read-register, and set their unwound
values with ephemeral-frame-write-register!.  Simple stuff, and I'll
post later once I get the V8 unwinder working.

Unfortunately, the unwind callback is really squirrely -- you can't do
much there.

  * You can't call the Guile lookup-symbol function within an unwind
    handler, because the Guile wrapper wants to default the "block"
    argument from the selected frame, and there is no selected frame.

  * You can't value-call, which is not unexpected in general, but the
    reason is unexpected: because call_function_by_hand calls
    get_current_arch and that doesn't work

  * You can't call get_current_arch, directly or indirectly, because it
    causes unbounded recursion:

      #3  0x00000000006a65f2 in frame_unwind_try_unwinder (this_frame=this_frame@entry=0x5f6af10, this_cache=this_cache@entry=0x5f6af28, unwinder=0x409fe30) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame-unwind.c:126
      #4  0x00000000006a696f in frame_unwind_find_by_frame (this_frame=this_frame@entry=0x5f6af10, this_cache=this_cache@entry=0x5f6af28) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame-unwind.c:157
      #5  0x00000000006a32cb in get_prev_frame_if_no_cycle (fi=0x5f6af10) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:454
      #6  0x00000000006a32cb in get_prev_frame_if_no_cycle (this_frame=this_frame@entry=0x5f6ae40) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1780
      #7  0x00000000006a53a9 in get_prev_frame_always (this_frame=0x5f6ae40) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1954
      #8  0x00000000006a53a9 in get_prev_frame_always (this_frame=this_frame@entry=0x5f6ae40) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1971
      #9  0x00000000006a5ab1 in get_prev_frame (this_frame=this_frame@entry=0x5f6ae40) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:2212
      #10 0x00000000006a5d8c in unwind_to_current_frame (ui_out=<optimized out>, args=args@entry=0x5f6ae40) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1447
      #11 0x00000000005cf63c in catch_exceptions_with_msg (func_uiout=<optimized out>, func=func@entry=0x6a5d80 <unwind_to_current_frame>, func_args=func_args@entry=0x5f6ae40, gdberrmsg=gdberrmsg@entry=0x0, mask=mask@entry=RETURN_MASK_ERROR)
          at /home/wingo/src/binutils-gdb/+2.2/../gdb/exceptions.c:189
      #12 0x00000000005cf75a in catch_exceptions (uiout=<optimized out>, func=func@entry=0x6a5d80 <unwind_to_current_frame>, func_args=func_args@entry=0x5f6ae40, mask=mask@entry=RETURN_MASK_ERROR) at /home/wingo/src/binutils-gdb/+2.2/../gdb/exceptions.c:169
      #13 0x00000000006a33e0 in get_current_frame () at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1486
      #14 0x00000000006a3fe7 in get_selected_frame (message=message@entry=0x0) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1541
      #15 0x00000000005e7a27 in get_current_arch () at /home/wingo/src/binutils-gdb/+2.2/../gdb/arch-utils.c:784

    Perhaps this is only the case for the most inner frame?  Anyway this
    is the reason that many other things fail.

  * You can't read user regs from an ephemeral frame for some reason:

      /home/wingo/src/binutils-gdb/+2.2/../gdb/regcache.c:779: internal-error: regcache_cooked_read_value: Assertion `regnum < regcache->descr->nr_cooked_registers' failed.
      #9  0x00000000005732b5 in regcache_cooked_read_value (regcache=0xd25cc0, regnum=221) at /home/wingo/src/binutils-gdb/+2.2/../gdb/regcache.c:779
      #10 0x0000000000684c28 in sentinel_frame_prev_register (this_frame=0x6c7c350, this_prologue_cache=<optimized out>, regnum=<optimized out>) at /home/wingo/src/binutils-gdb/+2.2/../gdb/sentinel-frame.c:52
      #11 0x00000000006a4408 in frame_unwind_register_value (frame=0x6c7c350, regnum=221) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1111
      #12 0x00000000006a468f in frame_register_unwind (frame=<optimized out>, regnum=<optimized out>, optimizedp=0x7fffffffcde8, unavailablep=0x7fffffffcdec, lvalp=0x7fffffffcdf0, addrp=0x7fffffffcdf8, realnump=0x7fffffffcdf4, bufferp=0x7fffffffce40 "l\305\001\367\377\177") at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1016
      #13 0x00000000006a4892 in frame_register (frame=<optimized out>, regnum=<optimized out>, optimizedp=<optimized out>, unavailablep=<optimized out>, lvalp=<optimized out>, addrp=<optimized out>, realnump=<optimized out>, bufferp=<optimized out>)
          at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1057

And so on.  From what I can tell, all of this is because there is no
selected frame.  I recognize that this situation reflects reality in
some way -- we're still building the selected frame -- but is there any
way that we could have GDB be in a more "normal" state while the unwind
callback is running?

Andy


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Frame sniffers in Python/Guile/*
  2015-03-02 13:28 Frame sniffers in Python/Guile/* Andy Wingo
@ 2015-03-08 20:04 ` Doug Evans
  2015-03-09  8:27   ` Andy Wingo
  2015-03-09 15:39 ` Pedro Alves
  1 sibling, 1 reply; 7+ messages in thread
From: Doug Evans @ 2015-03-08 20:04 UTC (permalink / raw)
  To: Andy Wingo; +Cc: gdb-patches, Alexander Smundak, Pedro Alves

Andy Wingo <wingo@igalia.com> writes:
> [...]
>
> And so on.  From what I can tell, all of this is because there is no
> selected frame.  I recognize that this situation reflects reality in
> some way -- we're still building the selected frame -- but is there any
> way that we could have GDB be in a more "normal" state while the unwind
> callback is running?

I think one gets into trouble if one tries to
apply the word "normal" to gdb. :-)

Things like selected_frame and get_current_arch
are annoying, global state generally is.
Anything you can do untangle the need for global
state will be welcome.

But as you've observed, this is a bit of a special case.
Sometimes the architecture depends on the frame.
And until we know the architecture we don't know
what the registers are. And so on.

I don't have any simple suggestions. Maybe others will.
A lot of times improving/fixing gdb requires first
improving/fixing several other parts first.
Welcome to gdb. :-)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Frame sniffers in Python/Guile/*
  2015-03-08 20:04 ` Doug Evans
@ 2015-03-09  8:27   ` Andy Wingo
  0 siblings, 0 replies; 7+ messages in thread
From: Andy Wingo @ 2015-03-09  8:27 UTC (permalink / raw)
  To: Doug Evans; +Cc: gdb-patches, Alexander Smundak, Pedro Alves

On Sun 08 Mar 2015 21:03, Doug Evans <xdje42@gmail.com> writes:

> Andy Wingo <wingo@igalia.com> writes:
>> [...]
>>
>> And so on.  From what I can tell, all of this is because there is no
>> selected frame.  I recognize that this situation reflects reality in
>> some way -- we're still building the selected frame -- but is there any
>> way that we could have GDB be in a more "normal" state while the unwind
>> callback is running?
>
> I think one gets into trouble if one tries to
> apply the word "normal" to gdb. :-)

:-)

> I don't have any simple suggestions. Maybe others will.
> A lot of times improving/fixing gdb requires first
> improving/fixing several other parts first.
> Welcome to gdb. :-)

Hehe OK :)

In this case I came up with what can only be considered a hack, but one
which does solve the issue for me.  The root of the issue is described
in another mail, which I quote:

  1. The Guile unwinder's sniffer is called on the innermost frame.
     That sniffer calls out to Guile.

  2. Many actions, for example looking up a symbol without specifying a
     block. will request the selected frame.

  3. get_selected_frame() sees there is no selected frame, and goes to
     get_current_frame() and will select that.

  4. get_current_frame creates a sentinel frame and unwinds that to
     produce the innermost frame.

  5. After unwinding saved registers from the sentinel, frame.c finishes
     constructing the innermost frame by doing a compute_frame_id() on
     the frame.

  6. compute_frame_id() goes to compute the unwinder for the innermost
     frame, in order to call the this_id() method, which leads us back to
     the beginning.

  http://article.gmane.org/gmane.comp.gdb.patches/105431

The same thing could happen in python on the innermost frame.  My
terrible solution is somewhat hidden in this patch:

  http://article.gmane.org/gmane.comp.gdb.patches/105473

The idea is that frame_unwind_find_by_frame detects recursion and
signals an error, at least for the innermost frame.  (Actually, it
should probably error on recursion for any frame; unwinding is
iterative, not recursive.)

Then get_prev_frame uses the frame_unwind_is_unwinding_innermost_frame()
interface to avoid re-unwinding the sentinel frame, and instead returns
NULL.  Terrible, right?  But it does cause get_current_frame to the
sentinel frame, allowing get_selected_frame() to succeed, which is at
least useful to get the current architecture.  Yuck!

What does the nice solution look like here?  The situation is, no
selected frame, we're unwinding the innermost frame, then via
Guile/Python/etc something indirectly calls get_selected_frame() in
order to get some data from the frame, like the architecture.  Do we
change all of these to use get_selected_frame_if_set, then fall back to
get_current_frame_or_sentinel() ?

Andy


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Frame sniffers in Python/Guile/*
  2015-03-02 13:28 Frame sniffers in Python/Guile/* Andy Wingo
  2015-03-08 20:04 ` Doug Evans
@ 2015-03-09 15:39 ` Pedro Alves
  2015-03-09 19:19   ` Andy Wingo
  1 sibling, 1 reply; 7+ messages in thread
From: Pedro Alves @ 2015-03-09 15:39 UTC (permalink / raw)
  To: Andy Wingo, gdb-patches, Alexander Smundak; +Cc: Doug Evans

Hi Andy,

On 03/02/2015 01:28 PM, Andy Wingo wrote:

> (Why a new object type?  Because all <gdb:frame> objects have a
> frame_id, and this one does not, and it turns out it's a deep invariant
> that frames have identifiers.)

Yeah.

> The list of unwinders is run in order over the ephemeral frame, and the
> first one that calls (set-ephemeral-frame-id! frame sp [ip [special]])
> on the frame takes responsibility of unwinding the frame.  You can read
> registers with ephemeral-frame-read-register, and set their unwound
> values with ephemeral-frame-write-register!.  Simple stuff, and I'll
> post later once I get the V8 unwinder working.
> 
> Unfortunately, the unwind callback is really squirrely -- you can't do
> much there.

You mean the "unwind->sniffer" callback, right?  If so, that's (so far) by
design.  The sniffer callback is only supposed to identify whether
the unwinder can unwind, not do any unwinding at all.  :-/

See here for example:

  https://www.sourceware.org/ml/gdb-patches/2013-11/msg00602.html

>   * You can't call the Guile lookup-symbol function within an unwind
>     handler, because the Guile wrapper wants to default the "block"
>     argument from the selected frame, and there is no selected frame.

There can certainly be a selected frame already when something tries
to unwind past the most-unwound-already frame.  That will
be the normal case of "up" past the innermost frame, even.  But
imagine the case of:

  #0 frame0  <<<< current
  #1 frame1  <<<< selected
  #2 frame2

And the user has frame 1 selected.  And now the user runs some guile or
python code that does frame-older a couple times, which results in GDB
trying to unwind past frame #2.

So what context does make sense for symbol look ups at that point?
With your current hack, because there's a selected frame (frame #1),
symbols will be looked up starting from there.  But it seems to me
that whatever was the selected frame shouldn't influence the sniffer's
decisions.  Isn't the correct answer that the sniffer should
pass in an explicit block to lookup-symbol?

> 
>   * You can't value-call, which is not unexpected in general, but the
>     reason is unexpected: because call_function_by_hand calls
>     get_current_arch and that doesn't work

Where is that?  I think you meant that the guile wrapper calls
get_current_arch, not call_function_by_hand.

>   * You can't call get_current_arch, directly or indirectly, because it
>     causes unbounded recursion:
> 
>       #3  0x00000000006a65f2 in frame_unwind_try_unwinder (this_frame=this_frame@entry=0x5f6af10, this_cache=this_cache@entry=0x5f6af28, unwinder=0x409fe30) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame-unwind.c:126
>       #4  0x00000000006a696f in frame_unwind_find_by_frame (this_frame=this_frame@entry=0x5f6af10, this_cache=this_cache@entry=0x5f6af28) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame-unwind.c:157
>       #5  0x00000000006a32cb in get_prev_frame_if_no_cycle (fi=0x5f6af10) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:454
>       #6  0x00000000006a32cb in get_prev_frame_if_no_cycle (this_frame=this_frame@entry=0x5f6ae40) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1780
>       #7  0x00000000006a53a9 in get_prev_frame_always (this_frame=0x5f6ae40) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1954
>       #8  0x00000000006a53a9 in get_prev_frame_always (this_frame=this_frame@entry=0x5f6ae40) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1971
>       #9  0x00000000006a5ab1 in get_prev_frame (this_frame=this_frame@entry=0x5f6ae40) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:2212
>       #10 0x00000000006a5d8c in unwind_to_current_frame (ui_out=<optimized out>, args=args@entry=0x5f6ae40) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1447
>       #11 0x00000000005cf63c in catch_exceptions_with_msg (func_uiout=<optimized out>, func=func@entry=0x6a5d80 <unwind_to_current_frame>, func_args=func_args@entry=0x5f6ae40, gdberrmsg=gdberrmsg@entry=0x0, mask=mask@entry=RETURN_MASK_ERROR)
>           at /home/wingo/src/binutils-gdb/+2.2/../gdb/exceptions.c:189
>       #12 0x00000000005cf75a in catch_exceptions (uiout=<optimized out>, func=func@entry=0x6a5d80 <unwind_to_current_frame>, func_args=func_args@entry=0x5f6ae40, mask=mask@entry=RETURN_MASK_ERROR) at /home/wingo/src/binutils-gdb/+2.2/../gdb/exceptions.c:169
>       #13 0x00000000006a33e0 in get_current_frame () at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1486
>       #14 0x00000000006a3fe7 in get_selected_frame (message=message@entry=0x0) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1541
>       #15 0x00000000005e7a27 in get_current_arch () at /home/wingo/src/binutils-gdb/+2.2/../gdb/arch-utils.c:784
> 
>     Perhaps this is only the case for the most inner frame?  Anyway this
>     is the reason that many other things fail.
> 
>   * You can't read user regs from an ephemeral frame for some reason:
> 
>       /home/wingo/src/binutils-gdb/+2.2/../gdb/regcache.c:779: internal-error: regcache_cooked_read_value: Assertion `regnum < regcache->descr->nr_cooked_registers' failed.
>       #9  0x00000000005732b5 in regcache_cooked_read_value (regcache=0xd25cc0, regnum=221) at /home/wingo/src/binutils-gdb/+2.2/../gdb/regcache.c:779
>       #10 0x0000000000684c28 in sentinel_frame_prev_register (this_frame=0x6c7c350, this_prologue_cache=<optimized out>, regnum=<optimized out>) at /home/wingo/src/binutils-gdb/+2.2/../gdb/sentinel-frame.c:52
>       #11 0x00000000006a4408 in frame_unwind_register_value (frame=0x6c7c350, regnum=221) at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1111
>       #12 0x00000000006a468f in frame_register_unwind (frame=<optimized out>, regnum=<optimized out>, optimizedp=0x7fffffffcde8, unavailablep=0x7fffffffcdec, lvalp=0x7fffffffcdf0, addrp=0x7fffffffcdf8, realnump=0x7fffffffcdf4, bufferp=0x7fffffffce40 "l\305\001\367\377\177") at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1016
>       #13 0x00000000006a4892 in frame_register (frame=<optimized out>, regnum=<optimized out>, optimizedp=<optimized out>, unavailablep=<optimized out>, lvalp=<optimized out>, addrp=<optimized out>, realnump=<optimized out>, bufferp=<optimized out>)
>           at /home/wingo/src/binutils-gdb/+2.2/../gdb/frame.c:1057
> 

Why would the unwinder want to access user / pseudo registers?
Shouldn't it be working only with raw registers?

> And so on.  From what I can tell, all of this is because there is no
> selected frame.  I recognize that this situation reflects reality in
> some way -- we're still building the selected frame -- but is there any
> way that we could have GDB be in a more "normal" state while the unwind
> callback is running?

Thanks,
Pedro Alves


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Frame sniffers in Python/Guile/*
  2015-03-09 15:39 ` Pedro Alves
@ 2015-03-09 19:19   ` Andy Wingo
  2015-03-10 16:36     ` Pedro Alves
  0 siblings, 1 reply; 7+ messages in thread
From: Andy Wingo @ 2015-03-09 19:19 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches, Alexander Smundak, Doug Evans

Hi Pedro,

Thanks for the detailed comments!  Comments inline and a brief summary
at the end.

On Mon 09 Mar 2015 16:38, Pedro Alves <palves@redhat.com> writes:

> On 03/02/2015 01:28 PM, Andy Wingo wrote:
>
>> Unfortunately, the unwind callback is really squirrely -- you can't do
>> much there.
>
> You mean the "unwind->sniffer" callback, right?  If so, that's (so far) by
> design.  The sniffer callback is only supposed to identify whether
> the unwinder can unwind, not do any unwinding at all.  :-/

Unfortunately you have to do some unwinding in order to compute a frame
ID, and the "unwind->this_id" callback runs in approximately the same
environment as the sniffer.

> See here for example:
>
>   https://www.sourceware.org/ml/gdb-patches/2013-11/msg00602.html

Is it important to use frame_unwind_got_memory et al for results?  (IIUC
links between values and frames was part of the issue in that mail.)  In
my unwinders I am just returning values using e.g.

  (value-dereference
   (value-cast (ephemeral-frame-read-register ef "rbp") void**-type))

or something similar.  If this is important, we will need to provide an
interface that looks much more like the unwind interface instead of a
simple Maybe<UnwindInfo> :(

> But imagine the case of:
>
>   #0 frame0  <<<< current
>   #1 frame1  <<<< selected
>   #2 frame2
>
> So what context does make sense for symbol look ups at that point?

Oh, agreed very much.  I was looking up global symbols, like global
constants, and in that case the block is NULL.  But, I was having issues
that I didn't understand using lookup_global_symbol, and lookup_symbol
worked otherwise outside of the sniffer...

Is the issue in the Guile bindings, that they should assume that it's
sensible to get the current architecture, or the selected or current
frame?

> Isn't the correct answer that the sniffer should pass in an explicit
> block to lookup-symbol?

Yes, or in this case lookup-global-symbol.

>>   * You can't value-call, which is not unexpected in general, but the
>>     reason is unexpected: because call_function_by_hand calls
>>     get_current_arch and that doesn't work
>
> Where is that?  I think you meant that the guile wrapper calls
> get_current_arch, not call_function_by_hand.

Yes, I see that now.  I think I was confused.

> Why would the unwinder want to access user / pseudo registers?
> Shouldn't it be working only with raw registers?

Because the user is ignorant? :)  I was pleased that $pc, $sp, and $fp
"worked" otherwise, so I hoped to avoid encoding architecture-specific
things.  Alas.  (Incidentally, my patch still has the problem that
passing in a pseudo register will barf.)

In summary:

  * Is is possible to make a Maybe<UnwindInfo>-style interface to
    sniffers, or should extension languages expose the callbacks
    instead because they really need to be called within their specific
    dynamic environments?

  * Regarding recursion and selected frames -- what do you think about
    the hack in v2 of this patch, which adds
    frame_unwinder_is_unwinding() ?  It is a hack but it has somewhat
    reasonable semantics.

Andy


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Frame sniffers in Python/Guile/*
  2015-03-09 19:19   ` Andy Wingo
@ 2015-03-10 16:36     ` Pedro Alves
  2015-03-10 17:36       ` Andy Wingo
  0 siblings, 1 reply; 7+ messages in thread
From: Pedro Alves @ 2015-03-10 16:36 UTC (permalink / raw)
  To: Andy Wingo; +Cc: gdb-patches, Alexander Smundak, Doug Evans

On 03/09/2015 07:18 PM, Andy Wingo wrote:
> Hi Pedro,
> 
> Thanks for the detailed comments!  Comments inline and a brief summary
> at the end.
> 
> On Mon 09 Mar 2015 16:38, Pedro Alves <palves@redhat.com> writes:
> 
>> On 03/02/2015 01:28 PM, Andy Wingo wrote:
>>
>>> Unfortunately, the unwind callback is really squirrely -- you can't do
>>> much there.
>>
>> You mean the "unwind->sniffer" callback, right?  If so, that's (so far) by
>> design.  The sniffer callback is only supposed to identify whether
>> the unwinder can unwind, not do any unwinding at all.  :-/
> 
> Unfortunately you have to do some unwinding in order to compute a frame
> ID, and the "unwind->this_id" callback runs in approximately the same
> environment as the sniffer.

Can you give an example, so we can understand the requirements better?

The main problem would be unwind-past-this_frame: that is, something doing
get_prev_frame(this_frame), get_frame_id (this_frame) or something like that
while inside the sniffer.  Unwinding registers from the next frame
to build a cache to compute the frame id or figure out where registers
were saved is ok.

> 
>> See here for example:
>>
>>   https://www.sourceware.org/ml/gdb-patches/2013-11/msg00602.html
> 
> Is it important to use frame_unwind_got_memory et al for results?

I don't think I understand this question.

> (IIUC links between values and frames was part of the issue in that mail.) 

The follow up mail makes it clearer, perhaps:

  https://www.sourceware.org/ml/gdb-patches/2013-11/msg00681.html

If you look at the backtrace there, you'll notice that
while inside dwarf2_frame_sniffer(this_frame) (frame #8), we ended
up in frame_register_unwind(this_frame), and that ends up with
bogus results.

> In my unwinders I am just returning values using e.g.

> 
>   (value-dereference
>    (value-cast (ephemeral-frame-read-register ef "rbp") void**-type))
> 
> or something similar.  If this is important, we will need to provide an
> interface that looks much more like the unwind interface instead of a
> simple Maybe<UnwindInfo> :(

I'm afraid I don't know what "Maybe<UnwindInfo>" is.

> 
>> But imagine the case of:
>>
>>   #0 frame0  <<<< current
>>   #1 frame1  <<<< selected
>>   #2 frame2
>>
>> So what context does make sense for symbol look ups at that point?
> 
> Oh, agreed very much.  I was looking up global symbols, like global
> constants, and in that case the block is NULL.  But, I was having issues
> that I didn't understand using lookup_global_symbol, and lookup_symbol
> worked otherwise outside of the sniffer...
> 
> Is the issue in the Guile bindings, that they should assume that it's
> sensible to get the current architecture, or the selected or current
> frame?

It probably makes sense to refer to those.  As long as e.g.,
can e.g., make get_current_frame() really return the current frame.
But from what I understood, we end up instead falling back to NULL?
If we're going to give the wrong answer, it makes me nervous to have
to support doing it going forward.  :-/

> 
>> Isn't the correct answer that the sniffer should pass in an explicit
>> block to lookup-symbol?
> 
> Yes, or in this case lookup-global-symbol.
> 
>>>   * You can't value-call, which is not unexpected in general, but the
>>>     reason is unexpected: because call_function_by_hand calls
>>>     get_current_arch and that doesn't work
>>
>> Where is that?  I think you meant that the guile wrapper calls
>> get_current_arch, not call_function_by_hand.
> 
> Yes, I see that now.  I think I was confused.
> 
>> Why would the unwinder want to access user / pseudo registers?
>> Shouldn't it be working only with raw registers?
> 
> Because the user is ignorant? :)  

:-)

> I was pleased that $pc, $sp, and $fp
> "worked" otherwise, so I hoped to avoid encoding architecture-specific
> things.  Alas.  (Incidentally, my patch still has the problem that
> passing in a pseudo register will barf.)

Speaking of which, it occurred to me that we'll likely have problems
on architectures that hide all raw registers (by making them unnamed)
behind pseudo registers, like MIPS.

> In summary:
> 
>   * Is is possible to make a Maybe<UnwindInfo>-style interface to
>     sniffers, or should extension languages expose the callbacks
>     instead because they really need to be called within their specific
>     dynamic environments?
> 
>   * Regarding recursion and selected frames -- what do you think about
>     the hack in v2 of this patch, which adds
>     frame_unwinder_is_unwinding() ?  It is a hack but it has somewhat
>     reasonable semantics.

Thanks,
Pedro Alves


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Frame sniffers in Python/Guile/*
  2015-03-10 16:36     ` Pedro Alves
@ 2015-03-10 17:36       ` Andy Wingo
  0 siblings, 0 replies; 7+ messages in thread
From: Andy Wingo @ 2015-03-10 17:36 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches, Alexander Smundak, Doug Evans

Greets :)

On Tue 10 Mar 2015 17:35, Pedro Alves <palves@redhat.com> writes:

> On 03/09/2015 07:18 PM, Andy Wingo wrote:
>
>> On Mon 09 Mar 2015 16:38, Pedro Alves <palves@redhat.com> writes:
>> 
>>> The sniffer callback is only supposed to identify whether the
>>> unwinder can unwind, not do any unwinding at all.  :-/
>> 
>> Unfortunately you have to do some unwinding in order to compute a frame
>> ID, and the "unwind->this_id" callback runs in approximately the same
>> environment as the sniffer.
>
> Can you give an example, so we can understand the requirements better?

Yes.  For example we are unwinding the innermost frame, frame 0, from
the sentinel.  The Guile sniffer is called to see if any Guile unwinder
can handle the frame.  I have a Guile unwinder that checks if the PC is
within any JIT-compiled page in V8.  It has to traverse the V8 heap,
basically, to see if it can handle the page.

Once the Guile V8 sniffer has enough information to know whether it can
unwind the frame or not, it _also_ has enough information to compute the
frame ID for frame 0 -- namely, since it knows what V8 object contains
the code, it knows how to interpret the registers to find the frame
pointer and it knows about the stack layout, so it can compute the start
PC and the start SP easily, which is the frame ID.

For the same reason it also knows how to unwind the PC, SP, and FP.

I guess by "you have to do some unwinding in the sniffer" I just meant
that users (ignorant ones like me :) of the Guile/Python interface to
unwinders will have to do some computation over the state of the
inferior in the sniffer callback to know whether to handle the frame or
not -- and that relatively speaking the other parts of the unwinder
interface are trivial.

> The main problem would be unwind-past-this_frame: that is, something doing
> get_prev_frame(this_frame), get_frame_id (this_frame) or something like that
> while inside the sniffer.

This is impossible in the proposed Guile or Python patches because the
wrappers for this_frame are not frames.  You can't actually even get to
any linked frame from the "ephemeral frame object" (in Guile) or the
"sniffer info object" (in Python).

What can happen is that someone could inadvertently cause recursion to
this_frame's unwinder -- for example via get_prev_frame on
get_selected_frame(), or even just by calling get_selected_frame() while
unwinding frame 0.

In this case -- somehow recursing to unwind this_frame -- my patch makes
get_prev_frame(next_frame) return NULL.  But you'd have to have a
next_frame to begin with, so this is OK.  It doesn't add strangeness.
True, this can make get_selected_frame() return the sentinel frame if
recursion happens while unwinding frame 0, but that case appears to be
already reachable otherwise -- see unwind_to_current_frame.

> Unwinding registers from the next frame to build a cache to compute
> the frame id or figure out where registers were saved is ok.

Yep, no problem here.

>>> See here for example:
>>>
>>>   https://www.sourceware.org/ml/gdb-patches/2013-11/msg00602.html
>> 
>> Is it important to use frame_unwind_got_memory et al for results?
>
> I don't think I understand this question.

I think I misunderstood the issue, please ignore the question.

(I thought it was that because you used frame_unwind_got_memory to
produce GDB values for saved registers, and those values linked to a
frame by ID, that somehow caused recursion when computing the frame_id
for this_frame.  I was wondering if it were somehow necessary to build
register values by frame_unwind_got_memory and not the value-building
API in Python or Guile.  It seems that is not the case.)

>> In my unwinders I am just returning values using e.g.
>
>> 
>>   (value-dereference
>>    (value-cast (ephemeral-frame-read-register ef "rbp") void**-type))
>> 
>> or something similar.  If this is important, we will need to provide an
>> interface that looks much more like the unwind interface instead of a
>> simple Maybe<UnwindInfo> :(
>
> I'm afraid I don't know what "Maybe<UnwindInfo>" is.

My apologies!

The Python patch calls the return value of the sniffer the "unwind
info".  If the sniffer returns None, then the sniffer is considered to
fail, and otherwise the return value is the "unwind info".  A Maybe<T>
is just an idiom for a data type that is either an instance of T or
None.  Although I'm a schemer at heart I have enjoyed thinking about
program design in types recently, but I see I have added to the
confusion.

What I meant was: in both the Python and the Guile patches, the unwinder
interface is essentially a function that either punts, or it does all of
the unwinding at once.  It doesn't reflect the nuances that the sniffer,
the this_id callback, and the prev_register callbacks are all called at
different times.  I like the simplification but I do see why the
low-level GDB interface is different (avoiding unnecessary prev_register
computation, chiefly).

The question was, is there a problem with the all-at-a-time approach
that the Guile and Python patches use?  I am thinking that no, that it's
OK to do it all at once.

>> [Should the Guile bindings avoid assuming that] it's sensible to get
>> the current architecture, or the selected or current frame?
>
> It probably makes sense to refer to those.  As long as e.g.,
> can e.g., make get_current_frame() really return the current frame.
> But from what I understood, we end up instead falling back to NULL?
> If we're going to give the wrong answer, it makes me nervous to have
> to support doing it going forward.  :-/

If I understand the concern, I think I answered it above when referring
to the new way that get_prev_frame could return NULL.

>> I was pleased that $pc, $sp, and $fp
>> "worked" otherwise, so I hoped to avoid encoding architecture-specific
>> things.  Alas.  (Incidentally, my patch still has the problem that
>> passing in a pseudo register will barf.)
>
> Speaking of which, it occurred to me that we'll likely have problems
> on architectures that hide all raw registers (by making them unnamed)
> behind pseudo registers, like MIPS.

What's the solution here then?  Perhaps Alexander was right that
register numbers have a place :)

Thank you again for the review and for thinking this through, Pedro.  I
know that it takes time and I really do appreciate it.

Cheers,

Andy


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-03-10 17:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-02 13:28 Frame sniffers in Python/Guile/* Andy Wingo
2015-03-08 20:04 ` Doug Evans
2015-03-09  8:27   ` Andy Wingo
2015-03-09 15:39 ` Pedro Alves
2015-03-09 19:19   ` Andy Wingo
2015-03-10 16:36     ` Pedro Alves
2015-03-10 17:36       ` Andy Wingo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox