Re: [RFC 2/3] Record function descriptor address instead of function address in value

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

From: Yao Qi <qiyaoltc@gmail.com>
To: Ulrich Weigand <uweigand@de.ibm.com>
Cc: "gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Subject: Re: [RFC 2/3] Record function descriptor address instead of function address in value
Date: Fri, 28 Oct 2016 16:10:00 -0000	[thread overview]
Message-ID: <CAH=s-PN7hn9+mTmDmjJC3UpkQuocebbBCaDSgCMwZKH2Sz6cpQ@mail.gmail.com> (raw)
In-Reply-To: <20161017155133.A9B8711C257@oc8523832656.ibm.com>

Hi Ulrich,
I did some experiments these days, for all your suggestions, and
comments.  I share the results now...

On Mon, Oct 17, 2016 at 4:51 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> Yao Qi wrote:
>
>> GCC vs GDB divergence on the meaning of "incr" brings the confusion.
>> We should reduce such divergence as much as we can.  However, this
>> divergence was added in https://sourceware.org/ml/gdb-patches/2001-11/msg00001.html
>> I agree with Jim, but I'd like use function descriptor address in value,
>> which makes the whole expression evaluation look more reasonable and
>> more close to compiler's behavior.
>
> I agree that this is a problem.  However, the reasons Jim mentions in
> that long comment are still valid: it is in general *not possible* to
> convert an arbitrary function address back to a function descriptor ...
>
>> In this patch, I add a new gdbarch method convert_from_func_addr, which
>> converts function address back to function descriptor address or
>> function pointer address.  It is the reversed operation of
>> convert_from_func_ptr_addr.
>
> ... and therefore this function cannot really be implemented on ppc64
> in a fully-general way.  In particular, when running stripped binaries
> or code that is otherwise without symbols (e.g. JITted code etc.), this
> conversion may not be possible.

I don't expect convert_from_func_addr working in general, due to the
reasons you mentioned.  convert_from_func_addr is only used when
symbol (from debug information) is available.  If we want to GDB
handle such complex expression evaluation correctly, debug
information is required.

>
> B.t.w. note that there is already a similar function that attempts this
> conversion (ppc-sysv-tdep.c:convert_code_addr_to_desc_addr), but it is
> currently only used during inferior function calls, so it is not so
> critical if it fails.  (With your change, that function may actually
> no longer be necessary at that place since values should already point
> to function descriptors ...)
>

Yes, convert_from_func_addr is similar to convert_code_addr_to_desc_addr.
I removed convert_code_addr_to_desc_addr, but it breaks ifunc
inferior call in my experiment, because ifunc resolver gets the function
address rather than function descriptor address, we still need
convert_code_addr_to_desc_addr to get function descriptor address.

> Note that this checks for presence of an msymbol at the code address,
> while your code checks for presence of a symbol.  Maybe the best way
> would be to check for either.
>

Now, convert_from_func_addr does whatever
convert_code_addr_to_desc_addr does.

>> We convert function address to function
>> descriptor address when,
>>
>>  - we create value for a function,
>>  - we generate ax code for a function,
>
> Since the conversion is problematic in general, I think the best way
> would be to keep descriptor addresses wherever possible, and only
> convert to code addresses at the last minute when necessary.
>

Yes, I agree.

> Your code already *mostly* does that, with the exception of symbol
> addresses.  I'm wondering whether we shouldn't push the change one
> step further back and already store descriptor addresses in
> SYMBOL_VALUE_ADDRESS.  Note that this is currently already handled
> somewhat inconsistenly on ppc64: msymbols already point to the
> descriptor, and so do symbols read from stabs debug info; only
> symbols read from DWARF debug info actually point to the code
> address.

I tried what you suggested, but failed.  SYMBOL_VALUE_ADDRESS
is used to get variable address, rather than function's address.  We
get function address from block (BLOCK_START), and get block
from symbol (SYMBOL_BLOCK_VALUE).  There is no good way to
store function descriptor address in symbol.

>
>> I don't change the meaning of return value of value_as_address, because
>> it is widely used, so value_as_address still return the function
>> address if type is TYPE_CODE_FUNC or TYPE_CODE_METHOD.
>
> I'm wondering whether this is a good idea; maybe it would be better to
> return descriptor addresses and update those callers that actually need
> code addresses.  This would keep the principle that we should keep
> descriptor addresses as long as possible.  Also, this might actually
> fix more bugs; if you look at e.g. this code in value_equal:
>
>   else if (code1 == TYPE_CODE_PTR && is_int2)
>     return value_as_address (arg1) == (CORE_ADDR) value_as_long (arg2);
>   else if (code2 == TYPE_CODE_PTR && is_int1)
>     return (CORE_ADDR) value_as_long (arg1) == value_as_address (arg2);
>
> this might actually cause invalid evaluation of expressions like
>
>   incr == 0x12345678
>
> (as compared to the native C evaluation).
>

How about doing this in a followup patch as an enhancement?  My
priority is to get this RFC acceptable, and make PPC64/ARM/MIPS
works well.  Propagate descriptor address and bug fixes can be done
later.

>
>> This patch brings several user visible changes, which look more
>> accurate, shown by this table below,
>>
>>  COMMAND           BEFORE                       AFTER
>>  p main            main function address        main function descriptor
>>                                                 address
>>  disass main       disassembly function main    not changed
>>  disass main+4,+4  disassembly 4 bytes from     disassembly 4 bytes from
>>                    function main address + 4    main's function descriptor + 4
>>  x/i main          show one instruction on      show one instruction on main's
>>                    function main address        function descriptor
>>
>> Although the latter looks inconvenient, that is consistent to the
>> meaning on C language level.  Due to these changes, test cases are
>> adjusted accordingly.
>
> Those are a bit annoying.  I think the underlying problem is that operations
> like "disass" or "x/i" really don't work on "values" in the original sense
> but rather on "PC addresses".  Maybe it would make sense to have those
> function enable a special "mode" in the expression evaluator that would
> change the conversion of functions to function pointers to use the code
> address instead of the descriptor address?
>

I think about the special "mode" in the expression evaluation, but I
suspect that there is a case in a single expression, function address
is expected in some symbol, and descriptor address is expected in
some other symbol, like,

int func (int i) { return i;}

(gdb) x/i func + func (1)

the "func" is expected to be function address and the second "func"
is expected to be function descriptor address.  I had a heuristics that
VALUE means function address if we do a pointer add with long type
(valarith.c:value_ptradd).  It works fine.

I've already managed to get all my patches work.  If you agree with
my thoughts above, I'll post my patches next week.

-- 
Yao (齐尧)

next prev parent reply	other threads:[~2016-10-28 16:10 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-14 10:53 [PATCH 0/3] Fix gdb.base/func-ptrs.exp fails in ppc64 and arm thumb mode Yao Qi
2016-10-14 10:53 ` [RFC 3/3] Update test cases Yao Qi
2016-10-14 10:53 ` [RFC 2/3] Record function descriptor address instead of function address in value Yao Qi
2016-10-14 17:35   ` Simon Marchi
2016-10-17 11:40     ` Yao Qi
2016-10-17 15:40       ` Simon Marchi
2016-10-17 15:51   ` Ulrich Weigand
2016-10-18  2:27     ` Maciej W. Rozycki
2016-10-18 13:03       ` Yao Qi
2016-10-28 16:20       ` Yao Qi
2017-10-03 18:12         ` Maciej W. Rozycki
2017-10-04 21:25           ` Yao Qi
2016-10-28 16:10     ` Yao Qi [this message]
2016-10-28 18:41       ` Ulrich Weigand
2016-10-14 10:53 ` [PATCH 1/3] Use get_var_address in test cases Yao Qi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAH=s-PN7hn9+mTmDmjJC3UpkQuocebbBCaDSgCMwZKH2Sz6cpQ@mail.gmail.com' \
    --to=qiyaoltc@gmail.com \
    --cc=gdb-patches@sourceware.org \
    --cc=uweigand@de.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox