From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 71009 invoked by alias); 28 Oct 2016 16:10:11 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 70989 invoked by uid 89); 28 Oct 2016 16:10:10 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,KAM_STOCKGEN,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_SPAM,SPF_PASS autolearn=no version=3.3.2 spammy=0x12345678, reversed, mentions X-HELO: mail-qk0-f193.google.com Received: from mail-qk0-f193.google.com (HELO mail-qk0-f193.google.com) (209.85.220.193) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 28 Oct 2016 16:09:59 +0000 Received: by mail-qk0-f193.google.com with SMTP id v138so2433160qka.2 for ; Fri, 28 Oct 2016 09:09:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=b0sETXWGiGMIRUCgPugkY1ifQpwC+QF9H1UBEJY+Lhs=; b=TMzDJKsBoYgEem1VSfuCnkP19AfQUZc5NtusuLoc9lkU4XSVMdwI4HMWxsIQrRHek3 TpC2vvTLWKc+jpDvCNBqbLPWOQq4GMOwytxZwvXjHNrGTjH1tmU7MGUY56K1moNQPsmH Ufw7TRVBXMxHMVqQelMUqKWFFycSnVvClz76ZC6XulE8vBxqhAjg+BVOj3TjKkIg5r+2 YMpvfPZCjQOrOmjUUBLql0Efs+zWIE1+1oRBFqk7dGZMltOJB8L5U9/Pj7W6r1/vij6P 2c3EL5nfOolxoHOrIH6Tf9sppsQ/RSh+Am3YiQYSPOBU0yi4/eJ0nEr/vqcgy36fQkMH g9IQ== X-Gm-Message-State: ABUngvcuJoPPzxUXLWWfLbquVMOUstWKp1YyDKVRtC4VU3fTnt4shBMI4qltIbHx9feWF8vTIKhAR44IkKr0Tg== X-Received: by 10.55.43.90 with SMTP id r87mr12993239qkh.234.1477670998046; Fri, 28 Oct 2016 09:09:58 -0700 (PDT) MIME-Version: 1.0 Received: by 10.12.149.168 with HTTP; Fri, 28 Oct 2016 09:09:57 -0700 (PDT) In-Reply-To: <20161017155133.A9B8711C257@oc8523832656.ibm.com> References: <1476442387-17291-3-git-send-email-yao.qi@linaro.org> <20161017155133.A9B8711C257@oc8523832656.ibm.com> From: Yao Qi Date: Fri, 28 Oct 2016 16:10:00 -0000 Message-ID: Subject: Re: [RFC 2/3] Record function descriptor address instead of function address in value To: Ulrich Weigand Cc: "gdb-patches@sourceware.org" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2016-10/txt/msg00815.txt.bz2 Hi Ulrich, I did some experiments these days, for all your suggestions, and comments. I share the results now... On Mon, Oct 17, 2016 at 4:51 PM, Ulrich Weigand wrote: > Yao Qi wrote: > >> GCC vs GDB divergence on the meaning of "incr" brings the confusion. >> We should reduce such divergence as much as we can. However, this >> divergence was added in https://sourceware.org/ml/gdb-patches/2001-11/ms= g00001.html >> I agree with Jim, but I'd like use function descriptor address in value, >> which makes the whole expression evaluation look more reasonable and >> more close to compiler's behavior. > > I agree that this is a problem. However, the reasons Jim mentions in > that long comment are still valid: it is in general *not possible* to > convert an arbitrary function address back to a function descriptor ... > >> In this patch, I add a new gdbarch method convert_from_func_addr, which >> converts function address back to function descriptor address or >> function pointer address. It is the reversed operation of >> convert_from_func_ptr_addr. > > ... and therefore this function cannot really be implemented on ppc64 > in a fully-general way. In particular, when running stripped binaries > or code that is otherwise without symbols (e.g. JITted code etc.), this > conversion may not be possible. I don't expect convert_from_func_addr working in general, due to the reasons you mentioned. convert_from_func_addr is only used when symbol (from debug information) is available. If we want to GDB handle such complex expression evaluation correctly, debug information is required. > > B.t.w. note that there is already a similar function that attempts this > conversion (ppc-sysv-tdep.c:convert_code_addr_to_desc_addr), but it is > currently only used during inferior function calls, so it is not so > critical if it fails. (With your change, that function may actually > no longer be necessary at that place since values should already point > to function descriptors ...) > Yes, convert_from_func_addr is similar to convert_code_addr_to_desc_addr. I removed convert_code_addr_to_desc_addr, but it breaks ifunc inferior call in my experiment, because ifunc resolver gets the function address rather than function descriptor address, we still need convert_code_addr_to_desc_addr to get function descriptor address. > Note that this checks for presence of an msymbol at the code address, > while your code checks for presence of a symbol. Maybe the best way > would be to check for either. > Now, convert_from_func_addr does whatever convert_code_addr_to_desc_addr does. >> We convert function address to function >> descriptor address when, >> >> - we create value for a function, >> - we generate ax code for a function, > > Since the conversion is problematic in general, I think the best way > would be to keep descriptor addresses wherever possible, and only > convert to code addresses at the last minute when necessary. > Yes, I agree. > Your code already *mostly* does that, with the exception of symbol > addresses. I'm wondering whether we shouldn't push the change one > step further back and already store descriptor addresses in > SYMBOL_VALUE_ADDRESS. Note that this is currently already handled > somewhat inconsistenly on ppc64: msymbols already point to the > descriptor, and so do symbols read from stabs debug info; only > symbols read from DWARF debug info actually point to the code > address. I tried what you suggested, but failed. SYMBOL_VALUE_ADDRESS is used to get variable address, rather than function's address. We get function address from block (BLOCK_START), and get block from symbol (SYMBOL_BLOCK_VALUE). There is no good way to store function descriptor address in symbol. > >> I don't change the meaning of return value of value_as_address, because >> it is widely used, so value_as_address still return the function >> address if type is TYPE_CODE_FUNC or TYPE_CODE_METHOD. > > I'm wondering whether this is a good idea; maybe it would be better to > return descriptor addresses and update those callers that actually need > code addresses. This would keep the principle that we should keep > descriptor addresses as long as possible. Also, this might actually > fix more bugs; if you look at e.g. this code in value_equal: > > else if (code1 =3D=3D TYPE_CODE_PTR && is_int2) > return value_as_address (arg1) =3D=3D (CORE_ADDR) value_as_long (arg2= ); > else if (code2 =3D=3D TYPE_CODE_PTR && is_int1) > return (CORE_ADDR) value_as_long (arg1) =3D=3D value_as_address (arg2= ); > > this might actually cause invalid evaluation of expressions like > > incr =3D=3D 0x12345678 > > (as compared to the native C evaluation). > How about doing this in a followup patch as an enhancement? My priority is to get this RFC acceptable, and make PPC64/ARM/MIPS works well. Propagate descriptor address and bug fixes can be done later. > >> This patch brings several user visible changes, which look more >> accurate, shown by this table below, >> >> COMMAND BEFORE AFTER >> p main main function address main function descriptor >> address >> disass main disassembly function main not changed >> disass main+4,+4 disassembly 4 bytes from disassembly 4 bytes from >> function main address + 4 main's function descript= or + 4 >> x/i main show one instruction on show one instruction on = main's >> function main address function descriptor >> >> Although the latter looks inconvenient, that is consistent to the >> meaning on C language level. Due to these changes, test cases are >> adjusted accordingly. > > Those are a bit annoying. I think the underlying problem is that operati= ons > like "disass" or "x/i" really don't work on "values" in the original sense > but rather on "PC addresses". Maybe it would make sense to have those > function enable a special "mode" in the expression evaluator that would > change the conversion of functions to function pointers to use the code > address instead of the descriptor address? > I think about the special "mode" in the expression evaluation, but I suspect that there is a case in a single expression, function address is expected in some symbol, and descriptor address is expected in some other symbol, like, int func (int i) { return i;} (gdb) x/i func + func (1) the "func" is expected to be function address and the second "func" is expected to be function descriptor address. I had a heuristics that VALUE means function address if we do a pointer add with long type (valarith.c:value_ptradd). It works fine. I've already managed to get all my patches work. If you agree with my thoughts above, I'll post my patches next week. --=20 Yao (=E9=BD=90=E5=B0=A7)