From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 50848 invoked by alias); 8 Apr 2015 16:49:17 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 50695 invoked by uid 89); 8 Apr 2015 16:49:15 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: e06smtp15.uk.ibm.com Received: from e06smtp15.uk.ibm.com (HELO e06smtp15.uk.ibm.com) (195.75.94.111) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Wed, 08 Apr 2015 16:49:13 +0000 Received: from /spool/local by e06smtp15.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 8 Apr 2015 17:49:07 +0100 Received: from d06dlp01.portsmouth.uk.ibm.com (9.149.20.13) by e06smtp15.uk.ibm.com (192.168.101.145) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 8 Apr 2015 17:49:05 +0100 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp01.portsmouth.uk.ibm.com (Postfix) with ESMTP id 422AC17D805D for ; Wed, 8 Apr 2015 17:49:38 +0100 (BST) Received: from d06av07.portsmouth.uk.ibm.com (d06av07.portsmouth.uk.ibm.com [9.149.37.248]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t38Gn4c064749800 for ; Wed, 8 Apr 2015 16:49:04 GMT Received: from d06av07.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av07.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t38Gn4LB021183 for ; Wed, 8 Apr 2015 12:49:04 -0400 Received: from oc7340732750.ibm.com (dyn-9-152-213-199.boeblingen.de.ibm.com [9.152.213.199]) by d06av07.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id t38Gn4OB021180; Wed, 8 Apr 2015 12:49:04 -0400 Received: by oc7340732750.ibm.com (Postfix, from userid 500) id 31EECCF20; Wed, 8 Apr 2015 18:49:04 +0200 (CEST) Subject: Re: [PATCH 1/3 v2] Fast tracepoint for powerpc64le To: cole945@gmail.com (Wei-cheng Wang) Date: Wed, 08 Apr 2015 16:49:00 -0000 From: "Ulrich Weigand" Cc: palves@redhat.com, gdb-patches@sourceware.org In-Reply-To: <55185216.4080306@gmail.com> from "Wei-cheng Wang" at Mar 30, 2015 03:27:18 AM MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <20150408164904.31EECCF20@oc7340732750.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15040816-0021-0000-0000-000003813D9F X-SW-Source: 2015-04/txt/msg00278.txt.bz2 Wei-cheng Wang wrote: > >> +static void > >> +ppc64_emit_reg (int reg) > >> +{ > >> + unsigned char buf[10 * 4]; > >> + unsigned char *p = buf; > >> + > >> + p += GEN_LD (p, 3, 31, bc_framesz - 32); > >> + p += GEN_LD (p, 3, 3, 48); /* offsetof (fast_tracepoint_ctx, regs) */ > > > > This seems a bit fragile, it would be better to determine the offset > > automatically ... (I don't quite understand why the x86 code works > > either, as it is right now ...) > >Hi Predro, > >I checked the implementation of x86 emit_reg and it seems the implementation >is wrong. It assumes the first argument is ctx->regs, but it's actually 'ctx' > >if (tpoint->compiled_cond) > err = ((condfn) (uintptr_t) (tpoint->compiled_cond)) (ctx, &value); > >I think probably either we could pass ctx->regs to compiled_cond instead, >or move the declarations of fast_tracepoint_ctx (and others) to tracepoint.h, >so we can use "offsetof (fast_tracepoint_ctx, regs)" instead. >Any suggestion? FWIW, passing the regs buffer directly to the compiled routine seems more straightforward to me ... >The following are specific to PowerPC. > > >> + if ((imm >> 8) == 0) > >> + { > >> + /* li reg, imm[7:0] */ > >> + p += GEN_LI (p, reg, imm); > > Actually, you can load values up to 32767 with a single LI. > >How about this fix? So we can load a small negative number with LI. > >if ((imm + 32768) < 65536) > { > /* li reg, imm[7:0] */ > p += GEN_LI (p, reg, imm); > } That looks good to me. > >> p += gen_atomic_xchg (p, lockaddr, 0, 1); > >> /* Call to collector. */ > >> p += gen_call (p, collector); > >> p += gen_atomic_xchg (p, lockaddr, 1, 0); > > This seems wrong. Shouldn't *lockaddr be set to the address > > of a collecting_t object, and not just "1"? > >AFAIK, lockaddr only matters to the two lines above, >so simply put '1' for LOCKED should be fine. Or am I missing anything? Yes, the lockaddr is used from the gdbserver side. See the uses of ipa_sym_addrs.addr_collecting in tracepoint.c; it is expected that this value is either NULL or else points to a collecting_t object that can be read/written by gdbserver. > >> + /* Now, insert the original instruction to execute in the jump pad. */ > >> + *adjusted_insn_addr = buildaddr + (p - buf); > >> + *adjusted_insn_addr_end = *adjusted_insn_addr; > >> + relocate_instruction (adjusted_insn_addr_end, tpaddr); > > Hmm. This calls back to GDB to perform the relocation of the > > original instruction. On PowerPC, there are only a handful of > > instructions that need to be relocated; I'm not sure it is really > > necessary to call back to GDB. Can't those just be relocated > > directly here? This might even make the code simpler overall. > >I just follow the design of Predo. >I could move the code to gdbserver side if you suggest. I think if there is no benefit in having this code on the GDB side, it would be better to move it to gdbsever. (Potential benefits could be: we need information that isn't available in gdbserver, like symbol data; or the code can be shared with other users if in GDB. But on PowerPC, I think none of this applies.) > > Note that the stack frame layout as above only applies to ELFv1, but > > you're actually only supporting ELFv2 at the moment. For ELFv2, there > > is no parameter save area (for this specific call), there is no compiler > > or linker doubleword, and the TOC save area is at SP + 24. (So this > > location probably shouldn't be used to save something else ...) > > This is also a bit bigger than required for ELFv2. On the other hand, > > having a larger buffer doesn't hurt. > >Oops, I have to admit I looked into the wrong ABI document. >Hopefully we will support ppc64be soon, so I suggest still use 112-byte for >minimual frame size for simplicity? Yes, as I said, just having a larger frame is OK. However, there are other places where the ABI matters (like not overwriting the TOC save area, or like respecting the (lack of) stack red zone ...). > >> +/* Return true if ADDR is a valid address for tracepoint. Set *ISZIE > >> + to the number of bytes the target should copy elsewhere for the > >> + tracepoint. */ > >> + > >> +static int > >> +ppc_fast_tracepoint_valid_at (struct gdbarch *gdbarch, > >> + CORE_ADDR addr, int *isize, char **msg) > >> +{ > >> + if (isize) > >> + *isize = gdbarch_max_insn_length (gdbarch); > >> + if (msg) > >> + *msg = NULL; > >> + return 1; > >> +} > > > > Should/can we check here where the jump to the jump pad will be in > > range? Might be better to detect this early ... > >Client has no idea about where the jump pad will be installed. >If it's out of range, gdbserver will report it right after user >entered 'tstart' command Well, but we know the logic the stub uses. For example, we know that we certainly cannot install a fast tracepoint in any shared library code, since the jump pad will definitely be too far away. We can check for this condition here. (We could also check for tracepoints in executables that have a text section larger than 32 MB ...) > > Well, we really should handle the other cases too; there's no reason > > to simply fail if this happens to be a branch on count or such ... > >In the new patch, I will handle other cases as such, > > bdnz .Lgoto >1:INSN1 > >is transform to > > bdz > b .Lgoto >1:INSN1 > > and > > bdnzt eq, .Lgoto > >is transfrom to > > bdz 1f (+12) > bf eq, 1f (+8) > b .Lgoto >1:INSN1 > >Is it right? That looks OK, thanks. > >> + frame.frameless = 0; > >> + frame.lr_offset = 16; > > > > This isn't correct for ppc32 ... > >frame.lr_offset = 4; >Right? Correct. > > In any case, this code makes many assumptions that may not always be > > true. But then again, the same is true for the i386 case, so maybe this > > is OK for now ... In general, if we have DWARF CFI for the function, > > it would be much preferable to refer to that in order to determine the > > exact stack layout. > >This function will only be used if user want to collect return address >(using 'collect $_ret' action command) > >I agree that we should use DWARF CFI, but I have no idea how can I >get the information in this function. Any suggestion where I can look into? This is probably going to require major changes, so it looks like something for a separate project. In general, using DWARF CFI to unwind a frame is currently done in dwarf2-frame.c, with the bulk of the work done in dwarf2_frame_cache and dwarf2_frame_prev_register. This conceptually involves two stages: looking up the FDE from the current PC (which can be done using just the executable file) to determine the "recipe" how to unwind at this location, and then implement the recipe, which involves using current register values and memory content. Unfortunately, those two stages are somewhat interwoven and not cleanly separated in the current implemention, which makes it not directly usable for tracepoints. What we should be doing is to actually implement a clean separation, so that the work to determine the unwind "recipe" can be shared between live unwinding and tracepoints, and then implement a tracepoint-specific step that uses the recipe to generate a series of agent commands that will implement "running" the recipe in the context of the agent. This has already been done for the use of DWARF that describes variable locations (see locexpr_tracepoint_var_ref in dwarf2loc.c), but not yet for the use of DWARF to unwind. Bye, Ulrich -- Dr. Ulrich Weigand GNU/Linux compilers and toolchain Ulrich.Weigand@de.ibm.com