From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 33209 invoked by alias); 8 Mar 2019 02:55:38 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 33124 invoked by uid 89); 8 Mar 2019 02:55:37 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.1 spammy=HX-Languages-Length:6603, familiar X-HELO: simark.ca Received: from simark.ca (HELO simark.ca) (158.69.221.121) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 08 Mar 2019 02:55:34 +0000 Received: by simark.ca (Postfix, from userid 112) id 04F9F1E658; Thu, 7 Mar 2019 21:55:33 -0500 (EST) Received: from simark.ca (localhost [127.0.0.1]) by simark.ca (Postfix) with ESMTP id 97FE41E152; Thu, 7 Mar 2019 21:55:30 -0500 (EST) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Fri, 08 Mar 2019 02:55:00 -0000 From: Simon Marchi To: John Baldwin Cc: gdb-patches@sourceware.org Subject: Re: [PATCH v2 04/11] Add a new gdbarch method to resolve the address of TLS variables. In-Reply-To: <2c282f52-0269-d6a8-8533-4c00b1a4ee8d@FreeBSD.org> References: <4db33aead3f31532b7d4e165d9786df792a4d925.1549672588.git.jhb@FreeBSD.org> <02c8a44b-b1d2-0f0f-9b6f-72a0fb673f83@simark.ca> <2c282f52-0269-d6a8-8533-4c00b1a4ee8d@FreeBSD.org> Message-ID: <4a8bfa95b84386b3a76a37113495b7bc@simark.ca> X-Sender: simark@simark.ca User-Agent: Roundcube Webmail/1.3.6 X-SW-Source: 2019-03/txt/msg00184.txt.bz2 On 2019-03-07 18:50, John Baldwin wrote: > On 3/7/19 8:08 AM, Simon Marchi wrote: >> On 2019-02-08 7:40 p.m., John Baldwin wrote: >>> Permit TLS variable addresses to be resolved purely by an ABI rather >>> than requiring a target method. This doesn't try the target method >>> if >>> the ABI function is present (even if the ABI function fails) to >>> simplify error handling. >> >> I don't see anything wrong with the patch (and the comment you removed >> in target_translate_tls_address hints it is right), but again I am not >> very familiar with how TLS works, so I wouldn't spot if anything was >> conceptually wrong with this approach. I would appreciate if another >> maintainer could take a look and give their opinion. > > Ok. FWIW, the reason for target vs gdbarch has to do with the > different ways one can > resolve a TLS variable. Some background: > > In ELF relocations, a TLS variable is identified by an offset in a > special TLS section > of an ELF file, similar to global symbols being an offset relative to > .data or .bss. > However, TLS variables are duplicated for each thread. To support > this, the runtime > linker allocates an array of pointers for each thread called the DTV > array. The runtime > linker also assigns an array index to each ELF object, so the > executable is assigned array > index 1, and other shared libraries that use TLS are assigned indices > as they are mapped > by the runtime linker. The pointers in each thread's array point to > the per-thread blocks > of TLS variables for a given ELF object. Thus, if index 1 is for my > program and index 2 > was assigned to libc, then DTV[1] contains a pointer to all of the TLS > variables in my > main program and DTV[2] contains a pointer to all of the TLS variables > in libc. > > Thus, if libc has two TLS integers 'foo' and 'bar', they might be > assigned offsets of > 0 and 4. To read the value of 'foo' one uses the expression '*(int > *)(DTV[1])'. To > read 'bar' you would use '*(int *)(DTV[1] + 4)'. > > There are some extra optimizations in the compiler-generated code > (there's something > called static TLS that can be at a fixed offset from the per-thread > TCB pointer IIRC, > but there are also valid DTV[] pointers that can get to the same > variables just via > more indirection. Compiled code is also allowed to fetch the 'base' > of a TLS block > for a given shared object and then save that 'base' and use offsets > from it to access > different variables. Put another way, the compiler can assume that > &bar - &foo is > always '4' and just add the relative offset to '&foo' to compute > '&bar' without going > through the DTV array every time). > > In target_translate_tls_address() we are given the 'struct objfile' of > the ELF object > and the offset of the TLS variable we are trying to find. The > gdbarch_fetch_tls_load_module_address function fetches the pointer to > the runtime > linker's data structure describing that ELF object. > > The target version (target::get_thread_local_address) expects to use > some target > specific method to turn a (thread, linker_module_addr, offset) tuple > into the > address of a TLS variable. On Linux and other systems using > libthread_db for this, > it calls a libthread_db function. Internally that libthread_db > function looks at > the runtime linker's structure to extract the TLS index of the ELF > object. It > then looks in the thread library's per-thread data structure to find a > pointer > to the DTV array. It then uses 'DTV[index] + offset' to compute the > final address. > Note that this is all done in libthread_db rather than in gdb itself. > > The gdbarch method I'm using for FreeBSD doesn't use libthread_db. > Instead, it > more closely mimics what the compiler-generated code does. Many > architectures use > some sort of register to point to a per-thread Thread Control Block > (TCB), and they > store a pointer to the DTV array either in the TCB or at a fixed > offset relative to > the TCB. For example, 64-bit x86 uses the %fs segment prefix to access > the TCB, > and the %fs_base register is thus a pointer to the TCB. 32-bit x86 > uses %gs and > %gs_base instead. RISC-V has a 'tp' register for this purpose, etc. > The approach > I'm using for FreeBSD is to provide an architecture-specific function > that uses the > relevant register to locate the pointer to the DTV array. It then > calls a shared > function (patch 7) that extracts the TLS index from the runtime > linker's data > structure and computes the final address via 'DTV[index] + offset'. > > Mostly I did this because I don't like libthread_db, but using a > gdbarch method > should also be a bit more cross-debugger friendly (you don't have to > have a libthread_db > on a FreeBSD host that understands the Linux runtime linker or thread > library or > vice versa, and similar concerns with 32-bit vs 64-bit and x86 vs ARM, > etc.). Ok, thanks to your explanation I think I understand better the need to have an arch-specific way of doing it. >>> diff --git a/gdb/gdbarch.sh b/gdb/gdbarch.sh >>> index afc4da7cdd..09097bcbaf 100755 >>> --- a/gdb/gdbarch.sh >>> +++ b/gdb/gdbarch.sh >>> @@ -602,6 +602,7 @@ m;int;remote_register_number;int >>> regno;regno;;default_remote_register_number;;0 >>> >>> # Fetch the target specific address used to represent a load >>> module. >>> F;CORE_ADDR;fetch_tls_load_module_address;struct objfile >>> *objfile;objfile >>> +M;CORE_ADDR;get_thread_local_address;ptid_t ptid, CORE_ADDR lm_addr, >>> CORE_ADDR offset;ptid, lm_addr, offset >> >> Could you document the method, especially the meaning of the >> parameters? > > Sure. I used a variant of the comment from the target method: > > diff --git a/gdb/gdbarch.sh b/gdb/gdbarch.sh > index 48fcebd19a..d15b6aa794 100755 > --- a/gdb/gdbarch.sh > +++ b/gdb/gdbarch.sh > @@ -602,6 +602,14 @@ m;int;remote_register_number;int > regno;regno;;default_remote_register_number;;0 > > # Fetch the target specific address used to represent a load module. > F;CORE_ADDR;fetch_tls_load_module_address;struct objfile > *objfile;objfile > + > +# Return the thread-local address at OFFSET in the thread-local > +# storage for the thread PTID and the shared library or executable > +# file given by LM_ADDR. If that block of thread-local storage hasn't > +# been allocated yet, this function may return an error. LM_ADDR may > +# be zero for statically linked multithreaded inferiors. What does "may return an error" mean? A special CORE_ADDR value, or it throws an error? Simon