From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 78532 invoked by alias); 5 Feb 2019 22:21:37 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 78522 invoked by uid 89); 5 Feb 2019 22:21:37 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-0.4 required=5.0 tests=BAYES_00,KAM_STOCKGEN,SPF_PASS autolearn=no version=3.3.2 spammy=1314, H*c:windows-1252 X-HELO: mx2.freebsd.org Received: from mx2.freebsd.org (HELO mx2.freebsd.org) (8.8.178.116) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 05 Feb 2019 22:21:35 +0000 Received: from mx1.freebsd.org (mx1.freebsd.org [96.47.72.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (Client CN "mx1.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx2.freebsd.org (Postfix) with ESMTPS id CF2FC99D45; Tue, 5 Feb 2019 22:21:32 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id DB7CB6D276; Tue, 5 Feb 2019 22:21:31 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from John-Baldwins-MacBook-Pro-3.local (ralph.baldwin.cx [66.234.199.215]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) (Authenticated sender: jhb) by smtp.freebsd.org (Postfix) with ESMTPSA id 3BBF9100B9; Tue, 5 Feb 2019 22:21:31 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Subject: Re: [PATCH 3/9] Handle TLS variable lookups when using separate debug object files. To: Simon Marchi Cc: gdb-patches@sourceware.org References: <6f24b813adb0155b499d6e2265a6f15a2db4e6ca.1548180889.git.jhb@FreeBSD.org> <27bfe45d3ccba5f52d2e3632da417000@polymtl.ca> From: John Baldwin Openpgp: preference=signencrypt Message-ID: Date: Tue, 05 Feb 2019 22:21:00 -0000 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: DB7CB6D276 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.96 / 15.00]; local_wl_from(0.00)[FreeBSD.org]; NEURAL_HAM_MEDIUM(-1.00)[-0.996,0]; NEURAL_HAM_SHORT(-0.96)[-0.963,0]; ASN(0.00)[asn:11403, ipnet:96.47.64.0/20, country:US]; NEURAL_HAM_LONG(-1.00)[-0.999,0] X-IsSubscribed: yes X-SW-Source: 2019-02/txt/msg00030.txt.bz2 On 2/5/19 12:06 PM, Simon Marchi wrote: > On 2019-02-04 15:02, John Baldwin wrote: >> On 2/2/19 7:52 AM, Simon Marchi wrote: >>> On 2019-01-22 13:42, John Baldwin wrote: >>>> The object files stored in the shared object list are the original >>>> object files, not the separate debug object files. However, when >>>> resolving a TLS variable, the variable is resolved against a separate >>>> debug object file if it exists. In this case, >>>> svr4_fetch_objfile_link_map was failing to find the link map entry >>>> since the debug object file is not in its internal list, only the >>>> original object file. >>> >>> Does this solve an existing issue, or an issue that would come up with >>> the following patches? >> >> I only noticed while working on these patches, but I believe it is a >> generic issue. I tried to reproduce on a Linux box by compiling a >> small >> library with separate debug symbols and a program that linked against >> it >> and running it under gdb, but TLS variables didn't work for me even >> without >> separate debug symbols in my testing. :( >> >> $ cat foo.c >> #include "foo.h" >> >> static __thread int foo_id; >> >> void >> set_foo_id(int id) >> { >> foo_id = id; >> } >> >> int >> get_foo_id(void) >> { >> return foo_id; >> } >> $ cat foo.h >> void set_foo_id(int id); >> int get_foo_id(void); >> $ cat main.c >> #include >> >> #include "foo.h" >> >> int >> main(void) >> { >> >> set_foo_id(47); >> printf("foo id = %d\n", get_foo_id()); >> return (0); >> } >> $ cc -g -fPIC -shared foo.c -o libfoo.so.full >> $ objcopy --only-keep-debug libfoo.so.full libfoo.so.debug >> $ objcopy --strip-debug --add-gnu-debuglink=libfoo.so.debug >> libfoo.so.full libfoo.so >> $ cc -g main.c -o foo -lfoo -L. >> $ gdb foo >> GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git >> ... >> (gdb) start >> Temporary breakpoint 1 at 0x7ae: file main.c, line 9. >> Starting program: /home/john/tls_lib/foo >> >> Temporary breakpoint 1, main () at main.c:9 >> 9 set_foo_id(47); >> (gdb) p foo_id >> Cannot find thread-local storage for process 3970, shared library >> libfoo.so: >> Cannot find thread-local variables on this target >> >> Then tried it without separate debug file, but that didn't work either: >> >> $ cc -g -fPIC -shared foo.c -o libfoo.so >> $ gdb foo >> GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git >> ... >> (gdb) start >> Temporary breakpoint 1 at 0x7ae: file main.c, line 9. >> Starting program: /home/john/tls_lib/foo >> >> Temporary breakpoint 1, main () at main.c:9 >> 9 set_foo_id(47); >> (gdb) p foo_id >> Cannot find thread-local storage for process 3982, shared library >> libfoo.so: >> Cannot find thread-local variables on this target >> (gdb) n >> 10 printf("foo id = %d\n", get_foo_id()); >> (gdb) >> foo id = 47 >> 11 return (0); >> (gdb) p foo_id >> Cannot find thread-local storage for process 3982, shared library >> libfoo.so: >> Cannot find thread-local variables on this target >> >> I would have expected the second case to work and the first to fail and >> for >> the patch to fix the first case. > > I did a similar test and hit the same message. I figured it's because > you need to build with -pthread. With -pthread, GDB manages to print > the value in both cases (with and without separate debug info). That's > why I ended up asking you this question, I thought that maybe I just > hadn't reproduced the issue correctly. > > I have attached my test case (so we don't have to rewrite the same test > over and over), which you can easily build with and without debug info > (see the Makefile). Does it trigger the problem on FreeBSD? Does it > work for you in both cases on Linux, like it does for me? > > Now that I take a look, there might be the gdb.threads/tls-shared.exp > test that does the exact same thing. But there doesn't seem to be a > board file to test with separate debug files out of the box... So, it works fine for me without this patch on FreeBSD. However, I dug around a bit more as I had only noticed this before when trying to debug a core dump for riscv (as FreeBSD/riscv64 core dumps have the tp register in them). That is when I get a separate objfile passed to svr4_fetch_objfile_link_map: Breakpoint 3, svr4_fetch_objfile_link_map (objfile=0x8038a1600) at ../../gdb/solib-svr4.c:1541 1541 struct svr4_info *info = get_svr4_info (); (top-gdb) p *objfile $3 = {next = 0x8038a1480, original_name = 0x8038df010 "/usr/lib/debug//ufs/riscv64/rootfs/lib/libc.so.7.debug", addr_low = 0x0, flags = { ... (top-gdb) where #0 svr4_fetch_objfile_link_map (objfile=0x8038a1600) at ../../gdb/solib-svr4.c:1541 #1 0x00000000015085e7 in gdbarch_fetch_tls_load_module_address (gdbarch=0x803800010, objfile=0x8038a1600) at ../../gdb/gdbarch.c:3019 #2 0x00000000019ba415 in target_translate_tls_address (objfile=0x8038a1600, offset=0x1800) at ../../gdb/target.c:705 #3 0x00000000016ea2a1 in find_minsym_type_and_address (msymbol=0x803cfcb18, objfile=0x8038a1600, address_p=0x7fffffffd2e8) at ../../gdb/parse.c:500 #4 0x00000000014afeb7 in evaluate_var_msym_value (noside=EVAL_NORMAL, objfile=0x8038a1600, msymbol=0x803cfcb18) at ../../gdb/eval.c:743 #5 0x00000000014b0290 in evaluate_subexp_standard (expect_type=0x0, exp=0x8034f0430, pos=0x7fffffffde34, noside=EVAL_NORMAL) at ../../gdb/eval.c:1326 #6 0x00000000012a8562 in evaluate_subexp_c (expect_type=0x0, exp=0x8034f0430, pos=0x7fffffffde34, noside=EVAL_NORMAL) at ../../gdb/c-lang.c:704 #7 0x00000000014ae17b in evaluate_subexp (expect_type=0x0, exp=0x8034f0430, pos=0x7fffffffde34, noside=EVAL_NORMAL) at ../../gdb/eval.c:80 #8 0x00000000014ae48b in evaluate_expression (exp=0x8034f0430) at ../../gdb/eval.c:141 #9 0x000000000173184f in print_command_1 ( exp=0x7fffffffe101 "__je_tsd_initialized", voidprint=1) at ../../gdb/printcmd.c:1187 #10 0x000000000172e5ff in print_command ( exp=0x7fffffffe101 "__je_tsd_initialized", from_tty=1) at ../../gdb/printcmd.c:1200 (top-gdb) frame 5 #5 0x00000000014b0290 in evaluate_subexp_standard (expect_type=0x0, exp=0x8034f0430, pos=0x7fffffffde34, noside=EVAL_NORMAL) at ../../gdb/eval.c:1326 1326 value *val = evaluate_var_msym_value (noside, (top-gdb) l 1321 case OP_VAR_MSYM_VALUE: 1322 { 1323 (*pos) += 3; 1324 1325 minimal_symbol *msymbol = exp->elts[pc + 2].msymbol; 1326 value *val = evaluate_var_msym_value (noside, 1327 exp->elts[pc + 1].objfile, 1328 msymbol); 1329 1330 type = value_type (val); whereas on a live amd64 process this was called from a different place: (top-gdb) where #0 svr4_fetch_objfile_link_map (objfile=0x803879380) at ../../gdb/solib-svr4.c:1541 #1 0x00000000015085e7 in gdbarch_fetch_tls_load_module_address (gdbarch=0x8039b7010, objfile=0x803879380) at ../../gdb/gdbarch.c:3019 #2 0x00000000019ba3f5 in target_translate_tls_address (objfile=0x803879380, offset=6168) at ../../gdb/target.c:705 #3 0x000000000141f361 in dwarf_evaluate_loc_desc::get_tls_address ( this=0x7fffffffc578, offset=6168) at ../../gdb/dwarf2loc.c:609 #4 0x00000000014077dd in dwarf_expr_context::execute_stack_op ( this=0x7fffffffc578, op_ptr=0x803dea6fb "\002S\177\001", op_end=0x803dea6fb "\002S\177\001") at ../../gdb/dwarf2expr.c:1175 #5 0x00000000014056b1 in dwarf_expr_context::eval (this=0x7fffffffc578, addr=0x803dea6f1 "\016\030\030", len=10) at ../../gdb/dwarf2expr.c:301 #6 0x000000000140e99d in dwarf2_evaluate_loc_desc_full (type=0x8040ad930, frame=0x0, data=0x803dea6f1 "\016\030\030", size=10, per_cu=0x8039cb190, subobj_type=0x8040ad930, subobj_byte_offset=0) at ../../gdb/dwarf2loc.c:2170 #7 0x000000000140e7d2 in dwarf2_evaluate_loc_desc (type=0x8040ad930, frame=0x0, data=0x803dea6f1 "\016\030\030", size=10, per_cu=0x8039cb190) at ../../gdb/dwarf2loc.c:2352 #8 0x0000000001412794 in locexpr_read_variable (symbol=0x80416c930, frame=0x0) at ../../gdb/dwarf2loc.c:3503 #9 0x00000000014e3743 in default_read_var_value (var=0x80416c930, var_block=0x8041724e0, frame=0x0) at ../../gdb/findvar.c:610 #10 0x00000000014e4881 in read_var_value (var=0x80416c930, var_block=0x8041724e0, frame=0x0) at ../../gdb/findvar.c:815 #11 0x0000000001a99abe in value_of_variable (var=0x80416c930, b=0x8041724e0) at ../../gdb/valops.c:1292 #12 0x00000000014afd7b in evaluate_var_value (noside=EVAL_NORMAL, blk=0x8041724e0, var=0x80416c930) at ../../gdb/eval.c:721 #13 0x00000000014b0217 in evaluate_subexp_standard (expect_type=0x0, exp=0x803507830, pos=0x7fffffffd504, noside=EVAL_NORMAL) at ../../gdb/eval.c:1312 #14 0x00000000012a8562 in evaluate_subexp_c (expect_type=0x0, exp=0x803507830, pos=0x7fffffffd504, noside=EVAL_NORMAL) at ../../gdb/c-lang.c:704 #15 0x00000000014ae17b in evaluate_subexp (expect_type=0x0, exp=0x803507830, pos=0x7fffffffd504, noside=EVAL_NORMAL) at ../../gdb/eval.c:80 #16 0x00000000014ae48b in evaluate_expression (exp=0x803507830) at ../../gdb/eval.c:141 #17 0x000000000173184f in print_command_1 (exp=0x7fffffffd7d1 "__je_tsd_initialized", voidprint=1) at ../../gdb/printcmd.c:1187 #18 0x000000000172e5ff in print_command (exp=0x7fffffffd7d1 "__je_tsd_initialized", from_tty=1) at ../../gdb/printcmd.c:1200 (top-gdb) frame 13 #13 0x00000000014b0217 in evaluate_subexp_standard (expect_type=0x0, exp=0x803507830, pos=0x7fffffffd504, noside=EVAL_NORMAL) at ../../gdb/eval.c:1312 1312 return evaluate_var_value (noside, exp->elts[pc + 1].block, var); (top-gdb) l 1307 (*pos) += 3; 1308 symbol *var = exp->elts[pc + 2].symbol; 1309 if (TYPE_CODE (SYMBOL_TYPE (var)) == TYPE_CODE_ERROR) 1310 error_unknown_type (SYMBOL_PRINT_NAME (var)); 1311 if (noside != EVAL_SKIP) 1312 return evaluate_var_value (noside, exp->elts[pc + 1].block, var); 1313 else 1314 { 1315 /* Return a dummy value of the correct type when skipping, so 1316 that parent functions know what is to be skipped. */ So it seems that the OP_VAR_VALUE path calls down into the dwarf bits that get the "original" objfile to pass to target_translate_tls_address, whereas the OP_VAR_MSYM_VALUE case ends up using a separate object file. This might in fact be due to bugs in the RISCV GCC backend as the TLS symbols for RISCV don't seem quite right. I have to cast TLS variables to their types for example: (gdb) p __je_tsd_initialized '__je_tsd_initialized` has unknown type; cast it to its declared type (gdb) p (bool)__je_tsd_initialized $2 = 1 '\001' Also, for me TLS variables in the main executable did not work for me on RISCV, only TLS variables in shared libraries, unlike on other architectures I tested where TLS variables in both the executable and shared libraries worked. -- John Baldwin