From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 97055 invoked by alias); 24 Jan 2020 03:32:42 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 97046 invoked by uid 89); 24 Jan 2020 03:32:42 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-5.4 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: simark.ca Received: from simark.ca (HELO simark.ca) (158.69.221.121) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 24 Jan 2020 03:32:40 +0000 Received: from [10.0.0.11] (unknown [192.222.164.54]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 8786E1E5FD; Thu, 23 Jan 2020 22:32:38 -0500 (EST) Subject: Re: [PATCH] Fix re-runs of a second inferior (PR gdb/25410) To: Pedro Alves , gdb-patches@sourceware.org References: <20200124030222.13854-1-palves@redhat.com> From: Simon Marchi Message-ID: <50431f58-94c4-88ce-f59a-54c16a36475c@simark.ca> Date: Fri, 24 Jan 2020 07:00:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: <20200124030222.13854-1-palves@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2020-01/txt/msg00779.txt.bz2 On 2020-01-23 10:02 p.m., Pedro Alves wrote: > This fixes a latent bug exposed by the multi-target patch (5b6d1e4fa > "Multi-target support). > > The symptom described in the bug is that starting a first inferior, > then trying to run a second (multi-threaded) inferior twice, causes > libthread_db to fail to load, along with other erratic behavior: > > (gdb) run > Starting program: /tmp/foo > warning: td_ta_new failed: generic error > > Going a bit deeply, I found that if the two inferiors have different > symbols, we can see that just after inferior 2 exits, we are left with > inferior 2 selected, which is correct, but the symbols in scope belong > to inferior 1, which is obviously incorrect... > > This problem is that there's a path in > scoped_restore_current_thread::restore() that switches to no thread > selected, and switches the current inferior, but leaves the current > program space as is, resulting in leaving the program space pointing > to the wrong program space (the one of the other inferior). This was > happening after handling TARGET_WAITKIND_NO_RESUMED, which is an event > that triggers after TARGET_WAITKIND_EXITED for the previous inferior > exit. Subsequent symbol lookups find the symbols of the wrong > inferior. > > The fix is to use switch_to_inferior_no_thread in that problem spot. > This function was recently added along with the multi-target work > exactly for these situations. > > As for testing, this patch adds a new testcase that tests symbol > printing just after inferior exit, which exercises the root cause of > the problem more directly. And then, to cover the use case described > in the bug too, it also exercises the lithread_db.so mis-loading, by > using TLS printing as a proxy for being sure that threaded debugging > was activated sucessfully. The testcase fails without the fix like > this, for the "print symbol just after exit" bits: > > ... > [Inferior 1 (process 8719) exited normally] > (gdb) PASS: gdb.multi/multi-re-run.exp: re_run_inf=1: iter=1: continue until exit > print re_run_var_1 > No symbol "re_run_var_1" in current context. > (gdb) FAIL: gdb.multi/multi-re-run.exp: re_run_inf=1: iter=1: print re_run_var_1 > ... > > And like this for the "libthread_db.so loading" bits: > > (gdb) run > Starting program: /home/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.multi/multi-re-run/multi-re-run > warning: td_ta_new failed: generic error > [New LWP 27001] > > Thread 1.1 "multi-re-run" hit Breakpoint 3, all_started () at /home/pedro/gdb/binutils-gdb/build/../src/gdb/testsuite/gdb.multi/multi-re-run.c:44 > 44 } > (gdb) PASS: gdb.multi/multi-re-run.exp: re_run_inf=1: iter=2: running to all_started in runto > print tls_var > Cannot find thread-local storage for LWP 27000, executable file /home/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.multi/multi-re-run/multi-re-run: > Cannot find thread-local variables on this target > (gdb) FAIL: gdb.multi/multi-re-run.exp: re_run_inf=1: iter=2: print tls_var Thanks, that fixes it for me (and LGTM). Simon