From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 113033 invoked by alias); 29 Nov 2018 14:32:23 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 113020 invoked by uid 89); 29 Nov 2018 14:32:22 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 29 Nov 2018 14:32:19 +0000 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 63F0D85362; Thu, 29 Nov 2018 14:32:18 +0000 (UTC) Received: from [127.0.0.1] (ovpn04.gateway.prod.ext.ams2.redhat.com [10.39.146.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6C40265F4D; Thu, 29 Nov 2018 14:32:17 +0000 (UTC) Subject: Re: [PATCH] AArch64: Racy: Don't set empty set of hardware BPs/WPs on new thread To: Alan Hayward , "gdb-patches@sourceware.org" References: <20181120115602.30606-1-alan.hayward@arm.com> Cc: nd From: Pedro Alves Message-ID: <53201c24-5947-c29e-85cc-eff8e6ece2d4@redhat.com> Date: Thu, 29 Nov 2018 14:32:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20181120115602.30606-1-alan.hayward@arm.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2018-11/txt/msg00533.txt.bz2 On 11/20/2018 11:56 AM, Alan Hayward wrote: > [Note: I hope to eventually isolate the exact kernel/hardware issue and > raise a bug elsewhere against that.] Any luck with that so far? A fix that makes a race window a bit narrower, while still leaving it open, risks pushing down / delaying a proper fix, papering over the issue, and making the bug harder to reproduce. But I won't object, especially since this seems to reduce number of syscalls gdb issues, so could be seen as optimization and good thing on its own. LGTM, with nits below addressed. > > On some heavily loaded AArch64 boxes, GDB will sometimes hang forever when > the inferior creates a thread. This hang happens inside the kernel during > the ptrace call to set hardware watchpoints or hardware breakpoints. > Currently, GDB will always set hw wp/bp at the start of each thread even if > there are none set in the process. > > This patch works around the issue by avoiding setting hw wp/bp if there > are none set for the process. > > On an effected machine, this fix drastically reduces the racy nature of the > gdb.threads test set. I ran the entire gdb test suite across all processors > for 100 iterations, then ran the results through the racy tests script. > Without the patch, 58 .exp files in gdb.threads were marked as racy. After > the patch this reduced to the same ~14 tests as the non effected boxes. > > Clearly GDB will still be subject to hangs on an effect box if hw wp/bp's are > used prior to creating inferior threads on a heavily loaded system. > > To enable this in gdbserver, the sequence in gdbserver add_lwp() is switched > to the same as gdb order as gdb, to ensure the thread is registered before > calling new_thread(). This allows aarch64_linux_new_thread() to read the > ptid. > > gdb/ChangeLog: > > 2018-11-20 Alan Hayward > > * nat/aarch64-linux-hw-point.c > (aarch64_linux_any_set_debug_regs_state): New function. > * nat/aarch64-linux-hw-point.h > (aarch64_linux_any_set_debug_regs_state): New declaration. > * nat/aarch64-linux.c (aarch64_linux_new_thread): Check if any > BPs or WPs are set. > > gdb/gdbserver/ChangeLog: > > 2018-11-20 Alan Hayward > > * linux-low.c (add_lwp): Switch ordering. > --- > gdb/gdbserver/linux-low.c | 4 ++-- > gdb/nat/aarch64-linux-hw-point.c | 23 +++++++++++++++++++++++ > gdb/nat/aarch64-linux-hw-point.h | 6 ++++++ > gdb/nat/aarch64-linux.c | 15 ++++++++++----- > 4 files changed, 41 insertions(+), 7 deletions(-) > > diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c > index 701f3e863c..1c336efade 100644 > --- a/gdb/gdbserver/linux-low.c > +++ b/gdb/gdbserver/linux-low.c > @@ -951,11 +951,11 @@ add_lwp (ptid_t ptid) > > lwp->waitstatus.kind = TARGET_WAITKIND_IGNORE; > > + lwp->thread = add_thread (ptid, lwp); > + > if (the_low_target.new_thread != NULL) > the_low_target.new_thread (lwp); > > - lwp->thread = add_thread (ptid, lwp); > - > return lwp; > } > > diff --git a/gdb/nat/aarch64-linux-hw-point.c b/gdb/nat/aarch64-linux-hw-point.c > index 18b5af2d49..7f96fa2b7a 100644 > --- a/gdb/nat/aarch64-linux-hw-point.c > +++ b/gdb/nat/aarch64-linux-hw-point.c > @@ -717,6 +717,29 @@ aarch64_linux_set_debug_regs (struct aarch64_debug_reg_state *state, > } > } > > +/* See nat/aarch64-linux-hw-point.h. */ > + > +bool > +aarch64_linux_any_set_debug_regs_state (struct aarch64_debug_reg_state *state, > + bool watchpoint) > +{ > + int i, count; > + const CORE_ADDR *addr; > + const unsigned int *ctrl; > + > + count = watchpoint ? aarch64_num_wp_regs : aarch64_num_bp_regs; > + addr = watchpoint ? state->dr_addr_wp : state->dr_addr_bp; > + ctrl = watchpoint ? state->dr_ctrl_wp : state->dr_ctrl_bp; > + if (count == 0) > + return false; > + > + for (i = 0; i < count; i++) You could declare+initialize at the same time. I.e.: int count = watchpoint ? aarch64_num_wp_regs : aarch64_num_bp_regs; if (count == 0) return false; CORE_ADDR *addr = watchpoint ? state->dr_addr_wp : state->dr_addr_bp; unsigned int ctrl = watchpoint ? state->dr_ctrl_wp : state->dr_ctrl_bp; for (int i = 0; i < count; i++) > --- a/gdb/nat/aarch64-linux-hw-point.h > +++ b/gdb/nat/aarch64-linux-hw-point.h > @@ -182,6 +182,12 @@ int aarch64_handle_watchpoint (enum target_hw_bp_type type, CORE_ADDR addr, > void aarch64_linux_set_debug_regs (struct aarch64_debug_reg_state *state, > int tid, int watchpoint); > > +/* Return TRUE if there are any hardware breakpoints. If WATCHPOINT is TRUE, > + check hardware watchpoints instead. */ > +bool > +aarch64_linux_any_set_debug_regs_state (struct aarch64_debug_reg_state *state, > + bool watchpoint); Function name only goes on first column in definitions. See e.g., extension.h: bool aarch64_linux_any_set_debug_regs_state (struct aarch64_debug_reg_state *state, bool watchpoint); Or you could drop "struct" to fit under 80 cols: bool aarch64_linux_any_set_debug_regs_state (aarch64_debug_reg_state *state, bool watchpoint); > + > void aarch64_show_debug_reg_state (struct aarch64_debug_reg_state *state, > const char *func, CORE_ADDR addr, > int len, enum target_hw_bp_type type); > diff --git a/gdb/nat/aarch64-linux.c b/gdb/nat/aarch64-linux.c > index 0f2c8011ea..48c59f6288 100644 > --- a/gdb/nat/aarch64-linux.c > +++ b/gdb/nat/aarch64-linux.c > @@ -73,13 +73,18 @@ aarch64_linux_prepare_to_resume (struct lwp_info *lwp) > void > aarch64_linux_new_thread (struct lwp_info *lwp) > { > + ptid_t ptid = ptid_of_lwp (lwp); > + struct aarch64_debug_reg_state *state > + = aarch64_get_debug_reg_state (ptid.pid ()); > struct arch_lwp_info *info = XNEW (struct arch_lwp_info); > > - /* Mark that all the hardware breakpoint/watchpoint register pairs > - for this thread need to be initialized (with data from > - aarch_process_info.debug_reg_state). */ > - DR_MARK_ALL_CHANGED (info->dr_changed_bp, aarch64_num_bp_regs); > - DR_MARK_ALL_CHANGED (info->dr_changed_wp, aarch64_num_wp_regs); > + /* If there are hardware breakpoints/watchpoints in the process then mark that > + all the hardware breakpoint/watchpoint register pairs for this thread need > + to be initialized (with data from aarch_process_info.debug_reg_state). */ > + if (aarch64_linux_any_set_debug_regs_state (state, false)) > + DR_MARK_ALL_CHANGED (info->dr_changed_bp, aarch64_num_bp_regs); > + if (aarch64_linux_any_set_debug_regs_state (state, true)) > + DR_MARK_ALL_CHANGED (info->dr_changed_wp, aarch64_num_wp_regs); > > lwp_set_arch_private_info (lwp, info); > } > Thanks, Pedro Alves