From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-86031-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 10007 invoked by alias); 12 Dec 2011 16:57:06 -0000
Received: (qmail 9993 invoked by uid 22791); 12 Dec 2011 16:57:04 -0000
X-SWARE-Spam-Status: No, hits=-1.8 required=5.0	tests=AWL,BAYES_00
X-Spam-Check-By: sourceware.org
Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 12 Dec 2011 16:56:51 +0000
Received: from nat-ies.mentorg.com ([192.94.31.2] helo=EU1-MAIL.mgc.mentorg.com)	by relay1.mentorg.com with esmtp 	id 1Ra9BI-0001Ez-QL from pedro_alves@mentor.com ; Mon, 12 Dec 2011 08:56:49 -0800
Received: from scottsdale.localnet ([172.16.63.104]) by EU1-MAIL.mgc.mentorg.com with Microsoft SMTPSVC(6.0.3790.1830);	 Mon, 12 Dec 2011 16:56:46 +0000
From: Pedro Alves <pedro@codesourcery.com>
To: Jan Kratochvil <jan.kratochvil@redhat.com>
Subject: Re: [PATCH] PR threads/10729: x86 hw watchpoints and non-stop mode
Date: Mon, 12 Dec 2011 17:23:00 -0000
User-Agent: KMail/1.13.6 (Linux/2.6.38-13-generic; KDE/4.7.2; x86_64; ; )
Cc: Tristan Gingold <gingold@adacore.com>, gdb-patches@sourceware.org
References: <201112051601.59664.pedro@codesourcery.com> <201112091630.20916.pedro@codesourcery.com> <20111211233811.GA27629@host2.jankratochvil.net>
In-Reply-To: <20111211233811.GA27629@host2.jankratochvil.net>
MIME-Version: 1.0
Content-Type: Text/Plain;  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201112121656.44478.pedro@codesourcery.com>
X-IsSubscribed: yes
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2011-12/txt/msg00350.txt.bz2

On Sunday 11 December 2011 23:38:11, Jan Kratochvil wrote:
> Hi Pedro,
> 
> on simple non-hit (inferior does not touch "j" at all) watchpoint case:
> strace -o 2 -q ./gdb -nx ./36 -ex start -ex 'watch j' -ex stepi -ex 'set confirm no' -ex q
> grep 'PTRACE_....USER' 1 >1b; grep 'PTRACE_....USER' 2 >2b
> 
> It has performance regression of 15 ptrace syscalls -> 27 ptrace syscalls.

Hmm.  Before, a single step (si) (instead of all syscalls fro the
begining) does:

$ tail -f 1 | grep PTRACE_P...USER
ptrace(PTRACE_POKEUSER, 15618, offsetof(struct user, u_debugreg), 0x601028) = 0
ptrace(PTRACE_PEEKUSER, 15618, offsetof(struct user, u_debugreg) + 48, [0x4000]) = 0
ptrace(PTRACE_POKEUSER, 15618, offsetof(struct user, u_debugreg) + 48, 0x4000) = 0
ptrace(PTRACE_POKEUSER, 15618, offsetof(struct user, u_debugreg) + 56, 0xd0101) = 0
ptrace(PTRACE_PEEKUSER, 15618, offsetof(struct user, u_debugreg) + 48, [0x4000]) = 0
ptrace(PTRACE_POKEUSER, 15618, offsetof(struct user, u_debugreg), 0) = 0
ptrace(PTRACE_POKEUSER, 15618, offsetof(struct user, u_debugreg) + 56, 0xd0100) = 0

So, 7 related ptrace calls.

After the patch, just one single-step does:

$ tail -f 2 | grep PTRACE_P...USER

ptrace(PTRACE_POKEUSER, 16139, offsetof(struct user, u_debugreg), 0x601028) = 0
ptrace(PTRACE_POKEUSER, 16139, offsetof(struct user, u_debugreg) + 8, 0) = 0
ptrace(PTRACE_POKEUSER, 16139, offsetof(struct user, u_debugreg) + 16, 0) = 0
ptrace(PTRACE_POKEUSER, 16139, offsetof(struct user, u_debugreg) + 24, 0) = 0
ptrace(PTRACE_POKEUSER, 16139, offsetof(struct user, u_debugreg) + 56, 0xd0101) = 0
ptrace(PTRACE_PEEKUSER, 16139, offsetof(struct user, u_debugreg) + 48, [0x4000]) = 0
ptrace(PTRACE_PEEKUSER, 16139, offsetof(struct user, u_debugreg) + 56, [0xd0101]) = 0

Still 7 related ptrace syscalls.

The 15 -> 27 jump in all PTRACE_P...USER syscalls is because for all stops,
we're now reading both DR_STATUS and DR_CONTROL, while before
we were only reading DR_STATUS.  And there are stops in starting up an inferior
until it reaches main.  That looks easily fixed.

In addition, before, we'd only poke to DR[N] if the register was used,
now (i386|amd64)_linux_prepare_to_resume always writes all of DR0-3,
even when only one DR is used.  That's looks easily fixed too.

I'm trying out this patch:

 gdb/amd64-linux-nat.c |    3 ++-
 gdb/i386-nat.c        |   26 ++++++++++++++++++--------
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/gdb/amd64-linux-nat.c b/gdb/amd64-linux-nat.c
index 9699f84..dc6a735 100644
--- a/gdb/amd64-linux-nat.c
+++ b/gdb/amd64-linux-nat.c
@@ -390,7 +390,8 @@ amd64_linux_prepare_to_resume (struct lwp_info *lwp)
       struct i386_debug_reg_state *state = i386_debug_reg_state ();
 
       for (i = DR_FIRSTADDR; i <= DR_LASTADDR; i++)
-	amd64_linux_dr_set (lwp->ptid, i, state->dr_mirror[i]);
+	if (state->dr_ref_count[i] > 0)
+	  amd64_linux_dr_set (lwp->ptid, i, state->dr_mirror[i]);
 
       amd64_linux_dr_set (lwp->ptid, DR_CONTROL, state->dr_control_mirror);
 
diff --git a/gdb/i386-nat.c b/gdb/i386-nat.c
index 94306a1..6d59b14 100644
--- a/gdb/i386-nat.c
+++ b/gdb/i386-nat.c
@@ -607,22 +607,32 @@ i386_stopped_data_address (struct target_ops *ops, CORE_ADDR *addr_p)
   int rc = 0;
   unsigned status;
   unsigned control;
+  unsigned control_p = 0;
 
   /* Get the current values the inferior has.  If the thread was
      running when we last changed watchpoints, the mirror no longer
      represents what was set in this thread's debug registers.  */
   status = i386_dr_low.get_status ();
-  control = i386_dr_low.get_control ();
 
   ALL_DEBUG_REGISTERS(i)
     {
-      if (I386_DR_WATCH_HIT (status, i)
-	  /* This second condition makes sure DRi is set up for a data
-	     watchpoint, not a hardware breakpoint.  The reason is
-	     that GDB doesn't call the target_stopped_data_address
-	     method except for data watchpoints.  In other words, I'm
-	     being paranoiac.  */
-	  && I386_DR_GET_RW_LEN (control, i) != 0)
+      if (!I386_DR_WATCH_HIT (status, i))
+	continue;
+
+      /* Fetching DR_CONTROL may require another syscall.  Avoid when
+	 possible.  */
+      if (!control_p)
+	{
+	  control = i386_dr_low.get_control ();
+	  control_p = 1;
+	}
+
+      /* This second condition makes sure DRi is set up for a data
+	 watchpoint, not a hardware breakpoint.  The reason is
+	 that GDB doesn't call the target_stopped_data_address
+	 method except for data watchpoints.  In other words, I'm
+	 being paranoiac.  */
+      if (I386_DR_GET_RW_LEN (control, i) != 0)
 	{
 	  addr = i386_dr_low.get_addr (i);
 	  rc = 1;


With those changes, we're actually even better than before.  Your
example when from 15 -> 12.  A single stepi that doesn't trigger a
watchpoint only does:

ptrace(PTRACE_POKEUSER, 23332, offsetof(struct user, u_debugreg), 0x601028) = 0
ptrace(PTRACE_POKEUSER, 23332, offsetof(struct user, u_debugreg) + 56, 0xd0101) = 0
ptrace(PTRACE_PEEKUSER, 23332, offsetof(struct user, u_debugreg) + 48, [0x4000]) = 0

That is, write DR0 and DR_CONTROL on resume, and, read DR_STATUS on stop.

> On Fri, 09 Dec 2011 17:30:20 +0100, Pedro Alves wrote:
> > @@ -513,22 +499,7 @@ i386_update_inferior_debug_regs (struct i386_debug_reg_state *new_state)
> >    ALL_DEBUG_REGISTERS (i)
> >      {
> >        if (I386_DR_VACANT (new_state, i) != I386_DR_VACANT (&dr_mirror, i))
> > -	{
> > -	  if (!I386_DR_VACANT (new_state, i))
> > -	    {
> > -	      i386_dr_low.set_addr (i, new_state->dr_mirror[i]);
> > -
> 
> 
> > -	      /* Only a sanity check for leftover bits (set possibly only
> > -		 by inferior).  */
> > -	      if (i386_dr_low.unset_status)
> > -		i386_dr_low.unset_status (I386_DR_WATCH_MASK (i));
> 
> Deleting this part is a regression.  Testcase for that part is attached.

Thanks a lot, I'll take a look.  I had assumed this bit:

  if (lwp->stopped_by_watchpoint)
    amd64_linux_dr_set (lwp->ptid, DR_STATUS, 0);

in amd64_linux_prepare_to_resume would fix the issue.

> 
> > +	  && I386_DR_GET_RW_LEN (control, i) != 0)
> >  	{
> > -	  addr = state->dr_mirror[i];
> > +	  addr = i386_dr_low.get_addr (i);
> 
> Why to do this change?  Why we can no longer trust DR_MIRROR?  This is
> a performance regression.

This is non-stop, so threads can be running while we change the
global state->dr_mirror (and friends).  Say, we set a watchpoint,
and let the threads rusume.  Now, say you delete the watchpoint, or
add/remove watchpoints such that state->dr_mirror[*] changes.  Inserting/deleting
watchpoines updates state->dr_mirror[*].  Now threads haven't been updated
with the mirror yet, and say a thread has meanwhile hit an old watchpoint,
but we haven't handled the SIGTRAP yet.  If we trusted state->dr_mirror[*],
we'd mistake the real trapped address to whatever was
currently state->dr_mirror[i].  So state->dr_mirror now represents
intention.  To get at the address that trapped, we need to read the
state the thread had when it trapped.  I'll add some comments to the code.

-- 
Pedro Alves