From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-68527-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 23326 invoked by alias); 20 Nov 2009 14:15:36 -0000
Received: (qmail 23308 invoked by uid 22791); 20 Nov 2009 14:15:31 -0000
X-SWARE-Spam-Status: No, hits=-2.5 required=5.0 	tests=AWL,BAYES_00,SPF_PASS
X-Spam-Check-By: sourceware.org
Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (38.113.113.100)     by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 20 Nov 2009 14:14:24 +0000
Received: (qmail 31509 invoked from network); 20 Nov 2009 14:14:22 -0000
Received: from unknown (HELO orlando) (pedro@127.0.0.2)   by mail.codesourcery.com with ESMTPA; 20 Nov 2009 14:14:22 -0000
From: Pedro Alves <pedro@codesourcery.com>
Subject: Re: hardware watchpoints in non-stop - "moribund"/delayed watchpoint traps
Date: Fri, 20 Nov 2009 14:15:00 -0000
User-Agent: KMail/1.9.10
References: <200911181445.38897.pedro@codesourcery.com> <200911181945.50684.pedro@codesourcery.com> <20091118205115.GA2301@host0.dyn.jankratochvil.net>
In-Reply-To: <20091118205115.GA2301@host0.dyn.jankratochvil.net>
MIME-Version: 1.0
Content-Disposition: inline
To: gdb-patches@sourceware.org
Content-Type: text/plain;   charset="us-ascii"
Content-Transfer-Encoding: 7bit
Message-Id: <200911201414.23316.pedro@codesourcery.com>
X-IsSubscribed: yes
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2009-11/txt/msg00434.txt.bz2

Here's what I just checked in.

I've made two more changes: there's no use keeping moribund
locations around that won't be used.  And, I've also updated
the comment in breakpoint.c.  The "but not a correctness problem"
comment is too optimistic, on targets that need PC
adjustment on breakpoint traps.

-- 
Pedro Alves

2009-11-20  Pedro Alves  <pedro@codesourcery.com>

	* infrun.c (handle_inferior_event): Hardware hatchpoint traps are
	never random signals.
	* breakpoint.c (update_global_location_list): Always delete
	immediately delete hardware watchpoint locations and other
	locations whose target address isn't meaningful.  Update comment
	explaining the hazard of moribund locations.

---
 gdb/breakpoint.c |   61 +++++++++++++++++++++++++++++++++++++++++++------------
 gdb/infrun.c     |   16 ++++++++++++++
 2 files changed, 64 insertions(+), 13 deletions(-)

Index: src/gdb/infrun.c
===================================================================
--- src.orig/gdb/infrun.c	2009-11-20 12:45:54.000000000 +0000
+++ src/gdb/infrun.c	2009-11-20 12:46:18.000000000 +0000
@@ -3593,6 +3593,21 @@ targets should add new threads to the th
 	 function.  */
       stop_print_frame = 1;
 
+      /* This is where we handle "moribund" watchpoints.  Unlike
+	 software breakpoints traps, hardware watchpoint traps are
+	 always distinguishable from random traps.  If no high-level
+	 watchpoint is associated with the reported stop data address
+	 anymore, then the bpstat does not explain the signal ---
+	 simply make sure to ignore it if `stopped_by_watchpoint' is
+	 set.  */
+
+      if (debug_infrun
+	  && ecs->event_thread->stop_signal == TARGET_SIGNAL_TRAP
+	  && !bpstat_explains_signal (ecs->event_thread->stop_bpstat)
+	  && stopped_by_watchpoint)
+	fprintf_unfiltered (gdb_stdlog, "\
+infrun: no user watchpoint explains watchpoint SIGTRAP, ignoring\n");
+
       /* NOTE: cagney/2003-03-29: These two checks for a random signal
          at one stage in the past included checks for an inferior
          function call's call dummy's return breakpoint.  The original
@@ -3616,6 +3631,7 @@ targets should add new threads to the th
       if (ecs->event_thread->stop_signal == TARGET_SIGNAL_TRAP)
 	ecs->random_signal
 	  = !(bpstat_explains_signal (ecs->event_thread->stop_bpstat)
+	      || stopped_by_watchpoint
 	      || ecs->event_thread->trap_expected
 	      || (ecs->event_thread->step_range_end
 		  && ecs->event_thread->step_resume_breakpoint == NULL));
Index: src/gdb/breakpoint.c
===================================================================
--- src.orig/gdb/breakpoint.c	2009-11-20 12:45:54.000000000 +0000
+++ src/gdb/breakpoint.c	2009-11-20 12:46:18.000000000 +0000
@@ -8300,20 +8300,55 @@ update_global_location_list (int should_
 
       if (!found_object)
 	{
-	  if (removed && non_stop)
+	  if (removed && non_stop
+	      && breakpoint_address_is_meaningful (old_loc->owner)
+	      && !is_hardware_watchpoint (old_loc->owner))
 	    {
-	      /* This location was removed from the targets.  In non-stop mode,
-		 a race condition is possible where we've removed a breakpoint,
-		 but stop events for that breakpoint are already queued and will
-		 arrive later.  To suppress spurious SIGTRAPs reported to user,
-		 we keep this breakpoint location for a bit, and will retire it
-		 after we see 3 * thread_count events.
-		 The theory here is that reporting of events should,
-		 "on the average", be fair, so after that many event we'll see
-		 events from all threads that have anything of interest, and no
-		 longer need to keep this breakpoint.  This is just a
-		 heuristic, but if it's wrong, we'll report unexpected SIGTRAP,
-		 which is usability issue, but not a correctness problem.  */
+	      /* This location was removed from the target.  In
+		 non-stop mode, a race condition is possible where
+		 we've removed a breakpoint, but stop events for that
+		 breakpoint are already queued and will arrive later.
+		 We apply an heuristic to be able to distinguish such
+		 SIGTRAPs from other random SIGTRAPs: we keep this
+		 breakpoint location for a bit, and will retire it
+		 after we see some number of events.  The theory here
+		 is that reporting of events should, "on the average",
+		 be fair, so after a while we'll see events from all
+		 threads that have anything of interest, and no longer
+		 need to keep this breakpoint location around.  We
+		 don't hold locations forever so to reduce chances of
+		 mistaking a non-breakpoint SIGTRAP for a breakpoint
+		 SIGTRAP.
+
+		 The heuristic failing can be disastrous on
+		 decr_pc_after_break targets.
+
+		 On decr_pc_after_break targets, like e.g., x86-linux,
+		 if we fail to recognize a late breakpoint SIGTRAP,
+		 because events_till_retirement has reached 0 too
+		 soon, we'll fail to do the PC adjustment, and report
+		 a random SIGTRAP to the user.  When the user resumes
+		 the inferior, it will most likely immediately crash
+		 with SIGILL/SIGBUS/SEGSEGV, or worse, get silently
+		 corrupted, because of being resumed e.g., in the
+		 middle of a multi-byte instruction, or skipped a
+		 one-byte instruction.  This was actually seen happen
+		 on native x86-linux, and should be less rare on
+		 targets that do not support new thread events, like
+		 remote, due to the heuristic depending on
+		 thread_count.
+
+		 Mistaking a random SIGTRAP for a breakpoint trap
+		 causes similar symptoms (PC adjustment applied when
+		 it shouldn't), but then again, playing with SIGTRAPs
+		 behind the debugger's back is asking for trouble.
+
+		 Since hardware watchpoint traps are always
+		 distinguishable from other traps, so we don't need to
+		 apply keep hardware watchpoint moribund locations
+		 around.  We simply always ignore hardware watchpoint
+		 traps we can no longer explain.  */
+
 	      old_loc->events_till_retirement = 3 * (thread_count () + 1);
 	      old_loc->owner = NULL;