From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-81941-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 19510 invoked by alias); 1 Jul 2011 18:06:59 -0000
Received: (qmail 19502 invoked by uid 22791); 1 Jul 2011 18:06:58 -0000
X-SWARE-Spam-Status: No, hits=-6.8 required=5.0	tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,TW_BJ,T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 01 Jul 2011 18:06:37 +0000
Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11])	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p61I6X2i015293	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK)	for <gdb-patches@sourceware.org>; Fri, 1 Jul 2011 14:06:37 -0400
Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199])	by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p61Gd12a008118;	Fri, 1 Jul 2011 12:39:01 -0400
Received: from barimba (ovpn01.gateway.prod.ext.phx2.redhat.com [10.5.9.1])	by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id p61Gd0G3003151;	Fri, 1 Jul 2011 12:39:00 -0400
From: Tom Tromey <tromey@redhat.com>
To: gdb-patches@sourceware.org
Subject: Re: [RFC] canonical linespec and multiple breakpoints ...
References: <20110505162855.GA2546@adacore.com>	<m3oc3gx48l.fsf@fleche.redhat.com> <83bozgmhil.fsf@gnu.org>	<m3oc2pxjds.fsf@fleche.redhat.com> <83k4dcd1bh.fsf@gnu.org>	<m3r56bdoh9.fsf@fleche.redhat.com>
Date: Fri, 01 Jul 2011 18:06:00 -0000
In-Reply-To: <m3r56bdoh9.fsf@fleche.redhat.com> (Tom Tromey's message of "Thu,	30 Jun 2011 15:00:02 -0600")
Message-ID: <m362nmarbv.fsf@fleche.redhat.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2011-07/txt/msg00036.txt.bz2

Here is my proposal for how to handle ambiguous linespecs.  First I'll
describe it in general terms, with a few specifics about the
implementation.  Then I'll describe how it handles each scenario I
posted yesterday.

I would appreciate comments on this.  This is what I intend to work on
starting next Monday, and I would not like to be in the position where I
put in a lot of work and then have to redo it all.

I welcome contrasting proposals, but particularly if they address all
the scenarios identified thus far.  I think that partial solutions are
much less interesting -- it is easy to do well on any particular facet
of the problem, but it is much tougher, I think, to come up with a
complete solution.


I propose a simple rule for the handling of ambiguous linespecs: a
breakpoint whose argument is ambiguous will fire at all matching
locations.  This rule has several properties that I consider desirable:

* It is simple to explain to users
* It is predictable
* It is time-invariant
* It is implementable ;-)

In addition to this rule I think we should introduce some new linespec
syntax, to let users more precisely narrow down breakpoints.  In
particular I would like to add objfile names as a prefix, like:

  (gdb) break libc.so:malloc

I think I/T sets are also a good idea here, for the multi-inferior case.
These have been discussed several times and I think don't need much more
discussion.

In order to properly re-set breakpoints, we need a canonical form of the
linespec.  Currently this is done by constructing a new canonicalized
linespec.  In my proposal we will replace this with a structure, the
better to add more precise behavior without needing to construct
linespec syntax for every possible case.  E.g., we can have a bit
indicating whether this canonical linespec matches symbols without
debuginfo.

I am not sure how the breakpoint location information should be arranged
internally.  Perhaps each bp_location will need a canonical linespec
struct.

We'll continue to respect 'set multiple-symbols ask' -- in fact, we will
effectively expand it use by noticing more ambiguity in the first place.
This setting will provide users a way to easily create multiple
breakpoints from a single linespec (the "all" choice).

Linespecs are used in places other than "break", like "advance".  My
full list plus how to handle ambiguity:

* list, edit: give a menu.
* func: rewrite; I'm not sure why this even uses linespec
* until, advance: use the current context to filter the results; error
  if it is still ambiguous
* clear: work like breakpoints
* select_source_symtab: rewrite

One might note that this approach has a built-in inefficiency: nearly
every inferior change will require reading (this at a time when we are
trying to move more toward lazy reading) and scanning of debuginfo.  I
think this is ok provided that we provide users with better ways to
precisely specify linespecs; and also perhaps by modifying the internal
code to be smarter about computing re-sets depending on different kinds
of events -- like just dropping some locations rather than fully
re-evaluating a breakpoint when an objfile goes away.

The last part of the proposal is providing a way to modify a
breakpoint's linespec after the breakpoint has been made.  The rationale
for this is that it gives users a way to recover from ambiguity when a
condition is attached.  See below.


Now on to the scenarios:

Tom> 1. Break on an ambiguously-named function.
Tom>     $ gdb gdb
Tom>     (gdb) break parse_number

Make a single breakpoint with 7 locations -- one for each function in
gdb with this name.

Tom> 2. The same, but with a file:line.

Make a single breakpoint with multiple locations, just as today.

Tom> 3. Set a breakpoint at a SystemTap probe, the thing that started all this:
Tom>     (gdb) break probe:longjmp
Tom>    This one (in Fedora 15 glibc) happens to have 2 locations.

Likewise.

Tom> 4. Set a pending breakpoint.
Tom>    (gdb) break lib_function
Tom> 5. The same, but the pending name is ambiguous.

These both make a pending breakpoint; inferior changes may cause
locations to be added or removed.

If the name has no matches when 'break' is invoked, then we would
continue to respect 'set breakpoint pending'.

Tom> 6. Set a breakpoint that has a match in the inferior.  Then the inferior
Tom>    loads a .so to make the linespec ambiguous.

Add new locations to the breakpoint.

Tom> 7. 'break main', then start multiple inferiors

Each new inferior causes new locations to be added; gdb stops at the
'main' of each one.

Tom> 8. Any of the above, but with a breakpoint condition.
Tom>    Of particular importance is the case where the condition cannot be
Tom>    parsed in one location.

This, I think, is the one bad part of this proposal.  When a location is
added to a breakpoint, it may cause the condition to no longer be
parseable.

In this situation I think we should change gdb in 2 ways, to give the
user more control:

1. When a condition is no longer parseable, provide the option to stop
   the inferior so the user can correct it.
2. Also provide a way for the user to respecify the breakpoint location.

For the latter I think something like:

  modify breakpoint N location LINESPEC

Tom> 9. An ambiguous linespec where one match has debuginfo and another does
Tom>    not.

Handled by the canonical linespec struct.

Tom