From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-37490-listarch-gdb=sources.redhat.com@sourceware.org>
Received: (qmail 30153 invoked by alias); 4 Jun 2010 23:00:10 -0000
Received: (qmail 30135 invoked by uid 22791); 4 Jun 2010 23:00:09 -0000
X-SWARE-Spam-Status: No, hits=-2.0 required=5.0	tests=AWL,BAYES_00,T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
Received: from smtp-outbound-2.vmware.com (HELO smtp-outbound-2.vmware.com) (65.115.85.73)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 04 Jun 2010 23:00:03 +0000
Received: from jupiter.vmware.com (mailhost5.vmware.com [10.16.68.131])	by smtp-outbound-2.vmware.com (Postfix) with ESMTP id D399C24087;	Fri,  4 Jun 2010 16:00:00 -0700 (PDT)
Received: from msnyder-server.eng.vmware.com (promd-2s-dhcp138.eng.vmware.com [10.20.124.138])	by jupiter.vmware.com (Postfix) with ESMTP id CA7F5DC0DF;	Fri,  4 Jun 2010 16:00:00 -0700 (PDT)
Message-ID: <4C098570.7040108@vmware.com>
Date: Fri, 04 Jun 2010 23:00:00 -0000
From: Michael Snyder <msnyder@vmware.com>
User-Agent: Thunderbird 2.0.0.22 (X11/20090609)
MIME-Version: 1.0
To: Stan Shebs <stan@codesourcery.com>
CC: "gdb@sourceware.org" <gdb@sourceware.org>
Subject: Re: [RFC] Collecting strings at tracepoints
References: <4C0983C3.6000604@codesourcery.com>
In-Reply-To: <4C0983C3.6000604@codesourcery.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
Mailing-List: contact gdb-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb.sourceware.org>
List-Subscribe: <mailto:gdb-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-owner@sourceware.org
X-SW-Source: 2010-06/txt/msg00019.txt.bz2

Stan Shebs wrote:
> Collection of strings is a problem for tracepoint users, because the 
> literal interpretation of "collect str", where str is a char *, is to 
> collect the address of the string, but not any of its contents.  It is 
> possible to use the '@' syntax to get some contents, for instance 
> "collect str@40" acquires the first 40 characters, but it is a poor 
> approximation; if the string is shorter than that, you collect more than 
> necessary, and possibly run into access trouble if str+40 is outside the 
> program's address space, or else the string is longer, in which case you 
> may miss the part you really wanted.
> 
> For normal printing of strings GDB has a couple tricks it does.  First, 
> it explicitly recognizes types that are pointers to chars, and 
> automatically dereferences and prints the bytes it finds.  Second, the 
> print elements limit prevents excessive output in case the string is long.
> 
> For tracepoint collection, I think the automatic heuristic is probably 
> not a good idea.  In interactive use, if you print too much string, or 
> just wanted to see the address, there's no harm in displaying extra 
> data.  But for tracing, the user needs a little more control, so that 
> the buffer doesn't inadvertantly fill up too soon.  So I think that 
> means that we should have the user explicitly request collection of 
> string contents.
> 
> Looking at how '@' syntax works, we can extend it without disrupting 
> expression parsing much.  For instance, "str@@" could mean to deference 
> str, and collect bytes until a 0 is seen, or the print elements limit is 
> reached (implication is that we would have to tell the target that 
> number).  The user could exercise even finer control by supplying the 
> limit explicitly, for instance "str@/80" to collect at most 80 chars of 
> the string.  ("str@@80" seems like it would cause ambiguity problems vs 
> "str@@").
> 
> This extended syntax could work for the print command too, in lieu of 
> tweaking the print element limit, and for types that GDB does not 
> recognize as a string type.
> 
> Under the hood, it's not yet clear if we will need additional bytecodes, 
> but probably so.

Another possibility might be as an option to the "collect" command,
eg.
 > collect /s foo (collect foo as a string).

Does seem like an additional byte code would be needed.