From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-24111-listarch-gdb=sources.redhat.com@sourceware.org>
Received: (qmail 2040 invoked by alias); 27 Jan 2006 18:04:34 -0000
Received: (qmail 2029 invoked by uid 22791); 27 Jan 2006 18:04:33 -0000
X-Spam-Check-By: sourceware.org
Received: from nevyn.them.org (HELO nevyn.them.org) (66.93.172.17)     by sourceware.org (qpsmtpd/0.31.1) with ESMTP; Fri, 27 Jan 2006 18:04:31 +0000
Received: from drow by nevyn.them.org with local (Exim 4.54) 	id 1F2XxV-0004CK-OE 	for gdb@sourceware.org; Fri, 27 Jan 2006 13:04:29 -0500
Date: Fri, 27 Jan 2006 18:41:00 -0000
From: Daniel Jacobowitz <drow@false.org>
To: gdb@sourceware.org
Subject: Re: Using XML in GDB?
Message-ID: <20060127180429.GA15726@nevyn.them.org>
Mail-Followup-To: gdb@sourceware.org
References: <20060126055744.GA29647@nevyn.them.org> <u8xt1efg5.fsf@gnu.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <u8xt1efg5.fsf@gnu.org>
User-Agent: Mutt/1.5.8i
X-IsSubscribed: yes
Mailing-List: contact gdb-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-owner@sourceware.org
X-SW-Source: 2006-01/txt/msg00311.txt.bz2

On Fri, Jan 27, 2006 at 07:41:30PM +0200, Eli Zaretskii wrote:
> > Date: Thu, 26 Jan 2006 00:57:44 -0500
> > From: Daniel Jacobowitz <drow@false.org>
> > 
> > Does anyone have a good reason why GDB should not make use of this
> > well-standardized format instead of inventing additional ad-hoc formats?
> 
> I'd like to see some spec, if you please.  It's hard to make up one's
> mind without at least that, since XML is not the ideal format for all
> and every type of communications.  It has disadvantages, some of them
> were already mentioned in this thread: it's very wordy, and existing
> implementations are not guaranteed to exist on every supported
> platform (telling users to download and install additional packages as
> a prerequisite to building GDB is a mild nuisance).

[As mentioned earlier, I do not intend to add any requirements for
building GDB; I would include it in-tree, unless someone had a good
reason not to do so.  I'm one of the folks who would get annoyed at
additional build requirements :-)]

> So I'd like to weigh these demerits against the advantages, and that
> is only possible if some kind of specification for this specific
> interface which we are discussing is available.
> 
> I've read your description from May (and just re-read it now), and it
> sounds like the number of different entities we need to communicate is
> quite small and their structure is simple.  So why did you come to the
> conclusion that you needed something like XML to express that
> interface?

At the moment I don't have a new specification; it's in too much flux
as the implementation continues to reveal issues I hadn't considered
adequately.  However, I can definitely explain what brought me to this
conclusion!

The proposal I sketched out in May is indeed simple.  The bits with
"architecture-specific data" and "registers tags" are the only hints
that it may need to evolve in the future.  However, there's a lot of
other things which GDB would like to know about targets, in an ideal
world.  A great example that GDB wouldn't make use of today, but
could, is Andrew Stubbs' mention of memory maps paired with the recent
report of backtrace blowing up on uclinux.  If we can receive the
memory map from the remote stub, we can absolutely guarantee that we
never send reads to outside of RAM when we wanted RAM access.

My original sketch did allow for this; the feature: and reg: prefixes
existed so that later prefixes like memory: could be added.  But then
for every new data type we don't just have to define the contents but
also the encoding.  With XML, we only have to define the contents.

The thing that pushed me over the edge this week was a combination of
two additions: files containing the same data, for GDB to parse in
advance without querying the target, and a "description" field.

The existing format is extremely file unfriendly because it uses colon
as a field separator and semicolon as a list separator - no newlines.
So the file becomes a single, awkwardly long, line.  Otherwise the
parser has to tolerate newlines, and that complicates the set of
characters which have to be handled specially.

And it would be nice if we could offer a user-friendly description of
target provided features - especially if we can pass it through to an
IDE.  Instead of an opaque "gdb.arm.vfpv2_user", you ought to be able
to view details about "ARM VFP unit (version 2, user mode)" and have
your debugger explain to you that the enabled interrupts are stored in
particular bits of FPSCR.

If we're going to do that, it would be a real shame not to consider
localization; most ARM system programmers can probably manage the
English names of the registers, but if we want to offer help text,
being able to provide it in Japanese is a big win.  So that means
character encodings, and in turn that means we need to be somewhat
careful with the contents of descriptions.

Combining this with files provided the third straw: now a single
register description might contain embedded newlines!  The question
of field separators has gotten even more complicated.

The biggest win of XML, for me, is that there are standard answers to
all of these problems and standard tools for editing and
checking XML files.  I get to rip out a half-usable parser that I spent
several days writing, and integrate a new XML parser in probably about
the amount of time it would take me to finish the one I've got.  And
the documentation for this feature (which is not yet (re)written but I
intend to be fairly substantial) can focus on its semantics, and just
reference the relevant bits of XML.

-- 
Daniel Jacobowitz
CodeSourcery