From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2040 invoked by alias); 27 Jan 2006 18:04:34 -0000 Received: (qmail 2029 invoked by uid 22791); 27 Jan 2006 18:04:33 -0000 X-Spam-Check-By: sourceware.org Received: from nevyn.them.org (HELO nevyn.them.org) (66.93.172.17) by sourceware.org (qpsmtpd/0.31.1) with ESMTP; Fri, 27 Jan 2006 18:04:31 +0000 Received: from drow by nevyn.them.org with local (Exim 4.54) id 1F2XxV-0004CK-OE for gdb@sourceware.org; Fri, 27 Jan 2006 13:04:29 -0500 Date: Fri, 27 Jan 2006 18:41:00 -0000 From: Daniel Jacobowitz To: gdb@sourceware.org Subject: Re: Using XML in GDB? Message-ID: <20060127180429.GA15726@nevyn.them.org> Mail-Followup-To: gdb@sourceware.org References: <20060126055744.GA29647@nevyn.them.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.8i X-IsSubscribed: yes Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2006-01/txt/msg00311.txt.bz2 On Fri, Jan 27, 2006 at 07:41:30PM +0200, Eli Zaretskii wrote: > > Date: Thu, 26 Jan 2006 00:57:44 -0500 > > From: Daniel Jacobowitz > > > > Does anyone have a good reason why GDB should not make use of this > > well-standardized format instead of inventing additional ad-hoc formats? > > I'd like to see some spec, if you please. It's hard to make up one's > mind without at least that, since XML is not the ideal format for all > and every type of communications. It has disadvantages, some of them > were already mentioned in this thread: it's very wordy, and existing > implementations are not guaranteed to exist on every supported > platform (telling users to download and install additional packages as > a prerequisite to building GDB is a mild nuisance). [As mentioned earlier, I do not intend to add any requirements for building GDB; I would include it in-tree, unless someone had a good reason not to do so. I'm one of the folks who would get annoyed at additional build requirements :-)] > So I'd like to weigh these demerits against the advantages, and that > is only possible if some kind of specification for this specific > interface which we are discussing is available. > > I've read your description from May (and just re-read it now), and it > sounds like the number of different entities we need to communicate is > quite small and their structure is simple. So why did you come to the > conclusion that you needed something like XML to express that > interface? At the moment I don't have a new specification; it's in too much flux as the implementation continues to reveal issues I hadn't considered adequately. However, I can definitely explain what brought me to this conclusion! The proposal I sketched out in May is indeed simple. The bits with "architecture-specific data" and "registers tags" are the only hints that it may need to evolve in the future. However, there's a lot of other things which GDB would like to know about targets, in an ideal world. A great example that GDB wouldn't make use of today, but could, is Andrew Stubbs' mention of memory maps paired with the recent report of backtrace blowing up on uclinux. If we can receive the memory map from the remote stub, we can absolutely guarantee that we never send reads to outside of RAM when we wanted RAM access. My original sketch did allow for this; the feature: and reg: prefixes existed so that later prefixes like memory: could be added. But then for every new data type we don't just have to define the contents but also the encoding. With XML, we only have to define the contents. The thing that pushed me over the edge this week was a combination of two additions: files containing the same data, for GDB to parse in advance without querying the target, and a "description" field. The existing format is extremely file unfriendly because it uses colon as a field separator and semicolon as a list separator - no newlines. So the file becomes a single, awkwardly long, line. Otherwise the parser has to tolerate newlines, and that complicates the set of characters which have to be handled specially. And it would be nice if we could offer a user-friendly description of target provided features - especially if we can pass it through to an IDE. Instead of an opaque "gdb.arm.vfpv2_user", you ought to be able to view details about "ARM VFP unit (version 2, user mode)" and have your debugger explain to you that the enabled interrupts are stored in particular bits of FPSCR. If we're going to do that, it would be a real shame not to consider localization; most ARM system programmers can probably manage the English names of the registers, but if we want to offer help text, being able to provide it in Japanese is a big win. So that means character encodings, and in turn that means we need to be somewhat careful with the contents of descriptions. Combining this with files provided the third straw: now a single register description might contain embedded newlines! The question of field separators has gotten even more complicated. The biggest win of XML, for me, is that there are standard answers to all of these problems and standard tools for editing and checking XML files. I get to rip out a half-usable parser that I spent several days writing, and integrate a new XML parser in probably about the amount of time it would take me to finish the one I've got. And the documentation for this feature (which is not yet (re)written but I intend to be fairly substantial) can focus on its semantics, and just reference the relevant bits of XML. -- Daniel Jacobowitz CodeSourcery