From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21612 invoked by alias); 29 Mar 2006 16:16:30 -0000 Received: (qmail 21602 invoked by uid 22791); 29 Mar 2006 16:16:29 -0000 X-Spam-Check-By: sourceware.org Received: from nevyn.them.org (HELO nevyn.them.org) (66.93.172.17) by sourceware.org (qpsmtpd/0.31.1) with ESMTP; Wed, 29 Mar 2006 16:16:27 +0000 Received: from drow by nevyn.them.org with local (Exim 4.54) id 1FOdLN-0008Ut-1S for gdb@sourceware.org; Wed, 29 Mar 2006 11:16:25 -0500 Date: Wed, 29 Mar 2006 21:26:00 -0000 From: Daniel Jacobowitz To: gdb@sourceware.org Subject: Self-describing targets - a more concrete proposal Message-ID: <20060329161624.GA32241@nevyn.them.org> Mail-Followup-To: gdb@sourceware.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.8i X-IsSubscribed: yes Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2006-03/txt/msg00199.txt.bz2 I've posted about this work several times now, including: http://sourceware.org/ml/gdb/2005-05/msg00074.html http://sourceware.org/ml/gdb/2005-05/msg00171.html http://sourceware.org/ml/gdb/2006-01/msg00257.html http://sourceware.org/ml/gdb/2006-03/msg00031.html It is in a much more concrete stage of development than it's ever been before; almost everything described in the proposed documentation actually works now, and I am generally happy with the format of the descriptions. You can find the code on gdb-csl-available-20060303-branch in the CVS repository. It's currently only wired up for ARM, and there aren't any sample stub implementations - that'll be along. I'd particularly like to thank Paul Brook for some valuable suggestions, and Jim Blandy for both suggestions and turning my messy notes into the coherent Texinfo below (and for using "Self-Describing", which I think is the phrase I'd been fumbling around for a while). I would appreciate comments on the sample and documentation; while there are a lot of things left on my to-do list for this project, most of them are nice to have rather than important. What's in CVS is enough to be very, very useful. If other GDB developers like the path it's taking, I'd prefer to do future development on it in mainline, instead of on a branch. Here's a small, but useful, sample description. Then, below it, Jim's documentation - which includes details on what the description means. This description matches the current layout of the ARM register cache. In target.xml: In arm-core.xml: In arm-fpa.xml: The documentation: Appendix F Self-Describing Targets ********************************** One of the challenges of using GDB to debug embedded systems is that there are so many minor variants of each processor architecture in use. It is common practice for vendors to start with a standard processor core -- ARM, PowerPC, or MIPS, for example -- and then make changes to adapt it to a particular market niche. Some architectures have hundreds of variants, available from dozens of vendors. This leads to a number of problems: * With so many different customized processors, it is difficult for the GDB maintainers to keep up with the changes. * Since individual variants may have short lifetimes or limited audiences, it may not be worthwhile to carry information about every variant in the GDB source tree. * When GDB does support the architecture of the embedded system at hand, the task of finding the correct architecture name to give the `set architecture' command can be error-prone. To address these problems, the GDB remote protocol allows a target system to not only identify itself to GDB, but to actually describe its own features. This lets GDB support processor variants it has never seen before -- to the extent that the descriptions are accurate, and that GDB understands them. F.1 Retrieving Self-Descriptions ================================ GDB retrieves a target's self-description via the remote protocol using a `qPart' request (*note the `qPart' request: qPart request.) of the form: qPart:features:read:ANNEX:OFFSET,LENGTH where ANNEX is the string `target.xml'. The OFFSET and LENGTH parameters are the offset into the description and the number of bytes to transfer, as for other `qPart' requests. The `target.xml' annex contains an XML document describing the target's features; its form is described in *Note Self-Description Format::. Feature descriptions may be split into several annexes, which GDB retrieves and assembles into a complete description. An annex may use XML Inclusions (http://www.w3.org/TR/xinclude/) to incorporate other annexes, much as a C header file refers to other headers using `#include'. GDB first retrieves `target.xml', and then makes further `qPart' requests as needed to retrieve the annexes referred to by any `xi:include' elements it finds. Naturally, annexes brought in by `xi:include' may use `xi:include' themselves. To reduce protocol overhead, a target may supply a special annex named `CHECKSUMS' that provides 160-bit SHA1 checksum values for the annexes it has available. The `CHECKSUMS' annex contains a series of newline-terminated lines, each of which contains a 40-digit hexidecimal checksum, two spaces, and the name of an annex with the given checksum. Here is an example `CHECKSUM' annex: 68c94ffc34f8ad2d7bfae3f5a6b996409211c1b1 target.xml 0e8e850b0580fbaaa0872326cb1b8ad6adda9b0d mmu.xml 00f22e5f971ccec05c2acce98caf8cff4343c8cf fpu.xml GDB uses these checksums to avoid retrieving a given annex more than once. When GDB retrieves an annex, it caches its contents locally. Then, each time GDB thinks the target architecture may have changed (say, after making a new remote protocol connection, or after starting a new child process using the extended remote protocol), it retrieves the `CHECKSUMS' annex afresh. If the checksums show that a particular annex's contents are the same on the target and in GDB's cache, GDB avoids fetching it again. If none of the annexes have changed, GDB needs only retrieve the `CHECKSUMS' annex. `CHECKSUMS' need not provide a checksum for every annex available; if a given annex is not mentioned, GDB will try to retrieve it each time it thinks the target architecture may have changed. The target need not provide any `CHECKSUMS' annex at all; this is equivalent to an empty `CHECKSUMS' annex. F.2 Self-Description Format =========================== A target description annex is an XML (http://www.w3.org/XML/) document which complies with the Document Type Definition provided in the GDB sources in `gdb/features/gdb-target.dtd'. This means you can use generally available tools like `xmllint' to check that your feature descriptions are well-formed and valid. However, to help people unfamiliar with XML write descriptions for their targets, we also describe the grammar here. At the moment, target descriptions can only describe register sets, to be accessed via the remote protocol `g', `G', `p' and `P' requests. We hope to extend the format to include other kinds of information, like memory maps. Here is a simple sample target description: This describes a simple target feature set which only contains two registers, named `s0' (a 32-bit integer register) and `s1' (a 32-bit floating point register). A target description has the overall form: FEATURE... FEATURE-SET The description is generally insensitive to whitespace and line breaks, under the usual common-sense rules. The ellipsis (`...') after FEATURE indicates that FEATURE may appear zero or more times. Each FEATURE names and describes a single feature of the target; at the moment, features can only describe register sets. The FEATURE-SET cites particular features by name, pulling together a complete description of the target. A FEATURE has the form: REG... This defines a feature named NAME; each feature's name must be unique across the description. Each REG has the form: Items in [brackets] are optional. The components are as follows: NAME The register's name; it must be unique within the target description. BITSIZE The register's size, in bits. REGNUM The register's number. If omitted, a register's number is one greater than that of the previous register; the first register's number defaults to zero. But also see the `feature-ref' element's `base-regnum' attribute, below--these register numbers are relative to the `base-regnum'. READONLY Whether the register is read-only or not; this must be either `yes' or `no'. The default is `no'. SAVE-RESTORE Whether the register should be preserved across inferior function calls; this must be either `yes' or `no'. The default is `yes'. TYPE The type of the register. At the moment, TYPE must be either `int' or `float'. The default is `int'. GROUP The register group to which this register belongs. At the moment, GROUP must be either `general', `float', or `vector'. If no GROUP is specified, GDB will select a register group based on the register's type. A FEATURE-SET binds together a set of features to describe a complete target. There can be only one FEATURE-SET in a target. Each FEATURE-SET has the form: FEATURE-REF... where each FEATURE-REF has the form: This means that the target includes the feature named NAME. If the `base-regnum' is present, that means that registers in the given feature are numbered starting with N, until overridden by an explicit register number. It can sometimes be valuable to split a target description up into several different annexes, either for organizational purposes, or to allow GDB to cache portions of the description that change rarely. To make this possible, you can replace any feature description with an inclusion directive of the form: When GDB encounters an element of this form, it will retrieve the annex named ANNEX (or use its cached copy), and replace the inclusion directive with the contents of that annex. -- Daniel Jacobowitz CodeSourcery