From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-15286-listarch-gdb=sources.redhat.com@sources.redhat.com>
Received: (qmail 8878 invoked by alias); 25 Aug 2003 22:35:12 -0000
Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/gdb/>
List-Post: <mailto:gdb@sources.redhat.com>
List-Help: <mailto:gdb-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: gdb-owner@sources.redhat.com
Received: (qmail 8842 invoked from network); 25 Aug 2003 22:35:00 -0000
Received: from unknown (HELO walton.kettenis.dyndns.org) (213.93.115.144)
  by sources.redhat.com with SMTP; 25 Aug 2003 22:35:00 -0000
Received: from elgar.kettenis.dyndns.org (elgar.kettenis.dyndns.org [192.168.0.2])
	by walton.kettenis.dyndns.org (8.12.6p2/8.12.5) with ESMTP id h7PMYrCS000596;
	Tue, 26 Aug 2003 00:34:53 +0200 (CEST)
	(envelope-from kettenis@elgar.kettenis.dyndns.org)
Received: from elgar.kettenis.dyndns.org (localhost [127.0.0.1])
	by elgar.kettenis.dyndns.org (8.12.6p2/8.12.6) with ESMTP id h7PMYrVl001248;
	Tue, 26 Aug 2003 00:34:53 +0200 (CEST)
	(envelope-from kettenis@elgar.kettenis.dyndns.org)
Received: (from kettenis@localhost)
	by elgar.kettenis.dyndns.org (8.12.6p2/8.12.6/Submit) id h7PMYqFu001245;
	Tue, 26 Aug 2003 00:34:52 +0200 (CEST)
Date: Mon, 25 Aug 2003 22:35:00 -0000
Message-Id: <200308252234.h7PMYqFu001245@elgar.kettenis.dyndns.org>
From: Mark Kettenis <kettenis@chello.nl>
To: drow@mvista.com
CC: gdb@sources.redhat.com
In-reply-to: <20030824164347.GA17520@nevyn.them.org> (message from Daniel
	Jacobowitz on Sun, 24 Aug 2003 12:43:47 -0400)
Subject: Re: [RFC] Register sets
References: <200308232249.h7NMnvhh090154@elgar.kettenis.dyndns.org> <20030824164347.GA17520@nevyn.them.org>
X-SW-Source: 2003-08/txt/msg00285.txt.bz2

   Date: Sun, 24 Aug 2003 12:43:47 -0400
   From: Daniel Jacobowitz <drow@mvista.com>

   On Sun, Aug 24, 2003 at 12:49:57AM +0200, Mark Kettenis wrote:
   > Folks (and especially Andrew),
   > 
   > A while back, Andrew submitted a patch that made it possible to read
   > Linux/i386 corefiles on Linux/x86-64.  He tricked me into fixing the
   > "gcore" command for this case, which forced me to think a bit more
   > about how we can solve this cleanly.  Here's what I think is the
   > solution.

   Hi Mark,

   I've been thinking about the same problem for a while now.  I have a
   slightly different mental picture; let's see if we can put them
   together.

   > Registers sets
   > --------------
   > 
   > In most hosted environments, the underlying OS devides the ISA's
   > registers in groups.  Typically there is a set of general-purpose
   > (which usually includes all integer register)s and a set of
   > floating-point registers.  That's one of the reasons why we have
   > reggroups in GDB.
   > 
   > Most OS'es define data structures for dealing with these groups of
   > registers:
   > 
   >  * Modern BSD's have `struct reg' and `struct fpreg'.
   > 
   >  * System V has `gregset_t' and `fpregset_t'.  Solaris (as an
   >    SVR4-derivative) has `prgregset_t' and `fpregset_t' and the
   >    provision for an `xregset'.
   > 
   >  * QNX has `procfs_greg', `procfs_fpreg' and `procfs_altreg'.
   > 
   >  * Linux is a bit messy, but tries to mimic SVR4/Solaris.
   > 
   > These data structures are used in several interfaces:
   > 
   >  * ptrace(2) (for systems that have PT_GETREGS/PTRACE_GETREGS and
   >    friends, e.g. modern BSD's and most Linuxen).
   > 
   >  * proc(4) filesystem (a.k.a. procfs on e.g. SVR4/Solaris and QNX).
   > 
   >  * libthread_db(3)/proc_service(3) debugging interface (on Solaris and
   >    Linux).
   > 
   >  * core(5) memory dumps (on ELF systems, but also in some cases,
   >    traditional a.out systems).
   > 
   > As a result, we use these data structures all over the place in GDB,
   > usually in ways that make it difficult to re-use code and/or
   > multi-arch things.  I'd like to introduce a more unified approach to
   > handling these data structures, which I propose to call register sets,
   > or regsets for short.  Basically a register set is just a block of
   > memory of a certain size.  We may want to put these register sets (or
   > a single register from them) into GDB's register cache.  We also need
   > to be able to grab these register sets (or a single register in them)
   > from the register cache.
   > 
   > It seems that we need a maximum of three of these sets.  I propose to
   > name these register sets as follows:
   > 
   >  * `gregset' for the general-purpose registers.
   > 
   >  * `fpregset' for the floating-point registers.
   > 
   >  * `xregset' for any "extra" registers.

   I don't think this is a good assumption.  There are two problems with
   it:

     - It assumes that everything relating to a particular target uses the
   same register set format.  In general (there are exceptions where
   libthread_db will zero the regset instead of calling ps_*getregs) we
   can pass regsets through libthread_db as opaque objects; we might wish
   to include registers that are not available in a core dump.  Then we've
   got two different "general" regsets.  There are some other examples.

That's quite true.  It's probably a better idea to have the
possibility of defining register sets for these particular purpose,
falling back on a generic definition if those register sets aren't
defined.  That keeps things simple for "sane" targets.

Anyway, it's not a good idea to use the register set definition from
the acrhitecture vector for ptrace(2) anyway.  On FreeBSD I can run
Linux binaries.  But ptrace(2) still returns the registers in the
FreeBSD fromat for them (unless GDB itself is a Linux binary).  So
using the Linux register set definitions would make things fail,
whereas using FreeBSD's register set format make it possible to debug
a Linux binary on FreeBSD :-).  Similar things play a role when
debugging 32-bit code on a 64-bit platform such as amd64, but probably
also sparc64.

     - It assumes that there is one possible xregset for a target. 
   Suppose that you have two ARM targets with different vector extensions;
   I'll pick Cirrus/Maverick and iWMMXt, since there happen to be two ARM
   targets with different vector extensions.  We may not know right away
   which regset is available.  I'm not sure this is a problem; we could
   just define the architecture appropriately; but another example is the
   embedded x86 architecture registers, which a remote stub might want to
   transfer.

Hmm.  You certainly should define the architecture appropriately.

   (Choice of iWMMXt isn't just an example.  I need to add gdbserver
   support for iWMMXt, and I want to do it without changing the g/G
   packet.  See below...)

While I have been hacking on gdbserver a bit to add FreeBSD support,
I'm not too familiar with the remote protocol :-(.

   > For each register set we have to provide a function that takes apart
   > the register set and stores it in GDB's register cache, and a
   > functions that puts together a register set from GDB's register set.
   > For these functions I propose the functions:
   > 
   > void <prefix>_supply_<regset> (struct regcache *regcache,
   >                                const void *regs, int regnum);
   > 
   > for supplying register REGNUM from register set REGS into the cache
   > REGCACHE, and
   > 
   > void <prefix>_collect_<regset> (const struct regcache *regcache,
   > 				void *regs, int regnum)
   > 
   > for collecting register REGNUM from the cache REGCACHE into register
   > set REGS.
   > 
   > If REGNUM is -1, these function operate on all registers within the set.

   Can we define the REGNUM != -1 case a little more clearly?  Is the
   regnum a hint, what registers must be valid in the regcache when
   collecting, what registers must be valid in the regset when supplying,
   et cetera.  Right now we're a bit inconsistent between targets.

It's pretty clear.  If REGNUM is not -1, only that particular register
is transferred between the register cache and the buffer.  If REGNUM
is -1, all registers (within the set) are transferred.  When
collecting, this doesn't pay attention to the validity of the data.

   > For each architecture we will have a structure that contains all
   > information about the register sets:
   > 
   > struct regset_info
   > {
   >   size_t sizeof_gregset;
   >   void (*supply_gregset)(struct regcache *, const void *, int);
   >   void (*collect_gregset)(const struct regcache *, void *, int);
   >   size_t sizeof_fpregset;
   >   void (*supply_fpregset)(struct regcache *, const void *, int);
   >   void (*collect_fpregset)(const struct regcache *, void *, int);
   >   size_t sizeof_xregset;
   >   void (*supply_xregset)(const struct regcache *, void *, int);
   >   void (*collect_xregset)(const struct regcache *, void *, int);
   > };
   > 
   > A pointer to this structure will be stored in the architecture vector,
   > such that we can use this from various places in GDB.
   > 
   > Thoughts?

   I was thinking of something like this, very roughly:

   struct regset
   {
     size_t size;
     mapping_of_registers_and_sizes_and_offsets mapping;
   };

   struct native_regsets
   {
     struct regset *gregset, *fpregset, *xregset;
   };

   struct regset *
   gdbarch_core_section_to_regset (int which, int sizeof);

   This would replace lots of identical copies of fetch_core_registers all
   over GDB.

That's exactly what I'm aiming at :-).  The "mapping" needs to be a
function though, since in some cases it might need to do some
arithmetic on the buffer contents in order to convert them to the
format used by GDB's register cache.

   And then:

   struct regset *
   gdbarch_remote_name_to_regset (int which, int sizeof);

   The companion to this would be some remote protocol work:
   -> qRegsets::
   <- qRegsets:General,FP,iWMMXt

   [Local GDB calls gdbarch_remote_name_to_regset on each of these]

   -> g
   <- 0000102109210831973987etc

   [This would be the General set.]

   -> qFetchRegset:FP
   <- 000000831092831098201923etc

   -> qSetRegset:FP:100000831092831098201923etc
   <- OK

   [This would be the FP set.]

   Also available might be:

   -> qDescribeRegset:General
   <- a0:8;a1:8;a2:8;a3:8;v0:8;sp:8;pc:8

   which could be used before trying the gdbarch_remote_name_to_regset as
   a fallback.  And this logically extends to a CLI command to specify the
   format of the G packet, which I think there's a PR requesting?

Might be.

   How does that sound?  I'll implement it if folks like it.  Open to
   any/all suggestions.

I have to think about this a bit more...

Mark