Re: Non-uniform address spaces

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

From: Jim Blandy <jimb@codesourcery.com>
To: Michael Eager <eager@eagercon.com>
Cc: gdb@sources.redhat.com
Subject: Re: Non-uniform address spaces
Date: Tue, 26 Jun 2007 16:56:00 -0000	[thread overview]
Message-ID: <m3y7i6hcxt.fsf@codesourcery.com> (raw)
In-Reply-To: <468047F0.7060207@eagercon.com> (Michael Eager's message of "Mon, 25 Jun 2007 15:55:44 -0700")

Michael Eager <eager@eagercon.com> writes:
> Jim Blandy wrote:
>> Michael Eager <eager@eagercon.com> writes:
>>> For an example, the SPEs in a Cell processors could be configured
>>> to distribute pieces of an array over different SPEs.
>>>
>>>> How do you declare such an array?  How do you index it?  What code is
>>>> generated for an array access?  How does it relate to C's rules for
>>>> pointer arithmetic?
>>> In UPC (a parallel extension to C) there is a new attribute "shared"
>>> which says that data is (potentially) distributed across multiple processors.
>>>
>>> In UPC, pointer arithmetic works exactly the same as in C: you can
>>> compare pointers, subtract them to get a difference, and add integers.
>>> The compiler generates code which does the correct computation.
>>
>> All right.  Certainly pointer arithmetic and array indexing need to be
>> fixed to handle such arrays.  Support for such a system will entail
>> having the compiler describe to GDB how to index these things, and
>> having GDB understand those descriptions.  
>
> This may be more something that is better described in an ABI than in
> DWARF.  The compiler may not know how to translate a pointer into
> a physical address.  UPC, for example, allows you to specify the number
> of threads at runtime.
>
> The compiler certainly can identify that an array or other data
> is shared, to use UPC's terminology.  From there, the target code
> would need to perform some magic to figure out where the address
> actually pointed to.

Certainly, an ABI informs the interpretation of the debugging info.
Do you have specific ideas yet on how to convey this information?

>> If those were fixed, how do the other CORE_ADDR uses look to you?
>> Say, in the frame code?  Or the symtab code?
>
> There are uses of CORE_ADDR values which assume that arithmetic
> operations are valid, such as testing whether a PC address is
> within a stepping range.  These are not likely to cause problems,
> because code space generally does conform to the linear space
> assumptions that GDB makes.

Right --- this is what I was alluding to before: most often the
addresses being compared actually come from something known to be
contiguous, so it'll work out.

> There are other places where an address is incremented, such as
> in displaying memory contents.  I doubt that the code knows
> what what it is displaying, only to display n words starting at
> x address in z format.  This would probably result in incorrect
> results if the data spanned from one processor/thread to another.
> (At least at a first approximation, this may well be an acceptable
> restriction.)

Certainly code for printing distributed objects will need to
understand how to traverse them properly; I see this as parallel to
the indexing/pointer arithmetic requirements.  Hopefully we can design
one interface that serves both purposes nicely.

> Symtab code would need a hook which converted the ELF
> <section,offset> into a <processor,thread,offset> for shared
> objects.  Again, that would require target-dependent magic.

Hmm.  GDB's internal representation for debugging information stores
actual addresses, not <section, offset> pairs.  After reading the
information, we call objfile_relocate to turn the values read from the
debugging information into real addresses.  It seems to me that that
code should be doing this job already.

How does code get loaded in your system?  Does a single module get
loaded multiple times?

In GDB, each objfile represents a specific loading of a library or
executable.  The information is riddled with real addresses.  If a
single file is loaded N times, you'll need N objfiles, and the
debugging information will be duplicated.

In the long run, I think GDB should change to represent debugging
information in a loading-independent way, so that multiple instances
of the same library can share the same data.  In a sense, you'd have a
big structure that just holds data parsed from the file, and then a
bunch of little structures saying, "I'm an instance of THAT, loaded at
THIS address."

This would enable multi-process debugging, and might also allow us to
avoid re-reading debugging info for shared libraries every time they
get loaded.

> One problem may be that it may not be clear whether one has a
> pointer to a linear code space or to a distributed NUMA data space.
> It might be reasonable to model the linear code space as a 64-bit
> CORE_ADDR, with the top half zero, while a NUMA address has non-zero
> values in the top half.  (I don't know if there might be alias
> problems, where zero might be valid for the top half of a NUMA address.)

I think this isn't going to be a problem, but it's hard to tell.  Can
you think of a specific case where we wouldn't be able to tell which
we have?

> I'd be very happy figuring out where to put a hook which allowed me
> to translate a NUMA CORE_ADDR into a physical address, setting the
> thread appropriately.  A bit of a kludge, but probably workable.

CORE_ADDR should be capable of addressing all memory on the system.  I
think you'll make a lot of trouble for yourself if you don't follow
that rule.

next prev parent reply	other threads:[~2007-06-26 16:56 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-23 16:31 Michael Eager
2007-06-23 21:25 ` Daniel Jacobowitz
2007-06-23 21:47   ` Michael Eager
2007-06-23 23:09     ` Daniel Jacobowitz
2007-06-25 17:46     ` Jim Blandy
2007-06-25 18:08       ` Michael Eager
2007-06-25 19:05         ` Jim Blandy
2007-06-25 19:09           ` Daniel Jacobowitz
2007-06-25 20:04           ` Michael Eager
2007-06-25 22:23             ` Jim Blandy
2007-06-25 22:55               ` Michael Eager
2007-06-25 23:08                 ` basic gdb usage question Matt Funk
     [not found]                   ` <655C3D4066B7954481633935A40BB36F041415@ussunex02.svl.access-company.com>
2007-06-25 23:36                     ` Matt Funk
2007-06-26  1:25                       ` Michael Eager
2007-06-26  3:12                   ` Eli Zaretskii
2007-06-26 16:13                     ` Matt Funk
2007-06-27  3:29                       ` Eli Zaretskii
2007-06-26 16:56                 ` Jim Blandy [this message]
2007-06-26 17:22                   ` Non-uniform address spaces Michael Eager
2007-06-26 17:55                     ` Jim Blandy
2007-06-26 18:08                     ` Jim Blandy
2007-06-26 23:08                       ` Michael Eager
2007-06-26 23:39                         ` Jim Blandy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3y7i6hcxt.fsf@codesourcery.com \
    --to=jimb@codesourcery.com \
    --cc=eager@eagercon.com \
    --cc=gdb@sources.redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox