Re: Non-uniform address spaces

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

From: Michael Eager <eager@eagercon.com>
To: Jim Blandy <jimb@codesourcery.com>
Cc: gdb@sources.redhat.com
Subject: Re: Non-uniform address spaces
Date: Tue, 26 Jun 2007 17:22:00 -0000	[thread overview]
Message-ID: <46814B4C.7080302@eagercon.com> (raw)
In-Reply-To: <m3y7i6hcxt.fsf@codesourcery.com>

Jim Blandy wrote:

>> The compiler certainly can identify that an array or other data
>> is shared, to use UPC's terminology.  From there, the target code
>> would need to perform some magic to figure out where the address
>> actually pointed to.
> 
> Certainly, an ABI informs the interpretation of the debugging info.
> Do you have specific ideas yet on how to convey this information?

A hook (specified in gdb_arch) would specify a target routine
to do the translation.   When GDB sees a shared pointer, it
will call this target-dependent translation routine.

What isn't clear to me is where to call the hook.  Suggestions
about where to look would be welcome.

>> There are other places where an address is incremented, such as
>> in displaying memory contents.  I doubt that the code knows
>> what what it is displaying, only to display n words starting at
>> x address in z format.  This would probably result in incorrect
>> results if the data spanned from one processor/thread to another.
>> (At least at a first approximation, this may well be an acceptable
>> restriction.)
> 
> Certainly code for printing distributed objects will need to
> understand how to traverse them properly; I see this as parallel to
> the indexing/pointer arithmetic requirements.  Hopefully we can design
> one interface that serves both purposes nicely.

Perhaps.  I haven't looked in this code for a long time, but
my impression is that knowledge about what is being printed
gets discarded pretty early, leaving only a pointer, a count,
and a format.

>> Symtab code would need a hook which converted the ELF
>> <section,offset> into a <processor,thread,offset> for shared
>> objects.  Again, that would require target-dependent magic.
> 
> Hmm.  GDB's internal representation for debugging information stores
> actual addresses, not <section, offset> pairs.  After reading the
> information, we call objfile_relocate to turn the values read from the
> debugging information into real addresses.  It seems to me that that
> code should be doing this job already.

Perhaps.  I'll look at that.  How does this work for TLS now?

> How does code get loaded in your system?  Does a single module get
> loaded multiple times?

On a system which has shared memory (not UPC 'shared' but memory which
is accessed by all processors/threads) the code image is simply loaded.
Data areas are duplicated for thread-specific data, similar to TLS.
On multi-processors systems which have independent memories, a target
agent loads the processors with the executable.

> In GDB, each objfile represents a specific loading of a library or
> executable.  The information is riddled with real addresses.  If a
> single file is loaded N times, you'll need N objfiles, and the
> debugging information will be duplicated.

Likely not a real problem.  The code image is linear and addresses
don't need to be translated.  Addresses in the code are relative to
either global data or thread-specific data.  They aren't NUMA
addresses.

> In the long run, I think GDB should change to represent debugging
> information in a loading-independent way, so that multiple instances
> of the same library can share the same data.  In a sense, you'd have a
> big structure that just holds data parsed from the file, and then a
> bunch of little structures saying, "I'm an instance of THAT, loaded at
> THIS address."
> 
> This would enable multi-process debugging, and might also allow us to
> avoid re-reading debugging info for shared libraries every time they
> get loaded.

This would address my comment above, that GDB converts from a
symbolic form to an address too early.

>> One problem may be that it may not be clear whether one has a
>> pointer to a linear code space or to a distributed NUMA data space.
>> It might be reasonable to model the linear code space as a 64-bit
>> CORE_ADDR, with the top half zero, while a NUMA address has non-zero
>> values in the top half.  (I don't know if there might be alias
>> problems, where zero might be valid for the top half of a NUMA address.)
> 
> I think this isn't going to be a problem, but it's hard to tell.  Can
> you think of a specific case where we wouldn't be able to tell which
> we have?

Only if the <processor,thread> component of a NUMA address can be
zero, and looks like a linear address.

>> I'd be very happy figuring out where to put a hook which allowed me
>> to translate a NUMA CORE_ADDR into a physical address, setting the
>> thread appropriately.  A bit of a kludge, but probably workable.
> 
> CORE_ADDR should be capable of addressing all memory on the system.  I
> think you'll make a lot of trouble for yourself if you don't follow
> that rule.

The NUMA address has to be translated into a physical address somewhere.
Perhaps lower in the access routines is better.

-- 
Michael Eager	 eager@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

next prev parent reply	other threads:[~2007-06-26 17:22 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-23 16:31 Michael Eager
2007-06-23 21:25 ` Daniel Jacobowitz
2007-06-23 21:47   ` Michael Eager
2007-06-23 23:09     ` Daniel Jacobowitz
2007-06-25 17:46     ` Jim Blandy
2007-06-25 18:08       ` Michael Eager
2007-06-25 19:05         ` Jim Blandy
2007-06-25 19:09           ` Daniel Jacobowitz
2007-06-25 20:04           ` Michael Eager
2007-06-25 22:23             ` Jim Blandy
2007-06-25 22:55               ` Michael Eager
2007-06-25 23:08                 ` basic gdb usage question Matt Funk
     [not found]                   ` <655C3D4066B7954481633935A40BB36F041415@ussunex02.svl.access-company.com>
2007-06-25 23:36                     ` Matt Funk
2007-06-26  1:25                       ` Michael Eager
2007-06-26  3:12                   ` Eli Zaretskii
2007-06-26 16:13                     ` Matt Funk
2007-06-27  3:29                       ` Eli Zaretskii
2007-06-26 16:56                 ` Non-uniform address spaces Jim Blandy
2007-06-26 17:22                   ` Michael Eager [this message]
2007-06-26 17:55                     ` Jim Blandy
2007-06-26 18:08                     ` Jim Blandy
2007-06-26 23:08                       ` Michael Eager
2007-06-26 23:39                         ` Jim Blandy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46814B4C.7080302@eagercon.com \
    --to=eager@eagercon.com \
    --cc=gdb@sources.redhat.com \
    --cc=jimb@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox