Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
* Re: [PATCH] Use mmap for symbol tables
@ 2006-01-31  4:53 David Anderson
  0 siblings, 0 replies; 31+ messages in thread
From: David Anderson @ 2006-01-31  4:53 UTC (permalink / raw)
  To: eirik, jimb; +Cc: gdb-patches


Eirik Fuller <eirik@hackrat.com>
> I've already mentioned that the wasted address space isn't all that big,
> at least not in the symbol tables I'm accustomed to.  Anyone who is
> crowding the limits of virtual address space will run out soon enough

The size of the text and the data can be very large indeed.
And those, as mmap,  are not going to be used normally.
So are wasted virtual address space in the debugger.

If gdb runs out of space building what it needs that's one thing.
But running out because gdb is wasting space is another thing
entirely -- best not to go there.

Again, this is a 'been there done that' comment and
is, I hope, not a waste of everyone's time (or worse).
David Anderson


^ permalink raw reply	[flat|nested] 31+ messages in thread
* Re: [PATCH] Use mmap for symbol tables
@ 2006-01-31  4:40 David Anderson
  2006-01-31  5:00 ` Eirik Fuller
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: David Anderson @ 2006-01-31  4:40 UTC (permalink / raw)
  To: gdb-patches; +Cc: jimb, eirik



Jim Blandy
> I understand that it would make your BFD code more complicated, but it
> seems to me you want to map individual sections, not entire files.
> Again, this will still share memory with the block cache, so aside
> from the complexity I don't see the downside.

Eirik Fuller
>I don't see the upside of making the code more complicated.  The
>downside of the extra complication is that it makes the patch less
>likely to the point of never actually existing.  :-)
>
>Could you be more specific about why multiple mmap regions per file are
>preferable?  (It might help to keep in mind that I'm using PROT_READ and
>MAP_SHARED).  The only downside I can see is the (relatively small)
>fraction of each symbol table which is not accessed via mmap, but that
>doesn't use memory, just virtual address space (if it does use memory,
>that contradicts the "not accessed" part).

Hope the following is not out of line.

The phrase 'just virtual address space' should not, IMO, be
uttered so cavalierly.  Mapping in one glob will in some cases
make gdb useless for some apps (fearless prediction).

Been there, done that, with IRIX/dbx and  'just virtual address
space' was a disaster for a few big apps with large value to
the owners/users (when the apps needed debugging).  Meaning we
did exactly what Eirik Fuller proposes and had to drop back to
exactly what Jim Blandy proposes.  [The problem was with 32 bit
address space of the debugger as it existed a few years ago.]

And  don't mmap in a whole core file for the same reason (not
that such is being proposed, this is just another thought).

David Anderson


^ permalink raw reply	[flat|nested] 31+ messages in thread
* [PATCH] Use mmap for symbol tables
@ 2006-01-29 23:36 Eirik Fuller
  2006-01-30  5:04 ` Jim Blandy
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Eirik Fuller @ 2006-01-29 23:36 UTC (permalink / raw)
  To: gdb-patches

I'm enclosing a patch which modifies gdb to use mmap instead of malloc
and seek and read.  It's a rough cut, in a number of respects.  I have
not written a ChangeLog entry or test case or documentation.  I assume
there are other places in gdb that can use this, but I've only made
changes for a few of them (the biggest memory users for the symbol
table formats I use most).  I have not provided a way to select this
at compile time, or test for features it requires at configure time.
The patch doesn't even indent the code properly, to make it a little
bit easier to see what's different.

My immediate goal is to solicit opinions on whether the basic approach
is sound.  Comments in gdb/dwarf2read.c indicate the idea is not new.
I first started using an approach similar to what's in the enclosed
patch long before I noticed those comments; those comments finally
motivated me to dust this off.

This patch takes a very simple approach.  It mmaps the entire symbol
table file the first time any caller tries to use the new function (I
am in no way loyal to the name bfd_fetch, but I think it's preferable
to, say, bfd_mmap in that it conveys the notion that a single mmap
call can lead to multiple accesses).  It munmaps the file contents
only when its BFD is deleted.

If there's any drawback to the simple approach of mmaping the entire
file, it's that it uses more than bare minimum of virtual address
space.  I haven't seen a symbol table big enough for that to matter.
For the symbol tables I use, most of that address space will be used
anyway (the symbol table data occupies far more than half of the file
size in the symbol tables I use), but it fits in the file system cache
rather than in swap space, and concurrent gdb processes on the same
symbol table can save considerable amounts of system memory.

If there's any one drawback to the general idea of using mmap, it's
that in some cases (like symbol tables over NFS) the file data can
become unavailable, if an NFS server goes down or the file disappears
(unlinked open files don't work reliably over NFS).  I suspect those
same drawbacks can kick in even without mmap, if gdb needs to flesh
out partial symbols long after it first reads the symbol table.  It
wouldn't be difficult to add code to make this behavior optional even
when it's compiled in, if that seems useful.

This patch has one interesting benefit; it papers over a memory leak
in elfxx-mips.c; the READ macro in _bfd_mips_elf_read_ecoff_info uses
bfd_malloc instead of, say, bfd_alloc or the obstack machinery.
Choosing another symbol table with the file command, or even just
unloading the current one, doesn't seem to release that memory.  I can
investigate that leak further if that seems useful, but for my own
purposes it's easier to use mmap for the big allocations.  It's
possible there are similar leaks for other symbol table formats, but I
haven't fully explored that question.

I'm open to suggestions about how to improve this patch, or how to
make its adoption more likely.  I don't think I've done any copright
assignment paperwork for gdb.  I discussed it with my manager years
ago, but I don't specifically remember anything coming of that.  I
plan to fix that some day; perhaps this patch will help provide
impetus for that.

Here's the patch:

--- bfd/bfd-in2.h.orig	2006-01-18 13:07:48.000000000 -0800
+++ bfd/bfd-in2.h	2006-01-29 14:30:38.000000000 -0800
@@ -476,6 +476,7 @@
 extern file_ptr bfd_tell (bfd *);
 extern int bfd_flush (bfd *);
 extern int bfd_stat (bfd *, struct stat *);
+extern void *bfd_fetch (file_ptr fp, bfd_size_type size, bfd *abfd);
 
 /* Deprecated old routines.  */
 #if __GNUC__
@@ -4270,6 +4271,9 @@
   BFD_SEND (obfd, _bfd_copy_private_symbol_data, \
             (ibfd, isymbol, obfd, osymbol))
 
+#include <unistd.h>
+#include <sys/mman.h>
+
 /* Extracted from bfd.c.  */
 struct bfd
 {
@@ -4435,6 +4439,10 @@
      struct objalloc *, but we use void * to avoid requiring the inclusion
      of objalloc.h.  */
   void *memory;
+  struct {
+    void *contents;
+    size_t length;
+  } mmap;
 };
 
 typedef enum bfd_error
--- bfd/opncls.c.orig	2005-11-03 08:06:11.000000000 -0800
+++ bfd/opncls.c	2006-01-29 14:39:22.000000000 -0800
@@ -117,6 +117,10 @@
 {
   bfd_hash_table_free (&abfd->section_htab);
   objalloc_free ((struct objalloc *) abfd->memory);
+
+  if (abfd->mmap.contents)
+    munmap(abfd->mmap.contents, abfd->mmap.length);
+
   free (abfd);
 }
 
@@ -301,6 +305,30 @@
   return bfd_fopen (filename, target, mode, fd);
 }
 
+void *
+bfd_fetch (file_ptr fp, bfd_size_type size, bfd* abfd)
+{
+  if (!abfd->mmap.contents)
+    {
+      struct stat buf;
+      int fd = open(abfd->filename, O_RDONLY);
+      void *map;
+      size_t length;
+      fstat(fd, &buf);
+      length = buf.st_size;
+      map = mmap(0, length, PROT_READ, MAP_SHARED, fd, 0);
+      if (map && map != (void *) -1) {
+	abfd->mmap.length = length;
+	abfd->mmap.contents = map;
+      }
+      close(fd);
+    }
+  if (abfd->mmap.contents && abfd->mmap.length >= fp + size)
+    return abfd->mmap.contents + fp;
+  else
+    return NULL;
+}
+
 /*
 FUNCTION
 	bfd_openstreamr
--- bfd/ecoff.c.orig	2005-12-26 19:06:27.000000000 -0800
+++ bfd/ecoff.c	2006-01-29 12:44:16.000000000 -0800
@@ -556,18 +556,23 @@
       ecoff_data (abfd)->sym_filepos = 0;
       return TRUE;
     }
+  pos = ecoff_data (abfd)->sym_filepos;
+  pos += backend->debug_swap.external_hdr_size;
+
+  raw = bfd_fetch(pos, raw_size, abfd);
+
+  if (!raw) {
   raw = bfd_alloc (abfd, raw_size);
   if (raw == NULL)
     return FALSE;
 
-  pos = ecoff_data (abfd)->sym_filepos;
-  pos += backend->debug_swap.external_hdr_size;
   if (bfd_seek (abfd, pos, SEEK_SET) != 0
       || bfd_bread (raw, raw_size, abfd) != raw_size)
     {
       bfd_release (abfd, raw);
       return FALSE;
     }
+  }
 
   ecoff_data (abfd)->raw_syments = raw;
 
--- bfd/elfxx-mips.c.orig	2005-12-30 21:02:22.000000000 -0800
+++ bfd/elfxx-mips.c	2006-01-29 12:46:13.000000000 -0800
@@ -768,12 +768,15 @@
   else									\
     {									\
       bfd_size_type amt = (bfd_size_type) size * symhdr->count;		\
+      debug->ptr = bfd_fetch (symhdr->offset, amt, abfd);		\
+      if (debug->ptr == NULL) {						\
       debug->ptr = bfd_malloc (amt);					\
       if (debug->ptr == NULL)						\
 	goto error_return;						\
       if (bfd_seek (abfd, symhdr->offset, SEEK_SET) != 0		\
 	  || bfd_bread (debug->ptr, amt, abfd) != amt)			\
 	goto error_return;						\
+      }									\
     }
 
   READ (line, cbLineOffset, cbLine, sizeof (unsigned char), unsigned char *);
--- gdb/dbxread.c.orig	2005-12-17 14:33:59.000000000 -0800
+++ gdb/dbxread.c	2006-01-29 12:47:10.000000000 -0800
@@ -708,6 +708,11 @@
 	    error (_("ridiculous string table size (%d bytes)."),
 		   DBX_STRINGTAB_SIZE (objfile));
 
+	  DBX_STRINGTAB (objfile) = bfd_fetch(STRING_TABLE_OFFSET,
+					      DBX_STRINGTAB_SIZE (objfile),
+					      sym_bfd);
+
+	  if (!DBX_STRINGTAB (objfile)) {
 	  DBX_STRINGTAB (objfile) =
 	    (char *) obstack_alloc (&objfile->objfile_obstack,
 				    DBX_STRINGTAB_SIZE (objfile));
@@ -723,6 +728,7 @@
 			   sym_bfd);
 	  if (val != DBX_STRINGTAB_SIZE (objfile))
 	    perror_with_name (name);
+	  }
 	}
     }
 }
--- gdb/dwarf2read.c.orig	2006-01-17 14:30:29.000000000 -0800
+++ gdb/dwarf2read.c	2006-01-29 13:14:11.000000000 -0800
@@ -4953,6 +4953,9 @@
   if (size == 0)
     return NULL;
 
+  buf = bfd_fetch(sectp->filepos, size, abfd);
+
+  if (!buf) {
   buf = obstack_alloc (&objfile->objfile_obstack, size);
   retbuf = symfile_relocate_debug_section (abfd, sectp, buf);
   if (retbuf != NULL)
@@ -4962,6 +4965,7 @@
       || bfd_bread (buf, size, abfd) != size)
     error (_("Dwarf Error: Can't read DWARF data from '%s'"),
 	   bfd_get_filename (abfd));
+  }
 
   return buf;
 }


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2006-02-20 15:52 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-31  4:53 [PATCH] Use mmap for symbol tables David Anderson
  -- strict thread matches above, loose matches on Subject: below --
2006-01-31  4:40 David Anderson
2006-01-31  5:00 ` Eirik Fuller
2006-01-31  5:34   ` Jim Blandy
2006-01-31 14:00     ` Daniel Jacobowitz
2006-01-31 18:39       ` Jim Blandy
2006-02-01 18:11         ` Eirik Fuller
2006-01-31 17:45 ` David Anderson
2006-01-31 18:24 ` Jim Blandy
2006-01-29 23:36 Eirik Fuller
2006-01-30  5:04 ` Jim Blandy
2006-01-30 11:44   ` Eirik Fuller
2006-01-30 18:07     ` Jim Blandy
2006-01-30 18:59       ` Eirik Fuller
2006-01-30 22:11         ` Jim Blandy
2006-01-31  0:38           ` Eirik Fuller
2006-01-31  1:49             ` Jim Blandy
2006-01-31  3:12               ` Eirik Fuller
2006-01-31 21:48             ` Mark Kettenis
2006-02-01 17:52               ` Eirik Fuller
2006-02-01  6:04       ` Michael Snyder
2006-01-30 11:34 ` Andrew STUBBS
2006-01-30 11:42   ` Corinna Vinschen
2006-01-30 11:48     ` Andrew STUBBS
2006-01-31  2:23 ` Daniel Jacobowitz
2006-01-31  3:31   ` Eirik Fuller
2006-01-31  3:38     ` Daniel Jacobowitz
2006-02-07 22:05     ` Eirik Fuller
2006-02-20 15:52       ` Daniel Jacobowitz
2006-01-31  5:28   ` Jim Blandy
2006-01-31 13:59     ` Daniel Jacobowitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox