From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-42541-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 13726 invoked by alias); 30 Jan 2006 18:59:16 -0000
Received: (qmail 13714 invoked by uid 22791); 30 Jan 2006 18:59:14 -0000
X-Spam-Check-By: sourceware.org
Received: from ns.hackrat.net (HELO ns.hackrat.org) (157.22.130.2)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 30 Jan 2006 18:59:12 +0000
Received: from [192.168.0.241] (kagnu.hackrat.com [192.168.0.241]) 	by ns.hackrat.org (Postfix) with ESMTP id 33BBE690067; 	Mon, 30 Jan 2006 10:59:10 -0800 (PST)
Message-ID: <43DE61FD.2090309@hackrat.com>
Date: Mon, 30 Jan 2006 18:59:00 -0000
From: Eirik Fuller <eirik@hackrat.com>
User-Agent: Debian Thunderbird 1.0.2 (X11/20051002)
MIME-Version: 1.0
To: Jim Blandy <jimb@red-bean.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH] Use mmap for symbol tables
References: <20060129233630.3EFA6690067@ns.hackrat.org>	 <8f2776cb0601292104y1c29f4adl6e681b20cd86c177@mail.gmail.com>	 <43DDFC13.90909@hackrat.com> <8f2776cb0601301007k5bccb594he33b08e84096d1a2@mail.gmail.com>
In-Reply-To: <8f2776cb0601301007k5bccb594he33b08e84096d1a2@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2006-01/txt/msg00474.txt.bz2

> But from what you've said, it sounds like the kernel recognizes
> there's a linear scan going on and starts doing read-ahead, so you can
> do the network I/O in parallel with GDB's processing.  Do I have that
> right?

The read-ahead effect is imperfect, at least on the system I'm using.
If it were better, I'd saturate the network interface, but it's running
at more like half speed.

With a fresh mount (no symbol table caching), the mmap patch provides
interleaving.  Without the patch, first it takes a long time to read the
entire symbol table into memory (which saturates the network interface),
then a second pass saturates the CPU.  I really should make some actual
measurements; at this point I'm not entirely sure that the elapsed time
is less for a fresh mount (but it's certainly less after the symbol
table is in the file system cache ... but I should measure that too).

> However, doesn't that mean that changes to the data by other processes
> (say) would become visible to GDB?  What happens when you recompile
> the program while GDB's running?

In our environment that's very rare (once a release ships, we really
need the symbol table to match what shipped), enough that I hadn't even
given it much thought, but I have a couple of offhand reactions.

One, it's roughly the same problem as NFS server outages (yes, the
details will vary; in our environment NFS outages are far more common
than symbol table clobberage).

Two, if the build process unlinks the symbol table before commencing the
link, gdb won't see the changes (at least not in mmap'd data).  Renaming
the symbol table is better than unlinking, all in all, but unlinking it
is sufficient to rely on the unlinked open files feature (which, with
NFS, isn't enough; the conventional .nfs turd file hack is useless when
a different client unlinks the file).  I just double checked; our
Makefiles do remove the old symbol table first.

In an earlier message I commented about problems which might occur while
fleshing out partial symbols, if a symbol table becomes unavailable.  In
that commentary I assumed that gdb reads from the symbol table when it
promotes partial symbols to symbols; a quick glance at the code suggests
I assumed correctly.  Does gdb handle a symbol table which changes out
from under it when it fleshes out partial symbols?  If not, then the
mmap patch doesn't make things fundamentally worse; it's just a matter
of degree.  If gdb does handle changes to a symbol table gracefully,
then I wonder if the way it does that can somehow be extended to mmap.

> I'm just concerned about wasting address space.  People these days do
> have awfully big programs.  There are executables out there in the
> gigabytes (cue lurkers to share their horror stories).

I've already mentioned that the wasted address space isn't all that big,
at least not in the symbol tables I'm accustomed to.  Anyone who is
crowding the limits of virtual address space will run out soon enough
whether they use malloc/seek/read or mmap; the best long term answer,
short of a completely different symbol table format (one which doesn't
require slurping the entire file to build an index that belongs in the
file format to begin with), is to buy amd64 processors.  I've already
bought myself a dual Opteron system for similar reasons, but that has
more to do with multi-gigabyte core files than big symbol tables.

If I had symbol tables big enough to crowd the address space limits on
the processor gdb runs on, I would switch to a new symbol table format.
At the very least I'd break the existing file into two pieces, the piece
gdb needs and the piece it doesn't need (but really it would be better
in the long run, for a number of reasons, to completely overhaul the
symbol table format).

One concern I have about extra complication to mmap pieces of the file
is that it could conceivably use more address space rather than less
address space.  If different parts of gdb use overlapping regions of the
symbol table, the extra complication has to be careful to share an
existing mmap (or pay the penalty of mmaping it twice).  The whole-file
approach to mmap eats the address space cost up front, but it never gets
any worse than the worst case.

I should point out that I don't have any evidence to support a concern
about mmaping regions more than once.  I have a vague recollection, from
years ago, of being surprised that there were redundant mmap regions in
a gdb process, but I'm not even positive that it was for a symbol table.
The way I remember it, it might have been related to the use of the same
file for executable and symbol table, but I just don't remember the details.

I really should gather some timing information and pass it along.  I'll
try to do that today.