From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-42543-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 1353 invoked by alias); 31 Jan 2006 00:38:31 -0000
Received: (qmail 1345 invoked by uid 22791); 31 Jan 2006 00:38:31 -0000
X-Spam-Check-By: sourceware.org
Received: from ns.hackrat.net (HELO ns.hackrat.org) (157.22.130.2)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 31 Jan 2006 00:38:29 +0000
Received: from [192.168.0.241] (kagnu.hackrat.com [192.168.0.241]) 	by ns.hackrat.org (Postfix) with ESMTP id 476EE690067; 	Mon, 30 Jan 2006 16:38:27 -0800 (PST)
Message-ID: <43DEB182.4020804@hackrat.com>
Date: Tue, 31 Jan 2006 00:38:00 -0000
From: Eirik Fuller <eirik@hackrat.com>
User-Agent: Debian Thunderbird 1.0.2 (X11/20051002)
MIME-Version: 1.0
To: Jim Blandy <jimb@red-bean.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH] Use mmap for symbol tables
References: <20060129233630.3EFA6690067@ns.hackrat.org>	 <8f2776cb0601292104y1c29f4adl6e681b20cd86c177@mail.gmail.com>	 <43DDFC13.90909@hackrat.com>	 <8f2776cb0601301007k5bccb594he33b08e84096d1a2@mail.gmail.com>	 <43DE61FD.2090309@hackrat.com> <8f2776cb0601301411j17219c9en74881de2489ad129@mail.gmail.com>
In-Reply-To: <8f2776cb0601301411j17219c9en74881de2489ad129@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2006-01/txt/msg00476.txt.bz2

> For inclusion in the public GDB sources, I'd want GDB's sensitivity to
> the files being changed out from underneath it to be unaffected,
> regardless of the details of peoples' build processes.

Fair enough.  Making the mmap patch a configure time option (not just
detecting the presence of mmap, but requiring that use of mmap to be
explictly enabled) would accomplish that.  Or always compile it in if
mmap is detected at configure time but leave it disabled by default.
If it's available, but is enabled only by request, and if the associated
risks are documented, I don't think GDB's ability to handle changed
files is diminished in any meaningful sense.

Of course, ignoring mmap altogether would just as easily accomplish the
same goal.  :-)

> Have you explained the disadvantages to simply using MAP_PRIVATE?

No, not yet.

> Does MAP_PRIVATE require the whole file to be read before the mmap
> system call can return, like 'read' does?

I doubt it.  I'm not sure MAP_PRIVATE is any different than MAP_SHARED
if PROT_READ is specified.  It's possible that MAP_PRIVATE preserves the
original file contents (MAP_SHARED certainly doesn't), but my reading of
the mmap man page tells me it makes no such promise.

> I'm glad it's not a problem for you, but I'm not sure that's the best
> answer for all of GDB's users.

Fair enough.  I still think that anyone for whom mmap burns too much
address space is a few months of bloat away from losing the rest even
with malloc.  :-)

> I'd love to see some timings.

All these years I'd assumed the mmap patch reduced startup time, but my
recent observations got me wondering (the network monitor bar in xosview
is a recent addition to my configuration).  Now that I've actually done
some simple timings, I realize that the faster network throughput of
read (compared to mmap) reduces the total startup time, even though the
network I/O is amortized in the mmap case (the other way to see it is
that the CPU isn't busy all the time with a fresh mount, with mmap).

With a slightly modified patch (to make mmap runtime optional) I always
get around 42 seconds of user time CPU on the symbol table file I used
for the tests (between 42.35 and 42.71 in a total of four samplings;
that variation looks like noise).  The system time for read is 3.22
seconds with the symbol table in cache and 3.67 seconds with a fresh
mount.  The same numbers for mmap are 0.62 (in cache) and 1.9 seconds
(fresh mount), so the syscall time is less with mmap (no big there).
The elapsed time with the file in cache is 43.3 seconds with mmap and
45.6 seconds with read.

With a fresh mount, read takes 1:14 and mmap takes 1:33; that's
consistent with the lower network throughput I see with mmap, in
my xosview window.  It might be interesting to compare network
traces between read and mmap, but it looks like read just does
better readahead than mmap.

I can reduce the gdb startup time on a fresh mount if I start dd on the
same file before I start gdb (yes, that's cheating :-)

My tests are on a dual 700 MHz PIII (yeah, so I like dinosaurs :-) with
2GB of PC-100 SDRAM and a single e100 network interface, full duplex.
The symbol table is a 336620600 byte DWARF/ELF file with segment sizes
of 22135266/2578672/33359212 (text/data/bss).

So for gdb startup time (which, I think, is the only place mmap makes a
different in CPU performance), the best case for mmap is only slightly
better than the best case for read, and the worst case for mmap is
noticably worse than the worst case for read.

In view of all that, I think it's fair to say that the real benefit of
the mmap patch is the memory footprint.  Perhaps I'll find time to run
some simple benchmarks on a system with considerably less memory (when I
used gdb without my mmap patch on a dual 550 MHz Xeon with 768MB, it
seemed painfully slow; perhaps I should try to quantify that).

About the other issue (reacting in a reasonable manner to a symbol table
which disappears or changes out from under gdb), I agree that enabling
mmap by default on systems which support it would be a mistake.
I think it's a performance tradeoff which needs to be consciously
chosen, not inflicted on GDB users without their consent.

Having said that, I believe that enabling mmap for symbol tables makes
more sense in my environment than not enabling it.  Unless that's an
extreme minority opinion, it seems like a vote in favor of adding (a
suitable mutation of) the patch and disabling it by default (but either
way, I'm free to use my own privately maintained GDB :-).

Thanks for your feedback!