From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1353 invoked by alias); 31 Jan 2006 00:38:31 -0000 Received: (qmail 1345 invoked by uid 22791); 31 Jan 2006 00:38:31 -0000 X-Spam-Check-By: sourceware.org Received: from ns.hackrat.net (HELO ns.hackrat.org) (157.22.130.2) by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 31 Jan 2006 00:38:29 +0000 Received: from [192.168.0.241] (kagnu.hackrat.com [192.168.0.241]) by ns.hackrat.org (Postfix) with ESMTP id 476EE690067; Mon, 30 Jan 2006 16:38:27 -0800 (PST) Message-ID: <43DEB182.4020804@hackrat.com> Date: Tue, 31 Jan 2006 00:38:00 -0000 From: Eirik Fuller User-Agent: Debian Thunderbird 1.0.2 (X11/20051002) MIME-Version: 1.0 To: Jim Blandy Cc: gdb-patches@sourceware.org Subject: Re: [PATCH] Use mmap for symbol tables References: <20060129233630.3EFA6690067@ns.hackrat.org> <8f2776cb0601292104y1c29f4adl6e681b20cd86c177@mail.gmail.com> <43DDFC13.90909@hackrat.com> <8f2776cb0601301007k5bccb594he33b08e84096d1a2@mail.gmail.com> <43DE61FD.2090309@hackrat.com> <8f2776cb0601301411j17219c9en74881de2489ad129@mail.gmail.com> In-Reply-To: <8f2776cb0601301411j17219c9en74881de2489ad129@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2006-01/txt/msg00476.txt.bz2 > For inclusion in the public GDB sources, I'd want GDB's sensitivity to > the files being changed out from underneath it to be unaffected, > regardless of the details of peoples' build processes. Fair enough. Making the mmap patch a configure time option (not just detecting the presence of mmap, but requiring that use of mmap to be explictly enabled) would accomplish that. Or always compile it in if mmap is detected at configure time but leave it disabled by default. If it's available, but is enabled only by request, and if the associated risks are documented, I don't think GDB's ability to handle changed files is diminished in any meaningful sense. Of course, ignoring mmap altogether would just as easily accomplish the same goal. :-) > Have you explained the disadvantages to simply using MAP_PRIVATE? No, not yet. > Does MAP_PRIVATE require the whole file to be read before the mmap > system call can return, like 'read' does? I doubt it. I'm not sure MAP_PRIVATE is any different than MAP_SHARED if PROT_READ is specified. It's possible that MAP_PRIVATE preserves the original file contents (MAP_SHARED certainly doesn't), but my reading of the mmap man page tells me it makes no such promise. > I'm glad it's not a problem for you, but I'm not sure that's the best > answer for all of GDB's users. Fair enough. I still think that anyone for whom mmap burns too much address space is a few months of bloat away from losing the rest even with malloc. :-) > I'd love to see some timings. All these years I'd assumed the mmap patch reduced startup time, but my recent observations got me wondering (the network monitor bar in xosview is a recent addition to my configuration). Now that I've actually done some simple timings, I realize that the faster network throughput of read (compared to mmap) reduces the total startup time, even though the network I/O is amortized in the mmap case (the other way to see it is that the CPU isn't busy all the time with a fresh mount, with mmap). With a slightly modified patch (to make mmap runtime optional) I always get around 42 seconds of user time CPU on the symbol table file I used for the tests (between 42.35 and 42.71 in a total of four samplings; that variation looks like noise). The system time for read is 3.22 seconds with the symbol table in cache and 3.67 seconds with a fresh mount. The same numbers for mmap are 0.62 (in cache) and 1.9 seconds (fresh mount), so the syscall time is less with mmap (no big there). The elapsed time with the file in cache is 43.3 seconds with mmap and 45.6 seconds with read. With a fresh mount, read takes 1:14 and mmap takes 1:33; that's consistent with the lower network throughput I see with mmap, in my xosview window. It might be interesting to compare network traces between read and mmap, but it looks like read just does better readahead than mmap. I can reduce the gdb startup time on a fresh mount if I start dd on the same file before I start gdb (yes, that's cheating :-) My tests are on a dual 700 MHz PIII (yeah, so I like dinosaurs :-) with 2GB of PC-100 SDRAM and a single e100 network interface, full duplex. The symbol table is a 336620600 byte DWARF/ELF file with segment sizes of 22135266/2578672/33359212 (text/data/bss). So for gdb startup time (which, I think, is the only place mmap makes a different in CPU performance), the best case for mmap is only slightly better than the best case for read, and the worst case for mmap is noticably worse than the worst case for read. In view of all that, I think it's fair to say that the real benefit of the mmap patch is the memory footprint. Perhaps I'll find time to run some simple benchmarks on a system with considerably less memory (when I used gdb without my mmap patch on a dual 550 MHz Xeon with 768MB, it seemed painfully slow; perhaps I should try to quantify that). About the other issue (reacting in a reasonable manner to a symbol table which disappears or changes out from under gdb), I agree that enabling mmap by default on systems which support it would be a mistake. I think it's a performance tradeoff which needs to be consciously chosen, not inflicted on GDB users without their consent. Having said that, I believe that enabling mmap for symbol tables makes more sense in my environment than not enabling it. Unless that's an extreme minority opinion, it seems like a vote in favor of adding (a suitable mutation of) the patch and disabling it by default (but either way, I'm free to use my own privately maintained GDB :-). Thanks for your feedback!