From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6807 invoked by alias); 28 Mar 2002 09:53:49 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 6731 invoked from network); 28 Mar 2002 09:53:46 -0000 Received: from unknown (HELO sj1-3-4-16.securesites.net) (192.220.127.209) by sources.redhat.com with SMTP; 28 Mar 2002 09:53:46 -0000 Received: (qmail 31225 invoked by uid 19025); 28 Mar 2002 09:53:46 -0000 Date: Thu, 28 Mar 2002 01:53:00 -0000 From: Jason Molenda To: Gerald Pfeifer Cc: overseers@gcc.gnu.org, Zack Weinberg , "Kaveh R. Ghazi" , gcc@gcc.gnu.org, gdb@sources.redhat.com, jimb@redhat.com, rth@redhat.com Subject: Re: gcc development schedule [Re: sharing libcpp between GDB and GCC] Message-ID: <20020328015346.A27639@molenda.com> References: <20020328034552.GB23767@codesourcery.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: ; from pfeifer@dbai.tuwien.ac.at on Thu, Mar 28, 2002 at 10:22:49AM +0100 X-SW-Source: 2002-03/txt/msg00293.txt.bz2 On Thu, Mar 28, 2002 at 10:22:49AM +0100, Gerald Pfeifer wrote: > On Wed, 27 Mar 2002, Zack Weinberg wrote: > >>> (E.g. it takes about 10x longer to do "cvs update" on the 3.0 > >>> branch than the trunk.) > >> Yeah, what's up with that? (I thought it was just me.) The cvs server on sourceware uses an optimization (a patch written by Ian Lance Taylor) to cache the information needed for a cvs update in a single file per directory. These are the CVS/fileattr files--you know, the ones that get out of date once every few months and need to be blown away. One of them in the gcc dir looks like F.cvsignore _head=1.9;_expand= F.gdbinit _head=1.6;_attic=;_expand= FABOUT-GCC-NLS _head=1.5;_expand= FABOUT-NLS _head=1.3;_expand= FCOPYING _head=1.4;_expand= FCOPYING.LIB _head=1.4;_expand= FChangeLog _head=1.13542;_expand= FChangeLog.0 _head=1.12;_expand= FChangeLog.1 _head=1.6;_expand= So the cvs server doesn't have to open every RCS file in the directory to find its head revision. A do-nothing cvs update goes much, much faster. For the cvs server to update on a branch, it has to read each RCS file to find the latest version on that branch. Even worse, it looks like it'll need to read several blocks into each file for frequently modified files. CVS shouldn't have to compute any diffs for most 'cvs update' operations - chances are that only a handful of files were modified on the branch so only a few of these expensive diffs are computed. It's the get-the-head-revision operation that is killing it. I did make a neato little mechanism for Apple's benefit a few months back. It logs every cvs commit as it happens to a plain text file, and provides a web interface to get the list of files that were modified since the last time the client contacted the server. (the scripts on user's systems would just use wget with a pre-set URL to get this list of files to update.) It was a neato little thing, but for trunk updates I found that doing a cvs update for the gcc repository consistently took less than two minutes, so it wasn't really all that useful. I could revisit it and finish it up if people would use it - it would work particularly well for things like a branch where a global cvs update becomes more expensive. > > That's the problem I know about; there may be others. > > Overseers, is there anything we can do? We could always disable the trunk cvs update optimization. The cvs updates on branches would still be slower than trunk cvs updates, but the difference wouldn't be so stark. OK, kidding aside, Ian's patch could be modified to keep the head revisions of branches in the fileattr cache file in addition to the trunk. But someone would have to do that. Goodness knows I don't have the time. The sourceware system is under a bit of memory pressure these days, which doesn't help; if these blocks could be cached for a longer period of time, this RCS file reading/parsing will go faster. (incidentally, Ian's patch is one of the reason we're still running cvs 1.10. Stock cvs performance regressed between 1.11 and 1.10, and no one has gotten Ian's patch to work fully with 1.11, so moving to 1.11 from 1.10+Ian's patch would be a big lose. To be fair, except for one weekend that Chris Faylor spent on Ian's patch, no one has really put any effort into making it work with 1.11 fully.) > I noticed that we have more anoncvs processes running than authenticated > users; is there any way we could find out which CVS modules these > anonymous users currently access? (If it's gcc, we could see whether > we can simply disable anoncvs access to the gcc module.) None that I know of. cvs logging bites. We do track the # of bytes being sent/received from different hosts and the frequency with which hosts connect. The host that downloaded the most number of bytes last week by anoncvs? A purdue.edu site. Two redhat.com and one suse.de sites make the top six; an IP# and a cable modem (rogers.com) finish out the top six. Not very useful without some idea what repository or what sorts of operations we're talking about. I haven't followed the discussion that led up to this note, so I apologize if I'm repeating things already said. Jason