Mirror of the gdb mailing list
 help / color / mirror / Atom feed
From: Jason Molenda <jason-swarelist@molenda.com>
To: Gerald Pfeifer <pfeifer@dbai.tuwien.ac.at>
Cc: overseers@gcc.gnu.org, Zack Weinberg <zack@codesourcery.com>,
	"Kaveh R. Ghazi" <ghazi@caip.rutgers.edu>,
	gcc@gcc.gnu.org, gdb@sources.redhat.com, jimb@redhat.com,
	rth@redhat.com
Subject: Re: gcc development schedule [Re: sharing libcpp between GDB and GCC]
Date: Thu, 28 Mar 2002 01:53:00 -0000	[thread overview]
Message-ID: <20020328015346.A27639@molenda.com> (raw)
In-Reply-To: <Pine.BSF.4.44.0203281016560.83412-100000@pulcherrima.dbai.tuwien.ac.at>; from pfeifer@dbai.tuwien.ac.at on Thu, Mar 28, 2002 at 10:22:49AM +0100

On Thu, Mar 28, 2002 at 10:22:49AM +0100, Gerald Pfeifer wrote:

> On Wed, 27 Mar 2002, Zack Weinberg wrote:
> >>> (E.g. it takes about 10x longer to do "cvs update" on the 3.0
> >>> branch than the trunk.)
> >> Yeah, what's up with that?  (I thought it was just me.)

The cvs server on sourceware uses an optimization (a patch written
by Ian Lance Taylor) to cache the information needed for a cvs
update in a single file per directory.  These are the CVS/fileattr
files--you know, the ones that get out of date once every few months
and need to be blown away.  One of them in the gcc dir looks like

F.cvsignore     _head=1.9;_expand=
F.gdbinit       _head=1.6;_attic=;_expand=
FABOUT-GCC-NLS  _head=1.5;_expand=
FABOUT-NLS      _head=1.3;_expand=
FCOPYING        _head=1.4;_expand=
FCOPYING.LIB    _head=1.4;_expand=
FChangeLog      _head=1.13542;_expand=
FChangeLog.0    _head=1.12;_expand=
FChangeLog.1    _head=1.6;_expand=

So the cvs server doesn't have to open every RCS file in the
directory to find its head revision.  A do-nothing cvs update goes
much, much faster.

For the cvs server to update on a branch, it has to read each RCS
file to find the latest version on that branch.  Even worse, it
looks like it'll  need to read several blocks into each file for
frequently modified files.  CVS shouldn't have to compute any diffs
for most 'cvs update' operations - chances are that only a handful
of files were modified on the branch so only a few of these expensive
diffs are computed.  It's the get-the-head-revision operation that
is killing it.


I did make a neato little mechanism for Apple's benefit a few months
back.  It logs every cvs commit as it happens to a plain text file,
and provides a web interface to get the list of files that were
modified since the last time the client contacted the server.  (the
scripts on user's systems would just use wget with a pre-set URL
to get this list of files to update.)  It was a neato little thing,
but for trunk updates I found that doing a cvs update for the gcc
repository consistently took less than two minutes, so it wasn't
really all that useful.  I could revisit it and finish it up if
people would use it - it would work particularly well for things
like a branch where a global cvs update becomes more expensive.

> > That's the problem I know about; there may be others.
> 
> Overseers, is there anything we can do?

We could always disable the trunk cvs update optimization.  The
cvs updates on branches would still be slower than trunk cvs updates,
but the difference wouldn't be so stark.

OK, kidding aside, Ian's patch could be modified to keep the head
revisions of branches in the fileattr cache file in addition to
the trunk.  But someone would have to do that.  Goodness knows I
don't have the time.

The sourceware system is under a bit of memory pressure these days,
which doesn't help; if these blocks could be cached for a longer
period of time, this RCS file reading/parsing will go faster.

(incidentally, Ian's patch is one of the reason we're still running
cvs 1.10.  Stock cvs performance regressed between 1.11 and 1.10,
and no one has gotten Ian's patch to work fully with 1.11, so moving
to 1.11 from 1.10+Ian's patch would be a big lose.  To be fair, except
for one weekend that Chris Faylor spent on Ian's patch, no one has
really put any effort into making it work with 1.11 fully.)

> I noticed that we have more anoncvs processes running than authenticated
> users; is there any way we could find out which CVS modules these
> anonymous users currently access? (If it's gcc, we could see whether
> we can simply disable anoncvs access to the gcc module.)

None that I know of.  cvs logging bites.  We do track the # of
bytes being sent/received from different hosts and the frequency
with which hosts connect.  The host that downloaded the most number
of bytes last week by anoncvs?  A purdue.edu site.  Two redhat.com
and one suse.de sites make the top six; an IP# and a cable modem
(rogers.com) finish out the top six.  Not very useful without some
idea what repository or what sorts of operations we're talking
about.


I haven't followed the discussion that led up to this note, so I
apologize if I'm repeating things already said.

Jason


  reply	other threads:[~2002-03-28  9:53 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-03-27 19:17 Kaveh R. Ghazi
2002-03-27 19:46 ` Zack Weinberg
2002-03-28  1:24   ` Gerald Pfeifer
2002-03-28  1:53     ` Jason Molenda [this message]
2002-03-28  2:01       ` Jason Molenda
2002-03-28  7:17         ` Christopher Faylor
2002-03-28  9:01       ` David O'Brien
2002-03-28 21:29         ` Christopher Faylor
2002-03-28  9:00     ` David O'Brien
2002-04-03 14:19   ` Jim Blandy
2002-04-03 14:29     ` Phil Edwards
2002-03-28 12:34 ` Phil Edwards
  -- strict thread matches above, loose matches on Subject: below --
2002-03-27  6:01 Robert Dewar
2002-03-25 15:40 sharing libcpp between GDB and GCC Jim Blandy
2002-03-25 20:07 ` Zack Weinberg
2002-03-26 14:29   ` gcc development schedule [Re: sharing libcpp between GDB and GCC] Richard Henderson
2002-03-26 14:37     ` David Edelsohn
2002-03-26 21:32       ` Andreas Jaeger
2002-03-26 15:17     ` Neil Booth
2002-03-26 15:30       ` Tom Lord
2002-03-26 15:40         ` David Edelsohn
2002-03-26 16:03           ` Tom Lord
2002-03-26 16:41             ` Tim Hollebeek
2002-03-26 22:23               ` Eli Zaretskii
2002-03-26 22:43                 ` Tom Lord
2002-03-27  7:18                   ` mike stump
2002-03-27  9:00                     ` law
2002-03-27 10:13                       ` Neil Booth
2002-03-26 22:45                 ` Tom Lord
2002-03-26 23:11                 ` Daniel Berlin
2002-03-26 23:53                   ` Tom Lord
2002-03-27  4:32                     ` Fergus Henderson
2002-03-27  6:30                     ` Andrew Cagney
2002-03-27  6:43                       ` Gianni Mariani
2002-03-26 22:39         ` Andreas Jaeger
2002-03-26 22:54           ` Tom Lord
2002-03-26 15:31       ` Zack Weinberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020328015346.A27639@molenda.com \
    --to=jason-swarelist@molenda.com \
    --cc=gcc@gcc.gnu.org \
    --cc=gdb@sources.redhat.com \
    --cc=ghazi@caip.rutgers.edu \
    --cc=jimb@redhat.com \
    --cc=overseers@gcc.gnu.org \
    --cc=pfeifer@dbai.tuwien.ac.at \
    --cc=rth@redhat.com \
    --cc=zack@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox