From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20891 invoked by alias); 24 Sep 2002 01:39:40 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 20875 invoked from network); 24 Sep 2002 01:39:39 -0000 Received: from unknown (HELO crack.them.org) (65.125.64.184) by sources.redhat.com with SMTP; 24 Sep 2002 01:39:39 -0000 Received: from nevyn.them.org ([66.93.61.169] ident=mail) by crack.them.org with asmtp (Exim 3.12 #1 (Debian)) id 17tfOo-0001tF-00; Mon, 23 Sep 2002 21:26:06 -0500 Received: from drow by nevyn.them.org with local (Exim 3.35 #1 (Debian)) id 17teSw-0002uo-00; Mon, 23 Sep 2002 21:26:18 -0400 Date: Mon, 23 Sep 2002 18:39:00 -0000 From: Daniel Jacobowitz To: Daniel Berlin Cc: David Carlton , Jim Blandy , gdb@sources.redhat.com Subject: Re: suggestion for dictionary representation Message-ID: <20020924012618.GA11065@nevyn.them.org> Mail-Followup-To: Daniel Berlin , David Carlton , Jim Blandy , gdb@sources.redhat.com References: <6FDBFE18-CF55-11D6-BA45-000393575BCC@dberlin.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6FDBFE18-CF55-11D6-BA45-000393575BCC@dberlin.org> User-Agent: Mutt/1.5.1i X-SW-Source: 2002-09/txt/msg00359.txt.bz2 On Mon, Sep 23, 2002 at 08:34:50PM -0400, Daniel Berlin wrote: > >I'm also curious about how it would affect the speed of reading in > >symbols. Right now, that should be O(n), where n is the number of > >global symbols, right? > > > If we used expandable hash tables, then I > >think it would be amortized O(n) and with the constant factor larger. > > Nope. > Our string hash function is O(N) right now (as are most). Hash tables > are only O(N) when the hash function is O(1). > > Now, if you have it only hash the first x characters, you can make your > hash table O(N) again, with the x as the constant. Of course, if only > hashing the first x characters causes tons of hash conflicts, it's not > going to make your hash table very fast. Wait a second... aren't you switching N's on us? N is the number of global symbols. We're talking about the total time of adding all symbols to the table. Hashing a string is "effectively" constant time, because all string lengths are "small". > >(But, I think, not larger in a way that would make a difference.) I'm > >curious about how often the "amortized" bit would lead to strange > >hiccups, but I don't think that's a big deal. > > > >But for skip lists, wouldn't it be something like O(n log n)? If so, > >that's an issue we have to consider. > > Put it in perspective. > for 1 billion symbols, n is 29.89. > for 1 million symbols, n is 19.93. You mean "log n", of course. -- Daniel Jacobowitz MontaVista Software Debian GNU/Linux Developer