From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14736 invoked by alias); 6 Oct 2010 23:59:55 -0000 Received: (qmail 14727 invoked by uid 22791); 6 Oct 2010 23:59:54 -0000 X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,SPF_HELO_PASS,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (216.239.44.51) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 06 Oct 2010 23:59:49 +0000 Received: from hpaq12.eem.corp.google.com (hpaq12.eem.corp.google.com [172.25.149.12]) by smtp-out.google.com with ESMTP id o96NxlS8002007 for ; Wed, 6 Oct 2010 16:59:47 -0700 Received: from vws3 (vws3.prod.google.com [10.241.21.131]) by hpaq12.eem.corp.google.com with ESMTP id o96NximW018657 for ; Wed, 6 Oct 2010 16:59:46 -0700 Received: by vws3 with SMTP id 3so102165vws.19 for ; Wed, 06 Oct 2010 16:59:45 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.188.133 with SMTP id da5mr23505vcb.21.1286409585573; Wed, 06 Oct 2010 16:59:45 -0700 (PDT) Received: by 10.220.83.79 with HTTP; Wed, 6 Oct 2010 16:59:45 -0700 (PDT) In-Reply-To: <201010050820.o958Kf42002588@syracuse.mckusick.com> References: <201010050820.o958Kf42002588@syracuse.mckusick.com> Date: Wed, 06 Oct 2010 23:59:00 -0000 Message-ID: Subject: Re: [RFA] Extend hashed symbol dictionaries to work with Ada From: Doug Evans To: Hilfinger@adacore.com, Tom Tromey , Joel Brobecker Cc: gdb-patches@sourceware.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-System-Of-Record: true X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2010-10/txt/msg00104.txt.bz2 On Tue, Oct 5, 2010 at 1:20 AM, Paul Hilfinger wrote: > > This patch allows Ada to speed up symbol lookup by using the facilities > in dictionary.[ch] for hashed lookups. =A0First, we generalize dictionary > search to allow clients to specify any matching function compatible with > the hashing function. Next, we modify the hashing algorithm so that symbo= ls > that wild-match a name hash to the same value. =A0Finally, we modify Ada > symbol lookup to use these facilities. > > Because this patch touches on a hashing algorithm used by other > languages, I took the precaution of doing a speed test on a list of > about 12000 identifiers (repeatedly inserting all of them into a table > and then doing a lookup on a million names at random, thus testing the > speed of the hashing algorithm and how well it distributed names). > There was actually a slight speedup, probably as a result of open- > coding some of the tests in msymbol_hash_iw. =A0By design, the revised > hashing algorithm produces the same results as the original on most > "normal" C identifiers. > > We considered augmenting the dictionary interface still further by allowi= ng > different hashing algorithms for different dictionaries, based on the > (supposed) language of the symbols in that dictionary. =A0While this prod= uced > better isolation of the changes to Ada programs, the additional flexibili= ty > also complicated the dictionary interface. =A0I'd prefer to keep things > simple for now. > >[...] Hi. I wouldn't mind having a couple of comments added to this function: > > +static unsigned int > +dict_hash (const char *string) > +{ > + =A0unsigned int hash; > + =A0int c; > + > + =A0if (*string =3D=3D '_' && strncmp (string, "_ada_", 5) =3D=3D 0) > + =A0 =A0string +=3D 5; > + > + =A0hash =3D 0; > + =A0while (*string) > + =A0 =A0{ > + =A0 =A0 =A0switch (*string) > + =A0 =A0 =A0 { > + =A0 =A0 =A0 case '$': case '.': case 'X': case '(': Why is 'X' special cased? [Actually, I'd have the comment explain all of these special cases.] > + =A0 =A0 =A0 =A0 return hash; > + =A0 =A0 =A0 case ' ': > + =A0 =A0 =A0 =A0 string +=3D 1; > + =A0 =A0 =A0 =A0 break; > + =A0 =A0 =A0 case '_': > + =A0 =A0 =A0 =A0 if (string[1] =3D=3D '_') > + =A0 =A0 =A0 =A0 =A0 { > + =A0 =A0 =A0 =A0 =A0 =A0 if (((c =3D string[2]) < 'a' || c > 'z') && c != =3D 'O') Why does this `if' exist? > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return hash; > + =A0 =A0 =A0 =A0 =A0 =A0 hash =3D 0; Why do we restart calculating the hash here? > + =A0 =A0 =A0 =A0 =A0 =A0 string +=3D 2; > + =A0 =A0 =A0 =A0 =A0 =A0 break; > + =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 /* FALL THROUGH */ > + =A0 =A0 =A0 default: > + =A0 =A0 =A0 =A0 hash =3D hash * 67 + *string - 113; > + =A0 =A0 =A0 =A0 string +=3D 1; > + =A0 =A0 =A0 =A0 break; > + =A0 =A0 =A0 } > + =A0 =A0} > + =A0return hash; > +} > +