From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-75939-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 14736 invoked by alias); 6 Oct 2010 23:59:55 -0000
Received: (qmail 14727 invoked by uid 22791); 6 Oct 2010 23:59:54 -0000
X-SWARE-Spam-Status: No, hits=-1.9 required=5.0	tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,SPF_HELO_PASS,T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
Received: from smtp-out.google.com (HELO smtp-out.google.com) (216.239.44.51)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 06 Oct 2010 23:59:49 +0000
Received: from hpaq12.eem.corp.google.com (hpaq12.eem.corp.google.com [172.25.149.12])	by smtp-out.google.com with ESMTP id o96NxlS8002007	for <gdb-patches@sourceware.org>; Wed, 6 Oct 2010 16:59:47 -0700
Received: from vws3 (vws3.prod.google.com [10.241.21.131])	by hpaq12.eem.corp.google.com with ESMTP id o96NximW018657	for <gdb-patches@sourceware.org>; Wed, 6 Oct 2010 16:59:46 -0700
Received: by vws3 with SMTP id 3so102165vws.19        for <gdb-patches@sourceware.org>; Wed, 06 Oct 2010 16:59:45 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.220.188.133 with SMTP id da5mr23505vcb.21.1286409585573; Wed, 06 Oct 2010 16:59:45 -0700 (PDT)
Received: by 10.220.83.79 with HTTP; Wed, 6 Oct 2010 16:59:45 -0700 (PDT)
In-Reply-To: <201010050820.o958Kf42002588@syracuse.mckusick.com>
References: <201010050820.o958Kf42002588@syracuse.mckusick.com>
Date: Wed, 06 Oct 2010 23:59:00 -0000
Message-ID: <AANLkTinak7-mHU=tvqUKnJwubvMHnwQmtp5uU+14QUVc@mail.gmail.com>
Subject: Re: [RFA] Extend hashed symbol dictionaries to work with Ada
From: Doug Evans <dje@google.com>
To: Hilfinger@adacore.com, Tom Tromey <tromey@redhat.com>,        Joel Brobecker <brobecker@adacore.com>
Cc: gdb-patches@sourceware.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-System-Of-Record: true
X-IsSubscribed: yes
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2010-10/txt/msg00104.txt.bz2

On Tue, Oct 5, 2010 at 1:20 AM, Paul Hilfinger
<hilfingr@syracuse.mckusick.com> wrote:
>
> This patch allows Ada to speed up symbol lookup by using the facilities
> in dictionary.[ch] for hashed lookups. =A0First, we generalize dictionary
> search to allow clients to specify any matching function compatible with
> the hashing function. Next, we modify the hashing algorithm so that symbo=
ls
> that wild-match a name hash to the same value. =A0Finally, we modify Ada
> symbol lookup to use these facilities.
>
> Because this patch touches on a hashing algorithm used by other
> languages, I took the precaution of doing a speed test on a list of
> about 12000 identifiers (repeatedly inserting all of them into a table
> and then doing a lookup on a million names at random, thus testing the
> speed of the hashing algorithm and how well it distributed names).
> There was actually a slight speedup, probably as a result of open-
> coding some of the tests in msymbol_hash_iw. =A0By design, the revised
> hashing algorithm produces the same results as the original on most
> "normal" C identifiers.
>
> We considered augmenting the dictionary interface still further by allowi=
ng
> different hashing algorithms for different dictionaries, based on the
> (supposed) language of the symbols in that dictionary. =A0While this prod=
uced
> better isolation of the changes to Ada programs, the additional flexibili=
ty
> also complicated the dictionary interface. =A0I'd prefer to keep things
> simple for now.
>
>[...]

Hi.  I wouldn't mind having a couple of comments added to this function:

>
> +static unsigned int
> +dict_hash (const char *string)
> +{
> + =A0unsigned int hash;
> + =A0int c;
> +
> + =A0if (*string =3D=3D '_' && strncmp (string, "_ada_", 5) =3D=3D 0)
> + =A0 =A0string +=3D 5;
> +
> + =A0hash =3D 0;
> + =A0while (*string)
> + =A0 =A0{
> + =A0 =A0 =A0switch (*string)
> + =A0 =A0 =A0 {
> + =A0 =A0 =A0 case '$': case '.': case 'X': case '(':

Why is 'X' special cased?
[Actually, I'd have the comment explain all of these special cases.]

> + =A0 =A0 =A0 =A0 return hash;
> + =A0 =A0 =A0 case ' ':
> + =A0 =A0 =A0 =A0 string +=3D 1;
> + =A0 =A0 =A0 =A0 break;
> + =A0 =A0 =A0 case '_':
> + =A0 =A0 =A0 =A0 if (string[1] =3D=3D '_')
> + =A0 =A0 =A0 =A0 =A0 {
> + =A0 =A0 =A0 =A0 =A0 =A0 if (((c =3D string[2]) < 'a' || c > 'z') && c !=
=3D 'O')

Why does this `if' exist?

> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return hash;
> + =A0 =A0 =A0 =A0 =A0 =A0 hash =3D 0;

Why do we restart calculating the hash here?

> + =A0 =A0 =A0 =A0 =A0 =A0 string +=3D 2;
> + =A0 =A0 =A0 =A0 =A0 =A0 break;
> + =A0 =A0 =A0 =A0 =A0 }
> + =A0 =A0 =A0 =A0 /* FALL THROUGH */
> + =A0 =A0 =A0 default:
> + =A0 =A0 =A0 =A0 hash =3D hash * 67 + *string - 113;
> + =A0 =A0 =A0 =A0 string +=3D 1;
> + =A0 =A0 =A0 =A0 break;
> + =A0 =A0 =A0 }
> + =A0 =A0}
> + =A0return hash;
> +}
> +