From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-145140-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 79072 invoked by alias); 30 Jan 2018 03:56:21 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 79056 invoked by uid 89); 30 Jan 2018 03:56:21 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=ha, letting, strengths, dangers
X-HELO: rock.gnat.com
Received: from rock.gnat.com (HELO rock.gnat.com) (205.232.38.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 30 Jan 2018 03:56:19 +0000
Received: from localhost (localhost.localdomain [127.0.0.1])	by filtered-rock.gnat.com (Postfix) with ESMTP id DB17B5601D;	Mon, 29 Jan 2018 22:56:17 -0500 (EST)
Received: from rock.gnat.com ([127.0.0.1])	by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024)	with LMTP id oNgF2dr7vBLn; Mon, 29 Jan 2018 22:56:17 -0500 (EST)
Received: from joel.gnat.com (localhost.localdomain [127.0.0.1])	by rock.gnat.com (Postfix) with ESMTP id 721A8117619;	Mon, 29 Jan 2018 22:56:17 -0500 (EST)
Received: by joel.gnat.com (Postfix, from userid 1000)	id DD1DB83307; Tue, 30 Jan 2018 07:56:12 +0400 (+04)
Date: Tue, 30 Jan 2018 03:56:00 -0000
From: Joel Brobecker <brobecker@adacore.com>
To: Pedro Alves <palves@redhat.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [RFC] regresssion(internal-error) printing subprogram argument
Message-ID: <20180130035612.xghiskhiweftijxi@adacore.com>
References: <00320239-44c8-b9c3-013b-b27c771e3401@redhat.com> <07a154ef-b6f5-ad86-1410-a73620de4b5b@redhat.com> <20180103043345.n6blge377ybsdx6q@adacore.com> <8df5cf8b-6e4e-e310-fcbd-2615334fe5b9@redhat.com> <832dbb30-7c2b-40ed-c03c-654bd1e2ea32@redhat.com> <20180117091332.z7bqu4aljudq33sw@adacore.com> <20180126035055.vbjtowj6q5ftbwiz@adacore.com> <21bfbb6a-bb10-812b-c34a-d367321e8d5e@redhat.com> <20180129103841.kdkomcjbuwiat5b4@adacore.com> <250976c6-6e7a-6a8e-b9f2-a57f5b92b965@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <250976c6-6e7a-6a8e-b9f2-a57f5b92b965@redhat.com>
User-Agent: NeoMutt/20170113 (1.7.2)
X-SW-Source: 2018-01/txt/msg00614.txt.bz2

> The other day I noticed that gdb_demangle -> bfd_demangle ->
> cplus_demangle does have support for demangling other languages.
> For Ada, see the GNAT_DEMANGLING handling in
> libiberty.c:cplus-dem.c:cplus_demangle, which takes you to:
> 
>  /* Demangle ada names.  The encoding is documented in gcc/ada/exp_dbug.ads.  */
> 
>  char *
>  ada_demangle (const char *mangled, int option ATTRIBUTE_UNUSED)
>  {

Ha! Thanks for digging; who would have thought...

> When I saw that, I wondered why is it that GDB has a separate
> implementation of Ada decoding in gdb/ada-lang.c.  The only plausible
> explanation I came up with is that gdb's version decodes into a
> buffer that is shared between invocations while libiberty's version
> always heap-allocates the result.  Maybe it was an efficiency thing?
> Do you know?

I don't unfortunately. I'm pretty sure GDB's ada_decode was already
present before I joined AdaCore...

It is true that start up performance is critical, particularly in Ada,
because one of the strengths of Ada is to help design very large apps.
I think that, eventually, we're going to have to accept that computing
power is gained by adding parallelism, and try to take advantage of that
to speed things up (with the dangers in terms of stability that it
entails), so that aspect might become less relevant once we make that
transition.

> I ran into that around this discussion
> <https://sourceware.org/ml/gdb/2017-11/msg00029.html> where I
> was wondering whether we could speed up demangling by letting
> bfd_demangle / cplus_demangle try all languages and return back
> which one worked:
> 
>  ~~~
>  An idea I had would be to try to combine most the language_sniff_from_mangled_name
>  implementations in one call, by deferring to cplus_demangle which seems
>  like could take care of gnu v3, rust, java, gnat, dlang and old C++ mangling
>  schemes all at once.  This would avoid the overhead of gdb_demangle and
>  bfd_demangle for each language-specific demangling call, at least.  Not sure.
>  ~~~
> 
> Because ada_decode comes up high in profiles today...

Can we avoid the calls to most demanglers entirely by passing
the language to symbol_set_names? I didn't do an extensive search
of all the callers. We could have a fallback where language_unknown
means try all the demanglers, but for the cases we care about,
which is mostly DWARF debugging info, that could save a lot of
unnecessary attempts, like you say.

My feeling on the global situation with those demanglers is that
we take a risk each time we use the demanglers themselves to
try to determine which language the symbol belongs to. Both
the C++ and the Ada mangling/encoding algorithms are not exclusive
enough, and we're seeing in this thread how easy it is to get
it wrong. For minimal symbols, we don't have the information,
so that's one valid situation where we have no choice but to guess.
But when reading DWARF debugging information, for instance, we do
have the language information, so we should take advantage of it.

Thinking out loud, this leads me to wonder whether it's a good idea
to store the demangled name instead of the linkage name. You'd do
the search on the mangled name rather than on the demangled one,
so you would not have to guess when loading the symbols. Maybe it's
a good idea if it helps performance, and perhaps we can plug that
weakness differently...

-- 
Joel