From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-144041-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 130411 invoked by alias); 3 Dec 2017 14:12:18 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 130395 invoked by uid 89); 3 Dec 2017 14:12:17 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_PASS,T_RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=scripting, pronounced, variation
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 03 Dec 2017 14:12:16 +0000
Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16])	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))	(No client certificate requested)	by mx1.redhat.com (Postfix) with ESMTPS id 1B12BC057FA7;	Sun,  3 Dec 2017 14:12:15 +0000 (UTC)
Received: from [127.0.0.1] (ovpn04.gateway.prod.ext.ams2.redhat.com [10.39.146.4])	by smtp.corp.redhat.com (Postfix) with ESMTP id 1CB1A5C885;	Sun,  3 Dec 2017 14:12:10 +0000 (UTC)
Subject: Re: [PATCH] C++-ify parse_format_string
To: Simon Marchi <simon.marchi@polymtl.ca>, gdb-patches@sourceware.org
References: <87h8tivyo4.fsf@tromey.com> <20171202203133.10770-1-simon.marchi@polymtl.ca>
Cc: Tom Tromey <tom@tromey.com>, Simon Marchi <simon.marchi@ericsson.com>
From: Pedro Alves <palves@redhat.com>
Message-ID: <315d390f-5b5d-f077-e730-ae4535729557@redhat.com>
Date: Sun, 03 Dec 2017 14:12:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0
MIME-Version: 1.0
In-Reply-To: <20171202203133.10770-1-simon.marchi@polymtl.ca>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-SW-Source: 2017-12/txt/msg00056.txt.bz2

On 12/02/2017 08:31 PM, Simon Marchi wrote:
> From: Simon Marchi <simon.marchi@ericsson.com>
> 
> I finally got around to polish this patch.  Let me know what you think.
> 
> This patch C++ifies the parse_format_string function and its return
> value, allowing to get rid of free_format_pieces_cleanup.  Currently,
> format_piece::string points into a big buffer shared by all format
> pieces resulting of the format string parse.  To make the code and
> allocation scheme simpler to understand, this patch makes each piece own
> its string with an std::string.
> 
> I've tried to measure the impact of this change on the time taken by
> parse_format_string in an optimized build.  I ran
> 
>   $ perf record -g -- ./gdb -nx -batch -x test.gdb

I like using:

 $ perf record --call-graph dwarf,65528 -- ....

so that perf report has more accurate callchain info.

Otherwise, compile with -fno-omit-frame-pointer (but that's no
longer testing what you normally compile to).

> 
> with test.gdb:
> 
>   set $a = 0
>   while $a < 20000
>     printf "%daaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa<16 times>\n", 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
>     set $a = $a + 1
>   end
> 
> The purpose of the long string of a's is to have some format pieces for
> which the substring doesn't fit in the std::string local buffer.  The
> command "perf report" shows this for current master:
> 
>   1.81%     1.81%  gdb  gdb  [.] parse_format_string
> 
> and this with the patch:
> 
>   1.69%     1.69%  gdb  gdb  [.] parse_format_string
> 
> So it doesn't seem like it makes much of a difference.  If you have tips
> on how to use perf to get more precise measurements, I would be glad to
> hear them.

Use plain "time" first.  (You may have shifted work
out of parse_format_string.)

With your test script as is, I get around:

 real    0m1.053s
 user    0m0.985s
 sys     0m0.068s

I like a testcase that runs a bit longer in order to have a
better signal/noise ratio.  IMO 1 second is too short.  Bumping the
number of iterations 10x (to 200000), I get (representative
of a few runs),

before patch:

 real    0m9.432s
 user    0m8.978s
 sys     0m0.444s

after patch:

 real    0m10.322s
 user    0m9.854s
 sys     0m0.454s

there's some variation across runs, but after-patch compared
to before-patch is consistently around 7%-11% slower, for me.

Looking at "perf report -g", we see we're very inefficient
while handling this specific use case, spending a lot of time
in the C parser and in fprintf_filtered.  If those paths
were tuned a bit the relative slowdown of parse_format_string
would probably be more pronounced.

Note that parse_format_string is used to implement target-side
dprintf, in gdbserver/ax.c:ax_printf.  This means that a slow
down here affects inferior execution time too, which may be
a less-contrived scenario.

My point is still the same as ever -- it's not possible
upfront to envision all the use cases users throw at us,
given gdb scripting, etc.  So if easy, I think it's better
to avoid premature pessimization:

 https://sourceware.org/ml/gdb-patches/2017-06/msg00506.html

(Note: I have not really looked at the patch in any detail.)

Thanks,
Pedro Alves