From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16395 invoked by alias); 15 May 2013 17:00:55 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 16386 invoked by uid 89); 15 May 2013 17:00:55 -0000 X-Spam-SWARE-Status: No, score=-4.2 required=5.0 tests=AWL,BAYES_00,KHOP_THREADED,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.1 Received: from mail-vb0-f47.google.com (HELO mail-vb0-f47.google.com) (209.85.212.47) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Wed, 15 May 2013 17:00:54 +0000 Received: by mail-vb0-f47.google.com with SMTP id x14so634480vbb.34 for ; Wed, 15 May 2013 10:00:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type:x-gm-message-state; bh=Cu1fn5Wu5DWm9aLyIiXlRP2wAN4WzbUeRHBtYXTOK4w=; b=oL4jH3CK0S7c6n0XSPkRw9N0JuiElku5VaS8Xp8VMZFLTb1NqH3mBmmbHR7SDlQOCu KLAU+I2UuMeQC6pdPnXia2UOxjDvpzJLNm9J2OgN6rYk5+IZtqUKY3PbhmVpotN7y0rX yCeZPfXruNeUDmxh/NfH/L2ibcU0KUBOxsfc7c/Tpjj/fZ9BeyzdbsVO+L4ZET1iCdKd JZrY1AMW4QkCJ0BT5hog3DwE2rIoPQ6mBWrD3e8JOHCettW7Uo/gFuPrSVRyw4szjQB2 /GJQhz2jm91o4Cryy72pnFI8sFHJevF1MRduMGYn+S0Li3Uz3CX+aSz2iFG/1OfhSQGl vjGQ== MIME-Version: 1.0 X-Received: by 10.59.0.67 with SMTP id aw3mr3896474ved.21.1368637252545; Wed, 15 May 2013 10:00:52 -0700 (PDT) Received: by 10.220.54.75 with HTTP; Wed, 15 May 2013 10:00:52 -0700 (PDT) In-Reply-To: <5192D33A.3060702@earthlink.net> References: <20130507140020.GA10070@host2.jankratochvil.net> <5192D33A.3060702@earthlink.net> Date: Wed, 15 May 2013 17:00:00 -0000 Message-ID: Subject: Re: Benchmarking (was Re: [patch 2/2] Assert leftover cleanups in TRY_CATCH) From: Doug Evans To: Stan Shebs Cc: gdb-patches Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQmAxa0VqbTW+InKtXh1FJkeAI2SreLNQezUnw+3erju8sLGmw/3ccACfQIoGx0iuASu7OksZmbPVTytAV3J44xx1ZL7hGR0NqWfDh0vMFRNebERuic404vRTaxWNNh+y69Zeaxmzn79eEHmLtDS7bJrpicAlYZcix/yMxDhbaxtMDvlADktgDDQaGHE08Un038fkWafwknwulkd4soTZQW21W0MTQ== X-SW-Source: 2013-05/txt/msg00548.txt.bz2 On Tue, May 14, 2013 at 5:13 PM, Stan Shebs wrote: > On 5/7/13 7:00 AM, Jan Kratochvil wrote: > >> >> target-side condition evaluation is a good idea: >> >> time gdb ./loop -ex 'b 4 if i==360000' -ex r -q -ex 'set confirm no' -ex q >> real 1m11.586s >> >> gdbserver :1234 ./loop >> time gdb ./loop -ex 'target remote localhost:1234' -ex 'b 4 if i==360000' -ex c -q -ex 'set confirm no' -ex q >> real 0m21.862s >> >> "set breakpoint condition-evaluation target" really helps a lot. > > This reminds me of something that has been on my mind recently - > detecting performance regression with the testsuite. tis on my todo list. Got a time machine? IIRC Redhat had the seeds of something, but it needed more work. > I added a test for fast tracepoints a while back (tspeed.exp) that also > went to some trouble to get numbers for fast tracepoint performance, > although it just reports them, they are not used to pass/fail. > > However, if target-side conditionals get worse due to some random > change, or GDB startup time gets excessive, these are things that we > know real users care about. On the other hand, this is hard to test > automatically, and no wants to hack dejagnu that much. Maybe an excuse > to dabble in a more-modern testing framework? Are there good options? re: dejagnu hacking: depends on what's needed. Sometimes regressions aren't really noticed unless one is debugging a really large app. A second slowdown might be attributable to many things, but in a bigger app it could be minutes and now we're talking real money. It's trivial enough to write a program to generate apps (of whatever size and along whatever axis is useful) for the task at hand. And it's trivial enough to come up with a set of benchmarks (I have an incomplete set I use for my standard large app benchmark), and a harness to run them. IWBN if running the tests didn't take a lot of time but alas some things only show up at scale. Plus one needs to run them a sufficient number of times to make the data usable. Running a benchmark with different sized tests and comparing the relative times can help. [One thing one could do is, e.g., run gdb under valgrind and use instruction counts as a proxy for performance. [One also needs to measure memory usage.] It has the property of being deterministic, and with a set of testcases for each benchmark could reveal problems. One can't do just this because it doesn't measure, e.g., disk/network latency which can be critical. I'm sure one could write a tool to approximate the times. Going this route is slower of course.] Ultimately I'd expect this to be separate from "make check" (or at least something one has to ask for explicitly). But we *do* need something and the sooner the better.