From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15231 invoked by alias); 14 Aug 2013 13:01:35 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 15210 invoked by uid 89); 14 Aug 2013 13:01:35 -0000 X-Spam-SWARE-Status: No, score=-2.9 required=5.0 tests=AWL,BAYES_40,KHOP_RCVD_UNTRUST,RCVD_IN_HOSTKARMA_W,RCVD_IN_HOSTKARMA_WL autolearn=ham version=3.3.2 Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Wed, 14 Aug 2013 13:01:33 +0000 Received: from svr-orw-exc-10.mgc.mentorg.com ([147.34.98.58]) by relay1.mentorg.com with esmtp id 1V9ahf-00040r-HZ from Yao_Qi@mentor.com for gdb-patches@sourceware.org; Wed, 14 Aug 2013 06:01:31 -0700 Received: from SVR-ORW-FEM-03.mgc.mentorg.com ([147.34.97.39]) by SVR-ORW-EXC-10.mgc.mentorg.com with Microsoft SMTPSVC(6.0.3790.4675); Wed, 14 Aug 2013 06:01:32 -0700 Received: from qiyao.dyndns.org (147.34.91.1) by svr-orw-fem-03.mgc.mentorg.com (147.34.97.39) with Microsoft SMTP Server id 14.2.247.3; Wed, 14 Aug 2013 06:01:30 -0700 Message-ID: <520B7F70.6070207@codesourcery.com> Date: Wed, 14 Aug 2013 13:01:00 -0000 From: Yao Qi User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Subject: [RFC] GDB performance testing infrastructure Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-SW-Source: 2013-08/txt/msg00380.txt.bz2 Hi, Here is a proposal of GDB performance testing infrastructure. We'd like to know how people think about this, especially on, 1) What performance issues this infrastructure can test or handle, 2) What does this infrastructure look like? What it can do and what it can't do. I've written some micro-benchmarks, and run them in this infrastructure prototype. The results look reasonable and interesting. Table of Contents _________________ 1 Motivation and Goals .. 1.1 Goals 2 Known works 3 Design .. 3.1 Requirements .. 3.2 Design 4 Example .. 4.1 single step .. 4.2 shared library 1 Motivation and Goals ====================== GDB development process has no standard mechanism to show the performance of GDB snapshot or release is improved or worsened. We run regression tests which address only questions of functionality. Performance regressions do show up periodically. We really needs performance testing in GDB development, especially for these following areas, to make sure there is no performance regression introduced in the development. * Remote debugging. It is slower to read from the remote target, and worse, GDB reads the same memory regions in multiple times, or reads the consecutive memory by multiple packets. * Symbols. Some of the performance problems in GDB are related to symbols. When GDB is used to debug large programs in real life, such as LibreOffice, which has a huge number of symbols, it is a challenge to GDB to organize them in an efficient way. We can also find some bugs reported in bugzilla, such as [PR15412], [PR14125], etc. Issues are documented on [wiki]. * Shared library. When a program needs a large number of shared libraries, GDB will be slow. Gary improved the performance in this area, but there is still an open bug on scalability ([PR15590]). * Tracepoint. Tracepoint is designed to be efficient on collecting data in the inferior, so we need performance tests to guarantee that tracepoint is still efficient enough. Note that we a test `gdb.trace/tspeed.exp', but there are still some rooms to improve. [PR15412] http://sourceware.org/bugzilla/show_bug.cgi?id=15412 [PR14125] http://sourceware.org/bugzilla/show_bug.cgi?id=14125 [wiki] http://sourceware.org/gdb/wiki/SymbolHandling [PR15590] http://sourceware.org/bugzilla/show_bug.cgi?id=15590 1.1 Goals ~~~~~~~~~ The goals in this project are: 1. Collect performance data of GDB in various areas under different supported configurations. These areas or aspects include performing single step, thread-specific breakpoint, stack backtrace, symbol lookup, shared library load/unload etc. Configurations includes native debugging and remote debugging with GDBserver. This framework include some micro-benchmarks and utilities to record the performance data, such as execution time and memory usage of micro-benchmarks. 2. Detect performance regressions. We collected the performance data of each micro-benchmark, and we need to detect or identify the performance regression by comparing with the previous run. It is more powerful to associate it with continuous testing. 2 Known works ============= * [LNT] It was written for LLVM, but is *designed* to be usable for the performance testing of any software. It is written in python, well-documented and easy to set up. LNT spawn the compiler first and then target program, record the time usages of compiler and target program in json format. No interaction is involved. The performance data collection in LNT is relatively simple, because it is targeted to compiler. The performance testing part is done, and the next step is to show the data and detect performance regressions. LNT does a lot work here. The performance data in json format can be imported to a database, and shown through [web]. The performance regression will be highlighted in red. * [lldb] LLDB has a [performance.py] to measure the speed and memory usage of LLDB. It captures the internal events, feeds some events and record the time usages. It handles interactions by consuming debugging events, and take some actions accordingly. It only collects performance data, doesn't detect performance regressions. * libstdc++-v3 There is directory performance in libstdc++-v3/testsuite/ and a header testsuite_performance.h in testsuite/util/. Test cases are compiled with the header, and run with some large data set, to calculate the time usage. It is suitable for performance testing for a library. [LNT] http://llvm.org/docs/lnt/index.html [web] http://llvm.org/perf/db_default/v4/nts/recent_activity [lldb] http://lldb.llvm.org/ [performance.py] http://llvm.org/viewvc/llvm-project/lldb/trunk/examples/python/performance.py 3 Design ======== 3.1 Requirements ~~~~~~~~~~~~~~~~ + Drive GDB to do some operations and record the performance data. Especially to drive GDB for these cases: * Libraries are loaded or unloaded in a program, which has a large number shared libraries, 4096 libraries, for example, * Look up a symbol in a program which has a large number of symbols, 1 million, for example, * Do single step, disassembly or other operations in remote debugging, + Both native debugging and remote debugging are supported. + Display the performance data in some format, plain text or html. + Detect the performance regressions. In functional regression testing, we can simply diff the two `gdb.sum' files and get to know the regressions or progressions. In performance testing, we need to analyze the performance data in two runs to find the regression instead of simply comparing them by diff. + Highlight regressions. It makes sense to show the regression or progression greater than a certain threshold, 5%, for example. The first three requires are the minimum set, and can be met in a short term. Our ultimate goal is to keep track of the performance of GDB, and improve its performance in some areas, instead of developing a full-functional performance testing framework. In the long term, we can improve the framework gradually and meet the last two requirements. 3.2 Design ~~~~~~~~~~ + Use `dejagnu' to invoke compiler to compile test case and start GDB (and/or GDBserver). It is same as regression functional testing we do nowadays. We choose `dejagnu' here because `dejagnu' handles GDB testing, especially when GDBserver is used, very well. We don't have to re-invent the wheel in python. + GDB load a python script, in which some operations are performed and performance data (time and memory usage) is collected into a file. The performance test is driven by python, because GDB has a good python binding now. We can use python too to collect performance data, process them and draw graph, which is very convenient. + Emulate the effect of large program, instead of using real large program. Performance problem shows up when the program is *large* enough, in terms of a large number of symbols or shared libraries. Using real large program can trigger the problem, but other people are hard to reproduce it. The test like this can be run regularly. 1. When we test the performance of GDB handling shared library, we can use .exp script to generate a large number of c files, compile them to shared libraries, and let main executable load these libraries in order to measure the performance. 2. When we test the performance of GDB reading symbols in and looking for symbols, we either can fake a lot of debug information in the executable or fake a lot of `objfile', `symtab' and `symbol' in GDB. we may extend `jit.c' to add symbols on the fly. `jit.c' is able to add `objfile' and `symtab' to GDB from external reader. We can factor this part to add `objfile', `symtab', and `symbol' to GDB for the performance testing purpose. However, I may be wrong. 4 Example ========= 4.1 single step ~~~~~~~~~~~~~~~ For micro-benchmark `single-step', there are three source files, `single-step.c', `single-step.py' and `single-step.exp'. `single-step.exp' is similar to our regression tests in `gdb.python' directory, ,---- | if ![runto_main] { | return -1 | } | | set remote_python_file [remote_download host ${srcdir}/${subdir}/${testfile}.py] | | gdb_test_no_output "python exec (open ('${remote_python_file}').read ())" | | send_gdb "call \$perftest()\n" | set timeout 300 | gdb_expect { | -re "\"Done\".*${gdb_prompt} $" { | } | timeout {} | } | | remote_file host delete ${remote_python_file} `---- `single-step.py' is to drive GDB to do command `stepi' repeatedly and record the time usage. Note that class `SingleStep' can be abstracted in a better way, for example, moving common code to class `TestCase', and extending it in class `SingleStep'. ,---- | import gdb | import time | | class SingleStep (gdb.Function): | def __init__(self): | # Each test has to register a convenience function 'perftest'. | super (SingleStep, self).__init__ ("perftest") | | def execute_test(self): | test_log = open ("perftest.log", 'a+'); | | # Execute command 'stepi' in a number of times, and record the | # time usage. | for i in range(1, 5): | start_time = time.clock() | for j in range(0, i * 300): | gdb.execute ("stepi"); | elapsed_time = time.clock() - start_time | print >>test_log, 'single step %d in %s' % (i * 300, elapsed_time) | | test_log.close () | def invoke(self): | self.execute_test() | return "Done" | | SingleStep () `---- * Run `single-step' with GDBserver ,---- | $ make check RUNTESTFLAGS='--target_board=native-gdbserver single-step.exp' `---- and the result `perftest.log' looks like, each row is about the time usage for doing a certain number of `stepi' ,---- | single step 300 in 0.19 | single step 600 in 0.35 | single step 900 in 0.57 | single step 1200 in 0.75 `---- * Run `single-step' without GDBserver ,---- | $ make check RUNTESTFLAGS='--target_board=unix single-step.exp' `---- and the result `perftest.log' looks like, ,---- | single step 300 in 0.06 | single step 600 in 0.08 | single step 900 in 0.14 | single step 1200 in 0.18 `---- 4.2 shared library ~~~~~~~~~~~~~~~~~~ For micro-benchmark `solib', which is testing the performance of GDB handling shared libraries load and unload, there are three source files, `solib.c', `solib.py' and `solib.exp'. `solib.exp' is to generate many c files, and compile them into shared libraries. `solib.c' is main program which load these libraries dynamically. `solib.py' is a python script to call some inferior functions to load libraries and measure the time usages. Here is the performance data, and each row is about the time usage of handling loading and unloading a certain number of shared libraries. We can use this data to track the performance of GDB on handling shared libraries. ,---- | solib 128 in 0.53 | solib 256 in 1.94 | solib 512 in 8.31 | solib 1024 in 47.34 | solib 2048 in 384.75 `---- -- Yao (齐尧)