From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14185 invoked by alias); 10 Jun 2011 06:39:00 -0000 Received: (qmail 14074 invoked by uid 22791); 10 Jun 2011 06:38:58 -0000 X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST X-Spam-Check-By: sourceware.org Received: from mail-pz0-f41.google.com (HELO mail-pz0-f41.google.com) (209.85.210.41) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 10 Jun 2011 06:38:43 +0000 Received: by pzk4 with SMTP id 4so1314530pzk.0 for ; Thu, 09 Jun 2011 23:38:43 -0700 (PDT) Received: by 10.142.149.20 with SMTP id w20mr296616wfd.137.1307687923322; Thu, 09 Jun 2011 23:38:43 -0700 (PDT) Received: from [10.5.0.40] ([159.226.43.45]) by mx.google.com with ESMTPS id t15sm2542630wfh.4.2011.06.09.23.38.40 (version=SSLv3 cipher=OTHER); Thu, 09 Jun 2011 23:38:42 -0700 (PDT) Message-ID: <4DF1BB97.6030300@gmail.com> Date: Fri, 10 Jun 2011 06:39:00 -0000 From: pi3orama User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110518 Lightning/1.0b3pre Thunderbird/3.1.10 MIME-Version: 1.0 To: Yao Qi CC: gdb@sourceware.org Subject: Re: ReBranch - a record-replay debugging tool References: <4DF0F72D.2020001@gmail.com> <4DF1B0CE.7070506@codesourcery.com> In-Reply-To: <4DF1B0CE.7070506@codesourcery.com> Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2011-06/txt/msg00063.txt.bz2 >> ReBranch demands "control-flow only debugging" because it only records >> every branch instruction. In current implementation (the modified >> version of gdbserver), the replayer still need to create a process and >> use ptrace to control it. When data-flow have error (caused by data-race >> in multi threading situation), the ptraced process will generate > IIUC, you have an assumption there that "during record, program runs > correctly, while during replace, program may run incorrectly", > otherwise, you can't find the problem by comparing control-flow. > > The first half of assumption is OK, but the second half, which is more > important, might not be. Some research works are focusing on how to > expose/trigger non-deterministic program's faults. Your current > approach in replaying looks fine, but if you want to find more bugs in > replay, that might be the way to go. > I don't think I have those assumption... Yes, ReBranch doesn't trigger non-deterministic bugs. In fact it doesn't know whether the program runs correctly or fault. ReBranch's job is to faithfully record and replay the execution. If problems raise during execution, ReBranch allows it to raise again in replay, and allows developers to look into its control flow. Which converts non-deterministic bugs into deterministic bugs. Developers can then replay the program many time to find the bug. Without such a record/replay tool, developers are hard to do this: when they notice the bug and apply GDB, the bug will disappear. >> segfault for every instructions, which slows down the performance. >> > I don't understand why "process will generate segfault for every > instruction"? > Nearly every memory accessing instructions will segfault a while after a new thread creation. Currently, although all threads can be recorded, ReBranch is unable to replay them simultaneously. Users can only replay and watch one thread at a time. Therefore, because of the lost of IPC from shared memory, sometimes ReBranch have to twist branch because the replayed thread doesn't have the same data-flow as the original one. The effect can propagate. For example, during original execution, a thread get a pointer from another thread, then retrieve some data structure by it, then do some computation. During replay, it won't get correct pointer because there's no the other thread, but corresponding branch instructions are twisted, so it will retrieve data from a invalid address. A segfault is generated, then ReBranch intercepts it, let the replay continue. So the following computation will be invalid, force ReBranch twist more branches. In our experience, the propagate is very fast. After the return from pthread_create function, the segfault problem have already heavily slows down the replay speed. >> ReBranch have a GUI replayer -- ReBranchK -- which is a simple >> control-flow-only debugging tool. ReBranchK doesn't really create the >> process and debug it. It 'executes' the program virtually by reads the >> log and shows corresponding source code. It implements 's', 'b' and 'c' >> command. However, when writing ReBranchK, I found that, without stack >> information, many useful control-flow command such as 'n' and 'bt' are >> hard to be implemented. Therefore, I hope someone help me to put this >> "control-flow only debugging" function into gdbserver. > IIUC, you want to put your replayer function into gdbserver, in which, > record log file, and checkpoints are input. User can connect to this > special gdbsever to debug. Is that correct? Yes. Especially, I want gdbserver can handle stack frame information to implement 'n', 'bt' and 'finish' commands. > Generally speaking, gdbserver supports multiple archs, so is your record > log file format and checkpoints portable? I hope they are portable. However, currently checkpoint files and log files are all platform specific. Checkpoint files contain some CPU specific information; log files contain some platform specific system call information.