From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 58282 invoked by alias); 24 Jul 2015 14:16:06 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 58272 invoked by uid 89); 24 Jul 2015 14:16:05 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.5 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 X-HELO: mail-oi0-f48.google.com Received: from mail-oi0-f48.google.com (HELO mail-oi0-f48.google.com) (209.85.218.48) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 24 Jul 2015 14:16:04 +0000 Received: by oigd21 with SMTP id d21so18874837oig.1 for ; Fri, 24 Jul 2015 07:16:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=AG3LacdjHjMs+5cJhfg/rPWcX3HIIQUKF/7JX4efewk=; b=XfCrI1qmurgJKw/dkQH2xI2iKcwUo/6WbsatupIqv6yLEqECvhJLOZm80ycRuhjoAi 2AMBK7LrprWGE/ELkgsyK/stp5/HLg5Bmrek83NAg4gEXspUmJae5APMMu4ZTGlHHfIl se4dKORWu/HhbBShJOeqNDSnAHH+t4Q9rqA8K3JVCdTPNuMEO4IRLb9yJ87fXsuGhIyN pIZ2aJrh5CkXbw6XQXHLvpHY/pzz/WgbOAbHPt+Y6OOnl3BSFGjc0u+pjL9UluC/YBc1 dFAwBkBO0y0wza4LVhp7u3pPaaI+rTY4dZNPSewQr1wjGwsZigbwg1gRZGVVx3WE/QmW pGAg== X-Gm-Message-State: ALoCoQlwd0N+WToSiuw2Eswm4IldL9+xTNhB65mUYjgv77G9HvdQDz7ASydUzalx1kQZy1Uzc3f3 X-Received: by 10.202.232.67 with SMTP id f64mr14690422oih.63.1437747362397; Fri, 24 Jul 2015 07:16:02 -0700 (PDT) MIME-Version: 1.0 Received: by 10.182.56.202 with HTTP; Fri, 24 Jul 2015 07:15:42 -0700 (PDT) In-Reply-To: References: <1433878062-23560-1-git-send-email-patrick@parcs.ath.cx> <1434466413-28892-1-git-send-email-patrick@parcs.ath.cx> <87mvym67i5.fsf_-_@redhat.com> From: Patrick Palka Date: Fri, 24 Jul 2015 14:16:00 -0000 Message-ID: Subject: Re: Racy failures on gdb.base/gdbinit-history.exp (native-extended-gdbserver/-m64) (was: Re: [PATCH] Don't truncate the history file when history size is unlimited) To: Sergio Durigan Junior Cc: "gdb-patches@sourceware.org" Content-Type: text/plain; charset=UTF-8 X-SW-Source: 2015-07/txt/msg00701.txt.bz2 On Fri, Jul 24, 2015 at 10:03 AM, Patrick Palka wrote: > On Thu, Jul 23, 2015 at 3:33 PM, Patrick Palka wrote: >> On Thu, Jul 23, 2015 at 2:42 PM, Sergio Durigan Junior >> wrote: >>> On Tuesday, June 16 2015, Patrick Palka wrote: >>> >>>> We still do not handle "set history size unlimited" correctly. In >>>> particular, after writing to the history file, we truncate the history >>>> even if it is unlimited. >>>> >>>> This patch makes sure that we do not call history_truncate_file() if the >>>> history is not stifled (i.e. if it's unlimited). This bug causes the >>>> history file to be truncated to zero on exit when one has "set history >>>> size unlimited" in their gdbinit file. Although this code exists in GDB >>>> 7.8, the bug is masked by a pre-existing bug that's been only fixed in >>>> GDB 7.9 (PR gdb/17820). >>> >>> Hey Patrick, >>> >>> Looking at the BuildBot logs today, I found that this new test is >>> failing occasionally on native-extended-gdbserver testing. Take a look >>> at the following build: >>> >>> >>> >>> You can see that gdb.base/gdbinit-history.exp failed: >>> >>> PASS -> FAIL: gdb.base/gdbinit-history.exp: truncation: appending: server show commands >>> PASS -> FAIL: gdb.base/gdbinit-history.exp: truncation: creating: server show commands >>> >>> The gdb.log is here: >>> >>> >>> >>> I haven't really investigated to determine what's going on here, but let >>> me know if you need any help with this. >> >> Thanks for the heads up. >> >> When doing gdb_exit followed by gdb_start, in the output log sometimes >> we have (this is printed shortly before the first FAIL) >> >> (gdb) ... >> Remote debugging from host 127.0.0.1 >> monitor exit >> spawn ... >> GNU gdb (GDB) 7.10.50.20150723-cvs >> ... >> >> Other times we have (this is printed shortly before the second FAIL) >> >> (gdb) ... >> Remote debugging from host 127.0.0.1 >> monitor exit >> (gdb) spawn ... >> GNU gdb (GDB) 7.10.50.20150723-cvs >> ... >> >> The literal difference being the "(gdb) " prompt printed before the >> "spawn" message. In the first case (where the "(gdb) " prefix is not >> there) the history file does not seem to be written/appended to. In >> the second case (when the "(gdb) " prefix is there) the history file >> is properly written/appended to (but it still FAILs because we're >> missing the command history from before the first case). So the race, >> if there is one, may have something to do with whether or not the >> "(gdb) " prompt gets printed after doing "monitor exit". Or maybe >> not. I'll do more analysis later. > > After further analysis I don't think there is any correlation between > whether "(gdb) spawn ..." or else "spawn ..." gets printed, and if a > race between "gdb_exit" and "gdb_start" occurs. In fact, I don't > think there even is a race between gdb_exit and gdb_start. If I add > "after 100" (i.e. TCL's way of sleeping 100ms) in the middle of such > sequences (i.e. gdb_exit; after 100; gdb_start) in > gdbinit-history.exp, I can still reproduce the intermittent FAILs. > > So it may be the case that sometimes gdb_exit does not kill the GDB > process properly, and by doing so the process doesn't get a chance to > save to the history file. And it only happens with > extended-gdbserver, not with gdbserver or non-gdbserver. A unique > code path taken only by the extended-gdbserver target is the code in > gdbserver-support.exp:gdb_exit guarded by "if {[info exists > gdb_spawn_id] && [info exists server_spawn_id]}", and then the > close_gdbserver proc that follows. However if I just outright delete > that code (which should make the exit logic nearly identical with the > gdbserver target) I can still trigger the intermittent FAILs... Maybe > it's an issue in dejagnu? Or could it be an obscure bug in GDB?? One thing I noticed while repeatedly running the test with --target_board=native-extended-gdbserver --verbose --verbose is that it sometimes takes a few seconds to kill the GDB process. The output log pauses at Quitting /scratchpad/binutils-gdb-build/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory /scratchpad/binutils-gdb-build/gdb/testsuite/../data-directory -ex "set auto-connect-native-target off" Closing the remote shell exp8 doing kill, pid is 16925 pid is 16925 for 5 or more seconds. I only see this happen with native-extended-gdbserver. I suppose it's waiting for the command (which gets spawned by dejagnu in its standard_close proc) sh -c exec > /dev/null 2>&1 && (kill -2 -16925 || kill -2 16925) && sleep 5 && (kill -16925 || kill 16925) && sleep 5 && (kill -9 -16925 || kill -9 16925) & to finish sleeping.