From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 99812 invoked by alias); 17 Aug 2015 13:16:09 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 99790 invoked by uid 89); 17 Aug 2015 13:16:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.2 required=5.0 tests=AWL,BAYES_50,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 X-HELO: mail-ob0-f173.google.com Received: from mail-ob0-f173.google.com (HELO mail-ob0-f173.google.com) (209.85.214.173) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Mon, 17 Aug 2015 13:16:06 +0000 Received: by obnw1 with SMTP id w1so111210293obn.3 for ; Mon, 17 Aug 2015 06:16:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=o2QHLuc4lmzIfjguTmE8gqFfZRl0U930+Mi1xdjybL8=; b=OcVJF2BZk1x9j6IyAx5zRUKuNM/UJGRRt/UEBdL6VB7+Rw+nPxKK2T0k56eVDEjzo0 H67SdcCsXCK8Bkbi2mTVy4AJVnXYhJlD9uyf6KL+Nc8qs3uEb51d6BKo4ogxN4Puau6f o7Dq3juy/XU4/M0RwCYGWITBsGyFDVPlDg8X+EjKJALrpD7rLgDu8UiL/EM4ZVE6QXdl RziCGljXScTJ2vS36tMkRO4KZLiRF6ho1LkIq021wUudWOug9RcyGgkOsp1WlQMKns4N oYGCVMPTcCWx0qaYq1un32OvAb/eh0WhaV65fHVV0joP+tYJ94p+F1v3IrNQxDmTIHHp AuFQ== X-Gm-Message-State: ALoCoQkrOwY595cXHARkqG/BXlGN5yXc1OYcbGAr07KIHQSA/J61mOoImWugZvJ9hJyqn6OzzKzQ X-Received: by 10.60.52.110 with SMTP id s14mr1142989oeo.14.1439817364189; Mon, 17 Aug 2015 06:16:04 -0700 (PDT) MIME-Version: 1.0 Received: by 10.182.56.202 with HTTP; Mon, 17 Aug 2015 06:15:44 -0700 (PDT) In-Reply-To: <55CD27AE.2090002@redhat.com> References: <1433878062-23560-1-git-send-email-patrick@parcs.ath.cx> <1434466413-28892-1-git-send-email-patrick@parcs.ath.cx> <87mvym67i5.fsf_-_@redhat.com> <87zj1un7ro.fsf@redhat.com> <55CD27AE.2090002@redhat.com> From: Patrick Palka Date: Mon, 17 Aug 2015 13:16:00 -0000 Message-ID: Subject: Re: Racy failures on gdb.base/gdbinit-history.exp (native-extended-gdbserver/-m64) To: Pedro Alves Cc: Sergio Durigan Junior , "gdb-patches@sourceware.org" Content-Type: text/plain; charset=UTF-8 X-SW-Source: 2015-08/txt/msg00418.txt.bz2 On Thu, Aug 13, 2015 at 7:26 PM, Pedro Alves wrote: > On 08/13/2015 11:27 PM, Sergio Durigan Junior wrote: >> On Thursday, August 13 2015, Patrick Palka wrote: >> >>> On Thu, Jul 23, 2015 at 2:42 PM, Sergio Durigan Junior >>> wrote: >>>> On Tuesday, June 16 2015, Patrick Palka wrote: >>>> >>>>> We still do not handle "set history size unlimited" correctly. In >>>>> particular, after writing to the history file, we truncate the history >>>>> even if it is unlimited. >>>>> >>>>> This patch makes sure that we do not call history_truncate_file() if the >>>>> history is not stifled (i.e. if it's unlimited). This bug causes the >>>>> history file to be truncated to zero on exit when one has "set history >>>>> size unlimited" in their gdbinit file. Although this code exists in GDB >>>>> 7.8, the bug is masked by a pre-existing bug that's been only fixed in >>>>> GDB 7.9 (PR gdb/17820). >>>> >>>> Hey Patrick, >>>> >>>> Looking at the BuildBot logs today, I found that this new test is >>>> failing occasionally on native-extended-gdbserver testing. Take a look >>>> at the following build: >>>> >>>> >>>> >>>> You can see that gdb.base/gdbinit-history.exp failed: >>>> >>>> PASS -> FAIL: gdb.base/gdbinit-history.exp: truncation: appending: server show commands >>>> PASS -> FAIL: gdb.base/gdbinit-history.exp: truncation: creating: server show commands >>>> >>>> The gdb.log is here: >>>> >>>> >>>> >>>> I haven't really investigated to determine what's going on here, but let >>>> me know if you need any help with this. >>> >>> Hi Sergio, >>> >>> Have you seen these spurious FAILs pop up recently? I wonder if the >>> fixes to SIGTERM handling I made a few weeks ago may have fixed this >>> as well. >> >> Hey Patrick, >> >> Yeah, I still see those FAIL's from time to time: >> >> >> >> >> I may be wrong, but I noticed that the FAIL's happen only on the PPC >> buildslaves. >> > > Here's a theory: > > (gdb) server show commands > 1 set height 0 > 2 set width 0 > 3 target extended-remote localhost:2348 > 4 print 1 > 5 monitor exit > 6 set height 0 > 7 set width 0 > 8 target extended-remote localhost:2349 > (gdb) PASS: gdb.base/gdbinit-history.exp: truncation: creating: server show commands > Remote debugging from host 127.0.0.1 > monitor exit > (gdb) spawn /home/gdb-buildbot/fedora-ppc64le-1/fedora-ppc64le-native-extended-gdbserver/build/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory /home/gdb-buildbot/fedora > -ppc64le-1/fedora-ppc64le-native-extended-gdbserver/build/gdb/testsuite/../data-directory -x /home/gdb-buildbot/fedora-ppc64le-1/fedora-ppc64le-native-extended-gdbserver/bu > ild/gdb/testsuite/outputs/gdb.base/gdbinit-history/gdbinit-history.gdbinit -ex set auto-connect-native-target off > GNU gdb (GDB) 7.10.50.20150812-cvs > ... > (gdb) set height 0 > (gdb) set width 0 > (gdb) spawn ../gdbserver/gdbserver --once --multi :2350 > Listening on port 2350 > target extended-remote localhost:2350 > Remote debugging using localhost:2350 > (gdb) server show commands > 1 set height 0 > 2 set width 0 > 3 target extended-remote localhost:2350 > (gdb) FAIL: gdb.base/gdbinit-history.exp: truncation: appending: server show commands > > Observations: > > #1 - In the FAILing case, the history only shows commands invoked in that > gdb session. That means that either that GDB invocation failed to > load the history file, or the previous invocation failed to write it. > > #2 - dejagnu's standard_close, what is ultimately used to exit gdb, > does this: > > 2.1 - send gdb a SIGINT. This might help by interrupting any command > that gdb might be stuck in, getting it back to > the event loop processing input. > > 2.2 - close gdb's ptty, in effect closing gdb's stdin/stdout. > > 2.3 - if gdb doesn't exit after 5 seconds, kills it > with SIGTERM. If it still doesn't exit after another 5 > seconds, dejagnu kill it with SIGKILL. > > Unless you're running dejagnu git master, 2.1 and 2.2 happen > in parallel. > > I'm assuming that 2.3 is not happening. It's 2.2 that normally > causes gdb to exit. See event-top.c:stdin_event_handler. > > - Because, due to 2.2, stdin/stdout are already closed when gdb is > running the teardown code (quit_force, what saves history), any > warning/output that gdb might output goes nowhere, it won't > appear in gdb.log. > > #3 - If you'll recall, not long ago we made the code that appends > the commands to the history file roughly: > > - mv old-history-file old-history-file.tmp > - append to old-history-file.tmp > - mv old-history-file.tmp old-history-file > > That is top.c:gdb_safe_append_history. > > AFAICS, SIGINT is not set to SA_RESTART. Now imagine what happens > if the SIGINT lands exactly just when gdb is about to call > rename here, in gdb_safe_append_history: > > ret = rename (local_history_filename, history_filename); > saved_errno = errno; > if (ret < 0 && saved_errno != EEXIST) > warning (_("Could not rename %s to %s: %s"), > local_history_filename, history_filename, > safe_strerror (saved_errno)); > } > > and rename fails with EINTR. Then the following gdb invocation > would not find any previous history file to load, which would > perfectly explain that FAIL. > > If we still have the build/test directory of that test run, > if the theory is right, I think we'd find a > leftover gdbinit-history.gdb_history-gdb-$PID~ file. > > What doesn't look right in the theory is that the rename man > does not document rename as failing with EINTR... But > maybe it does? Interesting theory. I had assumed that the problem lies in the testsuite driver, not in the GDB source code, because the problem only manifests itself under native-extended-gdbserver. If I understand correctly, your explanation is also plausible under native or native-gdbserver, yet the problem was not observed to ever manifest itself under these targets. Also, if rename fails with EINTR, wouldn't a "Could not rename %s to %s: %s" warning be emitted in the gdb output log?