From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 118381 invoked by alias); 25 Aug 2015 10:43:22 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 118316 invoked by uid 89); 25 Aug 2015 10:43:21 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Tue, 25 Aug 2015 10:43:20 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (Postfix) with ESMTPS id 6C2D78C1A5 for ; Tue, 25 Aug 2015 10:43:19 +0000 (UTC) Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t7PAhH2V017657; Tue, 25 Aug 2015 06:43:18 -0400 Message-ID: <55DC46C5.4050808@redhat.com> Date: Tue, 25 Aug 2015 10:43:00 -0000 From: Pedro Alves User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Sergio Durigan Junior , GDB Patches Subject: Re: [PATCH] Increase timeout on gdb.base/exitsignal.exp References: <1440481342-25971-1-git-send-email-sergiodj@redhat.com> In-Reply-To: <1440481342-25971-1-git-send-email-sergiodj@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-SW-Source: 2015-08/txt/msg00716.txt.bz2 On 08/25/2015 06:42 AM, Sergio Durigan Junior wrote: > I have noticed that BuildBot is showing random failures of > gdb.base/exitsignal.exp, specifically when testing on the > Fedora-ppc64be-native-gdbserver-m64 builder. Since I wrote this test > a while ago, I decided to investigate this further. > > This is what you see when you examine gdb.log: > > Breakpoint 1, main (argc=1, argv=0x3fffffffe3c8) at ../../../binutils-gdb/gdb/testsuite/gdb.base/segv.c:26 > 26 raise (SIGSEGV); > (gdb) print $_exitsignal > $1 = void > (gdb) PASS: gdb.base/exitsignal.exp: $_exitsignal is void before running > print $_exitcode > $2 = void > (gdb) PASS: gdb.base/exitsignal.exp: $_exitcode is void before running > continue > Continuing. > > Program received signal SIGSEGV, Segmentation fault. > 0x00003fffb7cbf808 in .raise () from target:/lib64/libc.so.6 > (gdb) PASS: gdb.base/exitsignal.exp: trigger SIGSEGV > continue > Continuing. > FAIL: gdb.base/exitsignal.exp: program terminated with SIGSEGV (timeout) > print $_exitsignal > FAIL: gdb.base/exitsignal.exp: $_exitsignal is 11 (SIGSEGV) after SIGSEGV. (timeout) > print $_exitcode > FAIL: gdb.base/exitsignal.exp: $_exitcode is still void after SIGSEGV (timeout) > kill > > Program terminated with signal SIGSEGV, Segmentation fault. > The program no longer exists. > (gdb) print $_exitsignal > $3 = 11 > (gdb) print $_exitcode > $4 = void > > Clearly a timeout issue: one can see that even though the tests failed > because the program was still running, both 'print' commands actually > succeeded later. > I recently bumped time outs for a few reverse/record tests, but in that case, it's justified because recording requires single-stepping all instructions, so it naturally takes a while. In this case, I don't see what could reasonably be causing the delay. It shouldn't really ever take 60 seconds just to deliver a signal and have the kernel report back process exit. What could cause this delay? I'm not sure whether the process's signalled exit status is reported to the parent before or after the kernel fully writes the core dump --- it occurred to me that if after, then writing a big core dump could explain a delay. So I would suggest switching to a signal that does cause a core dump by default, like e.g., SIGKILL/SIGTERM. Though in this case, the core dump generated should be small, so I'm mystified. This could be papering over some latent problem... > I could not reproduce this timeout here, but I decided to propose this > timeout increase anyway. I have chosen to increase it by a factor of > 10; that should give GDB/gdbserver plenty of time to reach the SEGV > point. > > For clarity, I am also attaching the output of 'git diff -w' here; it > makes things much easier to visualize. > > OK to apply? > gdb_continue_to_end > > -# Checking $_exitcode. It should be 0. > -gdb_test "print \$_exitcode" " = 0" \ > - "\$_exitcode is zero after normal inferior is executed" > +with_timeout_factor 10 { > + # Checking $_exitcode. It should be 0. > + gdb_test "print \$_exitcode" " = 0" \ > + "\$_exitcode is zero after normal inferior is executed" > > -# Checking $_exitsignal. It should still be void, since the inferior > -# has not received any signal. > -gdb_test "print \$_exitsignal" " = void" \ > - "\$_exitsignal is still void after normal inferior is executed" > + # Checking $_exitsignal. It should still be void, since the inferior > + # has not received any signal. > + gdb_test "print \$_exitsignal" " = void" \ > + "\$_exitsignal is still void after normal inferior is executed" > +} > This (many instances) doesn't make sense to me. And I think wouldn't fix anything. Seems to me the bumped timeout, if any, should be around the continue that caused the first time out: # Continue until the end. gdb_test "continue" "Program terminated with signal SIGSEGV.*" \ "program terminated with SIGSEGV" Thanks, Pedro Alves