From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-125601-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 118381 invoked by alias); 25 Aug 2015 10:43:22 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 118316 invoked by uid 89); 25 Aug 2015 10:43:21 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Tue, 25 Aug 2015 10:43:20 +0000
Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26])	by mx1.redhat.com (Postfix) with ESMTPS id 6C2D78C1A5	for <gdb-patches@sourceware.org>; Tue, 25 Aug 2015 10:43:19 +0000 (UTC)
Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11])	by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t7PAhH2V017657;	Tue, 25 Aug 2015 06:43:18 -0400
Message-ID: <55DC46C5.4050808@redhat.com>
Date: Tue, 25 Aug 2015 10:43:00 -0000
From: Pedro Alves <palves@redhat.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0
MIME-Version: 1.0
To: Sergio Durigan Junior <sergiodj@redhat.com>,        GDB Patches <gdb-patches@sourceware.org>
Subject: Re: [PATCH] Increase timeout on gdb.base/exitsignal.exp
References: <1440481342-25971-1-git-send-email-sergiodj@redhat.com>
In-Reply-To: <1440481342-25971-1-git-send-email-sergiodj@redhat.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-SW-Source: 2015-08/txt/msg00716.txt.bz2

On 08/25/2015 06:42 AM, Sergio Durigan Junior wrote:
> I have noticed that BuildBot is showing random failures of
> gdb.base/exitsignal.exp, specifically when testing on the
> Fedora-ppc64be-native-gdbserver-m64 builder.  Since I wrote this test
> a while ago, I decided to investigate this further.
> 
> This is what you see when you examine gdb.log:
> 
>   Breakpoint 1, main (argc=1, argv=0x3fffffffe3c8) at ../../../binutils-gdb/gdb/testsuite/gdb.base/segv.c:26
>   26	     raise (SIGSEGV);
>   (gdb) print $_exitsignal
>   $1 = void
>   (gdb) PASS: gdb.base/exitsignal.exp: $_exitsignal is void before running
>   print $_exitcode
>   $2 = void
>   (gdb) PASS: gdb.base/exitsignal.exp: $_exitcode is void before running
>   continue
>   Continuing.
> 
>   Program received signal SIGSEGV, Segmentation fault.
>   0x00003fffb7cbf808 in .raise () from target:/lib64/libc.so.6
>   (gdb) PASS: gdb.base/exitsignal.exp: trigger SIGSEGV
>   continue
>   Continuing.
>   FAIL: gdb.base/exitsignal.exp: program terminated with SIGSEGV (timeout)
>   print $_exitsignal
>   FAIL: gdb.base/exitsignal.exp: $_exitsignal is 11 (SIGSEGV) after SIGSEGV. (timeout)
>   print $_exitcode
>   FAIL: gdb.base/exitsignal.exp: $_exitcode is still void after SIGSEGV (timeout)
>   kill
> 
>   Program terminated with signal SIGSEGV, Segmentation fault.
>   The program no longer exists.
>   (gdb) print $_exitsignal
>   $3 = 11
>   (gdb) print $_exitcode
>   $4 = void
> 
> Clearly a timeout issue: one can see that even though the tests failed
> because the program was still running, both 'print' commands actually
> succeeded later.
>

I recently bumped time outs for a few reverse/record tests, but in that
case, it's justified because recording requires single-stepping all
instructions, so it naturally takes a while.  In this case, I don't see what
could reasonably be causing the delay.  It shouldn't really ever take 60
seconds just to deliver a signal and have the kernel report back
process exit.  What could cause this delay?  I'm not sure whether the
process's signalled exit status is reported to the parent before or after
the kernel fully writes the core dump --- it occurred to me that if after,
then writing a big core dump could explain a delay.  So I would
suggest switching to a signal that does cause a core dump by default,
like e.g., SIGKILL/SIGTERM.  Though in this case, the core dump generated
should be small, so I'm mystified.  This could be papering over some
latent problem...

> I could not reproduce this timeout here, but I decided to propose this
> timeout increase anyway.  I have chosen to increase it by a factor of
> 10; that should give GDB/gdbserver plenty of time to reach the SEGV
> point.
> 
> For clarity, I am also attaching the output of 'git diff -w' here; it
> makes things much easier to visualize.
> 
> OK to apply?


>  gdb_continue_to_end
>  
> -# Checking $_exitcode.  It should be 0.
> -gdb_test "print \$_exitcode" " = 0" \
> -    "\$_exitcode is zero after normal inferior is executed"
> +with_timeout_factor 10 {
> +    # Checking $_exitcode.  It should be 0.
> +    gdb_test "print \$_exitcode" " = 0" \
> +	"\$_exitcode is zero after normal inferior is executed"
>  
> -# Checking $_exitsignal.  It should still be void, since the inferior
> -# has not received any signal.
> -gdb_test "print \$_exitsignal" " = void" \
> -    "\$_exitsignal is still void after normal inferior is executed"
> +    # Checking $_exitsignal.  It should still be void, since the inferior
> +    # has not received any signal.
> +    gdb_test "print \$_exitsignal" " = void" \
> +	"\$_exitsignal is still void after normal inferior is executed"
> +}
> 

This (many instances) doesn't make sense to me.  And I think wouldn't
fix anything.  Seems to me the bumped timeout, if any, should be around
the continue that caused the first time out:

# Continue until the end.
gdb_test "continue" "Program terminated with signal SIGSEGV.*" \
    "program terminated with SIGSEGV"

Thanks,
Pedro Alves