Mirror of the gdb mailing list
 help / color / mirror / Atom feed
* GDB hangs when calling inferior functions
@ 2017-10-11 13:02 Stephen Roberts
       [not found] ` <3b857cc7-e722-f10f-7fea-c3928cc0ac36@redhat.com>
  0 siblings, 1 reply; 3+ messages in thread
From: Stephen Roberts @ 2017-10-11 13:02 UTC (permalink / raw)
  To: gdb; +Cc: nd

Hi folks,

I'm encountering a hang in GDB which I have traced back to the  infrun_async(int) function in infrun.c, and wondered if anyone could  shed some light on this part of the code before I raise a bug report. This function is a wrapper around the mark_async_event_handler and  clear_async_event_handler functions, which set the "ready" member of an  async_signal_handler to 1 or 0, respectively. From what I can tell, this wrapper saves its argument in a file-scoped  variable called infrun_is_async to ensure that it only calls the  mark/clear functions once if it is called repeatedly with the same  argument.

The hang occurs when GDB tries to call inferior functions on two different threads with scheduler-locking turned on. The first call works fine, with the call to infrun_async(1) causing the  signal_handler to be marked and the event to be handled, but then the  event loop resets the "ready" member to zero, while leaving  infrun_is_async set to 1. As a result, GDB hangs if the user switches to another thread and calls a  second function because calling infrun_async(1) a second time has no  effect, meaning the inferior call events are never handled.

I've been able to hack around this hang by resetting the infrun_is_async  variable to zero, but I'm hoping to find a more elegant solution. With that in mind, does anyone know what this logic is for? Most of  infrun.c just calls the mark/clear functions directly without going  through this interface. Which calls need to use this behavior, and why?

This issue affects all versions after 7.12 (including HEAD). For reference, this is my reproducer for this issue, where get_value is a global function:

break after_thread_creation
run
set scheduler-locking on
thread 1
call get_value()
thread 2
call get_value()
# GDB hangs here

Thanks and Regards,
Stephen Roberts.
From gdb-return-46105-listarch-gdb=sources.redhat.com@sourceware.org Wed Oct 11 13:24:51 2017
Return-Path: <gdb-return-46105-listarch-gdb=sources.redhat.com@sourceware.org>
Delivered-To: listarch-gdb@sources.redhat.com
Received: (qmail 103601 invoked by alias); 11 Oct 2017 13:24:50 -0000
Mailing-List: contact gdb-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb.sourceware.org>
List-Subscribe: <mailto:gdb-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-owner@sourceware.org
Delivered-To: mailing list gdb@sourceware.org
Received: (qmail 103583 invoked by uid 89); 11 Oct 2017 13:24:49 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.9 required=5.0 testsºYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=pulled, Ie, promise
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 11 Oct 2017 13:24:48 +0000
Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16])	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))	(No client certificate requested)	by mx1.redhat.com (Postfix) with ESMTPS id 777BE7C846;	Wed, 11 Oct 2017 13:24:47 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 777BE7C846
Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spfúil smtp.mailfrom=palves@redhat.com
Received: from [127.0.0.1] (ovpn04.gateway.prod.ext.ams2.redhat.com [10.39.146.4])	by smtp.corp.redhat.com (Postfix) with ESMTP id BED0B68D20;	Wed, 11 Oct 2017 13:24:44 +0000 (UTC)
Subject: Re: GDB hangs when calling inferior functions
To: Stephen Roberts <Stephen.Roberts@arm.com>, "gdb@sourceware.org" <gdb@sourceware.org>
References: <AM3PR08MB0625702F3F55B3A3C6A59031FB4A0@AM3PR08MB0625.eurprd08.prod.outlook.com>
Cc: nd <nd@arm.com>
From: Pedro Alves <palves@redhat.com>
Message-ID: <3b857cc7-e722-f10f-7fea-c3928cc0ac36@redhat.com>
Date: Wed, 11 Oct 2017 13:24:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0
MIME-Version: 1.0
In-Reply-To: <AM3PR08MB0625702F3F55B3A3C6A59031FB4A0@AM3PR08MB0625.eurprd08.prod.outlook.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-SW-Source: 2017-10/txt/msg00033.txt.bz2
Content-length: 2446

On 10/11/2017 02:01 PM, Stephen Roberts wrote:
> Hi folks,
>
> I'm encountering a hang in GDB which I have traced back to the
> infrun_async(int) function in infrun.c, and wondered if anyone could
> shed some light on this part of the code before I raise a bug report.

This event source exists to wake up the event loop when we have
pending events to handle recorded in the thread data structures.
I.e., events that we've already pulled out of the
backend (e.g., linux-nat.c), and thus wouldn't otherwise wake up
the event loop.

> This function is a wrapper around the mark_async_event_handler and
> clear_async_event_handler functions, which set the "ready" member of
> an  async_signal_handler to 1 or 0, respectively. From what I can
> tell, this wrapper saves its argument in a file-scoped  variable
> called infrun_is_async to ensure that it only calls the  mark/clear
> functions once if it is called repeatedly with the same  argument.

That's correct.

>
> The hang occurs when GDB tries to call inferior functions on two
> different threads with scheduler-locking turned on. The first call
> works fine, with the call to infrun_async(1) causing the
> signal_handler to be marked and the event to be handled, but then the
> event loop resets the "ready" member to zero, while leaving
> infrun_is_async set to 1. As a result, GDB hangs if the user switches
> to another thread and calls a  second function because calling
> infrun_async(1) a second time has no  effect, meaning the inferior
> call events are never handled.
>
> I've been able to hack around this hang by resetting the
> infrun_is_async  variable to zero, but I'm hoping to find a more
> elegant solution. With that in mind, does anyone know what this logic
> is for? Most of  infrun.c just calls the mark/clear functions
> directly without going  through this interface. Which calls need to
> use this behavior, and why?

I don't recall off hand, unfortunately.  Did you look at
git log/blame?

> This issue affects all versions after 7.12 (including HEAD). For
> reference, this is my reproducer for this issue, where get_value is a
> global function:

Do you have this in gdb testsuite testcase form?
I can't promise to look at this in detail right now,
but the easier you make it to try it out, the better.

>
> break after_thread_creation run set scheduler-locking on thread 1
> call get_value() thread 2 call get_value() # GDB hangs here

Thanks,
Pedro Alves


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: GDB hangs when calling inferior functions
       [not found] ` <3b857cc7-e722-f10f-7fea-c3928cc0ac36@redhat.com>
@ 2017-10-12 13:33   ` Stephen Roberts
  2017-10-18  9:17     ` Yao Qi
  0 siblings, 1 reply; 3+ messages in thread
From: Stephen Roberts @ 2017-10-12 13:33 UTC (permalink / raw)
  To: Pedro Alves, gdb; +Cc: nd

Hi Pedro,

Thanks for your reply. Any light you could shed on this would be very much appreciated. I did read the logs but didn't get very far as I'm not very familiar with this code. 

I've included a gdb testcase below - hopefully this format is acceptable. This testcase reproduces the issue around 99% of the time on my ubuntu 16.04 machine. This figure drops closer to 50% when under heavy load, which suggested a race condition. I dug deeper into this and found that the threads which hang  are always ones which did not hit the breakpoint but were stopped when another thread did hit a breakpoint. Threads which are stopped at breakpoints are immune to this issue. Loading the system allows more threads to reach the breakpoint before they are stopped by gdb. 

Thanks again,
Stephen Roberts.

cat > gdb-hangs-on-infcall.c << SOURCEFILE
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

#define THREADCOUNT 4

pthread_barrier_t barrier;
pthread_t threads[THREADCOUNT];
int thread_ids[THREADCOUNT];

// Returns the argument back, within range [0..THREADCOUNT)
int get_value(int index) {
  return thread_ids[index];
}

unsigned long fast_fib(unsigned int n) {
  int a = 0;
  int b = 1;
  int t;
  for (unsigned int i = 0; i < n; ++i) {
    t = b;
    b = a+b;
    a = t;
  }
  return a;
}

void * thread_function(void *args) {
  int tid = get_value(*((int *) args));
  (void) args;
  int status;
  status = pthread_barrier_wait(&barrier);
  if (status == PTHREAD_BARRIER_SERIAL_THREAD) {
    printf("All threads entering compute region\n");
  }
  unsigned long result = fast_fib(100); // testmarker01 
  status = pthread_barrier_wait(&barrier);
  if (status == PTHREAD_BARRIER_SERIAL_THREAD) {
    printf("All threads outputting results\n");
  }
  pthread_barrier_wait(&barrier);
  printf("Thread %d Result: %lu\n", tid, result);
  pthread_exit(NULL);
}

int main(void) {
  int err = pthread_barrier_init(&barrier, NULL, THREADCOUNT); 
  // Create worker threads (main)
  printf("Spawining worker threads\n");
  for (int tid = 0; tid < THREADCOUNT; ++tid) {
    thread_ids[tid] = tid;
    err = pthread_create(&threads[tid], NULL, thread_function, (void *) &thread_ids[tid]);
    if (err) {
      fprintf(stderr, "Thread creation failed\n"); 
      return EXIT_FAILURE;
    }
  }
  // Wait for threads to complete then exit 
  for (int tid = 0; tid < THREADCOUNT; ++tid) {
    pthread_join(threads[tid], NULL);
  }
  pthread_exit(NULL);
  return EXIT_SUCCESS;
}
SOURCEFILE



$ cat > gdb-hangs-on-infcall.exp << EXPECTFILE
standard_testfile .c

# What compiler are we using?
#
if [get_compiler_info] {
    return -1
}

if {[gdb_compile_pthreads "${srcdir}/${subdir}/${srcfile}" "${binfile}" executable {debug}] != "" } {
    return -1
}

clean_restart ${binfile}

if { ![runto main] } then {
   fail "run to main"
   return
}

gdb_breakpoint [gdb_get_line_number "testmarker01"]
gdb_continue_to_breakpoint "testmarker01"
gdb_test_no_output "set scheduler-locking on"
gdb_test "show scheduler-locking" "Mode for locking scheduler during execution is \"on\"."
gdb_test "thread 4" "Switching to .*"
gdb_test "call get_value(0)" ".* = 0"
gdb_test "thread 3" "Switching to .*"
gdb_test "call get_value(0)" ".* = 0"
gdb_test "thread 2" "Switching to .*"
gdb_test "call get_value(0)" ".* = 0"
gdb_test "thread 1" "Switching to .*"
gdb_test "call get_value(0)" ".* = 0"
EXPECTFILE
From gdb-return-46107-listarch-gdb=sources.redhat.com@sourceware.org Fri Oct 13 10:39:24 2017
Return-Path: <gdb-return-46107-listarch-gdb=sources.redhat.com@sourceware.org>
Delivered-To: listarch-gdb@sources.redhat.com
Received: (qmail 112834 invoked by alias); 13 Oct 2017 10:39:24 -0000
Mailing-List: contact gdb-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb.sourceware.org>
List-Subscribe: <mailto:gdb-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-owner@sourceware.org
Delivered-To: mailing list gdb@sourceware.org
Received: (qmail 53139 invoked by uid 89); 13 Oct 2017 10:38:08 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_SPAM,SPF_PASS,T_FILL_THIS_FORM_SHORT autolearn=no version=3.3.2 spammy=H*r:sk:gdb@sou, city, Person
X-HELO: mail-lf0-f46.google.com
Received: from mail-lf0-f46.google.com (HELO mail-lf0-f46.google.com) (209.85.215.46) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 13 Oct 2017 10:38:07 +0000
Received: by mail-lf0-f46.google.com with SMTP id b190so9097731lfg.9        for <gdb@sourceware.org>; Fri, 13 Oct 2017 03:38:06 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;        d\x1e100.net; s 161025;        h=x-gm-message-state:mime-version:from:date:message-id:subject:to;        bh=ZE0jBY0kTuzkA9QjbAsaaCANAaXkuC2KWy6tpGvfYYc=;        b=SuuVm6lwHNtCDHoe+WfT80ttYbUpJwB91CmQBEkttNRTXyQ0NXXhQNeoUaP7DTLubI         c5Xy182g7zyOCEVPpGmwST9mfp5bmDssvm8H2H7W1Bnv+o7Sjo70+LL/YgGeWs+JRzng         rwJk1kSqxKLVfbbZt+KLK8aPIRtzyO1vqAwLmoYfw6YwNalTuV8qeXZTeCwJ6MwSjf8v         /Rr0mBxv+gz/QvVYNv/5u3rLUuoLyauE3K/88gUtPL/fRf4P0TUFClKFbv2rhxrzuux8         HktBCpKvVphvFLD2ZWjO4mQifZdwBBWVJ5Zqok+IyGs6mr69q+1CmqEYWimiAEg5u3Gb         y1tg=X-Gm-Message-State: AMCzsaWwSKX26w3IL+ZsZQAaHEEli93N+UyVDkN70aX9fUWdJiYL08Zw	DeaATWxTRU0oCpSQZtXUT9Cxrndd
X-Received: by 10.46.81.81 with SMTP id b17mr451272lje.185.1507891084763;        Fri, 13 Oct 2017 03:38:04 -0700 (PDT)
Received: from mail-lf0-f51.google.com (mail-lf0-f51.google.com. [209.85.215.51])        by smtp.gmail.com with ESMTPSA id m17sm123915lfm.59.2017.10.13.03.38.04        for <gdb@sourceware.org>        (version=TLS1_2 cipherìDHE-RSA-AES128-GCM-SHA256 bits\x128/128);        Fri, 13 Oct 2017 03:38:04 -0700 (PDT)
Received: by mail-lf0-f51.google.com with SMTP id r129so9106696lff.8        for <gdb@sourceware.org>; Fri, 13 Oct 2017 03:38:04 -0700 (PDT)
X-Google-Smtp-Source: ABhQp+SihX79QzdLK/7xv0SxmCwgIyJQxYTuJmiBbx/uHRbpz5jAUP884AxdWP6sWPJZbdKEv2I51GIQ63ONiQOo4eAX-Received: by 10.46.74.25 with SMTP id x25mr488470lja.83.1507891084505; Fri, 13 Oct 2017 03:38:04 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.25.125.68 with HTTP; Fri, 13 Oct 2017 03:38:04 -0700 (PDT)
From: Ivo Raisr <ivosh@ivosh.net>
Date: Fri, 13 Oct 2017 10:39:00 -0000
X-Gmail-Original-Message-ID: <CANXv6=v=j=y8V0fXhfW6=mpWY+4SWQpmabAnGaHv1rwhu14GPA@mail.gmail.com>
Message-ID: <CANXv6=v=j=y8V0fXhfW6=mpWY+4SWQpmabAnGaHv1rwhu14GPA@mail.gmail.com>
Subject: [FOSDEM] Call for Participation: Debugging Tools DevRoom at FOSDEM 2018
To: gdb@sourceware.org
Content-Type: text/plain; charset="UTF-8"
X-IsSubscribed: yes
X-SW-Source: 2017-10/txt/msg00035.txt.bz2
Content-length: 4582

Debugging Tools developer room at FOSDEM 2018
(Brussels, Belgium, February 3th).

FOSDEM is a free software event that offers open source communities a
place to meet, share ideas and collaborate.  It is renown for being
highly developer-oriented and brings together 5000+ hackers from all
over the world.  It is held in the city of Brussels (Belgium).
http://fosdem.org/

FOSDEM 2018 will take place during the weekend of Saturday, February 3th
and Sunday February 4th 2018.  On Saturday we will have a devroom for
Debugging Tools, jointly organized by the Valgrind and GDB projects.
Devrooms are a place for development teams to meet, discuss, hack and publicly
present the project's latest improvements and future directions.

We will have a whole day to hang out together as community
embracing debugging tools, such as Valgrind, gdb, RR, Infinity, radare...
Please join us, regardless of whether you are a core hacker,
a plugin hacker, a user, a packager or a hacker on a
project that integrates, extends or complements debugging tools.

** Call for Participation

We would like to organize a series of talks/discussions on various
topics relevant to both core hackers, new developers, users, packagers
and cross project functionality.  Please do submit a talk proposal by
Friday December 1st 2017, so we can make a list of activities during the
day.

Some possible topics for talks/discussions are:

- The recently added functional changes (for users).
- Get feedback on what what kinds of new functionality would be
useful. Which tools and functionality users would like to see.
- Discuss release/bugfixing strategy/policy (core hackers, packagers).
- Connecting debugging tools together.
- Latest DWARF extensions, going from binary back to source.
- Multi, multi, multi... threads, processes and targets.
- Debugging anything, everywhere. Dealing with complex systems.
- Dealing with the dynamic loader and the kernel.
- Intercepting and interposing functions and events.
- Adding GDB features, such as designing GDB python scripts for your
data structures.
- Advances in gdbserver and the GDB remote serial protocol.
- Adding Valgrind features (adding syscalls for a platform or VEX
instructions for an architecture port).
- Infrastructure changes to the Valgrind JIT framework.
- Your interesting use case with a debugging tool.

** How to Submit

There are two ways.
The first one is to use the FOSDEM 'pentabarf' tool to submit your proposal:
https://penta.fosdem.org/submission/FOSDEM18

- If necessary, create a Pentabarf account and activate it.
  Please reuse your account from previous years if you have
  already created it.

- In the "Person" section, provide First name, Last name
  (in the "General" tab), Email (in the "Contact" tab)
  and Bio ("Abstract" field in the "Description" tab).

- Submit a proposal by clicking on "Create event".

- Important! Select the "Debugging Tools devroom" track
  (on the "General" tab).

- Provide the title of your talk ("Event title" in the "General" tab).

- Provide a description of the subject of the talk and the
  intended audience (in the "Abstract" field of the "Description" tab)

- Provide a rough outline of the talk or goals of the session (a short
  list of bullet points covering topics that will be discussed) in the
  "Full description" field in the "Description" tab

The second way is to email your talk proposal to
valgrind-devroom-manager@fosdem.org alias.

Julian, Philippe, Mark, Ivosh, and Pedro will review the proposals and
organize the
schedule for the day.  Please feel free to suggest or discuss any ideas
for the devroom on valgrind-devroom-manager@fosdem.org alias.

** Recording of Talks

As usually the FOSDEM organisers plan to have live streaming and
recording fully working, both for remote/later viewing of talks, and
so that people can watch streams in the hallways when rooms are full.
This obviously requires speakers to consent to being recorded and
streamed.  If you plan to be a speaker, please understand that by
doing so you implicitly give consent for your talk to be recorded and
streamed.  The recordings will be published under the same licence as
all FOSDEM content (CC-BY).

Important dates:

  Talk/Discussion Submission deadline:    Friday        1 Dec 2017
  Devroom Schedule announcement:        Friday      15 Dec 2017
  Devroom day:                                          Saturday   3 Feb 2018

Hope to see you all at FOSDEM 2018 in the joint Debugging Tools devroom.
Brussels (Belgium), Saturday February 3th 2018.

On behalf of the devroom committee,
Ivo Raisr


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: GDB hangs when calling inferior functions
  2017-10-12 13:33   ` Stephen Roberts
@ 2017-10-18  9:17     ` Yao Qi
  0 siblings, 0 replies; 3+ messages in thread
From: Yao Qi @ 2017-10-18  9:17 UTC (permalink / raw)
  To: Stephen Roberts; +Cc: Pedro Alves, gdb, nd

Stephen Roberts <Stephen.Roberts@arm.com> writes:

> I've included a gdb testcase below - hopefully this format is acceptable. This testcase reproduces the issue around 99% of the time on my ubuntu 16.04 machine. This figure drops closer to 50% when under heavy load, which suggested a race condition. I dug deeper into this and found that the threads which hang  are always ones which did not hit the breakpoint but were stopped when another thread did hit a breakpoint. Threads which are stopped at breakpoints are immune to this issue. Loading the system allows more threads to reach the breakpoint before they are stopped by gdb. 

I can reproduce the hang with your test case.  Looks
inferior_event_handler (INF_EXEC_COMPLETE, ) should be called somewhere,
may be in infrun or the thread finite-state machine is in a wrong state
for inferior call.

-- 
Yao (齐尧)


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-10-18  9:17 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-11 13:02 GDB hangs when calling inferior functions Stephen Roberts
     [not found] ` <3b857cc7-e722-f10f-7fea-c3928cc0ac36@redhat.com>
2017-10-12 13:33   ` Stephen Roberts
2017-10-18  9:17     ` Yao Qi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox