From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-88802-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 16870 invoked by alias); 13 Mar 2012 01:39:44 -0000
Received: (qmail 16855 invoked by uid 22791); 13 Mar 2012 01:39:42 -0000
X-SWARE-Spam-Status: No, hits=-1.5 required=5.0	tests=AWL,BAYES_00,SARE_TOWRITE,T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
Received: from rock.gnat.com (HELO rock.gnat.com) (205.232.38.15)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 13 Mar 2012 01:39:27 +0000
Received: from localhost (localhost.localdomain [127.0.0.1])	by filtered-rock.gnat.com (Postfix) with ESMTP id 806191C6916	for <gdb-patches@sourceware.org>; Mon, 12 Mar 2012 21:39:26 -0400 (EDT)
Received: from rock.gnat.com ([127.0.0.1])	by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024)	with LMTP id ycjT6ndhOl5R for <gdb-patches@sourceware.org>;	Mon, 12 Mar 2012 21:39:26 -0400 (EDT)
Received: from joel.gnat.com (localhost.localdomain [127.0.0.1])	by rock.gnat.com (Postfix) with ESMTP id 3C8E91C6654	for <gdb-patches@sourceware.org>; Mon, 12 Mar 2012 21:39:26 -0400 (EDT)
Received: by joel.gnat.com (Postfix, from userid 1000)	id 4AFAA145615; Mon, 12 Mar 2012 18:39:20 -0700 (PDT)
From: Joel Brobecker <brobecker@adacore.com>
To: gdb-patches@sourceware.org
Subject: Problem after hitting breakpoint on Windows (with GDBserver)
Date: Tue, 13 Mar 2012 01:39:00 -0000
Message-Id: <1331602756-23567-1-git-send-email-brobecker@adacore.com>
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2012-03/txt/msg00424.txt.bz2

Hello,

This is a patch that fixes a problem that started appearing on Feb 24th
after support for target-side condition evaluation got added. The problem
occurs on Windows when using GDB+GDBserver.

I decided to write a long-ish introductory message rather than putting
that information with the patch, because it's a lot of details which
I was afraid might dilute a bit the important piece for the patch email.
You may just look at the patch email itself, where the revision history
should provide the minimum amount of info to understand the problem.

In a nutshell, GDB uses memory read/writes to insert/remove breakpoints,
and breakpoint restoration fails. This eventually leads to this kind of
problem after hitting a breakpoint:

        (gdb) b foo
        Breakpoint 1 at 0x40177e: file foo.adb, line 5.
        (gdb) cont
        Continuing.

        Breakpoint 1, foo () at foo.adb:5
        5          Put_Line ("Hello World.");  -- STOP
        (gdb) n

        Program received signal SIGSEGV, Segmentation fault.
        0x00401782 in foo () at foo.adb:5
        5          Put_Line ("Hello World.");  -- STOP

Here is what's happening, in chonological order:

  . When the user types "cont", our breakpoint gets inserted.
    Contrary to GNU/Linux systems, the z0/Z0 packets are not
    supported, so we use memory read/writes instead.

    So, mem-break.c:default_memory_insert_breakpoint reads the
    (piece of) instruction that will be replaced by our breakpoint,
    saves that in the breakpoint location's shadow_contents buffer,
    and then writes the breakpoint instruction.

    So far, so good.

  . GDB receives a new-shared-library event from the kernel.
    GDB processes it, calling solib_add, which calls breakpoint_re_set.
    Basically, new shared library => we need to re-eval our breakpoints,
    to see if something changed. Unfortunately, part of the process
    involves update_breakpoint_locations replacing the breakpoint's
    bp_loc chain by an entirely new chain, as so:

     b->loc = NULL
     [...]
     for (all sals)
       [...]
       new_loc = add_location_to_breakpoint (b, &(sals.sals[i]));

    A side-effect of this is the fact that the location's condition_changed
    flag got reset to condition_modified (see init_bp_location's call to
    mark_breakpoint_location_modified).

  . As a result of this, update_global_location_list sets the new
    location's needs_update flag. (it also makes sure that the new
    location's "inserted" flag is set again).

  . This eventually leads to insert_bp_location trying to re-insert
    the breakpoint...

  . That's when things get interesting: The first thing that happens
    is that insert_bp_location actually wipes out the location's
    target_info data, which means that the breakpoint's shadow_contents
    info is lost. As a result, we are unable to restore the original
    instruction under our breakpoint after we hit it (before transfering
    control back to the user).  This explains the SIGTRAP, because we have
    modified the code executed by the inferior by restoring a corrupted
    shadow_contents.

    The first part of the fix avoids the re-initialization of the
    whole target_info structure, avoid the wipe.

  . But that's not enough. Still while trying to re-insert the breakpoint,
    we land inside mem-break.c:default_memory_insert_breakpoint.
    This function does not seem to have enough information to determine
    whether the breakpoint is already inserted or not. So it just
    assumes that it hasn't and performs two actions:
       - Save the original code in the shadow_contents buffer;
       - Write the breakpoint instruction in its place.

    That's where we hit an unexpected limitation of target_read_memory.
    This funtion knows about breakpoints and uses the shadow_contents
    buffer if the read touches one of the areas where a breakpoint is
    still inserted. Unfortunately, the way the code is written, here is
    what happens:
       - memory_xfer_partial fetches the raw memory, potentially polluted
         by inserted breakpoints, saves the result in the target buffer;
       - memory_xfer_partial then calls breakpoint_xfer_memory to update
         the target buffer with pieces of shadow_contents as appropriate
         to remove the inserted breakpoints from the result.
    This approach breaks down when the buffer passed to these read
    functions is actually a shadow_contents buffer, because the first
    read ends up overwriting the shadow_contents, thus rendering the
    call to breakpoint_xfer_memory useless.

    One way to fix that problem was to lift that limitation, by using
    a temporary buffer inside memory_xfer_partial, and then copy the
    contents of that buffer into the target buffer after the two steps
    are complete. But Pedro felt it was too heavy a hammer.

    So instead, we fixed the two call sites where a target-read
    operation is performed with a breakpoint shadow_contents buffer
    as the argument.

One side comment:

  * It looks like "b->loc = NULL" in update_breakpoint_locations
    is a memory leak, since the locations' ref counters are not
    decremented?