From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-64507-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 10872 invoked by alias); 15 Jun 2009 17:52:51 -0000
Received: (qmail 10863 invoked by uid 22791); 15 Jun 2009 17:52:50 -0000
X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 	tests=AWL,BAYES_00
X-Spam-Check-By: sourceware.org
Received: from smtp-outbound-2.vmware.com (HELO smtp-outbound-2.vmware.com) (65.115.85.73)     by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 15 Jun 2009 17:52:42 +0000
Received: from mailhost2.vmware.com (mailhost2.vmware.com [10.16.67.167]) 	by smtp-outbound-2.vmware.com (Postfix) with ESMTP id 872681E015; 	Mon, 15 Jun 2009 10:52:38 -0700 (PDT)
Received: from [10.20.94.141] (msnyder-server.eng.vmware.com [10.20.94.141]) 	by mailhost2.vmware.com (Postfix) with ESMTP id 6E1878EA20; 	Mon, 15 Jun 2009 10:52:38 -0700 (PDT)
Message-ID: <4A368A89.8020706@vmware.com>
Date: Mon, 15 Jun 2009 17:52:00 -0000
From: Michael Snyder <msnyder@vmware.com>
User-Agent: Thunderbird 1.5.0.12 (X11/20080411)
MIME-Version: 1.0
To: Hui Zhu <teawater@gmail.com>
CC: Eli Zaretskii <eliz@gnu.org>,   "gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Subject: Re: [Precord RFA/RFC] Check Linux sys_brk release memory in process  	record and replay.
References: <daef60380905050607yc499e39n34ac56108225f6ec@mail.gmail.com>	 <daef60380905052035v2758da1ak27fabfd902dff32d@mail.gmail.com>	 <83tz3ycchv.fsf@gnu.org>	 <daef60380905061921k7ced158eu2a74bab604a700e1@mail.gmail.com>	 <83k54td2el.fsf@gnu.org>	 <daef60380905110006gaa1ac38r870f6be455a402fc@mail.gmail.com>	 <daef60380906081917k3eabe59an2ba2a2983c139d10@mail.gmail.com>	 <4A342EBB.4020601@vmware.com>	 <daef60380906140226x215008b5sdad2051613274469@mail.gmail.com>	 <4A3536B8.1090508@vmware.com> <daef60380906150104x4816b264ub6897b9c0fce3eda@mail.gmail.com>
In-Reply-To: <daef60380906150104x4816b264ub6897b9c0fce3eda@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2009-06/txt/msg00394.txt.bz2

Hui Zhu wrote:
> On Mon, Jun 15, 2009 at 01:43, Michael Snyder<msnyder@vmware.com> wrote:
>> Hui Zhu wrote:
>>> On Sun, Jun 14, 2009 at 06:56, Michael Snyder<msnyder@vmware.com> wrote:
>>>> Hui Zhu wrote:
>>>>> Ping.
>>>> OK, my bad for taking so long to get to this... please allow me
>>>> to summarize the problem, to check my own understanding
>>>> (tell me if I'm wrong).
>>>>
>>> For that "nice people" words.  I just want to make a joke.  :)
>>>
>>>> Currently linux-record.c does not know how to "undo" a sys_brk
>>>> system call.  You (teawater) are concerned because if the child
>>>> process calls sys_brk to free some memory, we cannot un-free it
>>>> and therefore we may get into trouble by writing to the freed
>>>> memory during replay.  Something like this:
>>>>
>>>>  1) child allocates memory X
>>>>  2) child writes to memory X
>>>>  3) child frees memory X
>>>>  4) user asks for reverse-continue
>>>>  5) gdb tries to revert the write that happened in step #2,
>>>>    gets SIGSEGV because location has been freed.
>>>>
>>>> So far so good?
>>>>
>>>> Now, your proposal is that during the record mode, we will
>>>> detect any sys_brk call that frees memory, and query the
>>>> user whether to continue or give up.
>>>>
>>>> I'm not too crazy about that solution.  I think it's
>>>> awkward, and drastic for a situation that may only be
>>>> a problem later on (or not at all).  Let me throw out
>>>> some other ideas:
>>>>
>>>> A) Is it possible to actually "reverse" a sys_brk call?
>>>> Suppose we record the arguments, and when we want to reverse
>>>> it, we just change an increase into a decrease and vice versa?
>>>>
>>>> B) Suppose we wait until an actual memory error occurs
>>>> during replay, and THEN inform the user?  It will avoid
>>>> warning him about something that may never happen.
>>>>
>>>> We could use catch_errors to trap the SIGSEGV, and then
>>>> check to see if the error was caused by a write to memory
>>>> above the BRK boundary.  You will still need to keep track
>>>> of the BRK boundary, but you won't have that awkward early
>>>> query to deal with.
>>> The sys_brk just can increase and decrease data segment size.  The
>>> decrease behavior is very hard to replay.
>> I admit my ignorance in this area, but why is it difficult?
>> In my simple-minded view, if we need to reverse over a sys_brk
>> decrease call, we just make an increase call in the same amount.
>>
>> Please tell me what I am missing.
>>
> 
> I think about it again.  Maybe let it increase and decrease can handle
> this issue.  But call function when infrun is running is a very hard
> job.   Because call_function_by_hand will clear a lot of thing of
> inferior.
> We can do it.  But it must need a long time to check in.
> So maybe we can give user a warning first.

OK, I can see how it is difficult.  Maybe save for later.
For now, maybe we think about which short-term solution is
better:

     1) warn when sys_brk(<0), or
     2) warn when cannot undo a memory write.

Further comment below.

>> During record phase is too early for that notification.
>> The actual failure may never be encountered, especially if
>> the user never tries to reverse past this point in the recording.
>>
>> What I think is that we should wait until we are replaying, and
>> we actually experience a failure to modify freed memory.  At that
>> point we tell the user what has happened, and explain that we
>> cannot go back any earlier in the recording.
>>
>> So something like this:
>>
>>   1) Remember the BRK boundary at start (as you do in this patch).
>>   2) Remember the new BRK boundary whenever it changes (as you do).
>>   3) During replay, compare every memory write against the BRK
>> boundary.  If the memory write will fail because it is above the
>> BRK boundary, stop and inform the user that we cannot go back
>> any further.
> 
> It will make user lost the record entry that before this memory operation.
> And when this thing happen (sys_brk), user doesn't get any alert.

I think that warning when sys_brk(<0) is too early.
Nothing is really wrong at that time -- it only means
that something MIGHT be wrong later, during reverse.

I think the right time to warn is when we actually
encounter a memory write that we cannot undo.  At that
point, there is no way to reverse any further, but
there is really nothing we can do about it.

If we warn the user earlier (during record), there is still
nothing that he can do about the problem.