From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-107385-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 23493 invoked by alias); 19 Nov 2013 16:22:08 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 23468 invoked by uid 89); 19 Nov 2013 16:22:07 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,RDNS_NONE,SPF_PASS autolearn=no version=3.3.2
X-HELO: mx1.redhat.com
Received: from Unknown (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 19 Nov 2013 16:21:19 +0000
Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12])	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id rAJGLCVt018625	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK)	for <gdb-patches@sourceware.org>; Tue, 19 Nov 2013 11:21:12 -0500
Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11])	by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id rAJGLAdx004531;	Tue, 19 Nov 2013 11:21:11 -0500
Message-ID: <528B8FF6.7000406@redhat.com>
Date: Tue, 19 Nov 2013 16:33:00 -0000
From: Pedro Alves <palves@redhat.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130625 Thunderbird/17.0.7
MIME-Version: 1.0
To: Tom Tromey <tromey@redhat.com>
CC: gdb-patches@sourceware.org
Subject: Re: [PATCH 1/2] avoid infinite loop with bad debuginfo
References: <1384375873-32160-1-git-send-email-tromey@redhat.com>	<1384375873-32160-2-git-send-email-tromey@redhat.com>	<52850730.1060109@redhat.com> <87d2lxpo1l.fsf@fleche.redhat.com>	<528B7F15.7040605@redhat.com> <87vbzomm78.fsf@fleche.redhat.com>
In-Reply-To: <87vbzomm78.fsf@fleche.redhat.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-SW-Source: 2013-11/txt/msg00552.txt.bz2

On 11/19/2013 03:43 PM, Tom Tromey wrote:

> That said, even once your change is in, I think both of these patches
> should go in.  Patch #1 still prevents an infinite loop -- 

...

> I can probably find another test case

Yes, probably by manually creating a corrupted stack.

      while (VALUE_LVAL (new_val) == lval_register && value_lazy (new_val))
	{
	  frame = frame_find_by_id (VALUE_FRAME_ID (new_val));
...
	  new_val = get_frame_register_value (frame, regnum);
	}

get_frame_register_value can return a lazy register value pointing
to the next frame (in the dwarf unwinder, that's
DWARF2_FRAME_REG_SAME_VALUE).  That's perfectly normal.

But say we have a corrupted stack like this:

 #0 - frame_id_1
 #1 - frame_id_2
 #2 - frame_id_3
 #3 - frame_id_4
 #4 - frame_id_4  <<<< outermost (UNWIND_SAME_ID).

So, get_frame_register_value in frame #4, can return
a lazy value pointing to frame #3.  What's not normal is having two
frames with the same id.  So the next iteration, frame_find_by_id
tries to look for frame #3.  But, since it has the
same id as frame #4, frame #4 is returned, rinse, repeat.

I think this is an old latent problem.  We shouldn't ever have
two frames with the same id in the frame chain, lots of things
break otherwise.  But somehow, we managed to get this far
in this particular case.

If we can indeed trigger this with a real corruption test
case, I believe the reason is that the recent-ish addition
of the frame stash exposes the latent bug:

struct frame_info *
frame_find_by_id (struct frame_id id)
{
  struct frame_info *frame, *prev_frame;

  /* ZERO denotes the null frame, let the caller decide what to do
     about it.  Should it instead return get_current_frame()?  */
  if (!frame_id_p (id))
    return NULL;

  /* Try using the frame stash first.  Finding it there removes the need
     to perform the search by looping over all frames, which can be very
     CPU-intensive if the number of frames is very high (the loop is O(n)
     and get_prev_frame performs a series of checks that are relatively
     expensive).  This optimization is particularly useful when this function
     is called from another function (such as value_fetch_lazy, case
     VALUE_LVAL (val) == lval_register) which already loops over all frames,
     making the overall behavior O(n^2).  */
  frame = frame_stash_find (id);
  if (frame)
    return frame;

  for (frame = get_current_frame (); ; frame = prev_frame)
    {


Before we had a stash, frame_find_by_id(frame_id_4) would
always find #3 first.  But now, it's possible that
the stash returns #4 instead.

I still think that such a loop should be broken by never
having two frames with the same id in the frame chain in the
first place.  This potential infinite loop in value_fetch_lazy
is really an internal error, IMO.

-- 
Pedro Alves