From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23493 invoked by alias); 19 Nov 2013 16:22:08 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 23468 invoked by uid 89); 19 Nov 2013 16:22:07 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,RDNS_NONE,SPF_PASS autolearn=no version=3.3.2 X-HELO: mx1.redhat.com Received: from Unknown (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 19 Nov 2013 16:21:19 +0000 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id rAJGLCVt018625 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 19 Nov 2013 11:21:12 -0500 Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id rAJGLAdx004531; Tue, 19 Nov 2013 11:21:11 -0500 Message-ID: <528B8FF6.7000406@redhat.com> Date: Tue, 19 Nov 2013 16:33:00 -0000 From: Pedro Alves User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130625 Thunderbird/17.0.7 MIME-Version: 1.0 To: Tom Tromey CC: gdb-patches@sourceware.org Subject: Re: [PATCH 1/2] avoid infinite loop with bad debuginfo References: <1384375873-32160-1-git-send-email-tromey@redhat.com> <1384375873-32160-2-git-send-email-tromey@redhat.com> <52850730.1060109@redhat.com> <87d2lxpo1l.fsf@fleche.redhat.com> <528B7F15.7040605@redhat.com> <87vbzomm78.fsf@fleche.redhat.com> In-Reply-To: <87vbzomm78.fsf@fleche.redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-SW-Source: 2013-11/txt/msg00552.txt.bz2 On 11/19/2013 03:43 PM, Tom Tromey wrote: > That said, even once your change is in, I think both of these patches > should go in. Patch #1 still prevents an infinite loop -- ... > I can probably find another test case Yes, probably by manually creating a corrupted stack. while (VALUE_LVAL (new_val) == lval_register && value_lazy (new_val)) { frame = frame_find_by_id (VALUE_FRAME_ID (new_val)); ... new_val = get_frame_register_value (frame, regnum); } get_frame_register_value can return a lazy register value pointing to the next frame (in the dwarf unwinder, that's DWARF2_FRAME_REG_SAME_VALUE). That's perfectly normal. But say we have a corrupted stack like this: #0 - frame_id_1 #1 - frame_id_2 #2 - frame_id_3 #3 - frame_id_4 #4 - frame_id_4 <<<< outermost (UNWIND_SAME_ID). So, get_frame_register_value in frame #4, can return a lazy value pointing to frame #3. What's not normal is having two frames with the same id. So the next iteration, frame_find_by_id tries to look for frame #3. But, since it has the same id as frame #4, frame #4 is returned, rinse, repeat. I think this is an old latent problem. We shouldn't ever have two frames with the same id in the frame chain, lots of things break otherwise. But somehow, we managed to get this far in this particular case. If we can indeed trigger this with a real corruption test case, I believe the reason is that the recent-ish addition of the frame stash exposes the latent bug: struct frame_info * frame_find_by_id (struct frame_id id) { struct frame_info *frame, *prev_frame; /* ZERO denotes the null frame, let the caller decide what to do about it. Should it instead return get_current_frame()? */ if (!frame_id_p (id)) return NULL; /* Try using the frame stash first. Finding it there removes the need to perform the search by looping over all frames, which can be very CPU-intensive if the number of frames is very high (the loop is O(n) and get_prev_frame performs a series of checks that are relatively expensive). This optimization is particularly useful when this function is called from another function (such as value_fetch_lazy, case VALUE_LVAL (val) == lval_register) which already loops over all frames, making the overall behavior O(n^2). */ frame = frame_stash_find (id); if (frame) return frame; for (frame = get_current_frame (); ; frame = prev_frame) { Before we had a stash, frame_find_by_id(frame_id_4) would always find #3 first. But now, it's possible that the stash returns #4 instead. I still think that such a loop should be broken by never having two frames with the same id in the frame chain in the first place. This potential infinite loop in value_fetch_lazy is really an internal error, IMO. -- Pedro Alves