From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17528 invoked by alias); 2 Oct 2012 14:14:20 -0000 Received: (qmail 17498 invoked by uid 22791); 2 Oct 2012 14:14:18 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL,BAYES_00,KHOP_SPAMHAUS_DROP,RCVD_IN_HOSTKARMA_NO,TW_BJ X-Spam-Check-By: sourceware.org Received: from rock.gnat.com (HELO rock.gnat.com) (205.232.38.15) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 02 Oct 2012 14:14:13 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by filtered-rock.gnat.com (Postfix) with ESMTP id A61C41C7B77; Tue, 2 Oct 2012 10:14:12 -0400 (EDT) Received: from rock.gnat.com ([127.0.0.1]) by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 2Ql5H3mTJLo7; Tue, 2 Oct 2012 10:14:12 -0400 (EDT) Received: from joel.gnat.com (localhost.localdomain [127.0.0.1]) by rock.gnat.com (Postfix) with ESMTP id 61C3A1C7B73; Tue, 2 Oct 2012 10:14:12 -0400 (EDT) Received: by joel.gnat.com (Postfix, from userid 1000) id 115B3C562A; Tue, 2 Oct 2012 16:14:06 +0200 (CEST) Date: Tue, 02 Oct 2012 14:14:00 -0000 From: Joel Brobecker To: binutils@sourceware.org Cc: gdb-patches@sourceware.org Subject: [RFC/RFA] dangling bfd pointer in archive cache... Message-ID: <20121002141405.GD3028@adacore.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="gKMricLos+KVdGMg" Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2012-10/txt/msg00020.txt.bz2 --gKMricLos+KVdGMg Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-length: 2580 Hello, I am trying to fix a crash that's occurring when running a program from GDB on ppc-aix. Take any program that uses threading, and: (gdb) run Starting program: /[...]/task_switch zsh: 46458 segmentation fault (core dumped) gdb-head -q task_switch This is related to a recent patch that started counting references to BFDs, and closing them when the reference count reached zero. Here is what is happening in chronological order: During the startup phase, GDB receives notification that libthread.a has been mapped. It creates an archive BFD for it, and starts going through it object files. It looks at the first one by calling: bfd *result = bfd_openr_next_archived_file (archive, previous); (where previous is NULL). Following what archive.c:bfd_openr_next_archived_file does, we find that it calls coff-rs6000.c:_bfd_xcoff_openr_next_archived_file, which eventually calls archive.c:_bfd_get_elt_at_filepos. This routine first checks the archive's cache for our bfd, and creates a new one if not found. At the end of the element's creation, it then adds it to the archive BFD's cache: if (_bfd_add_bfd_to_archive_cache (archive, filepos, n_nfd)) return n_nfd; Back to GDB, GDB looks at our elt bfd, finds that it's not the one it is looking for, gets the next one using the same function, and then unref's it. As the ref count of that first objfile reached zero, it therefore calls bfd_close. This is when things start going wrong, as bfd_close frees the memory allocated to our elt bfd, but does not remove it from the archive's cache. As a result, the next time we query the first elt of our archive, we find the reference in the cache, and return that - a pointer to free'ed memory, which eventually leads to a crash. Looking further into this, I went back and forth between different approaches, until I found that archive.c defines a function that the cleanup: archive.c:_bfd_archive_close_and_cleanup. I don't think it should be hooked up to the target vector to be called automatically, since it's not entirely a target properly, more like a "construct" property. So I've simply added a call to it from include "bfd_close". This fixes the problem on ppc-aix, tested using the gdb-testsuite on ppc-aix and x86_64-linux. I'd be happy to do more testing if someone told me which testsuites to run for this type of change. But before doing so, I thought I'd make sure I am making the correct type of change... Thoughts? bfd/ChangeLog: * opncls.c (bfd_close); Add call to _bfd_archive_close_and_cleanup. Thanks, -- Joel --gKMricLos+KVdGMg Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="opncls.diff" Content-length: 870 diff --git a/bfd/opncls.c b/bfd/opncls.c index fdccba3..b9b3743 100644 --- a/bfd/opncls.c +++ b/bfd/opncls.c @@ -719,6 +719,17 @@ bfd_close (bfd *abfd) if (! BFD_SEND (abfd, _close_and_cleanup, (abfd))) return FALSE; + /* If ABFD is an archive bfd, or a bfd contained in an archive, + do any associated cleanup. This is necessary to avoid leaving + a dangling pointer somewhere. For instance, if ABFD is an object + contained in an archive, the archive's cache might have a reference + to ABFD that needs to be removed before we free ABFD. + + _bfd_archive_close_and_cleanup does nothing if ABFD is neither + an archive BFD nor a BFD of an object contained in an archive, + so it is safe to call it without checking the bfd kind. */ + _bfd_archive_close_and_cleanup (abfd); + ret = abfd->iovec->bclose (abfd); if (ret) --gKMricLos+KVdGMg--