Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
* [RFC/RFA] dangling bfd pointer in archive cache...
@ 2012-10-02 14:14 Joel Brobecker
  2012-10-03  5:29 ` Alan Modra
  0 siblings, 1 reply; 7+ messages in thread
From: Joel Brobecker @ 2012-10-02 14:14 UTC (permalink / raw)
  To: binutils; +Cc: gdb-patches

[-- Attachment #1: Type: text/plain, Size: 2580 bytes --]

Hello,

I am trying to fix a crash that's occurring when running a program
from GDB on ppc-aix. Take any program that uses threading, and:

    (gdb) run
    Starting program: /[...]/task_switch
    zsh: 46458 segmentation fault (core dumped)  gdb-head -q task_switch

This is related to a recent patch that started counting references
to BFDs, and closing them when the reference count reached zero.
Here is what is happening in chronological order:

During the startup phase, GDB receives notification that libthread.a
has been mapped. It creates an archive BFD for it, and starts going
through it object files. It looks at the first one by calling:

    bfd *result = bfd_openr_next_archived_file (archive, previous);

(where previous is NULL).

Following what archive.c:bfd_openr_next_archived_file does, we
find that it calls coff-rs6000.c:_bfd_xcoff_openr_next_archived_file,
which eventually calls archive.c:_bfd_get_elt_at_filepos. This
routine first checks the archive's cache for our bfd, and creates
a new one if not found. At the end of the element's creation, it
then adds it to the archive BFD's cache:

  if (_bfd_add_bfd_to_archive_cache (archive, filepos, n_nfd))
    return n_nfd;

Back to GDB, GDB looks at our elt bfd, finds that it's not the one
it is looking for, gets the next one using the same function, and
then unref's it.  As the ref count of that first objfile reached zero,
it therefore calls bfd_close.

This is when things start going wrong, as bfd_close frees the memory
allocated to our elt bfd, but does not remove it from the archive's
cache. As a result, the next time we query the first elt of our archive,
we find the reference in the cache, and return that - a pointer to
free'ed memory, which eventually leads to a crash.

Looking further into this, I went back and forth between different
approaches, until I found that archive.c defines a function that
the cleanup: archive.c:_bfd_archive_close_and_cleanup. I don't
think it should be hooked up to the target vector to be called
automatically, since it's not entirely a target properly, more like
a "construct" property. So I've simply added a call to it from
include "bfd_close".

This fixes the problem on ppc-aix, tested using the gdb-testsuite
on ppc-aix and x86_64-linux.  I'd be happy to do more testing if
someone told me which testsuites to run for this type of change.
But before doing so, I thought I'd make sure I am making the correct
type of change...

Thoughts?

bfd/ChangeLog:

        * opncls.c (bfd_close); Add call to _bfd_archive_close_and_cleanup.

Thanks,
-- 
Joel

[-- Attachment #2: opncls.diff --]
[-- Type: text/x-diff, Size: 870 bytes --]

diff --git a/bfd/opncls.c b/bfd/opncls.c
index fdccba3..b9b3743 100644
--- a/bfd/opncls.c
+++ b/bfd/opncls.c
@@ -719,6 +719,17 @@ bfd_close (bfd *abfd)
   if (! BFD_SEND (abfd, _close_and_cleanup, (abfd)))
     return FALSE;
 
+  /* If ABFD is an archive bfd, or a bfd contained in an archive,
+     do any associated cleanup.  This is necessary to avoid leaving
+     a dangling pointer somewhere.  For instance, if ABFD is an object
+     contained in an archive, the archive's cache might have a reference
+     to ABFD that needs to be removed before we free ABFD.
+
+     _bfd_archive_close_and_cleanup does nothing if ABFD is neither
+     an archive BFD nor a BFD of an object contained in an archive,
+     so it is safe to call it without checking the bfd kind.  */
+  _bfd_archive_close_and_cleanup (abfd);
+
   ret = abfd->iovec->bclose (abfd);
 
   if (ret)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC/RFA] dangling bfd pointer in archive cache...
  2012-10-02 14:14 [RFC/RFA] dangling bfd pointer in archive cache Joel Brobecker
@ 2012-10-03  5:29 ` Alan Modra
  2012-10-03 13:30   ` Joel Brobecker
  2012-10-15 18:01   ` Joel Brobecker
  0 siblings, 2 replies; 7+ messages in thread
From: Alan Modra @ 2012-10-03  5:29 UTC (permalink / raw)
  To: Joel Brobecker; +Cc: binutils, gdb-patches

On Tue, Oct 02, 2012 at 07:14:06AM -0700, Joel Brobecker wrote:
>         * opncls.c (bfd_close); Add call to _bfd_archive_close_and_cleanup.

No, we should already be calling _bfd_archive_close_and_cleanup via 

> --- a/bfd/opncls.c
> +++ b/bfd/opncls.c
> @@ -719,6 +719,17 @@ bfd_close (bfd *abfd)
>    if (! BFD_SEND (abfd, _close_and_cleanup, (abfd)))

this call.  The problem is in coff-rs6000.c (and coff64-rs6000.c)
where the bfd_target vector just uses bfd_true for close_and_cleanup.

-- 
Alan Modra
Australia Development Lab, IBM


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC/RFA] dangling bfd pointer in archive cache...
  2012-10-03  5:29 ` Alan Modra
@ 2012-10-03 13:30   ` Joel Brobecker
  2012-10-04  8:01     ` Alan Modra
  2012-10-15 18:01   ` Joel Brobecker
  1 sibling, 1 reply; 7+ messages in thread
From: Joel Brobecker @ 2012-10-03 13:30 UTC (permalink / raw)
  To: binutils, gdb-patches

Hello Alan,

Thanks for the review!

> On Tue, Oct 02, 2012 at 07:14:06AM -0700, Joel Brobecker wrote:
> >         * opncls.c (bfd_close); Add call to _bfd_archive_close_and_cleanup.
> 
> No, we should already be calling _bfd_archive_close_and_cleanup via 

OK, but before I go ahead with your implementation, I wanted to
make sure about something...

> > --- a/bfd/opncls.c
> > +++ b/bfd/opncls.c
> > @@ -719,6 +719,17 @@ bfd_close (bfd *abfd)
> >    if (! BFD_SEND (abfd, _close_and_cleanup, (abfd)))
> 
> this call.  The problem is in coff-rs6000.c (and coff64-rs6000.c)
> where the bfd_target vector just uses bfd_true for close_and_cleanup.

... My reasoning is that this part of the job isn't target-specific.
It's entirely related to how the generic side of archives is handled,
not to the target itself. To me, it needs to be called regardless
of the actual target, and requiring that the target code set it up
is a recipe for having some of them forgetting to do it (which is
what actually happened here).

WDYT? Am I wrong?

-- 
Joel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC/RFA] dangling bfd pointer in archive cache...
  2012-10-03 13:30   ` Joel Brobecker
@ 2012-10-04  8:01     ` Alan Modra
  0 siblings, 0 replies; 7+ messages in thread
From: Alan Modra @ 2012-10-04  8:01 UTC (permalink / raw)
  To: Joel Brobecker; +Cc: binutils, gdb-patches

On Wed, Oct 03, 2012 at 06:30:19AM -0700, Joel Brobecker wrote:
> Hello Alan,
> 
> Thanks for the review!
> 
> > On Tue, Oct 02, 2012 at 07:14:06AM -0700, Joel Brobecker wrote:
> > >         * opncls.c (bfd_close); Add call to _bfd_archive_close_and_cleanup.
> > 
> > No, we should already be calling _bfd_archive_close_and_cleanup via 
> 
> OK, but before I go ahead with your implementation, I wanted to
> make sure about something...
> 
> > > --- a/bfd/opncls.c
> > > +++ b/bfd/opncls.c
> > > @@ -719,6 +719,17 @@ bfd_close (bfd *abfd)
> > >    if (! BFD_SEND (abfd, _close_and_cleanup, (abfd)))
> > 
> > this call.  The problem is in coff-rs6000.c (and coff64-rs6000.c)
> > where the bfd_target vector just uses bfd_true for close_and_cleanup.
> 
> ... My reasoning is that this part of the job isn't target-specific.

but it may become target specific at some future date..

> It's entirely related to how the generic side of archives is handled,
> not to the target itself. To me, it needs to be called regardless
> of the actual target, and requiring that the target code set it up
> is a recipe for having some of them forgetting to do it (which is
> what actually happened here).

This is not the first time we've seen this sort of problem, due to
coff-rs6000.c filling in a bfd_target struct by hand.  I don't know
why it wasn't written to use one of the COFF_TARGET_VEC macros, or at
the very least, using all the BFD_JUMP_TABLE macros.

-- 
Alan Modra
Australia Development Lab, IBM


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC/RFA] dangling bfd pointer in archive cache...
  2012-10-03  5:29 ` Alan Modra
  2012-10-03 13:30   ` Joel Brobecker
@ 2012-10-15 18:01   ` Joel Brobecker
  2012-10-16  1:48     ` Alan Modra
  1 sibling, 1 reply; 7+ messages in thread
From: Joel Brobecker @ 2012-10-15 18:01 UTC (permalink / raw)
  To: binutils, gdb-patches

[-- Attachment #1: Type: text/plain, Size: 706 bytes --]

> No, we should already be calling _bfd_archive_close_and_cleanup via 
> 
> > --- a/bfd/opncls.c
> > +++ b/bfd/opncls.c
> > @@ -719,6 +719,17 @@ bfd_close (bfd *abfd)
> >    if (! BFD_SEND (abfd, _close_and_cleanup, (abfd)))
> 
> this call.  The problem is in coff-rs6000.c (and coff64-rs6000.c)
> where the bfd_target vector just uses bfd_true for close_and_cleanup.

Attached is a patch that implements your suggestion. Tested on ppc-aix
using AdaCore's GDB testsuite.

While at it, the powermac XCOFF backend also seemed to be using
the same code that eventually leads to the dangling pointer, and so
I fixed it as well. It's untested, but seems straightforward enough.

OK to commit?

Thanks,
-- 
Joel

[-- Attachment #2: 0001-Dangling-bfd-pointer-in-archive-cache.patch --]
[-- Type: text/x-diff, Size: 1157 bytes --]

From 63169be6cc57b68280afa74944abbfc285e9b0ae Mon Sep 17 00:00:00 2001
From: Joel Brobecker <brobecker@adacore.com>
Date: Mon, 15 Oct 2012 19:29:45 +0200
Subject: [PATCH] Dangling bfd pointer in archive cache.

This dandling pointer eventually leads to a crash when trying to run
on ppc-aix a program using threading...

bfd/ChangeLog:

        * coff-rs6000.c (rs6000coff_vec): Set _close_and_cleanup
        field to _bfd_archive_close_and_cleanup.
        (pmac_xcoff_vec): Likewise.
---
 bfd/coff-rs6000.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/bfd/coff-rs6000.c b/bfd/coff-rs6000.c
index edbef95..0945aca 100644
--- a/bfd/coff-rs6000.c
+++ b/bfd/coff-rs6000.c
@@ -4076,7 +4076,7 @@ const bfd_target rs6000coff_vec =
     },
 
     /* Generic */
-    bfd_true,
+    _bfd_archive_close_and_cleanup,
     bfd_true,
     coff_new_section_hook,
     _bfd_generic_get_section_contents,
@@ -4332,7 +4332,7 @@ const bfd_target pmac_xcoff_vec =
     },
 
     /* Generic */
-    bfd_true,
+    _bfd_archive_close_and_cleanup,
     bfd_true,
     coff_new_section_hook,
     _bfd_generic_get_section_contents,
-- 
1.6.5.rc2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC/RFA] dangling bfd pointer in archive cache...
  2012-10-15 18:01   ` Joel Brobecker
@ 2012-10-16  1:48     ` Alan Modra
  2012-10-16 22:58       ` Joel Brobecker
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Modra @ 2012-10-16  1:48 UTC (permalink / raw)
  To: Joel Brobecker; +Cc: binutils, gdb-patches

On Mon, Oct 15, 2012 at 11:00:52AM -0700, Joel Brobecker wrote:
>         * coff-rs6000.c (rs6000coff_vec): Set _close_and_cleanup
>         field to _bfd_archive_close_and_cleanup.
>         (pmac_xcoff_vec): Likewise.

OK.  Please apply the same fix to coff64-rs6000.c.

-- 
Alan Modra
Australia Development Lab, IBM


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC/RFA] dangling bfd pointer in archive cache...
  2012-10-16  1:48     ` Alan Modra
@ 2012-10-16 22:58       ` Joel Brobecker
  0 siblings, 0 replies; 7+ messages in thread
From: Joel Brobecker @ 2012-10-16 22:58 UTC (permalink / raw)
  To: binutils, gdb-patches

[-- Attachment #1: Type: text/plain, Size: 366 bytes --]

> On Mon, Oct 15, 2012 at 11:00:52AM -0700, Joel Brobecker wrote:
> >         * coff-rs6000.c (rs6000coff_vec): Set _close_and_cleanup
> >         field to _bfd_archive_close_and_cleanup.
> >         (pmac_xcoff_vec): Likewise.
> 
> OK.  Please apply the same fix to coff64-rs6000.c.

Thanks. Checked in (attached is the patch applied to coff64-rs6000.c).

-- 
Joel

[-- Attachment #2: 0001-dangling-pointer-in-coff64-rs6000-archive-cache.patch --]
[-- Type: text/x-diff, Size: 1070 bytes --]

From ecaaf1440801831e107c39fea3016e2e57cc507c Mon Sep 17 00:00:00 2001
From: Joel Brobecker <brobecker@adacore.com>
Date: Tue, 16 Oct 2012 15:51:13 -0700
Subject: [PATCH] dangling pointer in coff64-rs6000 archive cache.

bfd/ChangeLog:

        * coff64-rs6000.c (rs6000coff64_vec): Set _close_and_cleanup
        field to _bfd_archive_close_and_cleanup.
        (aix5coff64_vec): Likewise.
---
 bfd/coff64-rs6000.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/bfd/coff64-rs6000.c b/bfd/coff64-rs6000.c
index 0821b6f..5f4a502 100644
--- a/bfd/coff64-rs6000.c
+++ b/bfd/coff64-rs6000.c
@@ -2668,7 +2668,7 @@ const bfd_target rs6000coff64_vec =
     },
 
     /* Generic */
-    bfd_true,
+    _bfd_archive_close_and_cleanup,
     bfd_true,
     coff_new_section_hook,
     _bfd_generic_get_section_contents,
@@ -2926,7 +2926,7 @@ const bfd_target aix5coff64_vec =
     },
 
     /* Generic */
-    bfd_true,
+    _bfd_archive_close_and_cleanup,
     bfd_true,
     coff_new_section_hook,
     _bfd_generic_get_section_contents,
-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-10-16 22:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-02 14:14 [RFC/RFA] dangling bfd pointer in archive cache Joel Brobecker
2012-10-03  5:29 ` Alan Modra
2012-10-03 13:30   ` Joel Brobecker
2012-10-04  8:01     ` Alan Modra
2012-10-15 18:01   ` Joel Brobecker
2012-10-16  1:48     ` Alan Modra
2012-10-16 22:58       ` Joel Brobecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox