Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
* RFA: gdb/568, messy thread exits
@ 2002-07-31  9:55 Daniel Jacobowitz
  2002-07-31 13:28 ` Jim Blandy
  2002-08-12  9:14 ` Daniel Jacobowitz
  0 siblings, 2 replies; 11+ messages in thread
From: Daniel Jacobowitz @ 2002-07-31  9:55 UTC (permalink / raw)
  To: gdb-patches, jimb

Jim, what do you think about this change?  This fixes a whole class of
problems for me, by not longjmp'ing out of attempts to kill/detach/quit/etc.

-- 
Daniel Jacobowitz                           Carnegie Mellon University
MontaVista Software                         Debian GNU/Linux Developer

2002-07-31  Daniel Jacobowitz  <drow@mvista.com>

	Fix PR gdb/568
	* thread-db.c (lwp_from_thread): Only warn if unable to find
	the thread.

Index: thread-db.c
===================================================================
RCS file: /cvs/src/src/gdb/thread-db.c,v
retrieving revision 1.22
diff -u -p -r1.22 thread-db.c
--- thread-db.c	23 Mar 2002 17:38:13 -0000	1.22
+++ thread-db.c	31 Jul 2002 16:29:52 -0000
@@ -260,6 +260,12 @@ lwp_from_thread (ptid_t ptid)
     return ptid;
 
   err = td_ta_map_id2thr_p (thread_agent, GET_THREAD (ptid), &th);
+  if (err == TD_ERR)
+    {
+      warning ("Cannot find thread %ld: %s",
+	       (long) GET_THREAD (ptid), thread_db_err_str (err));
+      return ptid;
+    }
   if (err != TD_OK)
     error ("Cannot find thread %ld: %s",
 	   (long) GET_THREAD (ptid), thread_db_err_str (err));


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFA: gdb/568, messy thread exits
  2002-07-31  9:55 RFA: gdb/568, messy thread exits Daniel Jacobowitz
@ 2002-07-31 13:28 ` Jim Blandy
  2002-07-31 13:47   ` Daniel Jacobowitz
  2002-08-12  9:14 ` Daniel Jacobowitz
  1 sibling, 1 reply; 11+ messages in thread
From: Jim Blandy @ 2002-07-31 13:28 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb-patches


Daniel Jacobowitz <drow@mvista.com> writes:
> Jim, what do you think about this change?  This fixes a whole class of
> problems for me, by not longjmp'ing out of attempts to
> kill/detach/quit/etc.

I'm not sure I can review this change very helpfully; I'm not very
familiar with the threading code.

Can you go into more detail about why this change is adequate?  I
mean, the ptid argument is generally not going to be an lwp, right?
Shouldn't this function at least return a ptid that's an LWP?

Under what situations does this error occur?


> -- 
> Daniel Jacobowitz                           Carnegie Mellon University
> MontaVista Software                         Debian GNU/Linux Developer
> 
> 2002-07-31  Daniel Jacobowitz  <drow@mvista.com>
> 
> 	Fix PR gdb/568
> 	* thread-db.c (lwp_from_thread): Only warn if unable to find
> 	the thread.
> 
> Index: thread-db.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/thread-db.c,v
> retrieving revision 1.22
> diff -u -p -r1.22 thread-db.c
> --- thread-db.c	23 Mar 2002 17:38:13 -0000	1.22
> +++ thread-db.c	31 Jul 2002 16:29:52 -0000
> @@ -260,6 +260,12 @@ lwp_from_thread (ptid_t ptid)
>      return ptid;
>  
>    err = td_ta_map_id2thr_p (thread_agent, GET_THREAD (ptid), &th);
> +  if (err == TD_ERR)
> +    {
> +      warning ("Cannot find thread %ld: %s",
> +	       (long) GET_THREAD (ptid), thread_db_err_str (err));
> +      return ptid;
> +    }
>    if (err != TD_OK)
>      error ("Cannot find thread %ld: %s",
>  	   (long) GET_THREAD (ptid), thread_db_err_str (err));


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFA: gdb/568, messy thread exits
  2002-07-31 13:28 ` Jim Blandy
@ 2002-07-31 13:47   ` Daniel Jacobowitz
  2002-07-31 14:01     ` Jim Blandy
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Jacobowitz @ 2002-07-31 13:47 UTC (permalink / raw)
  To: Jim Blandy; +Cc: gdb-patches

On Wed, Jul 31, 2002 at 03:23:08PM -0500, Jim Blandy wrote:
> 
> Daniel Jacobowitz <drow@mvista.com> writes:
> > Jim, what do you think about this change?  This fixes a whole class of
> > problems for me, by not longjmp'ing out of attempts to
> > kill/detach/quit/etc.
> 
> I'm not sure I can review this change very helpfully; I'm not very
> familiar with the threading code.
> 
> Can you go into more detail about why this change is adequate?  I
> mean, the ptid argument is generally not going to be an lwp, right?
> Shouldn't this function at least return a ptid that's an LWP?

The first bit of the function just returns whatever PTID it was passed
if we weren't given a thread.  As such, I'm not terribly worried about
the type of PTID we return.

> Under what situations does this error occur?

Here's the interesting part.  It occurs whenever thread_db can not look
up the LWP<->Thread mapping in the child.  If, for instance, the child
has disappeared, we can not look up its mapping any more.  Or if some
other circumstance causes the mapping to be unavailable, like an
exec(); there's another PR about that case.

If we allow this to be a fatal error here, we very easily get stuck
somewhere we can not recover from.

Of course, it is also my opinion that we perform the mapping a stupid
number of times.  ``set debug target 1'' and run a threaded program;
you'll see it happen over and over again.

-- 
Daniel Jacobowitz                           Carnegie Mellon University
MontaVista Software                         Debian GNU/Linux Developer


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFA: gdb/568, messy thread exits
  2002-07-31 13:47   ` Daniel Jacobowitz
@ 2002-07-31 14:01     ` Jim Blandy
  2002-07-31 14:04       ` Daniel Jacobowitz
  0 siblings, 1 reply; 11+ messages in thread
From: Jim Blandy @ 2002-07-31 14:01 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb-patches


Daniel Jacobowitz <drow@mvista.com> writes:
> Here's the interesting part.  It occurs whenever thread_db can not look
> up the LWP<->Thread mapping in the child.  If, for instance, the child
> has disappeared, we can not look up its mapping any more.  Or if some
> other circumstance causes the mapping to be unavailable, like an
> exec(); there's another PR about that case.

I dunno.  I mean, if someone asks you to look up a TID's LWP, and you
can't, that seems like an error, no?  Isn't it supposed to be better
to flag errors early than let the thing go crashing on through the
forest with bogus information?

I'm glad you found a patch that addresses the PR I filed, but there's
no way I can approve this patch --- not because I'm sure it's wrong,
but because I'm sure I have no idea whether it's right or wrong.  :)

> Of course, it is also my opinion that we perform the mapping a stupid
> number of times.  ``set debug target 1'' and run a threaded program;
> you'll see it happen over and over again.

Yes, I noticed this: amazing numbers of huge memory transfers.  Yuck.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFA: gdb/568, messy thread exits
  2002-07-31 14:01     ` Jim Blandy
@ 2002-07-31 14:04       ` Daniel Jacobowitz
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Jacobowitz @ 2002-07-31 14:04 UTC (permalink / raw)
  To: Jim Blandy; +Cc: gdb-patches

On Wed, Jul 31, 2002 at 03:55:17PM -0500, Jim Blandy wrote:
> 
> Daniel Jacobowitz <drow@mvista.com> writes:
> > Here's the interesting part.  It occurs whenever thread_db can not look
> > up the LWP<->Thread mapping in the child.  If, for instance, the child
> > has disappeared, we can not look up its mapping any more.  Or if some
> > other circumstance causes the mapping to be unavailable, like an
> > exec(); there's another PR about that case.
> 
> I dunno.  I mean, if someone asks you to look up a TID's LWP, and you
> can't, that seems like an error, no?  Isn't it supposed to be better
> to flag errors early than let the thing go crashing on through the
> forest with bogus information?

Well, yes; except for the fact that this is on the exit path, so if we
call error(), we tend to get stuck.  This is a fairly frequent GDB
problem.

> I'm glad you found a patch that addresses the PR I filed, but there's
> no way I can approve this patch --- not because I'm sure it's wrong,
> but because I'm sure I have no idea whether it's right or wrong.  :)

Thanks for looking at it.  I'll wait for Kevin or Michael to have time
for it.

> 
> > Of course, it is also my opinion that we perform the mapping a stupid
> > number of times.  ``set debug target 1'' and run a threaded program;
> > you'll see it happen over and over again.
> 
> Yes, I noticed this: amazing numbers of huge memory transfers.  Yuck.
> 

-- 
Daniel Jacobowitz                           Carnegie Mellon University
MontaVista Software                         Debian GNU/Linux Developer


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFA: gdb/568, messy thread exits
  2002-07-31  9:55 RFA: gdb/568, messy thread exits Daniel Jacobowitz
  2002-07-31 13:28 ` Jim Blandy
@ 2002-08-12  9:14 ` Daniel Jacobowitz
  2002-08-13 15:00   ` Mark Kettenis
  1 sibling, 1 reply; 11+ messages in thread
From: Daniel Jacobowitz @ 2002-08-12  9:14 UTC (permalink / raw)
  To: gdb-patches; +Cc: msnyder, kettenis

I'm just having bad luck reading MAINTAINERS this week.  I'll ask the
actual maintainers this time!

Michael, Mark - what do you think of this patch?  A better explanation
of the patch is at:
  http://sources.redhat.com/ml/gdb-patches/2002-07/msg00630.html

On Wed, Jul 31, 2002 at 12:39:10PM -0400, Daniel Jacobowitz wrote:
> Jim, what do you think about this change?  This fixes a whole class of
> problems for me, by not longjmp'ing out of attempts to kill/detach/quit/etc.
> 
> -- 
> Daniel Jacobowitz                           Carnegie Mellon University
> MontaVista Software                         Debian GNU/Linux Developer
> 
> 2002-07-31  Daniel Jacobowitz  <drow@mvista.com>
> 
> 	Fix PR gdb/568
> 	* thread-db.c (lwp_from_thread): Only warn if unable to find
> 	the thread.
> 
> Index: thread-db.c
> ===================================================================
> RCS file: /cvs/src/src/gdb/thread-db.c,v
> retrieving revision 1.22
> diff -u -p -r1.22 thread-db.c
> --- thread-db.c	23 Mar 2002 17:38:13 -0000	1.22
> +++ thread-db.c	31 Jul 2002 16:29:52 -0000
> @@ -260,6 +260,12 @@ lwp_from_thread (ptid_t ptid)
>      return ptid;
>  
>    err = td_ta_map_id2thr_p (thread_agent, GET_THREAD (ptid), &th);
> +  if (err == TD_ERR)
> +    {
> +      warning ("Cannot find thread %ld: %s",
> +	       (long) GET_THREAD (ptid), thread_db_err_str (err));
> +      return ptid;
> +    }
>    if (err != TD_OK)
>      error ("Cannot find thread %ld: %s",
>  	   (long) GET_THREAD (ptid), thread_db_err_str (err));
> 

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFA: gdb/568, messy thread exits
  2002-08-12  9:14 ` Daniel Jacobowitz
@ 2002-08-13 15:00   ` Mark Kettenis
  2002-08-13 15:13     ` Daniel Jacobowitz
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Kettenis @ 2002-08-13 15:00 UTC (permalink / raw)
  To: drow; +Cc: gdb-patches, msnyder

   Date: Mon, 12 Aug 2002 12:14:47 -0400
   From: Daniel Jacobowitz <drow@mvista.com>

   Michael, Mark - what do you think of this patch?  A better explanation
   of the patch is at:
     http://sources.redhat.com/ml/gdb-patches/2002-07/msg00630.html

It's a kludge.  Therefore I'm not inclined to say that this can go in.
We should really fix things such that lwp_from_thread isn't called
under the circumstances where you're having problems, or we should
modify it such that it doesn't have to call td_ta_map_id2thr() under
those troublesome circumstances.

If we can't come up with such a patch in a timely fashion, we could
decide to get this in, but not without a comment saying that it is a
kludge.

Mark


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFA: gdb/568, messy thread exits
  2002-08-13 15:00   ` Mark Kettenis
@ 2002-08-13 15:13     ` Daniel Jacobowitz
  2002-08-26 18:33       ` Michael Snyder
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Jacobowitz @ 2002-08-13 15:13 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches, msnyder

On Wed, Aug 14, 2002 at 12:00:03AM +0200, Mark Kettenis wrote:
>    Date: Mon, 12 Aug 2002 12:14:47 -0400
>    From: Daniel Jacobowitz <drow@mvista.com>
> 
>    Michael, Mark - what do you think of this patch?  A better explanation
>    of the patch is at:
>      http://sources.redhat.com/ml/gdb-patches/2002-07/msg00630.html
> 
> It's a kludge.  Therefore I'm not inclined to say that this can go in.
> We should really fix things such that lwp_from_thread isn't called
> under the circumstances where you're having problems, or we should
> modify it such that it doesn't have to call td_ta_map_id2thr() under
> those troublesome circumstances.
> 
> If we can't come up with such a patch in a timely fashion, we could
> decide to get this in, but not without a comment saying that it is a
> kludge.

Well, let me elaborate.

lwp_from_thread consults data structures in the inferior, just like the
rest of thread_db.  As such, I have to consider it... "untrusted" if
you will.  The inferior crashes unexpectedly - we can't call
lwp_from_thread any more.  The inferior gets corrupted - we can't call
lwp_from_thread any more.  As such, I think we need to be at least a
little more graceful with this function everywhere we touch it (which
is far too often, if you look at the target traffic dumps).  The same
goes for any other request we make of thread_db.  So the fixes would
not be in calling it less, but in allowing it to fail (more)
gracefully.

To be honest, I gave up on this entirely.  We don't support any
non-one-to-one threads packages via thread-db.c; we've never tested
with any unless someone's got one up their sleeve that they're not
talking about.  I dispensed with that part of the abstraction layer
completely, and got a threads package several times faster, simpler,
and more reliable (I posted some additional test cases that it passed
and thread-db.c/lin-lwp.c didn't a few months ago; no one ever
commented on them).  It's sitting in gdbserver/thread-db.c and
gdbserver/linux-low.c if you want to look at it.  I believe it'll scale
to non-one-to-one, and I tried to preserve that in the design, but
actually supporting all the mapping calls when we don't have any such
packages just seemed like a bad idea.  It's too complex for me to get
it right blind.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFA: gdb/568, messy thread exits
  2002-08-13 15:13     ` Daniel Jacobowitz
@ 2002-08-26 18:33       ` Michael Snyder
  2002-08-26 19:05         ` Daniel Jacobowitz
  0 siblings, 1 reply; 11+ messages in thread
From: Michael Snyder @ 2002-08-26 18:33 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Mark Kettenis, gdb-patches

Daniel Jacobowitz wrote:
> 
> On Wed, Aug 14, 2002 at 12:00:03AM +0200, Mark Kettenis wrote:
> >    Date: Mon, 12 Aug 2002 12:14:47 -0400
> >    From: Daniel Jacobowitz <drow@mvista.com>
> >
> >    Michael, Mark - what do you think of this patch?  A better explanation
> >    of the patch is at:
> >      http://sources.redhat.com/ml/gdb-patches/2002-07/msg00630.html
> >
> > It's a kludge.  Therefore I'm not inclined to say that this can go in.
> > We should really fix things such that lwp_from_thread isn't called
> > under the circumstances where you're having problems, or we should
> > modify it such that it doesn't have to call td_ta_map_id2thr() under
> > those troublesome circumstances.
> >
> > If we can't come up with such a patch in a timely fashion, we could
> > decide to get this in, but not without a comment saying that it is a
> > kludge.
> 
> Well, let me elaborate.
> 
> lwp_from_thread consults data structures in the inferior, just like the
> rest of thread_db.  As such, I have to consider it... "untrusted" if
> you will.  The inferior crashes unexpectedly - we can't call
> lwp_from_thread any more.  The inferior gets corrupted - we can't call
> lwp_from_thread any more.  As such, I think we need to be at least a
> little more graceful with this function everywhere we touch it (which
> is far too often, if you look at the target traffic dumps).  The same
> goes for any other request we make of thread_db.  So the fixes would
> not be in calling it less, but in allowing it to fail (more)
> gracefully.

You're right, there is a problem -- but this isn't the solution.
It's too simple.


> To be honest, I gave up on this entirely.  We don't support any
> non-one-to-one threads packages via thread-db.c; we've never tested
> with any unless someone's got one up their sleeve that they're not
> talking about. 

I believe the next  major revision of glibc is expected
to include one-to-many thread mapping.  At least it used
to be expected to...


> I dispensed with that part of the abstraction layer
> completely, and got a threads package several times faster, simpler,
> and more reliable (I posted some additional test cases that it passed
> and thread-db.c/lin-lwp.c didn't a few months ago; no one ever
> commented on them).  It's sitting in gdbserver/thread-db.c and
> gdbserver/linux-low.c if you want to look at it.  I believe it'll scale
> to non-one-to-one, and I tried to preserve that in the design, but
> actually supporting all the mapping calls when we don't have any such
> packages just seemed like a bad idea.  It's too complex for me to get
> it right blind.
> 
> --
> Daniel Jacobowitz
> MontaVista Software                         Debian GNU/Linux Developer


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFA: gdb/568, messy thread exits
  2002-08-26 18:33       ` Michael Snyder
@ 2002-08-26 19:05         ` Daniel Jacobowitz
  2002-08-29 15:50           ` Jim Blandy
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Jacobowitz @ 2002-08-26 19:05 UTC (permalink / raw)
  To: Michael Snyder; +Cc: Mark Kettenis, gdb-patches

On Mon, Aug 26, 2002 at 06:31:55PM -0700, Michael Snyder wrote:
> Daniel Jacobowitz wrote:
> > 
> > On Wed, Aug 14, 2002 at 12:00:03AM +0200, Mark Kettenis wrote:
> > >    Date: Mon, 12 Aug 2002 12:14:47 -0400
> > >    From: Daniel Jacobowitz <drow@mvista.com>
> > >
> > >    Michael, Mark - what do you think of this patch?  A better explanation
> > >    of the patch is at:
> > >      http://sources.redhat.com/ml/gdb-patches/2002-07/msg00630.html
> > >
> > > It's a kludge.  Therefore I'm not inclined to say that this can go in.
> > > We should really fix things such that lwp_from_thread isn't called
> > > under the circumstances where you're having problems, or we should
> > > modify it such that it doesn't have to call td_ta_map_id2thr() under
> > > those troublesome circumstances.
> > >
> > > If we can't come up with such a patch in a timely fashion, we could
> > > decide to get this in, but not without a comment saying that it is a
> > > kludge.
> > 
> > Well, let me elaborate.
> > 
> > lwp_from_thread consults data structures in the inferior, just like the
> > rest of thread_db.  As such, I have to consider it... "untrusted" if
> > you will.  The inferior crashes unexpectedly - we can't call
> > lwp_from_thread any more.  The inferior gets corrupted - we can't call
> > lwp_from_thread any more.  As such, I think we need to be at least a
> > little more graceful with this function everywhere we touch it (which
> > is far too often, if you look at the target traffic dumps).  The same
> > goes for any other request we make of thread_db.  So the fixes would
> > not be in calling it less, but in allowing it to fail (more)
> > gracefully.
> 
> You're right, there is a problem -- but this isn't the solution.
> It's too simple.

OK.  I'm using this patch for now in the packages I maintain, because
the warning at least lets the user exit without having to kill GDB. 
Meanwhile, we can try to address the real problems.  If anyone actually
has a suggestion on them, that is!

> > To be honest, I gave up on this entirely.  We don't support any
> > non-one-to-one threads packages via thread-db.c; we've never tested
> > with any unless someone's got one up their sleeve that they're not
> > talking about. 
> 
> I believe the next  major revision of glibc is expected
> to include one-to-many thread mapping.  At least it used
> to be expected to...

I don't think so, but I don't have complete information.  For one
thing, the next major LinuxThreads revision seems to have slipped
beyond the next major glibc release; for another, given all the work
Ingo Molnar's been putting in on minimal-overhead clone() and scalable
scheduling for this new library, I doubt he'd advocate wedging multiple
threads into a process.  But the development is not happening in
public, so I don't really know.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RFA: gdb/568, messy thread exits
  2002-08-26 19:05         ` Daniel Jacobowitz
@ 2002-08-29 15:50           ` Jim Blandy
  0 siblings, 0 replies; 11+ messages in thread
From: Jim Blandy @ 2002-08-29 15:50 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Michael Snyder, Mark Kettenis, gdb-patches


Daniel Jacobowitz <drow@mvista.com> writes:
> I don't think so, but I don't have complete information.  For one
> thing, the next major LinuxThreads revision seems to have slipped
> beyond the next major glibc release; for another, given all the work
> Ingo Molnar's been putting in on minimal-overhead clone() and scalable
> scheduling for this new library, I doubt he'd advocate wedging multiple
> threads into a process.  But the development is not happening in
> public, so I don't really know.

<rumor>

If I followed the thread correctly, I think I've heard Uli say that
he's basically abandoned the M-on-N threads implementation, basically
because he can't get the support he wants from the kernel.  Uli and
the kernel people disagree on how far the kernel should go to support
pthreads.

Everyone agrees that glibc and Linux together ought to correctly
implement the pthreads interface.  Correctness isn't at issue.  But
they disagree on whether Linux should provide special support to make
pthreads more efficient.

Ingo's position is that people who really want speed should use the
non-portable kernel interfaces directly, and that people using
portable interfaces should expect less-than-optimal performance.
Providing optimal support for every random API some standards group
has invented will complicate the kernel far too much.  The groups
really working on performance (databases; web servers) will all use
the non-portable interfaces, and give good benchmarks on real-world
applications.

Uli's position is that most people will use the portable interfaces,
so (I may be filling in here) it benefits more people to improve their
performance.

</rumor>


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2002-08-29 22:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-31  9:55 RFA: gdb/568, messy thread exits Daniel Jacobowitz
2002-07-31 13:28 ` Jim Blandy
2002-07-31 13:47   ` Daniel Jacobowitz
2002-07-31 14:01     ` Jim Blandy
2002-07-31 14:04       ` Daniel Jacobowitz
2002-08-12  9:14 ` Daniel Jacobowitz
2002-08-13 15:00   ` Mark Kettenis
2002-08-13 15:13     ` Daniel Jacobowitz
2002-08-26 18:33       ` Michael Snyder
2002-08-26 19:05         ` Daniel Jacobowitz
2002-08-29 15:50           ` Jim Blandy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox