Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
* [patch] Fix crash on /proc/PID/stat race
@ 2010-05-27 18:00 Jan Kratochvil
  2010-05-27 19:06 ` Pedro Alves
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Kratochvil @ 2010-05-27 18:00 UTC (permalink / raw)
  To: gdb-patches

Hi,

got a reported a core file that gdb crashes in linux_nat_core_of_thread_1
called from linux_nat_wait_1 on TARGET_WAITKIND_EXITED.  It crashes because
CONTENT is empty there.

While it is understanable /proc/PID/stat is not available after
TARGET_WAITKIND_EXITED I failed to artificially reproduce it by
	sleep 1&p=$!;(sleep 2;cat) </proc/$p/stat
as it prints
	cat: -: No such process
due to
	read(0, 0x65d000, 32768) = -1 ESRCH (No such process)

No regressions on {x86_64,x86_64-m32}-fedora13-linux-gnu.

Checked the gdb.mi/*.exp gdb.log still contains some "core" values.


Thanks,
Jan


2010-05-27  Jan Kratochvil  <jan.kratochvil@redhat.com>

	* linux-nat.c (linux_nat_core_of_thread_1): Fix crash on invalid
	CONTENT.

--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -5502,16 +5502,23 @@ linux_nat_core_of_thread_1 (ptid_t ptid)
 
   make_cleanup (xfree, content);
 
+  /* Do not assume anything about CONTENT.  In some race fopen can be still
+     successful but CONTENT_READ can be 0 for an exited process.  */
+
   p = strchr (content, '(');
-  p = strchr (p, ')') + 2; /* skip ")" and a whitespace. */
+  if (p != NULL)
+    p = strchr (p, ')');
+  if (p != NULL)
+    p++;
 
   /* If the first field after program name has index 0, then core number is
      the field with index 36.  There's no constant for that anywhere.  */
-  p = strtok_r (p, " ", &ts);
-  for (i = 0; i != 36; ++i)
+  if (p != NULL)
+    p = strtok_r (p, " ", &ts);
+  for (i = 0; p != NULL && i != 36; ++i)
     p = strtok_r (NULL, " ", &ts);
 
-  if (sscanf (p, "%d", &core) == 0)
+  if (p == NULL || sscanf (p, "%d", &core) == 0)
     core = -1;
 
   do_cleanups (back_to);


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] Fix crash on /proc/PID/stat race
  2010-05-27 18:00 [patch] Fix crash on /proc/PID/stat race Jan Kratochvil
@ 2010-05-27 19:06 ` Pedro Alves
  2010-05-27 21:24   ` Jan Kratochvil
  0 siblings, 1 reply; 5+ messages in thread
From: Pedro Alves @ 2010-05-27 19:06 UTC (permalink / raw)
  To: gdb-patches; +Cc: Jan Kratochvil

On Thursday 27 May 2010 18:54:04, Jan Kratochvil wrote:
> Hi,
> 
> got a reported a core file that gdb crashes in linux_nat_core_of_thread_1
> called from linux_nat_wait_1 on TARGET_WAITKIND_EXITED.  It crashes because
> CONTENT is empty there.
> 
> While it is understanable /proc/PID/stat is not available after
> TARGET_WAITKIND_EXITED I failed to artificially reproduce it by
> 	sleep 1&p=$!;(sleep 2;cat) </proc/$p/stat
> as it prints
> 	cat: -: No such process
> due to
> 	read(0, 0x65d000, 32768) = -1 ESRCH (No such process)

Why are we trying to get at the core if we know the process
is gone?  Since the process is already waited for, I'm surprised
the fopen succeeded in the first place.  On a couple of quick tests,
I always see fopen failing.  It sounds like a kernel bug.  Can't we
just skip the core_of_thread call for
TARGET_WAITKING_EXITED|TARGET_WAITKING_SIGNALLED?

-- 
Pedro Alves


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] Fix crash on /proc/PID/stat race
  2010-05-27 19:06 ` Pedro Alves
@ 2010-05-27 21:24   ` Jan Kratochvil
  2010-05-27 22:50     ` Pedro Alves
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Kratochvil @ 2010-05-27 21:24 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches

On Thu, 27 May 2010 21:00:26 +0200, Pedro Alves wrote:
> Why are we trying to get at the core if we know the process
> is gone?  Since the process is already waited for, I'm surprised
> the fopen succeeded in the first place.  On a couple of quick tests,
> I always see fopen failing.  It sounds like a kernel bug.  Can't we
> just skip the core_of_thread call for
> TARGET_WAITKING_EXITED|TARGET_WAITKING_SIGNALLED?

An additional patch like this one?

It is IMO not correct for GDB to crash on unexpected /proc/** content.


Thanks,
Jan


2010-05-27  Jan Kratochvil  <jan.kratochvil@redhat.com>
(maybe rather Pedro Alves  <pedro@codesourcery.com> as I just "installed" it)

	* linux-nat.c (linux_nat_wait_1): Do not call
	linux_nat_core_of_thread_1 on TARGET_WAITKIND_EXITED or
	TARGET_WAITKIND_SIGNALLED.

--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -3626,7 +3626,13 @@ retry:
     fprintf_unfiltered (gdb_stdlog, "LLW: exit\n");
 
   restore_child_signals_mask (&prev_mask);
-  lp->core = linux_nat_core_of_thread_1 (lp->ptid);
+
+  if (ourstatus->kind == TARGET_WAITKIND_EXITED
+      || ourstatus->kind == TARGET_WAITKIND_SIGNALLED)
+    lp->core = -1;
+  else
+    lp->core = linux_nat_core_of_thread_1 (lp->ptid);
+
   return lp->ptid;
 }
 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] Fix crash on /proc/PID/stat race
  2010-05-27 21:24   ` Jan Kratochvil
@ 2010-05-27 22:50     ` Pedro Alves
  2010-05-28 18:28       ` Jan Kratochvil
  0 siblings, 1 reply; 5+ messages in thread
From: Pedro Alves @ 2010-05-27 22:50 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: gdb-patches

On Thursday 27 May 2010 22:20:37, Jan Kratochvil wrote:
> On Thu, 27 May 2010 21:00:26 +0200, Pedro Alves wrote:
> > Why are we trying to get at the core if we know the process
> > is gone?  Since the process is already waited for, I'm surprised
> > the fopen succeeded in the first place.  On a couple of quick tests,
> > I always see fopen failing.  It sounds like a kernel bug.  Can't we
> > just skip the core_of_thread call for
> > TARGET_WAITKING_EXITED|TARGET_WAITKING_SIGNALLED?
> 
> An additional patch like this one?

Yes, exactly.  Thanks.

> It is IMO not correct for GDB to crash on unexpected /proc/** content.

Okay, I don't mind your original patch that much.  I think that it
adds a bit of complexity by being paranoid.  As long as a process
hasn't been waited for, the /stat entry should exist, even if the
process is zombie.

Anyway, if you want to put that one in, it's okay, but please don't
lose the comment below:

> -  p = strchr (p, ')') + 2; /* skip ")" and a whitespace. */
> +  if (p != NULL)
> +    p = strchr (p, ')');

and I don't think the fopen race comment makes sense
as is anymore.

Also, IWBN if gdbserver was fixed similarly, but I won't ask
you to do that.  :-)

-- 
Pedro Alves


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] Fix crash on /proc/PID/stat race
  2010-05-27 22:50     ` Pedro Alves
@ 2010-05-28 18:28       ` Jan Kratochvil
  0 siblings, 0 replies; 5+ messages in thread
From: Jan Kratochvil @ 2010-05-28 18:28 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches

On Fri, 28 May 2010 00:25:40 +0200, Pedro Alves wrote:
> Yes, exactly.  Thanks.

Checked-in:
	http://sourceware.org/ml/gdb-cvs/2010-05/msg00243.html


> As long as a process hasn't been waited for, the /stat entry should exist,
> even if the process is zombie.

I agree but still finding such external dependency needlessly fragile.


> Anyway, if you want to put that one in, it's okay, but please don't
> lose the comment below:
> 
> > -  p = strchr (p, ')') + 2; /* skip ")" and a whitespace. */
> > +  if (p != NULL)
> > +    p = strchr (p, ')');

The part "and a whitespace." was removed as redundant (due to the following
strtok_r) while getting more complicated making it safe.  In the spirit of GNU
Coding Style
	"It is not necessary to duplicate in words the meaning of the
	C argument declarations"
(while used in a different meaning at that point) I removed the remainder of
the comment
	/* skip ")" */
for the block
	  if (p != NULL)
	    p = strchr (p, ')');
	  if (p != NULL)
	    p++;
as obvious; but I put it back now on your request.


> and I don't think the fopen race comment makes sense > as is anymore.

OK, removed that comment.


> Also, IWBN if gdbserver was fixed similarly, but I won't ask
> you to do that.  :-)

Forgot/unaware, fixed the same way.

Checked-in.


Thanks,
Jan


http://sourceware.org/ml/gdb-cvs/2010-05/msg00244.html

--- src/gdb/ChangeLog	2010/05/28 18:00:41	1.11855
+++ src/gdb/ChangeLog	2010/05/28 18:23:13	1.11856
@@ -1,5 +1,10 @@
 2010-05-28  Jan Kratochvil  <jan.kratochvil@redhat.com>
 
+	* linux-nat.c (linux_nat_core_of_thread_1): Fix crash on invalid
+	CONTENT.
+
+2010-05-28  Jan Kratochvil  <jan.kratochvil@redhat.com>
+
 	* linux-nat.c (linux_nat_wait_1): Do not call
 	linux_nat_core_of_thread_1 on TARGET_WAITKIND_EXITED or
 	TARGET_WAITKIND_SIGNALLED.
--- src/gdb/gdbserver/ChangeLog	2010/05/26 22:40:22	1.386
+++ src/gdb/gdbserver/ChangeLog	2010/05/28 18:23:15	1.387
@@ -1,3 +1,8 @@
+2010-05-28  Jan Kratochvil  <jan.kratochvil@redhat.com>
+
+	* linux-low.c (linux_core_of_thread): Fix crash on invalid CONTENT.
+	New comment.
+
 2010-05-26  Ozkan Sezer  <sezeroz@gmail.com>
 
 	* gdbreplay.c (remote_open): Check error return from socket() call by
--- src/gdb/linux-nat.c	2010/05/28 18:00:46	1.169
+++ src/gdb/linux-nat.c	2010/05/28 18:23:15	1.170
@@ -5509,15 +5509,21 @@
   make_cleanup (xfree, content);
 
   p = strchr (content, '(');
-  p = strchr (p, ')') + 2; /* skip ")" and a whitespace. */
+
+  /* Skip ")".  */
+  if (p != NULL)
+    p = strchr (p, ')');
+  if (p != NULL)
+    p++;
 
   /* If the first field after program name has index 0, then core number is
      the field with index 36.  There's no constant for that anywhere.  */
-  p = strtok_r (p, " ", &ts);
-  for (i = 0; i != 36; ++i)
+  if (p != NULL)
+    p = strtok_r (p, " ", &ts);
+  for (i = 0; p != NULL && i != 36; ++i)
     p = strtok_r (NULL, " ", &ts);
 
-  if (sscanf (p, "%d", &core) == 0)
+  if (p == NULL || sscanf (p, "%d", &core) == 0)
     core = -1;
 
   do_cleanups (back_to);
--- src/gdb/gdbserver/linux-low.c	2010/05/03 04:02:20	1.148
+++ src/gdb/gdbserver/linux-low.c	2010/05/28 18:23:15	1.149
@@ -4346,13 +4346,21 @@
     }
 
   p = strchr (content, '(');
-  p = strchr (p, ')') + 2; /* skip ")" and a whitespace. */
 
-  p = strtok_r (p, " ", &ts);
-  for (i = 0; i != 36; ++i)
+  /* Skip ")".  */
+  if (p != NULL)
+    p = strchr (p, ')');
+  if (p != NULL)
+    p++;
+
+  /* If the first field after program name has index 0, then core number is
+     the field with index 36.  There's no constant for that anywhere.  */
+  if (p != NULL)
+    p = strtok_r (p, " ", &ts);
+  for (i = 0; p != NULL && i != 36; ++i)
     p = strtok_r (NULL, " ", &ts);
 
-  if (sscanf (p, "%d", &core) == 0)
+  if (p == NULL || sscanf (p, "%d", &core) == 0)
     core = -1;
 
   free (content);


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-05-28 18:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-27 18:00 [patch] Fix crash on /proc/PID/stat race Jan Kratochvil
2010-05-27 19:06 ` Pedro Alves
2010-05-27 21:24   ` Jan Kratochvil
2010-05-27 22:50     ` Pedro Alves
2010-05-28 18:28       ` Jan Kratochvil

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox