* linux-thread-db.c not only caller of add_thread, -> gdb segv
@ 2007-11-09 4:38 Douglas Evans
2007-11-09 14:02 ` Daniel Jacobowitz
0 siblings, 1 reply; 7+ messages in thread
From: Douglas Evans @ 2007-11-09 4:38 UTC (permalink / raw)
To: gdb
Hi. I'm trying to decide what's the best way to fix a bug I've found.
linux-thread-db.c is not the only caller of add_thread, e.g. infrun.c
calls it too.
linux-thread-db.c assumes thread_info.private is non-NULL, e.g. here:
static int
clear_lwpid_callback (struct thread_info *thread, void *dummy)
{
/* If we know that our thread implementation is 1-to-1, we could
save
a certain amount of information; it's not clear how much, so we
are always conservative. */
thread->private->th_valid = 0;
thread->private->ti_valid = 0;
return 0;
}
called from here:
#0 0x00000000004681a4 in clear_lwpid_callback (thread=0xb132d0,
dummy=0x0) at ../../src/gdb/linux-thread-db.c:765
#1 0x00000000004f8480 in iterate_over_threads (callback=0x46818d
<clear_lwpid_callback>, data=0x0) at ../../src/gdb/thread.c:204
#2 0x00000000004682a6 in thread_db_resume (ptid={pid = -1, lwp = 0,
tid = 0}, step=0, signo=TARGET_SIGNAL_0) at
../../src/gdb/linux-thread-db.c:782
#3 0x00000000004eb83d in resume (step=0, sig=TARGET_SIGNAL_0) at
../../src/gdb/infrun.c:614
#4 0x00000000004ebc14 in proceed (addr=18446744073709551615,
siggnal=TARGET_SIGNAL_DEFAULT, step=0) at ../../src/gdb/infrun.c:805
This is iterating over all threads, some of which were created via
add_thread calls outside of linux-thread-db.c, and thus
thread_info.private is NULL and boom, segv.
I can see two solutions
1) ensure all calls to add_thread properly initialize thread_info.private
(via callback or whatever)
2) have linux-thread-db.c always check thread_info.private before
dereferencing it.
Comments?
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: linux-thread-db.c not only caller of add_thread, -> gdb segv 2007-11-09 4:38 linux-thread-db.c not only caller of add_thread, -> gdb segv Douglas Evans @ 2007-11-09 14:02 ` Daniel Jacobowitz 2007-11-10 1:27 ` Douglas Evans 0 siblings, 1 reply; 7+ messages in thread From: Daniel Jacobowitz @ 2007-11-09 14:02 UTC (permalink / raw) To: Douglas Evans; +Cc: gdb On Thu, Nov 08, 2007 at 08:38:13PM -0800, Douglas Evans wrote: > Hi. I'm trying to decide what's the best way to fix a bug I've found. > linux-thread-db.c is not the only caller of add_thread, e.g. infrun.c > calls it too. This is true. But if you're using the Linux native target, and reach the point where any of the other add_thread calls are made, then there's already a bug. It is supposed to detect all new threads. It sort of has to - that's how infrun gets the new ptid_t. > I can see two solutions > 1) ensure all calls to add_thread properly initialize thread_info.private > (via callback or whatever) > 2) have linux-thread-db.c always check thread_info.private before > dereferencing it. 3) Figure out how you got to any of the other add_thread calls. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: linux-thread-db.c not only caller of add_thread, -> gdb segv 2007-11-09 14:02 ` Daniel Jacobowitz @ 2007-11-10 1:27 ` Douglas Evans 2007-11-10 3:26 ` Daniel Jacobowitz 0 siblings, 1 reply; 7+ messages in thread From: Douglas Evans @ 2007-11-10 1:27 UTC (permalink / raw) To: gdb On Nov 9, 2007 6:02 AM, Daniel Jacobowitz <drow@false.org> wrote: > This is true. But if you're using the Linux native target, and reach > the point where any of the other add_thread calls are made, then > there's already a bug. It is supposed to detect all new threads. > It sort of has to - that's how infrun gets the new ptid_t. Ah, thanks. I understand the threading support a bit better now. [for completeness' sake, target = x86_64-linux] > 3) Figure out how you got to any of the other add_thread calls. Or that. :-) I think I understand what's going on. Consider a threaded app that first exec's itself (e.g. because it wants to be run with a specific value for LD_LIBRARY_PATH, the actual reason doesn't matter though). GDB doesn't properly handle the fact that the original process is gone. GDB detects the exec and removes thread_db_ops from the target stack. But (a) it doesn't clear out thread_list and (b) still thinks it has control of the running process. By the time GDB gets to handle_inferior_event() case TARGET_WAITKIND_EXECD it has added the new process to thread_list with the call to add_thread in infrun.c, and thus there's an entry in thread_list with "private" == NULL. thread_list also has stale entries at this point. If the user does a ^c after the exec and then "info threads" gdb will detect an internal error due to the stale entries in thread_list and crash. But if instead the user does a ^c and then "sharedlib some_library" GDB will push thread_db_ops back on the target stack. If the user then does "continue" thread_db_resume gets called and boom, segv because thread->private == NULL in clear_lwpid_callback. I'm working on a patch but I don't know this part of GDB very well. [small test case available, I'll include it with the patch] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: linux-thread-db.c not only caller of add_thread, -> gdb segv 2007-11-10 1:27 ` Douglas Evans @ 2007-11-10 3:26 ` Daniel Jacobowitz 2007-11-10 3:40 ` Douglas Evans 0 siblings, 1 reply; 7+ messages in thread From: Daniel Jacobowitz @ 2007-11-10 3:26 UTC (permalink / raw) To: gdb On Fri, Nov 09, 2007 at 05:26:56PM -0800, Douglas Evans wrote: > GDB detects the exec and removes thread_db_ops from the target stack. > But (a) it doesn't clear out thread_list and (b) still thinks it has control of > the running process. (b) should be true, no? Exec support is somewhat horrible. I posted a patch for it recently which might help (might not, I'm not sure). -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: linux-thread-db.c not only caller of add_thread, -> gdb segv 2007-11-10 3:26 ` Daniel Jacobowitz @ 2007-11-10 3:40 ` Douglas Evans 2007-11-10 5:20 ` Douglas Evans 0 siblings, 1 reply; 7+ messages in thread From: Douglas Evans @ 2007-11-10 3:40 UTC (permalink / raw) To: gdb On Nov 9, 2007 7:25 PM, Daniel Jacobowitz <drow@false.org> wrote: > On Fri, Nov 09, 2007 at 05:26:56PM -0800, Douglas Evans wrote: > > GDB detects the exec and removes thread_db_ops from the target stack. > > But (a) it doesn't clear out thread_list and (b) still thinks it has control of > > the running process. > > (b) should be true, no? I suspect it's a matter of degrees (so to speak) or word choice (apologies). Until MAY_FOLLOW_EXEC is true for linux I'd expect gdb to return control to the user when an exec() happens. Am I wrong in thinking gdb will lose control across the exec()? > Exec support is somewhat horrible. I posted a patch for it recently > which might help (might not, I'm not sure). I'll try to find it and give it a try. Thanks. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: linux-thread-db.c not only caller of add_thread, -> gdb segv 2007-11-10 3:40 ` Douglas Evans @ 2007-11-10 5:20 ` Douglas Evans 2007-11-10 16:00 ` Daniel Jacobowitz 0 siblings, 1 reply; 7+ messages in thread From: Douglas Evans @ 2007-11-10 5:20 UTC (permalink / raw) To: gdb [-- Attachment #1: Type: text/plain, Size: 1321 bytes --] On Nov 9, 2007 7:40 PM, Douglas Evans <dje@google.com> wrote: > I suspect it's a matter of degrees (so to speak) or word choice (apologies). > Until MAY_FOLLOW_EXEC is true for linux I'd expect gdb to return control > to the user when an exec() happens. > Am I wrong in thinking gdb will lose control across the exec()? I guess what gdb does for an exec() when MAY_FOLLOW_EXEC==0 is at least partially a matter of taste or definition. Running some experiments I can see that I can control (e.g. single step) the "new" process after the exec but the symbol info is all wrong if I exec() a different program. It didn't take much to get gdb to follow the exec'ing of toy programs. So setting aside a discussion of what MAY_FOLLOW_EXEC==0 really means on linux, how much work remains to support MAY_FOLLOW_EXEC==1? With this patch I can step across the exec from test-exec1.c to test-exec2.c. IWBN if gdb remembered the original program so that if I run it again, gdb reruns test-exec1.x and not test-exec2.x. This is not a real patch, just an experiment. I had to comment out the call to target_mourn_inferior to avoid a hang in gdb (and even a hang in gdb of gdb). Perhaps some aspect of mourning is required here, but just not a full-blown one. Any ideas what it would take to properly support MAY_FOLLOW_EXEC==1? [-- Attachment #2: may-follow-exec-play.txt --] [-- Type: text/plain, Size: 3106 bytes --] Index: infrun.c =================================================================== RCS file: /cvs/src/src/gdb/infrun.c,v retrieving revision 1.250 diff -u -p -r1.250 infrun.c --- infrun.c 2 Nov 2007 14:47:28 -0000 1.250 +++ infrun.c 10 Nov 2007 04:58:10 -0000 @@ -108,6 +108,7 @@ static ptid_t previous_inferior_ptid; /* This is true for configurations that may follow through execl() and similar functions. At present this is only true for HP-UX native. */ +#define MAY_FOLLOW_EXEC 1 #ifndef MAY_FOLLOW_EXEC #define MAY_FOLLOW_EXEC (0) #endif @@ -396,7 +397,9 @@ follow_exec (int pid, char *execd_pathna error (_("Could find run target to save before following exec")); gdb_flush (gdb_stdout); +#if 0 target_mourn_inferior (); +#endif inferior_ptid = pid_to_ptid (saved_pid); /* Because mourn_inferior resets inferior_ptid. */ push_target (tgt); [dje@seba gdb]$ cat test-exec1.c #include <stdio.h> #include <unistd.h> int main (int argc, char *argv[]) { printf ("this is %s\n", argv[0]); if (argc == 1) execl ("./test-exec2.x", "./test-exec2.x", "second", NULL); return 0; } [dje@seba gdb]$ cat test-exec2.c #include <stdio.h> #include <unistd.h> static void second () { printf ("second\n"); } int main (int argc, char *argv[]) { printf ("this is %s\n", argv[0]); if (argc == 1) execl (argv[0], argv[0], "second", NULL); second (); return 0; } [dje@seba gdb]$ gcc test-exec1.c -o test-exec1.x -g [dje@seba gdb]$ gcc test-exec2.c -o test-exec2.x -g [dje@seba gdb]$ ./gdb test-exec1.x GNU gdb 6.7.50.20071110-cvs Copyright (C) 2007 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... (gdb) b main Breakpoint 1 at 0x80483c8: file test-exec1.c, line 7. (gdb) r Starting program: /home/dje/gnu/sourceware/head/obj/gdb/test-exec1.x Breakpoint 1, main (argc= 1, argv=0xbfff2e64) at test-exec1.c:7 7 printf ("this is %s\n", argv[0]); (gdb) n this is /home/dje/gnu/sourceware/head/obj/gdb/test-exec1.x 8 if (argc == 1) (gdb) 9 execl ("./test-exec2.x", "./test-exec2.x", "second", NULL); (gdb) Executing new program: /home/dje/gnu/sourceware/head/obj/gdb/test-exec2.x warning: Temporarily disabling breakpoints for unloaded shared library "/lib/libc.so.6" Breakpoint 1, main (argc= 2, argv=0xbf848ef4) at test-exec2.c:13 13 printf ("this is %s\n", argv[0]); (gdb) this is ./test-exec2.x 14 if (argc == 1) (gdb) 16 second (); (gdb) second 17 return 0; (gdb) c Continuing. Program exited normally. (gdb) r Starting program: /home/dje/gnu/sourceware/head/obj/gdb/test-exec2.x Breakpoint 1, main (argc= 1, argv=0xbf9da854) at test-exec2.c:13 13 printf ("this is %s\n", argv[0]); (gdb) n this is /home/dje/gnu/sourceware/head/obj/gdb/test-exec2.x 14 if (argc == 1) (gdb) q The program is running. Exit anyway? (y or n) y [dje@seba gdb]$ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: linux-thread-db.c not only caller of add_thread, -> gdb segv 2007-11-10 5:20 ` Douglas Evans @ 2007-11-10 16:00 ` Daniel Jacobowitz 0 siblings, 0 replies; 7+ messages in thread From: Daniel Jacobowitz @ 2007-11-10 16:00 UTC (permalink / raw) To: gdb On Fri, Nov 09, 2007 at 09:20:25PM -0800, Douglas Evans wrote: > On Nov 9, 2007 7:40 PM, Douglas Evans <dje@google.com> wrote: > > I suspect it's a matter of degrees (so to speak) or word choice (apologies). > > Until MAY_FOLLOW_EXEC is true for linux I'd expect gdb to return control > > to the user when an exec() happens. > > Am I wrong in thinking gdb will lose control across the exec()? > > I guess what gdb does for an exec() when MAY_FOLLOW_EXEC==0 is at > least partially a matter of taste or definition. Running some > experiments I can see that I can control (e.g. single step) the "new" > process after the exec but the symbol info is all wrong if I exec() a > different program. Right. Thus the patch I mentioned (sorry, should have found the link yesterday): http://sourceware.org/ml/gdb-patches/2007-10/msg00455.html > With this patch I can step across the exec from test-exec1.c to test-exec2.c. > IWBN if gdb remembered the original program so that if I run it again, > gdb reruns test-exec1.x and not test-exec2.x. Yeah. I haven't figured out what to do about that, or all the various implications... for instance, if you say "file" should that override the program to run the next time? So I've just fixed it as it was and not messed with the dubious UI. There may be more to come on this topic... -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-11-10 16:00 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-11-09 4:38 linux-thread-db.c not only caller of add_thread, -> gdb segv Douglas Evans 2007-11-09 14:02 ` Daniel Jacobowitz 2007-11-10 1:27 ` Douglas Evans 2007-11-10 3:26 ` Daniel Jacobowitz 2007-11-10 3:40 ` Douglas Evans 2007-11-10 5:20 ` Douglas Evans 2007-11-10 16:00 ` Daniel Jacobowitz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox