Re: How to catch GDB crash

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

* Re: How to catch GDB crash
@ 2008-06-24 12:39 Dmitry Smirnov
  2008-06-24 12:58 ` Pedro Alves
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Smirnov @ 2008-06-24 12:39 UTC (permalink / raw)
  To: gdb

Hi!

I have to say that your advices were useful. I was able to catch the crash.
Maybe it will be interesting for you.
First, after I attached to the process, I've set 3 breakpoints: exit, _exit, abort (which is cygwin1!abort in fact).
Next, I've noticed that program is stopped in that weird state Brian wrote about:

(gdb) bt
#0  0x7c901231 in ntdll!DbgUiConnectToDbg ()  from /c/WINDOWS/system32/ntdll.dll
#1  0x7c9507a8 in ntdll!KiIntSystemCall () from /c/WINDOWS/system32/ntdll.dll
#2  0x00000005 in ?? ()

I've tried to step through the code: three or four 'ni' commands and, oops.. it looks the program has been resumed.
After that, I performed actions that lead to the crash (it is just issuing 'ni' for arm-elf-gdb from Eclipse) and I was brought to the following situation:

Breakpoint 2, 0x61084819 in cygwin1!abort () from /usr/bin/cygwin1.dll
(gdb)
0x6108481e in cygwin1!abort () from /usr/bin/cygwin1.dll
(gdb) info th
  3 thread 5424.0x114c  0x7c90eb94 in ntdll!LdrAccessResource ()
   from /c/WINDOWS/system32/ntdll.dll
  2 thread 5424.0x1108  0x7c90eb94 in ntdll!LdrAccessResource ()
   from /c/WINDOWS/system32/ntdll.dll
* 1 thread 5424.0x1bc  0x6108481e in cygwin1!abort ()
   from /usr/bin/cygwin1.dll
(gdb) bt
#0  0x6108481e in cygwin1!abort () from /usr/bin/cygwin1.dll
#1  0x61086e60 in sigfillset () from /usr/bin/cygwin1.dll
#2  0x0040b6b7 in internal_verror (file=0x62380b ".././gdb/mi/mi-interp.c",
    line=340, fmt=0x6237ed "%s: Assertion `%s' failed.", ap=0x22e64c "-7b")
    at utils.c:809
#3  0x0040b6f6 in internal_error (file=0x62380b ".././gdb/mi/mi-interp.c",
    line=340, string=0x6237ed "%s: Assertion `%s' failed.") at utils.c:818
#4  0x0048cfec in mi_on_resume (ptid={pid = 42000, lwp = 0, tid = 0})
    at .././gdb/mi/mi-interp.c:340
#5  0x0046377f in observer_target_resumed_notification_stub (data=0x48cf40,
    args_data=0x22e6b0) at observer.inc:378
#6  0x00463052 in generic_observer_notify (subject=0x1, args=0x5e74ac)
    at observer.c:166
#7  0x004637fe in observer_notify_target_resumed (ptid=
      {pid = 42000, lwp = 0, tid = 0}) at observer.inc:402
#8  0x0047bd22 in set_running (ptid={pid = 42000, lwp = 0, tid = 0},
    running=1) at thread.c:435
#9  0x004265a0 in resume (step=1, sig=TARGET_SIGNAL_HUP) at infrun.c:1063
#10 0x004294c0 in proceed (addr=4294967295, siggnal=TARGET_SIGNAL_DEFAULT,
    step=1) at infrun.c:1265
#11 0x00412564 in step_1 (skip_subroutines=1, single_inst=1, count_string=0x0)
    at infcmd.c:789
#12 0x00402433 in execute_command (p=0x22e872 "", from_tty=1) at top.c:466
#13 0x004116e2 in catch_exception (uiout=0x100f0c80,
    func=0x48e040 <do_captured_execute_command>, func_args=0x22e898, mask=6)
    at exceptions.c:463
#14 0x0048e0e6 in cli_interpreter_exec (data=0x0, command_str=0x103230b0 "ni")
    at .././gdb/cli/cli-interp.c:130
#15 0x0041acfb in interp_exec (interp=0x100f0ce8, command_str=0x103230b0 "ni")
    at interps.c:325
#16 0x0048cc49 in mi_cmd_interpreter_exec (
    command=0x64d54e "-interpreter-exec", argv=0x22e998, argc=2)
    at .././gdb/mi/mi-interp.c:209
#17 0x00505a45 in captured_mi_execute_command (uiout=0x100f1660,
    data=0x22ea40) at .././gdb/mi/mi-main.c:1104
#18 0x004116e2 in catch_exception (uiout=0x100f1660,
    func=0x5057a0 <captured_mi_execute_command>, func_args=0x22ea40, mask=6)
    at exceptions.c:463
#19 0x0050558a in mi_execute_command (cmd=0x10ef9ea0 "229 ni", from_tty=1)
    at .././gdb/mi/mi-main.c:1159
#20 0x0048cd49 in mi_execute_command_wrapper (cmd=0x10ef9ea0 "229 ni")
    at .././gdb/mi/mi-interp.c:265
#21 0x00436e5a in handle_file_event (event_file_desc=0) at event-loop.c:732
#22 0x004368c2 in process_event () at event-loop.c:341
#23 0x004371a5 in gdb_do_one_event (data=0x0) at event-loop.c:378
#24 0x0041192b in catch_errors (func=0x437020 <gdb_do_one_event>,
    func_args=0x0, errstring=0x607b20 "", mask=6) at exceptions.c:509
#25 0x00436914 in start_event_loop () at event-loop.c:404
#26 0x004010ab in captured_command_loop (data=0x0) at .././gdb/main.c:99
#27 0x0041192b in catch_errors (func=0x4010a0 <captured_command_loop>,
    func_args=0x0, errstring=0x5f7139 "", mask=6) at exceptions.c:509
#28 0x00401914 in captured_main (data=0x22ee10) at .././gdb/main.c:882
#29 0x0041192b in catch_errors (func=0x4010f0 <captured_main>,
    func_args=0x22ee10, errstring=0x5f7139 "", mask=6) at exceptions.c:509
#30 0x00402113 in gdb_main (args=0x22ee10) at .././gdb/main.c:891
#31 0x0040109b in main (argc=8, argv=0x100301a0) at gdb.c:33
(gdb)


Now I can start investigations.

Dmitry


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-24 12:39 How to catch GDB crash Dmitry Smirnov
@ 2008-06-24 12:58 ` Pedro Alves
  0 siblings, 0 replies; 30+ messages in thread
From: Pedro Alves @ 2008-06-24 12:58 UTC (permalink / raw)
  To: gdb, Dmitry Smirnov

A Tuesday 24 June 2008 13:38:48, Dmitry Smirnov wrote:
> #3  0x0040b6f6 in internal_error (file=0x62380b ".././gdb/mi/mi-interp.c",
>     line=340, string=0x6237ed "%s: Assertion `%s' failed.") at utils.c:818
> #4  0x0048cfec in mi_on_resume (ptid={pid = 42000, lwp = 0, tid = 0})
>     at .././gdb/mi/mi-interp.c:340

42000 looks a lot like remote.c:MAGIC_NULL_PID, and it wasn't on the
thread list at this point:

static void
mi_on_resume (ptid_t ptid)
{
  if (PIDGET (ptid) == -1)
    fprintf_unfiltered (raw_stdout, "*running,thread-id=\"all\"\n");
  else
    {
      struct thread_info *ti = find_thread_pid (ptid);
      gdb_assert (ti); <<<<<<<<<<<< assert here
      fprintf_unfiltered (raw_stdout, "*running,thread-id=\"%d\"\n", ti->num);
    }
}

And, it looks like your stub does not implement any thread support?

It seems we either need to make sure remote.c always registers
a thread, or remove that assert.  I would prefer the former,
as it's a requirement to  getting rid of context switching
on the core side.

-- 
Pedro Alves


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-07-07 16:01                             ` Pedro Alves
@ 2008-07-08  8:27                               ` Dmitry Smirnov
  0 siblings, 0 replies; 30+ messages in thread
From: Dmitry Smirnov @ 2008-07-08  8:27 UTC (permalink / raw)
  To: gdb

Thanks,

Most likely I've found the root cause: mi_on_resume() does not flush raw_stdout. I've added "gdb_flush (raw_stdout)" at the end of this function and everything seems to work fine.

Also, I've found a solution for Eclipse problem when it unable to work correctly after this "^running". On my mind it does not related to GDB (and to delayed "^running" :-) ), so I have no problems with GDB at this moment.

M-m-m, there is just one weird and minor observation: just after "target remote:12345" executed, Eclipse shows two identical threads: Thread[0] and Thread[1]. They are stopped at the same address. After I issued "-exec-continue" and GDB hits the breakpoint, Eclipse shows just one thread: Thread[1] with a correct call stack. Perhaps, I will have some time to debug it and get back if this is related to GDB.

Thanks!
Dmitry

-----Original Message-----
From: Pedro Alves <pedro_alves@portugalmail.pt>
To: gdb@sourceware.org,  Dmitry Smirnov <divis1969@mail.ru>
Date: Mon, 7 Jul 2008 17:00:37 +0100
Subject: Re: How to catch GDB crash

> 
> A Monday 07 July 2008 16:47:26, Dmitry Smirnov wrote:
> > Perhaps, "info threads" is fixed. But since both "info threads" and
> > "running" problems were [most probably] caused by the "main thread
> > registering" fix, maybe it is better to investigate "^running" problem
> > before submission? What if they are connected? ;-)
> 
> I seriously doubt they are connected.  The code to output "^running"
> has nothing to do with having threads or not.
> 
> > I have to say, that my goal is not just report issues, I would like to help
> > fixing them. 
> 
> Welcome on board!  We need all the help we can get.
> 
> > Unfortunately, I do not have much time to learn GDB, so I'm 
> > just asking for hints: what can I do to discover the root cause. 
> 
> The best way is to do a binary search on the CVS HEAD sources, to
> find the patch that caused your issue.
> 
> > For 
> > example, who is responding "^running"? What functions/files should I debug
> > to figure out the problem?
> 
> Grepping for "^running" should get you there.  
> 
> See here, your issue was most likelly introduced by this:
>  http://sourceware.org/ml/gdb-patches/2008-06/msg00247.html
> 
> -- 
> Pedro Alves
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-07-07 15:47                           ` Dmitry Smirnov
@ 2008-07-07 16:01                             ` Pedro Alves
  2008-07-08  8:27                               ` Dmitry Smirnov
  0 siblings, 1 reply; 30+ messages in thread
From: Pedro Alves @ 2008-07-07 16:01 UTC (permalink / raw)
  To: gdb, Dmitry Smirnov

A Monday 07 July 2008 16:47:26, Dmitry Smirnov wrote:
> Perhaps, "info threads" is fixed. But since both "info threads" and
> "running" problems were [most probably] caused by the "main thread
> registering" fix, maybe it is better to investigate "^running" problem
> before submission? What if they are connected? ;-)

I seriously doubt they are connected.  The code to output "^running"
has nothing to do with having threads or not.

> I have to say, that my goal is not just report issues, I would like to help
> fixing them. 

Welcome on board!  We need all the help we can get.

> Unfortunately, I do not have much time to learn GDB, so I'm 
> just asking for hints: what can I do to discover the root cause. 

The best way is to do a binary search on the CVS HEAD sources, to
find the patch that caused your issue.

> For 
> example, who is responding "^running"? What functions/files should I debug
> to figure out the problem?

Grepping for "^running" should get you there.  

See here, your issue was most likelly introduced by this:
 http://sourceware.org/ml/gdb-patches/2008-06/msg00247.html

-- 
Pedro Alves


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-07-07 14:29                         ` Pedro Alves
@ 2008-07-07 15:47                           ` Dmitry Smirnov
  2008-07-07 16:01                             ` Pedro Alves
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Smirnov @ 2008-07-07 15:47 UTC (permalink / raw)
  To: gdb

-----Original Message-----
From: Pedro Alves <pedro@codesourcery.com>
To: gdb@sourceware.org,  Dmitry Smirnov <divis1969@mail.ru>
Date: Mon, 7 Jul 2008 15:28:57 +0100
Subject: Re: How to catch GDB crash

> I think you said earlier that when you switched to the other eclipse you
> had around that got over the "^running" problem, you weren't even able
> to inspect the stack or do any disassembling --  eclipse would get
> stuck on the "info threads" command.  I take it that since you
> report you can now retrieve variables etc, that the "info threads"
> issue is fixed?  I'll post this patch for review at gdb-patches@.

Perhaps, "info threads" is fixed. But since both "info threads" and "running" problems were [most probably] caused by the "main thread registering" fix, maybe it is better to investigate "^running" problem before submission? What if they are connected? ;-)

I have to say, that my goal is not just report issues, I would like to help fixing them. Unfortunately, I do not have much time to learn GDB, so I'm just asking for hints: what can I do to discover the root cause. For example, who is responding "^running"? What functions/files should I debug to figure out the problem?

Dmitry

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-07-07  8:36                       ` Dmitry Smirnov
@ 2008-07-07 14:29                         ` Pedro Alves
  2008-07-07 15:47                           ` Dmitry Smirnov
  0 siblings, 1 reply; 30+ messages in thread
From: Pedro Alves @ 2008-07-07 14:29 UTC (permalink / raw)
  To: gdb, Dmitry Smirnov

A Monday 07 July 2008 09:35:43, Dmitry Smirnov wrote:
> Didn't have much time to digger.
> It seems that Eclipse CDT debugger is confused with the delayed "running"
> response. 

Oh, sorry if I didn't make it clear.  This patch was to fix the "info threads"
issue you found, not the "^running" problem.

> CDT can connect to the remote debugger, retireve stack (it is 
> absent at this point, in fact), retrieve variables, disassebling. But when
> I try to run it ("-exec-continue") there is nothing responded from gdb
> (i.e. "running" until it hits the breakpoint. Since in my case, it takes
> significant time to get to BP, Eclipse decides that target is not
> responding and either terminates ("gdbServer" CDT debugger) or behaves
> oddly ("Hardware" CDT Debugger): it sends commands like usual but do not
> show retrieved data (gdb itself responses correctly).
>

I think you said earlier that when you switched to the other eclipse you
had around that got over the "^running" problem, you weren't even able
to inspect the stack or do any disassembling --  eclipse would get
stuck on the "info threads" command.  I take it that since you
report you can now retrieve variables etc, that the "info threads"
issue is fixed?  I'll post this patch for review at gdb-patches@.

Thanks,

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-07-05  3:15                     ` Pedro Alves
@ 2008-07-07  8:36                       ` Dmitry Smirnov
  2008-07-07 14:29                         ` Pedro Alves
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Smirnov @ 2008-07-07  8:36 UTC (permalink / raw)
  To: gdb

Didn't have much time to digger.
It seems that Eclipse CDT debugger is confused with the delayed "running" response.
CDT can connect to the remote debugger, retireve stack (it is absent at this point, in fact), retrieve variables, disassebling. But when I try to run it ("-exec-continue") there is nothing responded from gdb (i.e. "running" until it hits the breakpoint. Since in my case, it takes significant time to get to BP, Eclipse decides that target is not responding and either terminates ("gdbServer" CDT debugger) or behaves oddly ("Hardware" CDT Debugger): it sends commands like usual but do not show retrieved data (gdb itself responses correctly).

> > P.S. I'll test your patch a little bit later and come back with results.
> 
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-07-02 12:51                   ` Re[2]: " Dmitry Smirnov
@ 2008-07-05  3:15                     ` Pedro Alves
  2008-07-07  8:36                       ` Dmitry Smirnov
  0 siblings, 1 reply; 30+ messages in thread
From: Pedro Alves @ 2008-07-05  3:15 UTC (permalink / raw)
  To: gdb, Dmitry Smirnov

A Wednesday 02 July 2008 13:50:00, Dmitry Smirnov wrote:

> I'm wondering why GDB is trying to get ThreadExtraInfo if stub has
> responded that it does not support threads? BTW, I didn't see this "Reply
> contains invalid hex digit 84" in older GDBs.

That's because versions of GDB until about a week ago didn't register
a main thread/task in the internal thread table if the target didn't
report thread support.  Now we're always registering a main
thread even in that case.  When you do "info threads", GDB queries
the target side for more info on each of the registered
threads.  Since previously there was 0 threads registered, there
were 0 qThreadExtraInfo queries performed.  Now, there's a thread, so
GDB core does 1 query.  But, since this thread was added internally
by GDB behind the remote side's back, we need to bar it just
before letting the query out to the remote side.  That's what
this new patch does.

> Does stub HAVE to support it?

No, it's optional.  But if the stub doesn't support it, it MUST
reply an empty response (an empty C string, as for all
unsupported packets).  Failing to do that is what looks
like a bug in the simulator.

> P.S. I'll test your patch a little bit later and come back with results.

Thanks!

-- 
Pedro Alves
l

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-07-02 11:05               ` Dmitry Smirnov
@ 2008-07-02 11:52                 ` Pedro Alves
  2008-07-02 12:51                   ` Re[2]: " Dmitry Smirnov
  0 siblings, 1 reply; 30+ messages in thread
From: Pedro Alves @ 2008-07-02 11:52 UTC (permalink / raw)
  To: gdb, Dmitry Smirnov

[-- Attachment #1: Type: text/plain, Size: 1554 bytes --]

Hi Dmitry, thanks for taking the time to investigate deeper.

A Wednesday 02 July 2008 12:05:28, Dmitry Smirnov wrote:
> Hi,
>
> Regarding the problem of -exec-run, I've suspended its investigation: I've
> found that 6.8.50.20080630 just didn't respond "running" and this seems
> reasonable. So, perhaps, previous version is misbehaving so causing Eclipse
> behave wrong way. Though, it is not clear why gdbServer CDT debugger also
> fails. Just postponed...
>

I don't know what to say about this failure.  The command was already
failing before, but it was outputting a spurious "^running".  Why is
eclipse even trying to run a process in with "target remote", I don't know.

> I've found another Eclipse CDT debugger variant that can run as I wish.
> And here I have a problem that I was reported to Pedro earlier: Eclipse is
> unable to disassemble the code, show variables, etc.
>
> I've noticed that GDB respond to "info threads" contains an error message.
> Below is the stack dump of this situation. I'm suspecting this respond
> prevents Eclipse from doing right.
> What is that "T1"?

I don't know.  It looks like a bug in the stub/simulator.
Do you have a way to activate remote protocol debugging?
"set debug remote 1" is the GDB command to use.
The stub embedded in the simulator you're using doesn't seem to
support any other thread related packets, and I would expect
it to reply "" to this packet too.

In the mean time, I think the attached patch (untested other than
building it) is sensible.  Could you give it a try?

-- 
Pedro Alves

[-- Attachment #2: dont_query_internal_threads.diff --]
[-- Type: text/x-diff, Size: 956 bytes --]

2008-07-02  Pedro Alves  <pedro@codesourcery.com>

	* remote.c (remote_threads_extra_info): Don't query the remote
	server about the <main> thread.

---
 gdb/remote.c |    6 ++++++
 1 file changed, 6 insertions(+)

Index: src/gdb/remote.c
===================================================================
--- src.orig/gdb/remote.c	2008-07-02 12:39:07.000000000 +0100
+++ src/gdb/remote.c	2008-07-02 12:50:31.000000000 +0100
@@ -2042,6 +2042,12 @@ remote_threads_extra_info (struct thread
     internal_error (__FILE__, __LINE__,
 		    _("remote_threads_extra_info"));
 
+  if (ptid_equal (tp->ptid, magic_null_ptid)
+      || (ptid_get_pid (tp->ptid) != 0 && ptid_get_tid (tp->ptid) == 0))
+    /* This is the main thread which was added by GDB.  The remote
+       doesn't know about it, so don't query about it.  */
+    return NULL;
+
   if (use_threadextra_query)
     {
       xsnprintf (rs->buf, get_remote_packet_size (), "qThreadExtraInfo,%lx",

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-30 15:58             ` Dmitry Smirnov
@ 2008-07-02 11:05               ` Dmitry Smirnov
  2008-07-02 11:52                 ` Pedro Alves
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Smirnov @ 2008-07-02 11:05 UTC (permalink / raw)
  To: gdb

Hi,

Regarding the problem of -exec-run, I've suspended its investigation: I've found that 6.8.50.20080630 just didn't respond "running" and this seems reasonable. So, perhaps, previous version is misbehaving so causing Eclipse behave wrong way. Though, it is not clear why gdbServer CDT debugger also fails. Just postponed...

I've found another Eclipse CDT debugger variant that can run as I wish.
And here I have a problem that I was reported to Pedro earlier: Eclipse is unable to disassemble the code, show variables, etc.

I've noticed that GDB respond to "info threads" contains an error message. Below is the stack dump of this situation.
I'm suspecting this respond prevents Eclipse from doing right.
What is that "T1"?
JIC, here is the console log of Eclipse:
 
&"info threads\n"
Reply contains invalid hex digit 84
info threads
&"warning: RMT ERROR : failed to get remote thread list.\n"
warning: RMT ERROR : failed to get remote thread list.
Reply contains invalid hex digit 84
~"* 1 Thread <main>"
* 1 Thread <main>&"Reply contains invalid hex digit 84\n"
553^error,msg="Reply contains invalid hex digit 84"


Breakpoint 1, fromhex (a=84) at remote.c:3007
3007        error (_("Reply contains invalid hex digit %d"), a);
(gdb) bt
#0  fromhex (a=84) at remote.c:3007
#1  0x004ea0db in hex2bin (hex=0x1004ace8 "T1", bin=0x697701 "", count=1)
    at remote.c:3023
#2  0x004ebe3a in remote_threads_extra_info (tp=0x1170cc60) at remote.c:2054
#3  0x0047c53f in print_thread_info (uiout=0x1006b278, requested_thread=-1)
    at thread.c:537
#4  0x00402473 in execute_command (p=0x22c6bc "", from_tty=1) at top.c:466
#5  0x00411722 in catch_exception (uiout=0x1006b278,
    func=0x48e3c0 <do_captured_execute_command>, func_args=0x22c6d8, mask=6)
    at exceptions.c:463
#6  0x0048e466 in cli_interpreter_exec (data=0x0,
    command_str=0x10ed21c0 "info threads") at .././gdb/cli/cli-interp.c:130
#7  0x0041ad3b in interp_exec (interp=0x1006b2e0,
    command_str=0x10ed21c0 "info threads") at interps.c:325
#8  0x0048cf39 in mi_cmd_interpreter_exec (
    command=0x64d5fb "-interpreter-exec", argv=0x22c7d8, argc=2)
    at .././gdb/mi/mi-interp.c:209
#9  0x0050626a in captured_mi_execute_command (uiout=0x1006bc58,
    data=0x11572ce0) at .././gdb/mi/mi-main.c:974
#10 0x00411722 in catch_exception (uiout=0x1006bc58,
    func=0x505f30 <captured_mi_execute_command>, func_args=0x11572ce0, mask=6)
    at exceptions.c:463
#11 0x00505d23 in mi_execute_command (cmd=0x1170ccb8 "551 info threads",
    from_tty=1) at .././gdb/mi/mi-main.c:1026
#12 0x0048d019 in mi_execute_command_wrapper (
    cmd=0x1170ccb8 "551 info threads") at .././gdb/mi/mi-interp.c:254
#13 0x0043703a in handle_file_event (event_file_desc=0) at event-loop.c:732
#14 0x00436aa2 in process_event () at event-loop.c:341
#15 0x00437385 in gdb_do_one_event (data=0x0) at event-loop.c:378
#16 0x0041196b in catch_errors (func=0x437200 <gdb_do_one_event>,
    func_args=0x0, errstring=0x607b30 "", mask=6) at exceptions.c:509
#17 0x00436af4 in start_event_loop () at event-loop.c:404
#18 0x004010ab in captured_command_loop (data=0x0) at .././gdb/main.c:104
#19 0x0041196b in catch_errors (func=0x4010a0 <captured_command_loop>,
    func_args=0x0, errstring=0x5f7139 "", mask=6) at exceptions.c:509
#20 0x00401c94 in captured_main (data=0x22cc40) at .././gdb/main.c:891
#21 0x0041196b in catch_errors (func=0x4010f0 <captured_main>,
    func_args=0x22cc40, errstring=0x5f7139 "", mask=6) at exceptions.c:509
#22 0x00402153 in gdb_main (args=0x22cc40) at .././gdb/main.c:900
#23 0x0040109b in main (argc=6, argv=0x1002aee8) at gdb.c:33

-----Original Message-----
From: Dmitry Smirnov <divis1969@mail.ru>
To: gdb@sourceware.org
Date: Mon, 30 Jun 2008 19:56:56 +0400
Subject: Re: How to catch GDB crash

> 
> Hi,
> 
> I've upgraded to gdb-6.8.50.20080630 and tested it.
> Unfortunately, I was totally unable to run my test case from Eclipse CDT.
> I've figured out that the problem is what GDB resonds to exec-run command.
> 
> Here is the Eclipse session log (it shows requests to GDB and responses)
> 
> 84-exec-run
> 84^error,msg="Don't know how to run.  Try \"help target\"."
> 
> I've tried older version of arm-elf-gdb (6.5):
> 
> 123-exec-run
> 123^running
> &"Don't know how to run.  Try \"help target\".\n"
> Don't know how to run.  Try "help target".
> 123^error,msg="Don't know how to run.  Try \"help target\"."
> 
> It seems that responses differs in 123^running (or maybe the order too?).
> 
> Is it possible to fix this?
> 
> Dmitry
> 
> 
> -----Original Message-----
> From: Dmitry Smirnov <divis1969@mail.ru>
> To: Pedro Alves <pedro@codesourcery.com>
> Date: Thu, 26 Jun 2008 18:32:46 +0400
> Subject: Re: How to catch GDB crash
> 
> > 
> > Ok, you'd convinced me :-)
> > You are right, the last resume() is executed with stepping_over_breakpoint equal to 1 whereas previously it was 0.
> > 
> > I'll try your patch (I can find it in gdb-patches, right?).
> > 
> > Dmitry
> > 
> > -----Original Message-----
> > From: Pedro Alves <pedro@codesourcery.com>
> > To: gdb@sourceware.org,  Dmitry Smirnov <divis1969@mail.ru>
> > Date: Thu, 26 Jun 2008 15:20:54 +0100
> > Subject: Re: How to catch GDB crash
> > 
> > > 
> > > A Thursday 26 June 2008 14:56:26, Dmitry Smirnov wrote:
> > > > I still have some doubts :-)
> > > >
> > > 
> > > Well, the doubts would go away if you tried the patches.  :-)
> > > 
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-07-01 11:38       ` Vladimir Prus
@ 2008-07-01 11:41         ` Pedro Alves
  0 siblings, 0 replies; 30+ messages in thread
From: Pedro Alves @ 2008-07-01 11:41 UTC (permalink / raw)
  To: Vladimir Prus; +Cc: gdb, Dmitry Smirnov

A Tuesday 01 July 2008 12:37:35, Vladimir Prus wrote:

> I think that for the time being, we can change the assert to check that
> either the program is single-threaded, or the thread is known. If the
> program is single-threaded, the the thread id is not registered, omitting
> thread-id completely seems right.
>
> I can make this change, or you have something already?
>

Please go ahead.  Thanks.

-- 
Pedro Alves


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-25 23:28     ` Pedro Alves
  2008-06-26 13:56       ` Dmitry Smirnov
@ 2008-07-01 11:38       ` Vladimir Prus
  2008-07-01 11:41         ` Pedro Alves
  1 sibling, 1 reply; 30+ messages in thread
From: Vladimir Prus @ 2008-07-01 11:38 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb, Dmitry Smirnov

On Thursday 26 June 2008 03:27:58 Pedro Alves wrote:
> A Wednesday 25 June 2008 09:02:33, Dmitry Smirnov wrote:
> > Hi Pedro,
> >
> > I'll try to figure out, whether skyeye (which is remote target) supports
> > notion of thread ids or pids. Now I just suppose it does not support.
> > Nevertheless, I do not believe this is related to a crash.
> 
> Yes it is.  :-)
> 
> >
> > As I said previously, I was debugging this program (ARM code) for some time
> > previously. 
> 
> But you've certainly upgraded your GDB recently (I can tell by your log
> output on your original post).  As I said, this is a recently introduced
> regression.
> 
> I've was able to reproduce the problem, by connecting to a local
> gdbserver with a GDB with all thread support hacked out in the
> remote target.
> 
> > BTW, I've just realized that command-line interface does not use mi_*
> > interface (neither mi_on_resume nor mi_execute_command were hit) and this
> > is most likely the reason why I cannot reproduce this test case with CLI.
> >
> 
> Yes, that's exactly the reason.
> 
> Anyway, I've posted a patch that fixes the issue in your case
> (it was actually a side effect of something else I was doing),
> although we may need to get rid of the assert you weren't tripping
> at for the time being (there are other targets other than
> remote that will also trip on the assert).
> 
> Vladimir, not sure if you noticed the issue, as it's buried in
> this long thread?  We can always leave the crash in place to
> force targets to follow our evil plot of always registering the
> main thread.  :-)
> I'd post a patch for it, but I don't know if we should output
> thread-id=0 in that case, or not output thread-id
> at all ...

I think that for the time being, we can change the assert to check that
either the program is single-threaded, or the thread is known. If the
program is single-threaded, the the thread id is not registered, omitting
thread-id completely seems right.

I can make this change, or you have something already?

- Volodya


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-26 14:33           ` Dmitry Smirnov
@ 2008-06-30 15:58             ` Dmitry Smirnov
  2008-07-02 11:05               ` Dmitry Smirnov
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Smirnov @ 2008-06-30 15:58 UTC (permalink / raw)
  To: gdb

Hi,

I've upgraded to gdb-6.8.50.20080630 and tested it.
Unfortunately, I was totally unable to run my test case from Eclipse CDT.
I've figured out that the problem is what GDB resonds to exec-run command.

Here is the Eclipse session log (it shows requests to GDB and responses)

84-exec-run
84^error,msg="Don't know how to run.  Try \"help target\"."

I've tried older version of arm-elf-gdb (6.5):

123-exec-run
123^running
&"Don't know how to run.  Try \"help target\".\n"
Don't know how to run.  Try "help target".
123^error,msg="Don't know how to run.  Try \"help target\"."

It seems that responses differs in 123^running (or maybe the order too?).

Is it possible to fix this?

Dmitry

-----Original Message-----
From: Dmitry Smirnov <divis1969@mail.ru>
To: Pedro Alves <pedro@codesourcery.com>
Date: Thu, 26 Jun 2008 18:32:46 +0400
Subject: Re: How to catch GDB crash

> 
> Ok, you'd convinced me :-)
> You are right, the last resume() is executed with stepping_over_breakpoint equal to 1 whereas previously it was 0.
> 
> I'll try your patch (I can find it in gdb-patches, right?).
> 
> Dmitry
> 
> -----Original Message-----
> From: Pedro Alves <pedro@codesourcery.com>
> To: gdb@sourceware.org,  Dmitry Smirnov <divis1969@mail.ru>
> Date: Thu, 26 Jun 2008 15:20:54 +0100
> Subject: Re: How to catch GDB crash
> 
> > 
> > A Thursday 26 June 2008 14:56:26, Dmitry Smirnov wrote:
> > > I still have some doubts :-)
> > >
> > 
> > Well, the doubts would go away if you tried the patches.  :-)
> > 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-26 14:21         ` Pedro Alves
@ 2008-06-26 14:33           ` Dmitry Smirnov
  2008-06-30 15:58             ` Dmitry Smirnov
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Smirnov @ 2008-06-26 14:33 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb

Ok, you'd convinced me :-)
You are right, the last resume() is executed with stepping_over_breakpoint equal to 1 whereas previously it was 0.

I'll try your patch (I can find it in gdb-patches, right?).

Dmitry

-----Original Message-----
From: Pedro Alves <pedro@codesourcery.com>
To: gdb@sourceware.org,  Dmitry Smirnov <divis1969@mail.ru>
Date: Thu, 26 Jun 2008 15:20:54 +0100
Subject: Re: How to catch GDB crash

> 
> A Thursday 26 June 2008 14:56:26, Dmitry Smirnov wrote:
> > I still have some doubts :-)
> >
> 
> Well, the doubts would go away if you tried the patches.  :-)
> 
> I'm hoping to get to commit them today, though...
> 
> > Below is a new log of my debug session. I've set the same
> > mi_execute_command and mi_on_resume. Last one prints the value of
> > 'inferior_ptid' when hit. Also, from Eclipse I've issues command 'ni'
> > before 'c'. As you can see, 'inferior_ptid' it is equal to {pid = 42000,
> > lwp = 0, tid = 0} all the time whereas mi_on_resume is called with {pid =
> > -1, lwp = 0, tid = 0} in all cases except last one.
> >
> > On my mind it indicates that while executing last 'ni', function resume()
> > in file infrun.c goes different way and it assigned 'inferior_ptid' to
> > 'resume_ptid' instead of default RESUME_ALL.
> 
> Did you actually look at the function that is asserting?  Here it is again:
> 
> static void
> mi_on_resume (ptid_t ptid)
> {
>   if (PIDGET (ptid) == -1)
>     fprintf_unfiltered (raw_stdout, "*running,thread-id=\"all\"\n");
>   else
>     {
>       struct thread_info *ti = find_thread_pid (ptid);
>       gdb_assert (ti);
>       fprintf_unfiltered (raw_stdout, "*running,thread-id=\"%d\"\n", ti->num);
>     }
> }
> 
> Calling the resume functions with {-1,0,0} means "let all threads execute",
> while with {42000,0,0} meant, "let only this thread execute".  This last
> case happens normally when GDB is trying to step over a breakpoint:
> 
> - remove breakpoints
> - step only the thread of interest, leaving others stopped, so if they happen 
> to be executing the same code, they don't miss the breakpoint
> - reinsert breakpoints
> - now safe to resume all threads
> 
> It just happens that in your case there's only one "thread" always,
> but the core of inferior control in GDB doesn't care and sends {42000,0,0}
> anyway.  The problem was that this assert is there because this function
> assumes threads are always registered in the thread table, while that
> is unfortunatelly still not always true throughout all of GDB's supported
> targets.
> 
> > Breakpoint 1, mi_on_resume (ptid={pid = 42000, lwp = 0, tid = 0})
> >     at .././gdb/mi/mi-interp.c:335
> > 335       if (PIDGET (ptid) == -1)
> > $3 = {pid = 42000, lwp = 0, tid = 0}
> >
> > Program exited with code 037777777777.
> > (gdb)
> 
> -- 
> Pedro Alves
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-26 13:56       ` Dmitry Smirnov
@ 2008-06-26 14:21         ` Pedro Alves
  2008-06-26 14:33           ` Dmitry Smirnov
  0 siblings, 1 reply; 30+ messages in thread
From: Pedro Alves @ 2008-06-26 14:21 UTC (permalink / raw)
  To: gdb, Dmitry Smirnov

A Thursday 26 June 2008 14:56:26, Dmitry Smirnov wrote:
> I still have some doubts :-)
>

Well, the doubts would go away if you tried the patches.  :-)

I'm hoping to get to commit them today, though...

> Below is a new log of my debug session. I've set the same
> mi_execute_command and mi_on_resume. Last one prints the value of
> 'inferior_ptid' when hit. Also, from Eclipse I've issues command 'ni'
> before 'c'. As you can see, 'inferior_ptid' it is equal to {pid = 42000,
> lwp = 0, tid = 0} all the time whereas mi_on_resume is called with {pid =
> -1, lwp = 0, tid = 0} in all cases except last one.
>
> On my mind it indicates that while executing last 'ni', function resume()
> in file infrun.c goes different way and it assigned 'inferior_ptid' to
> 'resume_ptid' instead of default RESUME_ALL.

Did you actually look at the function that is asserting?  Here it is again:

static void
mi_on_resume (ptid_t ptid)
{
  if (PIDGET (ptid) == -1)
    fprintf_unfiltered (raw_stdout, "*running,thread-id=\"all\"\n");
  else
    {
      struct thread_info *ti = find_thread_pid (ptid);
      gdb_assert (ti);
      fprintf_unfiltered (raw_stdout, "*running,thread-id=\"%d\"\n", ti->num);
    }
}

Calling the resume functions with {-1,0,0} means "let all threads execute",
while with {42000,0,0} meant, "let only this thread execute".  This last
case happens normally when GDB is trying to step over a breakpoint:

- remove breakpoints
- step only the thread of interest, leaving others stopped, so if they happen 
to be executing the same code, they don't miss the breakpoint
- reinsert breakpoints
- now safe to resume all threads

It just happens that in your case there's only one "thread" always,
but the core of inferior control in GDB doesn't care and sends {42000,0,0}
anyway.  The problem was that this assert is there because this function
assumes threads are always registered in the thread table, while that
is unfortunatelly still not always true throughout all of GDB's supported
targets.

> Breakpoint 1, mi_on_resume (ptid={pid = 42000, lwp = 0, tid = 0})
>     at .././gdb/mi/mi-interp.c:335
> 335       if (PIDGET (ptid) == -1)
> $3 = {pid = 42000, lwp = 0, tid = 0}
>
> Program exited with code 037777777777.
> (gdb)

-- 
Pedro Alves


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-25 23:28     ` Pedro Alves
@ 2008-06-26 13:56       ` Dmitry Smirnov
  2008-06-26 14:21         ` Pedro Alves
  2008-07-01 11:38       ` Vladimir Prus
  1 sibling, 1 reply; 30+ messages in thread
From: Dmitry Smirnov @ 2008-06-26 13:56 UTC (permalink / raw)
  To: gdb

I still have some doubts :-)

Below is a new log of my debug session. I've set the same mi_execute_command and mi_on_resume. Last one prints the value of 'inferior_ptid' when hit. Also, from Eclipse I've issues command 'ni' before 'c'. As you can see, 'inferior_ptid' it is equal to {pid = 42000, lwp = 0, tid = 0} all the time whereas mi_on_resume is called with {pid = -1, lwp = 0, tid = 0} in all cases except last one.

On my mind it indicates that while executing last 'ni', function resume() in file infrun.c goes different way and it assigned 'inferior_ptid' to 'resume_ptid' instead of default RESUME_ALL.

Digging...

Dmitry

GNU gdb 6.3.50_2004-12-28-cvs (cygwin-special)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-cygwin".
(gdb) attach 2348
Attaching to process 2348
Reading symbols from /cygdrive/d/Install/GDB/gdb-6.8.50.20080620/gdb-6.8.50.2008
0620/gdb/gdb.exe...done.
[Switching to thread 2348.0x15bc]
(gdb) info b
No breakpoints or watchpoints.
(gdb) b mi_on_resume
Breakpoint 1 at 0x48cf46: file .././gdb/mi/mi-interp.c, line 335.
(gdb) b mi_execute_command
Breakpoint 2 at 0x505537: file .././gdb/mi/mi-main.c, line 1135.
(gdb) commands 1
Type commands for when breakpoint 1 is hit, one per line.
End with a line saying just "end".
>print inferior_ptid
>c
>end
(gdb) commands 2
Type commands for when breakpoint 2 is hit, one per line.
End with a line saying just "end".
>c
>end
(gdb) ni
0x7c9507a8 in ntdll!KiIntSystemCall () from /c/WINDOWS/system32/ntdll.dll
(gdb)
0x7c9507bb in ntdll!KiIntSystemCall () from /c/WINDOWS/system32/ntdll.dll
(gdb)
0x7c9507bf in ntdll!KiIntSystemCall () from /c/WINDOWS/system32/ntdll.dll
(gdb)
0x7c9507c1 in ntdll!KiIntSystemCall () from /c/WINDOWS/system32/ntdll.dll
(gdb)
[Switching to thread 2348.0x740]

Breakpoint 2, mi_execute_command (cmd=0x138f74a0 "303 ni", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 1, mi_on_resume (ptid={pid = -1, lwp = 0, tid = 0})
    at .././gdb/mi/mi-interp.c:335
335       if (PIDGET (ptid) == -1)
$1 = {pid = 42000, lwp = 0, tid = 0}

Breakpoint 2, mi_execute_command (cmd=0x1393df78 "304 info threads",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (cmd=0x1393dfd0 "305-stack-info-depth",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (cmd=0x1395a678 "306-stack-list-frames 0 1",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (
    cmd=0x1395a6d0 "307-data-list-changed-registers", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (cmd=0x13962898 "308 info sharedlibrary",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (
    cmd=0x139628f0 "309-stack-list-arguments 0 0 0", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (cmd=0x13927738 "310-stack-list-locals 0",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (
    cmd=0x13927790 "311-interpreter-exec console \"info b\"", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (cmd=0x13977ce8 "312 c", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 1, mi_on_resume (ptid={pid = -1, lwp = 0, tid = 0})
    at .././gdb/mi/mi-interp.c:335
335       if (PIDGET (ptid) == -1)
$2 = {pid = 42000, lwp = 0, tid = 0}

Breakpoint 2, mi_execute_command (cmd=0x13974bd8 "313 info threads",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (cmd=0x13980e98 "314-stack-info-depth",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (
    cmd=0x14770128 "315-stack-list-frames 0 11", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (
    cmd=0x14770010 "316-data-list-changed-registers", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (cmd=0x10ef8958 "317 info sharedlibrary",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (
    cmd=0x10ef89b0 "318-data-disassemble -f /c/p4/views/KSW_S4000_WP_DEV_unknown
/qct/drivers/parb/parb.c -l 333 -n 100 -- 1", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (
    cmd=0x1476fe18 "319-data-disassemble -s 0x8c4a8e -e 0x8c4af2 -- 0",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (
    cmd=0x1172a8d0 "320-stack-list-arguments 0 0 0", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (cmd=0x11738a00 "321-stack-list-locals 0",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (cmd=0x1174b920 "322 whatis i", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (cmd=0x11758f10 "323 whatis rt_status",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (cmd=0x11761360 "324 whatis __result",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (cmd=0x11761548 "325-var-create - * i",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (
    cmd=0x117616e8 "326-var-evaluate-expression var1", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (
    cmd=0x117c3978 "327-var-create - * rt_status", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (
    cmd=0x11785568 "328-var-evaluate-expression var2", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (
    cmd=0x1178b4b8 "329-var-create - * __result", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (
    cmd=0x1178b550 "330-var-evaluate-expression var3", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 2, mi_execute_command (cmd=0x11795130 "331 ni", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {

Breakpoint 1, mi_on_resume (ptid={pid = 42000, lwp = 0, tid = 0})
    at .././gdb/mi/mi-interp.c:335
335       if (PIDGET (ptid) == -1)
$3 = {pid = 42000, lwp = 0, tid = 0}

Program exited with code 037777777777.
(gdb)


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-25  8:03   ` Dmitry Smirnov
@ 2008-06-25 23:28     ` Pedro Alves
  2008-06-26 13:56       ` Dmitry Smirnov
  2008-07-01 11:38       ` Vladimir Prus
  0 siblings, 2 replies; 30+ messages in thread
From: Pedro Alves @ 2008-06-25 23:28 UTC (permalink / raw)
  To: gdb, Dmitry Smirnov, Vladimir Prus

A Wednesday 25 June 2008 09:02:33, Dmitry Smirnov wrote:
> Hi Pedro,
>
> I'll try to figure out, whether skyeye (which is remote target) supports
> notion of thread ids or pids. Now I just suppose it does not support.
> Nevertheless, I do not believe this is related to a crash.

Yes it is.  :-)

>
> As I said previously, I was debugging this program (ARM code) for some time
> previously. 

But you've certainly upgraded your GDB recently (I can tell by your log
output on your original post).  As I said, this is a recently introduced
regression.

I've was able to reproduce the problem, by connecting to a local
gdbserver with a GDB with all thread support hacked out in the
remote target.

> BTW, I've just realized that command-line interface does not use mi_*
> interface (neither mi_on_resume nor mi_execute_command were hit) and this
> is most likely the reason why I cannot reproduce this test case with CLI.
>

Yes, that's exactly the reason.

Anyway, I've posted a patch that fixes the issue in your case
(it was actually a side effect of something else I was doing),
although we may need to get rid of the assert you weren't tripping
at for the time being (there are other targets other than
remote that will also trip on the assert).

Vladimir, not sure if you noticed the issue, as it's buried in
this long thread?  We can always leave the crash in place to
force targets to follow our evil plot of always registering the
main thread.  :-)
I'd post a patch for it, but I don't know if we should output
thread-id=0 in that case, or not output thread-id
at all ...

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-24 17:29 ` Pedro Alves
@ 2008-06-25  8:03   ` Dmitry Smirnov
  2008-06-25 23:28     ` Pedro Alves
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Smirnov @ 2008-06-25  8:03 UTC (permalink / raw)
  To: gdb

Hi Pedro,

I'll try to figure out, whether skyeye (which is remote target) supports notion of thread ids or pids. Now I just suppose it does not support.
Nevertheless, I do not believe this is related to a crash.

As I said previously, I was debugging this program (ARM code) for some time previously. It is very complex program, it consists of several ELF files. Before this one, I've debugged couple more. I've used 'ni' and/or -exec-next-instruction a lot and didn't see a lot crashes. I remember that I've seen couple of them (in similar scenario: when I've set a BP and run till it hits) but I had eliminated it by moving the breakpoint to another location. 
It is just feeling, no more, that crash is somehow connected with stack trace: previously I didn't see much stack traces. In most cases there was 2 or 3 frames (in fact, call stack should have more frames, but GDB was not able to correctly detect frames. This may be caused by assembler code which does not follow ABI).

BTW, I've just realized that command-line interface does not use mi_* interface (neither mi_on_resume nor mi_execute_command were hit) and this is most likely the reason why I cannot reproduce this test case with CLI.

Dmitry

-----Original Message-----
From: Pedro Alves <pedro@codesourcery.com>
To: gdb@sourceware.org,  Dmitry Smirnov <divis1969@mail.ru>
Date: Tue, 24 Jun 2008 18:29:28 +0100
Subject: Re: How to catch GDB crash

> 
> A Tuesday 24 June 2008 18:02:49, Dmitry Smirnov wrote:
> > As you can see, first time 
> > mi_on_resume is called with ptid={pid = -1, lwp = 0, tid = 0}. Something
> > happens between this call and second call where ptid={pid = 42000, lwp = 0,
> > tid = 0}.
> >
> > I'm going to figure out which command is followed by wrong pid. I'm
> > suspecting memory corruption.
> 
> This pid is used by GDB internally when you are connected to a target
> that does not support threads at all, like small embedded systems.
> If the remote side doesn't have a notion of thread ids or pids, that
> magic number is what GDB will use internally.  This code path can
> be triggered for example while stepping over a breakpoint (doing
> nexti / -exec-next-instruction while stopped at a breakpoint).
> 
> > BTW, command 'info threads' gives me
> >   (gdb) info threads
> >   warning: RMT ERROR : failed to get remote thread list.
> 
> ... which this seems to corroborate.
> 
> This is a new bug in GDB that was introduced recently.
> 
> I've already said in my last reply what needs to be done
> to fix it.
> 
> If your target *does* support threads and the remote protocol
> thread related packets, than there's yet another bug somewhere
> else.
> 
> -- 
> Pedro Alves
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-24 17:03 Dmitry Smirnov
@ 2008-06-24 17:29 ` Pedro Alves
  2008-06-25  8:03   ` Dmitry Smirnov
  0 siblings, 1 reply; 30+ messages in thread
From: Pedro Alves @ 2008-06-24 17:29 UTC (permalink / raw)
  To: gdb, Dmitry Smirnov

A Tuesday 24 June 2008 18:02:49, Dmitry Smirnov wrote:
> As you can see, first time 
> mi_on_resume is called with ptid={pid = -1, lwp = 0, tid = 0}. Something
> happens between this call and second call where ptid={pid = 42000, lwp = 0,
> tid = 0}.
>
> I'm going to figure out which command is followed by wrong pid. I'm
> suspecting memory corruption.

This pid is used by GDB internally when you are connected to a target
that does not support threads at all, like small embedded systems.
If the remote side doesn't have a notion of thread ids or pids, that
magic number is what GDB will use internally.  This code path can
be triggered for example while stepping over a breakpoint (doing
nexti / -exec-next-instruction while stopped at a breakpoint).

> BTW, command 'info threads' gives me
>   (gdb) info threads
>   warning: RMT ERROR : failed to get remote thread list.

... which this seems to corroborate.

This is a new bug in GDB that was introduced recently.

I've already said in my last reply what needs to be done
to fix it.

If your target *does* support threads and the remote protocol
thread related packets, than there's yet another bug somewhere
else.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
@ 2008-06-24 17:03 Dmitry Smirnov
  2008-06-24 17:29 ` Pedro Alves
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Smirnov @ 2008-06-24 17:03 UTC (permalink / raw)
  To: gdb

Finally, I was able to gather some logs from Cygwin GDB. For this, I've switched debug configuration in Eclipse from gdbServer to Cygwin GDB. It differs in how it starts the program. gdbServer connects to remote target by itself, whereas Cygwin GDB allows me to do it manually from its console.
Below is the log. I've set additional breakpoint at mi_execute_command to see what coommands are issued by Eclipse. 
As you can see, first time mi_on_resume is called with ptid={pid = -1, lwp = 0, tid = 0}. Something happens between this call and second call where ptid={pid = 42000, lwp = 0, tid = 0}.

I'm going to figure out which command is followed by wrong pid. I'm suspecting memory corruption.

I have to note that I didn't see some of these commands while debugging gdbServer as a Java code (CDT debugger) from Eclipse. I've seen the following commands that are common for both cases:
info threads
-stack-info-depth
-stack-list-frames
-data-list-changed-registers
-stack-list-arguments 0 0 0
info signal SIGHUP
-data-disassemble -f <my_file> -l 333 -n 100 -- 1
-data-disassemble -s 0x8c4a8e -e 0x8c4af2 -- 0
-stack-list-locals 0

I will try to simulate these command with console (do not use Eclipse), so if you know equivalent commands for GDB console, please let me know.

BTW, command 'info threads' gives me 
  (gdb) info threads
  warning: RMT ERROR : failed to get remote thread list.

Dmitry

(gdb) attach 4760
Attaching to program `/cygdrive/d/Install/GDB/gdb-6.8.50.20080620/gdb-6.8.50.200
80620/gdb/gdb.exe', process 4760
[Switching to thread 4760.0xcbc]
(gdb) ni
0x7c9507a8 in ntdll!KiIntSystemCall () from /c/WINDOWS/system32/ntdll.dll
(gdb)
0x7c9507bb in ntdll!KiIntSystemCall () from /c/WINDOWS/system32/ntdll.dll
(gdb)
0x7c9507bf in ntdll!KiIntSystemCall () from /c/WINDOWS/system32/ntdll.dll
(gdb)
0x7c9507c1 in ntdll!KiIntSystemCall () from /c/WINDOWS/system32/ntdll.dll
(gdb)
[Switching to thread 4760.0x1150]

Breakpoint 5, mi_execute_command (
    cmd=0x113a4610 "182-interpreter-exec console \"info b\"", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {
(gdb) c
Continuing.

Breakpoint 5, mi_execute_command (cmd=0x1392c8c8 "183-exec-continue",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {
(gdb) c
Continuing.

Breakpoint 4, mi_on_resume (ptid={pid = -1, lwp = 0, tid = 0})
    at .././gdb/mi/mi-interp.c:335
335       if (PIDGET (ptid) == -1)
(gdb) c
Continuing.

Breakpoint 5, mi_execute_command (cmd=0x138c2700 "184 info threads",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {
(gdb) c
Continuing.

Breakpoint 5, mi_execute_command (cmd=0x13931a38 "185-stack-info-depth",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {
(gdb) c
Continuing.

Breakpoint 5, mi_execute_command (
    cmd=0x10d2abf0 "186-stack-list-frames 0 11", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {
(gdb) c
Continuing.

Breakpoint 5, mi_execute_command (
    cmd=0x10c3d120 "187-data-list-changed-registers", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {
(gdb) c
Continuing.

Breakpoint 5, mi_execute_command (cmd=0x10c3d2c0 "188 info sharedlibrary",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {
(gdb) c
Continuing.

Breakpoint 5, mi_execute_command (cmd=0x10cfd660 "189 info signal SIGHUP",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {
(gdb) c
Continuing.

Breakpoint 5, mi_execute_command (
    cmd=0x1144e0a8 "190-data-disassemble -f <my_file> -l 333 -n 100 -- 1", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {
(gdb)
Continuing.

Breakpoint 5, mi_execute_command (
    cmd=0x10d438b8 "191-stack-list-arguments 0 0 0", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {
(gdb)
Continuing.

Breakpoint 5, mi_execute_command (
    cmd=0x10d2a770 "192-data-disassemble -s 0x8c4a8e -e 0x8c4af2 -- 0",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {
(gdb)
Continuing.

Breakpoint 5, mi_execute_command (cmd=0x10d2a820 "193-stack-list-locals 0",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {
(gdb)
Continuing.

Breakpoint 5, mi_execute_command (cmd=0x10d2a8d0 "194 whatis i", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {
(gdb)
Continuing.

Breakpoint 5, mi_execute_command (cmd=0x10f3ea98 "195 whatis rt_status",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {
(gdb)
Continuing.

Breakpoint 5, mi_execute_command (cmd=0x10f34780 "196 whatis __result",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {
(gdb)
Continuing.

Breakpoint 5, mi_execute_command (cmd=0x10f3e748 "197-var-create - * i",
    from_tty=1) at .././gdb/mi/mi-main.c:1135
1135    {
(gdb)
Continuing.

Breakpoint 5, mi_execute_command (
    cmd=0x10f39150 "198-var-evaluate-expression var1", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {
(gdb)
Continuing.

Breakpoint 5, mi_execute_command (
    cmd=0x1131f9e8 "199-var-create - * rt_status", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {
(gdb)
Continuing.

Breakpoint 5, mi_execute_command (
    cmd=0x11244350 "200-var-evaluate-expression var2", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {
(gdb)
Continuing.

Breakpoint 5, mi_execute_command (
    cmd=0x1127bb70 "201-var-create - * __result", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {
(gdb)
Continuing.

Breakpoint 5, mi_execute_command (
    cmd=0x1127bdc8 "202-var-evaluate-expression var3", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {
(gdb)
Continuing.

Breakpoint 5, mi_execute_command (
    cmd=0x112d6500 "203-exec-next-instruction 1", from_tty=1)
    at .././gdb/mi/mi-main.c:1135
1135    {
(gdb) c
Continuing.

Breakpoint 4, mi_on_resume (ptid={pid = 42000, lwp = 0, tid = 0})
    at .././gdb/mi/mi-interp.c:335
335       if (PIDGET (ptid) == -1)
(gdb) c
Continuing.

Breakpoint 2, 0x61084819 in cygwin1!abort () from /usr/bin/cygwin1.dll
(gdb)


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
@ 2008-06-24  8:52 Dmitry Smirnov
  0 siblings, 0 replies; 30+ messages in thread
From: Dmitry Smirnov @ 2008-06-24  8:52 UTC (permalink / raw)
  To: gdb

Hi,

I'm sorry guys, but I believe we are going wrong way. 
First, I suppose I do not need JIT: as I said, I can attach to the 
running arm-elf-gdb before the crash. I supposed that GDB is smart 
enough to catch system exceptions. Moreover, I have some indication 
of that: my skyeye is Cygwin-compiled program and when I run it in 
Eclipse (which uses Cygwin GDB as a debugger for this program), I can 
see the follwing stack:
-----------------
Skyeye_1.2.4 Cygwin GCC [C/C++ Local Application]	
	Cygwin gdb Debugger (23.06.08 20:12) (Suspended)	
		Thread [1] (Suspended: Signal 'SIGSEGV' received. Description: Segmentation fault.)	
			3 RpcRaiseException()  0x77ea27ea	
			2 h_errno()  0x662b7258	
			1 <symbol is not available> 0x00000000	
		Thread [2] (Suspended)	
	gdb (23.06.08 20:12)	
	D:\Dvs\Project\Skyeye_1.2.4\binary\skyeye.exe (23.06.08 20:12)	
----------------------
Here is the stack of command-line GDB:
----------------------
Program received signal SIGSEGV, Segmentation fault.
0x77ea27ea in RpcRaiseException () from /c/WINDOWS/system32/rpcrt4.dll
(gdb) info stack
#0  0x77ea27ea in RpcRaiseException () from /c/WINDOWS/system32/rpcrt4.dll
#1  0x662b7258 in h_errno () from /c/WINDOWS/system32/hnetcfg.dll
#2  0x00000000 in ?? () from
------------------------

I'm 99% sure that Cygwin GDB can intersept these sygnals.

Am I wrong? Perhaps this a Cygwin_libs-generated signal?

Second, as I mentioned in first mail, someone is trying to save the 
crashdump file. Who's that guy? Why it fails to save? Can I set a breakpoint 
just before he starts to save (e.g. when it detects the crash)?

Is it possible that arm-elf-gdb itself is handling some signals thus preventing 
Cygwin GDB from handling it? Where is that code?

Dmitry
	



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-23 20:50 ` Dr. Rolf Jansen
@ 2008-06-23 20:59   ` Dr. Rolf Jansen
  0 siblings, 0 replies; 30+ messages in thread
From: Dr. Rolf Jansen @ 2008-06-23 20:59 UTC (permalink / raw)
  To: gdb; +Cc: Dmitry Smirnov

Sorry,

the code that needs to be added at line 43 of gdb/main.c should read:

#if defined (POOR_MAN_DEBUG)
  #include <sys/time.h>
  double diff;
  time_t t0;
#endif

Before adding this to line 43, you should make the addition at line  
661, otherwise its line 667.

Best regards

Rolf Jansen


Am 24.06.2008 um 03:50 schrieb Dr. Rolf Jansen:

> Hi,
>
> Here comes a poor mans approach.
>
> 1. add in the file gdb/main.c at line 661 the
>   following code:
>
> #if defined (POOR_MAN_DEBUG)
>  printf_filtered ("[GDB PID %d]\n", getpid());
>  t0 = time(NULL);
>  while ((diff = difftime(time(NULL), t0)) <= 60);
> #endif
>
> and at line 43
>
> #if defined (POOR_MAN_DEBUG)
>  #include <sys/time.h>
>  double diff;
>  time_t t0 = time(NULL);
> #endif
>
>
> 2. issue before the configure command:
>   export CFLAGS="-g -O0 -DPOOR_MAN_DEBUG"
>   export CXXFLAGS=$CFLAGS
>
> 3. configure and make your cross-686-pc-cygwin/arm-elf-gdb
>
> 4. start the cross-debugging session
>
> 5. now you have 60 seconds to attach your host-gdb
>   to the just printed PID of your cross-gdb, and to
>   set a reasonable breakpoint somewhere near to the
>   location where you expect the crash to happen.
>
> 6. after 60 s the cross-gdb will continue running and
>   your host-gdb shall stop the cross-gdb at the
>   breakpoint that you set.
>
> 7. Now step through until the crash occurs.
>
>
> Hopefully this helps.
>
> Best regards
>
> Rolf Jansen
>
>
> Am 23.06.2008 um 23:31 schrieb Dmitry Smirnov:
>
>> Hi,
>>
>> I've encountered the very annoying problem with GDB. While  
>> debugging, it crashes for some reason and I cannot catch this  
>> moment. I have a possibility to attach to the running process with  
>> another GDB, but it does not help. Perhaps, there is some way to  
>> catch some system exception or something similar?
>>
>> I'm using a litle bit complex setup, so there is no much freedom  
>> (at least I cannot simplify the situation). First, I'm debugging  
>> from Eclipse (and it looks this crash happens only while running  
>> from Eclipse, I've tried from commad line - there is no crash). The  
>> debugger I'm using is cross-compiled arm-elf-gdb. It is compiled on  
>> windows (i686) platform:
>> GNU gdb (GDB) 6.8.50.20080620
>> ...
>> This GDB was configured as "--host=i686-pc-cygwin --target=arm-elf".
>>
>> I doubt this can be linked to GDB version. I recall I've seen this  
>> crash earlier with arm-elf-gdb 6.5 from GNUARM. I was just skipped  
>> this crash somehow but now I cannot continue my job.
>>
>> This arm-elf-gdb is running against skyeye simulator as a remote  
>> target.
>>
>> As you can see there are many possibilities for mailfunctioning  
>> software (Eclipse, GDB, skyeye). But the only way I can find the  
>> root cause is to debug the crash in arm-elf-gdb. While crashing it  
>> attempts to create the crashdump file, but it is incomplete and  
>> Cygwin gdb cannot recognize it. Typically it contains just three  
>> lines:
>>
>> Stack trace:
>> Frame     Function  Args
>> 0022E268  7C802532  (00000058
>>
>> If attached, Cygwin GDB just reporting me:
>> Program exited with code 037777777777.
>>
>> Is there any way to stop arm-elf-gdb on some critical error?
>>
>> Dmitry
>>
>>
>
>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-23 16:32 Dmitry Smirnov
  2008-06-23 16:57 ` Aleksandar Ristovski
  2008-06-23 17:12 ` Michael Snyder
@ 2008-06-23 20:50 ` Dr. Rolf Jansen
  2008-06-23 20:59   ` Dr. Rolf Jansen
  2 siblings, 1 reply; 30+ messages in thread
From: Dr. Rolf Jansen @ 2008-06-23 20:50 UTC (permalink / raw)
  To: gdb; +Cc: Dmitry Smirnov

Hi,

Here comes a poor mans approach.

1. add in the file gdb/main.c at line 661 the
    following code:

#if defined (POOR_MAN_DEBUG)
   printf_filtered ("[GDB PID %d]\n", getpid());
   t0 = time(NULL);
   while ((diff = difftime(time(NULL), t0)) <= 60);
#endif

and at line 43

#if defined (POOR_MAN_DEBUG)
   #include <sys/time.h>
   double diff;
   time_t t0 = time(NULL);
#endif


2. issue before the configure command:
    export CFLAGS="-g -O0 -DPOOR_MAN_DEBUG"
    export CXXFLAGS=$CFLAGS

3. configure and make your cross-686-pc-cygwin/arm-elf-gdb

4. start the cross-debugging session

5. now you have 60 seconds to attach your host-gdb
    to the just printed PID of your cross-gdb, and to
    set a reasonable breakpoint somewhere near to the
    location where you expect the crash to happen.

6. after 60 s the cross-gdb will continue running and
    your host-gdb shall stop the cross-gdb at the
    breakpoint that you set.

7. Now step through until the crash occurs.


Hopefully this helps.

Best regards

Rolf Jansen


Am 23.06.2008 um 23:31 schrieb Dmitry Smirnov:

> Hi,
>
> I've encountered the very annoying problem with GDB. While  
> debugging, it crashes for some reason and I cannot catch this  
> moment. I have a possibility to attach to the running process with  
> another GDB, but it does not help. Perhaps, there is some way to  
> catch some system exception or something similar?
>
> I'm using a litle bit complex setup, so there is no much freedom (at  
> least I cannot simplify the situation). First, I'm debugging from  
> Eclipse (and it looks this crash happens only while running from  
> Eclipse, I've tried from commad line - there is no crash). The  
> debugger I'm using is cross-compiled arm-elf-gdb. It is compiled on  
> windows (i686) platform:
> GNU gdb (GDB) 6.8.50.20080620
> ...
> This GDB was configured as "--host=i686-pc-cygwin --target=arm-elf".
>
> I doubt this can be linked to GDB version. I recall I've seen this  
> crash earlier with arm-elf-gdb 6.5 from GNUARM. I was just skipped  
> this crash somehow but now I cannot continue my job.
>
> This arm-elf-gdb is running against skyeye simulator as a remote  
> target.
>
> As you can see there are many possibilities for mailfunctioning  
> software (Eclipse, GDB, skyeye). But the only way I can find the  
> root cause is to debug the crash in arm-elf-gdb. While crashing it  
> attempts to create the crashdump file, but it is incomplete and  
> Cygwin gdb cannot recognize it. Typically it contains just three  
> lines:
>
> Stack trace:
> Frame     Function  Args
> 0022E268  7C802532  (00000058
>
> If attached, Cygwin GDB just reporting me:
> Program exited with code 037777777777.
>
> Is there any way to stop arm-elf-gdb on some critical error?
>
> Dmitry
>
>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-23 18:36     ` Pedro Alves
@ 2008-06-23 19:37       ` Brian Dessent
  0 siblings, 0 replies; 30+ messages in thread
From: Brian Dessent @ 2008-06-23 19:37 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb, Eli Zaretskii, Michael Snyder, divis1969

Pedro Alves wrote:

> for native Windows apps, there is some registry key (I don't
> remember which) you can set to point to a JIT debugger.  Probably

For the sake of the archives it's HKLM\SOFTWARE\Microsoft\Windows
NT\CurrentVersion\AeDebug, see <http://support.microsoft.com/kb/103861>.

> a little exe wrapper is needed to translate the incoming args
> to GDB args, that's all.

The first %ld in the command expands to the faulting PID so
"path-to\gdb.exe -p %ld" ought to work without need for a wrapper.

> I can't see what changes in GDB
> would be required?

The main problem is that gdb thinks that it's attaching to a normal
running process, rather than a faulted app.  Thus the current thread is
an artificial one with %eip in ntdll!DbgUiConnectToDbg which is just a
convenient 'int3' that lives in ntdll.dll as discussed.  That's easily
fixed just by switching to the thread of interest, however the state of
that thread often looks something bogus like this:

(gdb) bt
#0  0x7c90eb94 in ntdll!LdrAccessResource ()
   from C:\WINXP\system32\ntdll.dll
#1  0x7c90e9ab in ntdll!ZwWaitForMultipleObjects ()
   from C:\WINXP\system32\ntdll.dll
#2  0x7c86372c in UnhandledExceptionFilter ()
   from C:\WINXP\system32\kernel32.dll
#3  0x00000002 in ?? ()
#4  0x0022f560 in ?? ()
#5  0x00000001 in ?? ()
#6  0x00000001 in ?? ()
#7  0x00000000 in ?? ()

The actual location of the fault in the user code is nowhere evident (in
this testcase, it was at eip 00401304) because gdb doesn't know that it
has picked up in the middle of a fault.  You can get things back on
track by re-triggering the same fault again by continuing, but this
really isn't pretty and doesn't always work.  For this to work correctly
gdb would need some command line switch to tell it that it's taking over
a faulted process as the JIT, rather than breaking in on a normally
executing one.

Brian

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-23 18:23   ` Eli Zaretskii
  2008-06-23 18:32     ` Michael Snyder
@ 2008-06-23 18:36     ` Pedro Alves
  2008-06-23 19:37       ` Brian Dessent
  1 sibling, 1 reply; 30+ messages in thread
From: Pedro Alves @ 2008-06-23 18:36 UTC (permalink / raw)
  To: gdb, Eli Zaretskii; +Cc: Michael Snyder, divis1969

A Monday 23 June 2008 19:23:03, Eli Zaretskii wrote:
> > From: Michael Snyder <msnyder@specifix.com>
> > Cc: gdb@sourceware.org
> > Date: Mon, 23 Jun 2008 10:12:04 -0700
> >
> > You're running on a Windows host, right?  Doesn't Windows have
> > some mechanism for automatically catching a program that is
> > crashing, and holding it for the debugger?

> That's true, but you need special code in the debugger to be able to
> work like that (it's called JIT debugging, btw).  And GDB doesn't
> (yet) have such code.


You can set error_start in the CYGWIN environment variable
to point at GDB's executable to have GDB start automatically
on an exception (the op reported --host=i686-pc-cygwin), and,

for native Windows apps, there is some registry key (I don't
remember which) you can set to point to a JIT debugger.  Probably
a little exe wrapper is needed to translate the incoming args
to GDB args, that's all.  I can't see what changes in GDB
would be required?

-- 
Pedro Alves


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-23 18:23   ` Eli Zaretskii
@ 2008-06-23 18:32     ` Michael Snyder
  2008-06-23 18:36     ` Pedro Alves
  1 sibling, 0 replies; 30+ messages in thread
From: Michael Snyder @ 2008-06-23 18:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: divis1969, gdb

On Mon, 2008-06-23 at 21:23 +0300, Eli Zaretskii wrote:
> > From: Michael Snyder <msnyder@specifix.com>
> > Cc: gdb@sourceware.org
> > Date: Mon, 23 Jun 2008 10:12:04 -0700
> > 
> > You're running on a Windows host, right?  Doesn't Windows have
> > some mechanism for automatically catching a program that is
> > crashing, and holding it for the debugger?
> 
> That's true, but you need special code in the debugger to be able to
> work like that (it's called JIT debugging, btw).  And GDB doesn't
> (yet) have such code.

My failing memory.  Didn't I publish a patch for glibc about
a year ago that would allow that?  I'm going to hate myself
if I never posted it...




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-23 17:12 ` Michael Snyder
@ 2008-06-23 18:23   ` Eli Zaretskii
  2008-06-23 18:32     ` Michael Snyder
  2008-06-23 18:36     ` Pedro Alves
  0 siblings, 2 replies; 30+ messages in thread
From: Eli Zaretskii @ 2008-06-23 18:23 UTC (permalink / raw)
  To: Michael Snyder; +Cc: divis1969, gdb

> From: Michael Snyder <msnyder@specifix.com>
> Cc: gdb@sourceware.org
> Date: Mon, 23 Jun 2008 10:12:04 -0700
> 
> You're running on a Windows host, right?  Doesn't Windows have
> some mechanism for automatically catching a program that is
> crashing, and holding it for the debugger?

That's true, but you need special code in the debugger to be able to
work like that (it's called JIT debugging, btw).  And GDB doesn't
(yet) have such code.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-23 16:32 Dmitry Smirnov
  2008-06-23 16:57 ` Aleksandar Ristovski
@ 2008-06-23 17:12 ` Michael Snyder
  2008-06-23 18:23   ` Eli Zaretskii
  2008-06-23 20:50 ` Dr. Rolf Jansen
  2 siblings, 1 reply; 30+ messages in thread
From: Michael Snyder @ 2008-06-23 17:12 UTC (permalink / raw)
  To: Dmitry Smirnov; +Cc: gdb

On Mon, 2008-06-23 at 20:31 +0400, Dmitry Smirnov wrote:
> Hi,
> 
> I've encountered the very annoying problem with GDB. While debugging, it crashes for some reason and I cannot catch this moment. I have a possibility to attach to the running process with another GDB, but it does not help. Perhaps, there is some way to catch some system exception or something similar?
> 
> I'm using a litle bit complex setup, so there is no much freedom (at least I cannot simplify the situation). First, I'm debugging from Eclipse (and it looks this crash happens only while running from Eclipse, I've tried from commad line - there is no crash). The debugger I'm using is cross-compiled arm-elf-gdb. It is compiled on windows (i686) platform:
> GNU gdb (GDB) 6.8.50.20080620
> ...
> This GDB was configured as "--host=i686-pc-cygwin --target=arm-elf".
> 
> I doubt this can be linked to GDB version. I recall I've seen this crash earlier with arm-elf-gdb 6.5 from GNUARM. I was just skipped this crash somehow but now I cannot continue my job.
> 
> This arm-elf-gdb is running against skyeye simulator as a remote target.
> 
> As you can see there are many possibilities for mailfunctioning software (Eclipse, GDB, skyeye). But the only way I can find the root cause is to debug the crash in arm-elf-gdb. While crashing it attempts to create the crashdump file, but it is incomplete and Cygwin gdb cannot recognize it. Typically it contains just three lines:
> 
> Stack trace:
> Frame     Function  Args
> 0022E268  7C802532  (00000058
> 
> If attached, Cygwin GDB just reporting me:
> Program exited with code 037777777777.
> 
> Is there any way to stop arm-elf-gdb on some critical error?

Sounds annoying.

You're running on a Windows host, right?  Doesn't Windows have
some mechanism for automatically catching a program that is
crashing, and holding it for the debugger?  Like on a Mac?

If you can attach to the gdb before the crash, you might
try setting breakpoints on exit and _exit, abort, things
like that, and see if you can intercept it that way.

What about libsegfault?  Is something like that available
on windows?



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: How to catch GDB crash
  2008-06-23 16:32 Dmitry Smirnov
@ 2008-06-23 16:57 ` Aleksandar Ristovski
  2008-06-23 17:12 ` Michael Snyder
  2008-06-23 20:50 ` Dr. Rolf Jansen
  2 siblings, 0 replies; 30+ messages in thread
From: Aleksandar Ristovski @ 2008-06-23 16:57 UTC (permalink / raw)
  To: gdb

Dmitry Smirnov wrote:
> Hi,
> 
> I've encountered the very annoying problem with GDB. While debugging, it crashes for some reason and I cannot catch this moment. I have a possibility to attach to the running process with another GDB, but it does not help. Perhaps, there is some way to catch some system exception or something similar?
> 
> I'm using a litle bit complex setup, so there is no much freedom (at least I cannot simplify the situation). First, I'm debugging from Eclipse (and it looks this crash happens only while running from Eclipse, I've tried from commad line - there is no crash). The debugger I'm using is cross-compiled arm-elf-gdb. It is compiled on windows (i686) platform:
> GNU gdb (GDB) 6.8.50.20080620
> ...
> This GDB was configured as "--host=i686-pc-cygwin --target=arm-elf".
> 
> I doubt this can be linked to GDB version. I recall I've seen this crash earlier with arm-elf-gdb 6.5 from GNUARM. I was just skipped this crash somehow but now I cannot continue my job.
> 

Hello, I am working on a problem that might be related (although gdb version I am using is 6.7).

Could you try with this sample code? (from cmd line, set a breakpoint in printSimple and once the breakpoint is hit do 'print p' and see if the output makes sense). 

Thanks,

Aleksandar Ristovski
QNX Software Systems



#include <stdlib.h>
#include <stdio.h>

struct participant {
        char name[15];
        char country[15];
        float score;
        int age;
};

void printSimpleP(struct participant *p)
{
  printf("Name: %s, Country: %s, Score: %f, Age: %d\n",
         p->name, p->country, p->score, p->age);
}

void printSimple(struct participant p)
{
  printSimpleP(&p);
}

int main(int argc, char *argv[]) {
        struct participant p = { "Foo", "Bar", 1.2, 45 };
        printSimple(p);
        return 0;
}


^ permalink raw reply	[flat|nested] 30+ messages in thread

* How to catch GDB crash
@ 2008-06-23 16:32 Dmitry Smirnov
  2008-06-23 16:57 ` Aleksandar Ristovski
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Dmitry Smirnov @ 2008-06-23 16:32 UTC (permalink / raw)
  To: gdb

Hi,

I've encountered the very annoying problem with GDB. While debugging, it crashes for some reason and I cannot catch this moment. I have a possibility to attach to the running process with another GDB, but it does not help. Perhaps, there is some way to catch some system exception or something similar?

I'm using a litle bit complex setup, so there is no much freedom (at least I cannot simplify the situation). First, I'm debugging from Eclipse (and it looks this crash happens only while running from Eclipse, I've tried from commad line - there is no crash). The debugger I'm using is cross-compiled arm-elf-gdb. It is compiled on windows (i686) platform:
GNU gdb (GDB) 6.8.50.20080620
...
This GDB was configured as "--host=i686-pc-cygwin --target=arm-elf".

I doubt this can be linked to GDB version. I recall I've seen this crash earlier with arm-elf-gdb 6.5 from GNUARM. I was just skipped this crash somehow but now I cannot continue my job.

This arm-elf-gdb is running against skyeye simulator as a remote target.

As you can see there are many possibilities for mailfunctioning software (Eclipse, GDB, skyeye). But the only way I can find the root cause is to debug the crash in arm-elf-gdb. While crashing it attempts to create the crashdump file, but it is incomplete and Cygwin gdb cannot recognize it. Typically it contains just three lines:

Stack trace:
Frame     Function  Args
0022E268  7C802532  (00000058

If attached, Cygwin GDB just reporting me:
Program exited with code 037777777777.

Is there any way to stop arm-elf-gdb on some critical error?

Dmitry

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2008-07-08  8:27 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-06-24 12:39 How to catch GDB crash Dmitry Smirnov
2008-06-24 12:58 ` Pedro Alves
  -- strict thread matches above, loose matches on Subject: below --
2008-06-24 17:03 Dmitry Smirnov
2008-06-24 17:29 ` Pedro Alves
2008-06-25  8:03   ` Dmitry Smirnov
2008-06-25 23:28     ` Pedro Alves
2008-06-26 13:56       ` Dmitry Smirnov
2008-06-26 14:21         ` Pedro Alves
2008-06-26 14:33           ` Dmitry Smirnov
2008-06-30 15:58             ` Dmitry Smirnov
2008-07-02 11:05               ` Dmitry Smirnov
2008-07-02 11:52                 ` Pedro Alves
2008-07-02 12:51                   ` Re[2]: " Dmitry Smirnov
2008-07-05  3:15                     ` Pedro Alves
2008-07-07  8:36                       ` Dmitry Smirnov
2008-07-07 14:29                         ` Pedro Alves
2008-07-07 15:47                           ` Dmitry Smirnov
2008-07-07 16:01                             ` Pedro Alves
2008-07-08  8:27                               ` Dmitry Smirnov
2008-07-01 11:38       ` Vladimir Prus
2008-07-01 11:41         ` Pedro Alves
2008-06-24  8:52 Dmitry Smirnov
2008-06-23 16:32 Dmitry Smirnov
2008-06-23 16:57 ` Aleksandar Ristovski
2008-06-23 17:12 ` Michael Snyder
2008-06-23 18:23   ` Eli Zaretskii
2008-06-23 18:32     ` Michael Snyder
2008-06-23 18:36     ` Pedro Alves
2008-06-23 19:37       ` Brian Dessent
2008-06-23 20:50 ` Dr. Rolf Jansen
2008-06-23 20:59   ` Dr. Rolf Jansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox