From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11752 invoked by alias); 12 Dec 2013 15:14:33 -0000 Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org Received: (qmail 11702 invoked by uid 89); 12 Dec 2013 15:14:32 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-oa0-f46.google.com Received: from mail-oa0-f46.google.com (HELO mail-oa0-f46.google.com) (209.85.219.46) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 12 Dec 2013 15:14:31 +0000 Received: by mail-oa0-f46.google.com with SMTP id o6so563652oag.33 for ; Thu, 12 Dec 2013 07:14:29 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.182.229.34 with SMTP id sn2mr41755obc.86.1386861269424; Thu, 12 Dec 2013 07:14:29 -0800 (PST) Received: by 10.76.9.170 with HTTP; Thu, 12 Dec 2013 07:14:29 -0800 (PST) In-Reply-To: References: <1386711973.2226.30.camel@soleil> Date: Thu, 12 Dec 2013 15:14:00 -0000 Message-ID: Subject: Fwd: Multi-packet gdb server request/response - problem? From: Ivo Raisr To: gdb@sourceware.org Cc: Philippe Waroquiers Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2013-12/txt/msg00014.txt.bz2 Hello Philippe, So we meet again! Pleasure on my side and thank you for a quick response. > Does this also happen with the gdbserver included in the GDB > distribution ? > (be sure GDB is in all stop mode : > show non-stop > ) Unfortunately gdbserver is not built with the GDB distribution on Solaris :-( I suspect no one has ever needed it and my quick attempts to build my own failed. It needs to be properly ported first... > When using the Valgrind gdbserver, does the problem also happens > if you first instruct GDB to keep the "ack mode" ? Yes, it does: ------------------------------ ... Sending packet: $mfe7d44e0,4#30...Ack Packet received: 00000000 Sending packet: $s#73...Ack Sending packet: $mfe7d1af4,4#5f...Packet instead of Ack, ignoring it Packet instead of Ack, ignoring it (and the last message is repeated a dozen of times; gdb hangs) ------------------------------ which corresponds to what is observed on the remote stub (with a timestamp): ------------------------------ ... 1386767417.658724 from gdb $mfe7d44e0,4#30 1386767417.658901 to gdb + 1386767417.659166 to gdb $0*"00#dc 1386767417.659257 from gdb + 1386767417.659321 from gdb $s#73 1386767417.659402 to gdb + 1386767417.659532 from gdb $mfe7d1af4,4#5f 1386767417.659651 to gdb $T0505:48f8fe37;04:28f8fe37;08:89e31100;thread:1;#ae 1386767417.659786 from gdb + 1386767417.659854 to gdb $T0505:48f8fe37;04:28f8fe37;08:89e31100;thread:1;#ae (and the last response to gdb is repeated a dozen of times) ------------------------------ So this confirms that gdb does not wait for a response from the remote stub but issues another request. And then is surprised a response was received instead of ack... I was also suspecting that the remote stub may have taken too long to respond to an "s" packet, but this is not the case as confirmed by another conversation quite early in the same debugging session: ----------- ... 1386767415.992930 from gdb $s#73 1386767415.993112 to gdb + 1386767415.993469 to gdb $T0505:78f7fe37;04:58f7fe37;08:89e31100;thread:1;#b2 1386767415.993633 from gdb +$qTStatus#49 1386767415.993800 to gdb + 1386767415.994078 to gdb $#00 ... ----------- and gdb waited for the response here even if it took longer than in the latter case. I also tried to issue several "step" commands in gdb. There was a plenty of "s" packet request/response exchanged with success between gdb and remote stub. I managed to capture a backtrace of gdb when it printed the error message "Reply contains invalid hex digit 84" and it looks as follows: ----------- ffff80ffbfffe7c0 fromhex+0x3f() ffff80ffbfffe810 hex2bin+0x60() ffff80ffbfffe8a0 remote_xfer_partial+0x538() ffff80ffbfffe930 sol_thread_xfer_partial+0xdf() ffff80ffbfffe9e0 memory_xfer_partial_1+0x139() ffff80ffbfffea70 target_xfer_partial+0x191() ffff80ffbfffead0 target_read+0x67() ffff80ffbfffeaf0 target_read_memory+0x28() ffff80ffbfffeb70 rw_common.isra.4+0x9c() ffff80ffbfffebb0 libc_db.so.1`td_read_bootstrap_data+0xf3() ffff80ffbfffebe0 libc_db.so.1`ph_lock_ta+0x4a() ffff80ffbffff2f0 libc_db.so.1`td_ta_thr_iter+0x5d() ffff80ffbffff360 libc_db.so.1`td_ta_map_id2thr+0x97() ffff80ffbffff450 thread_to_lwp+0xa9() ffff80ffbffff530 sol_thread_wait+0xaa() ffff80ffbffff5a0 target_wait+0x74() ffff80ffbffff690 wait_for_inferior+0x155() ffff80ffbffff730 proceed+0x1e5() ffff80ffbffff7d0 continue_command+0x104() ffff80ffbffff830 execute_command+0x28f() ffff80ffbffff850 command_handler+0x5b() ffff80ffbffff890 command_line_handler+0x27c() ffff80ffbffff8c0 rl_callback_read_char+0xf9() ffff80ffbffff8d0 rl_callback_read_char_wrapper+9() ffff80ffbffff8f0 process_event+0x8d() ffff80ffbffff930 gdb_do_one_event+0x117() ffff80ffbffff960 start_event_loop+0x47() ffff80ffbffff970 captured_command_loop+0x13() ffff80ffbffff9d0 catch_errors+0x64() ffff80ffbffffaa0 captured_main+0x696() ffff80ffbffffb00 catch_errors+0x64() ffff80ffbffffb10 gdb_main+0x24() ffff80ffbffffb40 main+0x30() ffff80ffbffffb50 _start+0x6c() ----------- Just looking at the backtrace it seems that gdb is processing "m" packet (target_read_memory()), not waiting for a response to previous "s" packet. Then I enabled 'infrun' debugging and gdb shows the following: ----------- (gdb) continue Continuing. infrun: clear_proceed_status_thread (Thread 1) infrun: proceed (addr=0xffffffff, signal=144, step=0) infrun: resume (step=0, signal=0), trap_expected=0, current thread [Thread 1] at 0x111652 infrun: wait_for_inferior () infrun: target_wait (-1, status) = infrun: 42000 [Thread 1], infrun: status->kind = stopped, signal = SIGTRAP infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x11e388 infrun: BPSTAT_WHAT_SINGLE infrun: no stepping, continue infrun: resume (step=1, signal=0), trap_expected=1, current thread [Thread 1] at 0x11e388 infrun: prepare_to_wait infrun: target_wait (-1, status) = infrun: 42000 [Thread 1], infrun: status->kind = stopped, signal = SIGTRAP infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x11e389 infrun: no stepping, continue infrun: resume (step=0, signal=0), trap_expected=0, current thread [Thread 1] at 0x11e389 infrun: prepare_to_wait infrun: target_wait (-1, status) = infrun: 42000 [Thread 1], infrun: status->kind = stopped, signal = SIGTRAP infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x11e388 [Thread debugging using libthread_db enabled] infrun: BPSTAT_WHAT_SINGLE infrun: no stepping, continue infrun: resume (step=1, signal=0), trap_expected=1, current thread [Thread 1 (defunct)] at 0x11e388 infrun: prepare_to_wait (and then the error message) ----------- I wonder how gdb determined that Thread 1 is defunct now? Combined with 'set debug remote 1', the last few debugging messages are: ----------- Sending packet: $mfe7d4e80,38#6b...Packet received: Sending packet: $mfe7d4580,40#34...Packet received: Sending packet: $mfe7d44dc,4#62...Packet received: 00000000 Sending packet: $mfe7d44e0,4#30...Packet received: 00000000 Sending packet: $s#73...infrun: prepare_to_wait Sending packet: $mfe7d1af4,4#5f...Packet received: T0505:48f8fe37;04:28f8fe37;08:89e31100;thread:1; (and then the error message) ----------- Now the question is: how to debug the issue further? I am not familiar with gdb's remote debugging internals... Could the problem be that gdb determined that Thread 1 is defunct and somehow skipped waiting for "s" packet response, based on different internal state? Where I should be looking next? Thanks for any suggestion. I.