From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cagney To: Nick Roberts Cc: gdb@sources.redhat.com Subject: Re: internal-error: insert_step_resume_breakpoint_at_sal Date: Sun, 06 Feb 2005 07:16:00 -0000 Message-id: <4202BABF.3090605@gnu.org> References: <16804.1142.136766.593493@farnswood.snap.net.nz> <16874.17961.812024.375273@farnswood.snap.net.nz> <41ED5C15.7070401@gnu.org> <16877.33664.336543.446168@farnswood.snap.net.nz> <41EE82EA.7010803@gnu.org> <16879.224.997596.521183@farnswood.snap.net.nz> <41EFE531.9060605@gnu.org> <16880.27555.756855.225654@farnswood.snap.net.nz> <41F56FCD.6090101@gnu.org> <16887.27939.790997.257717@farnswood.snap.net.nz> <41F7C46D.6040702@gnu.org> <16888.105.309770.857500@farnswood.snap.net.nz> <41F80DD9.6030503@gnu.org> <16888.15318.746005.711735@farnswood.snap.net.nz> X-SW-Source: 2005-02/msg00018.html [chug chug chug] This is what I've noticed ... (gdb) n infrun: proceed (addr=0xffffffff, signal=144, step=1) GDB tries to single step off the breakpoint @0x80a05f6, (when doing this all breakpoints are removed). GDB is expecting to get a SIGTRAP back ... infrun: resume (step=1, signal=0) target_terminal_inferior () child:target_xfer_partial (2, (null), 0xbffff18a, 0x0, 0x80a05f6, 2) = 2, bytes = 8b 45 target_resume (2896, step, 0) infrun: wait_for_inferior ... but instead the inferior executes _no_ instructions and instead returns a [pending] SIGIO (still @0x80a05f6) ... target_wait (-1, status) = 2896, status->kind = stopped, signal = SIGIO infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED target_fetch_registers (eip) = f6050a08 0x80a05f6 134874614 infrun: stop_pc = 0x80a05f6 infrun: random signal 23 ... so GDB again tries to step the inferior, this time passing in the SIGNAL, and expecting back a SIGTRAP when the inferior reaches the signal handlers first instruction ... [wonder if, for this case GDB can insert all breakpoints and do a continue] infrun: resume (step=1, signal=23) target_terminal_inferior () child:target_xfer_partial (2, (null), 0xbfffeffa, 0x0, 0x80a05f6, 2) = 2, bytes = 8b 45 target_resume (2896, step, SIGIO) infrun: prepare_to_wait ... but instead GDB gets back another pending SIGIO with the PC still back at the original BP instruction! => that's a known kernel bug. Instead of single-stepping into the signal handler, the kernel runs the signal handler to completion. Fixed in very recent 2.6 kernels. ... so gdb tries again to step into the signal handler again (remember, GDB's expecting the inferior to return a sigtrap from the process reaching the signal handler's first instruction) ... target_wait (-1, status) = 2896, status->kind = stopped, signal = SIGIO infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED target_fetch_registers (eip) = f6050a08 0x80a05f6 134874614 infrun: stop_pc = 0x80a05f6 infrun: random signal 23 infrun: resume (step=1, signal=23) target_terminal_inferior () child:target_xfer_partial (2, (null), 0xbfffeffa, 0x0, 0x80a05f6, 2) = 2, bytes = 8b 45 target_resume (2896, step, SIGIO) infrun: prepare_to_wait ... but instead the signal handler runs, returns, and then executes _one_ instruction before trapping! => that's part of the same kernel bug target_wait (-1, status) = 2896, status->kind = stopped, signal = SIGTRAP target_fetch_registers (eip) = f9050a08 0x80a05f9 134874617 infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x80a05f9 infrun: trap expected infrun: step-resume breakpoint ... so I think we've now got gdb thinking that the inferior got a sigtrap [tick] indicating that the process reached the handler [cross] and hence the the inferior should continue until the signal handler returns ... child:target_xfer_partial (2, (null), 0x84ff020, 0x0, 0x810bdc0, 1) = 1, bytes = 83 child:target_xfer_partial (2, (null), 0x0, 0x8271b10, 0x810bdc0, 1) = 1, bytes = cc target_insert_breakpoint (0x810bdc0, xxx) = 0 child:target_xfer_partial (2, (null), 0x85004c8, 0x0, 0x80e74c1, 1) = 1, bytes = 68 child:target_xfer_partial (2, (null), 0x0, 0x8271b10, 0x80e74c1, 1) = 1, bytes = cc target_insert_breakpoint (0x80e74c1, xxx) = 0 child:target_xfer_partial (2, (null), 0x874d058, 0x0, 0x80a05f6, 1) = 1, bytes = 8b child:target_xfer_partial (2, (null), 0x0, 0x8271b10, 0x80a05f6, 1) = 1, bytes = cc target_insert_breakpoint (0x80a05f6, xxx) = 0 child:target_xfer_partial (2, (null), 0x874d348, 0x0, 0x40009bd0, 1) = 1, bytes = c3 child:target_xfer_partial (2, (null), 0x0, 0x8271b10, 0x40009bd0, 1) = 1, bytes = cc target_insert_breakpoint (0x40009bd0, xxx) = 0 child:target_xfer_partial (2, (null), 0x8aacb20, 0x0, 0x4000c2e0, 1) = 1, bytes = 55 child:target_xfer_partial (2, (null), 0x0, 0x8271b10, 0x4000c2e0, 1) = 1, bytes = cc target_insert_breakpoint (0x4000c2e0, xxx) = 0 child:target_xfer_partial (2, (null), 0x87f2118, 0x0, 0x4032e860, 1) = 1, bytes = 55 child:target_xfer_partial (2, (null), 0x0, 0x8271b10, 0x4032e860, 1) = 1, bytes = cc target_insert_breakpoint (0x4032e860, xxx) = 0 infrun: resume (step=0, signal=0) ... consequently GDB inserts breakpoints, including the step-resume breakpoint at expected signal return address, and continues the target. The inferior is ment to next run the handler, return, and hit the step-resume breakpoint. Funny enough this doesn't work (it's well past that step resume breakpoint). First thing I'd do is run the sig*.exp tests to see which are failing - if the kernel's working correctly they all pass. The next thing we can consider is, for the case of GDB wanting to "next" over a signal optimizing things so that instead it does a "continue/signal". That might help a little, but then it might not. Andrew From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8618 invoked by alias); 6 Feb 2005 07:16:39 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 7320 invoked from network); 6 Feb 2005 07:16:31 -0000 Received: from unknown (HELO mx1.redhat.com) (66.187.233.31) by sourceware.org with SMTP; 6 Feb 2005 07:16:31 -0000 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id j1400C8b018604 for ; Thu, 3 Feb 2005 19:00:12 -0500 Received: from localhost.redhat.com (vpn50-27.rdu.redhat.com [172.16.50.27]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id j1400BO08273; Thu, 3 Feb 2005 19:00:11 -0500 Received: from [127.0.0.1] (localhost.localdomain [127.0.0.1]) by localhost.redhat.com (Postfix) with ESMTP id 5AAF77D79; Thu, 3 Feb 2005 18:58:57 -0500 (EST) Message-ID: <4202BABF.3090605@gnu.org> Date: Sun, 06 Feb 2005 07:26:00 -0000 From: Andrew Cagney User-Agent: Mozilla Thunderbird 0.8 (X11/20041020) MIME-Version: 1.0 To: Nick Roberts Cc: gdb@sources.redhat.com Subject: Re: internal-error: insert_step_resume_breakpoint_at_sal References: <16804.1142.136766.593493@farnswood.snap.net.nz> <16874.17961.812024.375273@farnswood.snap.net.nz> <41ED5C15.7070401@gnu.org> <16877.33664.336543.446168@farnswood.snap.net.nz> <41EE82EA.7010803@gnu.org> <16879.224.997596.521183@farnswood.snap.net.nz> <41EFE531.9060605@gnu.org> <16880.27555.756855.225654@farnswood.snap.net.nz> <41F56FCD.6090101@gnu.org> <16887.27939.790997.257717@farnswood.snap.net.nz> <41F7C46D.6040702@gnu.org> <16888.105.309770.857500@farnswood.snap.net.nz> <41F80DD9.6030503@gnu.org> <16888.15318.746005.711735@farnswood.snap.net.nz> In-Reply-To: <16888.15318.746005.711735@farnswood.snap.net.nz> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2005-02/txt/msg00019.txt.bz2 Message-ID: <20050206072600.BbXEJ3hAsqiZ_ixMUQuWEPfMAFwzuBgfStSaiApfJug@z> [chug chug chug] This is what I've noticed ... (gdb) n infrun: proceed (addr=0xffffffff, signal=144, step=1) GDB tries to single step off the breakpoint @0x80a05f6, (when doing this all breakpoints are removed). GDB is expecting to get a SIGTRAP back ... infrun: resume (step=1, signal=0) target_terminal_inferior () child:target_xfer_partial (2, (null), 0xbffff18a, 0x0, 0x80a05f6, 2) = 2, bytes = 8b 45 target_resume (2896, step, 0) infrun: wait_for_inferior ... but instead the inferior executes _no_ instructions and instead returns a [pending] SIGIO (still @0x80a05f6) ... target_wait (-1, status) = 2896, status->kind = stopped, signal = SIGIO infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED target_fetch_registers (eip) = f6050a08 0x80a05f6 134874614 infrun: stop_pc = 0x80a05f6 infrun: random signal 23 ... so GDB again tries to step the inferior, this time passing in the SIGNAL, and expecting back a SIGTRAP when the inferior reaches the signal handlers first instruction ... [wonder if, for this case GDB can insert all breakpoints and do a continue] infrun: resume (step=1, signal=23) target_terminal_inferior () child:target_xfer_partial (2, (null), 0xbfffeffa, 0x0, 0x80a05f6, 2) = 2, bytes = 8b 45 target_resume (2896, step, SIGIO) infrun: prepare_to_wait ... but instead GDB gets back another pending SIGIO with the PC still back at the original BP instruction! => that's a known kernel bug. Instead of single-stepping into the signal handler, the kernel runs the signal handler to completion. Fixed in very recent 2.6 kernels. ... so gdb tries again to step into the signal handler again (remember, GDB's expecting the inferior to return a sigtrap from the process reaching the signal handler's first instruction) ... target_wait (-1, status) = 2896, status->kind = stopped, signal = SIGIO infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED target_fetch_registers (eip) = f6050a08 0x80a05f6 134874614 infrun: stop_pc = 0x80a05f6 infrun: random signal 23 infrun: resume (step=1, signal=23) target_terminal_inferior () child:target_xfer_partial (2, (null), 0xbfffeffa, 0x0, 0x80a05f6, 2) = 2, bytes = 8b 45 target_resume (2896, step, SIGIO) infrun: prepare_to_wait ... but instead the signal handler runs, returns, and then executes _one_ instruction before trapping! => that's part of the same kernel bug target_wait (-1, status) = 2896, status->kind = stopped, signal = SIGTRAP target_fetch_registers (eip) = f9050a08 0x80a05f9 134874617 infrun: infwait_normal_state infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x80a05f9 infrun: trap expected infrun: step-resume breakpoint ... so I think we've now got gdb thinking that the inferior got a sigtrap [tick] indicating that the process reached the handler [cross] and hence the the inferior should continue until the signal handler returns ... child:target_xfer_partial (2, (null), 0x84ff020, 0x0, 0x810bdc0, 1) = 1, bytes = 83 child:target_xfer_partial (2, (null), 0x0, 0x8271b10, 0x810bdc0, 1) = 1, bytes = cc target_insert_breakpoint (0x810bdc0, xxx) = 0 child:target_xfer_partial (2, (null), 0x85004c8, 0x0, 0x80e74c1, 1) = 1, bytes = 68 child:target_xfer_partial (2, (null), 0x0, 0x8271b10, 0x80e74c1, 1) = 1, bytes = cc target_insert_breakpoint (0x80e74c1, xxx) = 0 child:target_xfer_partial (2, (null), 0x874d058, 0x0, 0x80a05f6, 1) = 1, bytes = 8b child:target_xfer_partial (2, (null), 0x0, 0x8271b10, 0x80a05f6, 1) = 1, bytes = cc target_insert_breakpoint (0x80a05f6, xxx) = 0 child:target_xfer_partial (2, (null), 0x874d348, 0x0, 0x40009bd0, 1) = 1, bytes = c3 child:target_xfer_partial (2, (null), 0x0, 0x8271b10, 0x40009bd0, 1) = 1, bytes = cc target_insert_breakpoint (0x40009bd0, xxx) = 0 child:target_xfer_partial (2, (null), 0x8aacb20, 0x0, 0x4000c2e0, 1) = 1, bytes = 55 child:target_xfer_partial (2, (null), 0x0, 0x8271b10, 0x4000c2e0, 1) = 1, bytes = cc target_insert_breakpoint (0x4000c2e0, xxx) = 0 child:target_xfer_partial (2, (null), 0x87f2118, 0x0, 0x4032e860, 1) = 1, bytes = 55 child:target_xfer_partial (2, (null), 0x0, 0x8271b10, 0x4032e860, 1) = 1, bytes = cc target_insert_breakpoint (0x4032e860, xxx) = 0 infrun: resume (step=0, signal=0) ... consequently GDB inserts breakpoints, including the step-resume breakpoint at expected signal return address, and continues the target. The inferior is ment to next run the handler, return, and hit the step-resume breakpoint. Funny enough this doesn't work (it's well past that step resume breakpoint). First thing I'd do is run the sig*.exp tests to see which are failing - if the kernel's working correctly they all pass. The next thing we can consider is, for the case of GDB wanting to "next" over a signal optimizing things so that instead it does a "continue/signal". That might help a little, but then it might not. Andrew