From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6381 invoked by alias); 10 Mar 2004 23:25:56 -0000 Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 6372 invoked from network); 10 Mar 2004 23:25:54 -0000 Received: from unknown (HELO localhost.redhat.com) (216.129.200.20) by sources.redhat.com with SMTP; 10 Mar 2004 23:25:54 -0000 Received: from gnu.org (localhost [127.0.0.1]) by localhost.redhat.com (Postfix) with ESMTP id AC1E62B92; Wed, 10 Mar 2004 18:25:17 -0500 (EST) Message-ID: <404FA3DD.8030801@gnu.org> Date: Wed, 10 Mar 2004 23:25:00 -0000 From: Andrew Cagney User-Agent: Mozilla/5.0 (X11; U; NetBSD macppc; en-GB; rv:1.4.1) Gecko/20040217 MIME-Version: 1.0 To: Ulrich Weigand Cc: gdb-patches@sources.redhat.com Subject: Re: [PATCH] Fix signals.exp test case on S/390 References: <200403102032.VAA05524@faui1d.informatik.uni-erlangen.de> In-Reply-To: <200403102032.VAA05524@faui1d.informatik.uni-erlangen.de> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2004-03.o/txt/msg00256.txt > Hello, > > I was trying to get the signals.exp test case to pass on s390. > This exposed a number of problems in the Linux kernel and gdb; > in particular I think I've found a bug in gdb common code > (step_over_function in infrun.c). > > However, I'm not completely sure I fully understand the intended > flow of control throughout infrun.c, so I'd appreciate comments > on what I found. > > Basically, the test case arranges for a signal (SIGALRM) to > arrive while the inferior is stopped, and then does a 'next'. > When the inferior is continued with PT_SINGLESTEP due to the > 'next', the kernel tries to deliver the pending SIGALRM. > > This in turn causes another ptrace intercept. gdb gets active, > notices that it doesn't want to do anything special with the > signal, and continues with PT_SINGLESTEP giving the signal > number. > > At this point, the Linux kernel delivers the signal, runs to > the first instruction of the signal handler, and gets the > single step trap there. (Note that there was a bug in the > kernel that caused this not to work, but I've fixed that one.) > > Now the interesting aspect starts. handle_inferior_event > needs to figure out what has happened here. There is special > code in handle_inferior_event that tries to recognize the > 'we've just stepped into a signal handler' situation: > > /* Did we just take a signal? */ > if (pc_in_sigtramp (stop_pc) > && !pc_in_sigtramp (prev_pc) > && INNER_THAN (read_sp (), step_sp)) [anything in infrun.c using INNER_THAN is always suspect] > { > /* We've just taken a signal; go until we are back to > the point where we took it and one more. */ > > Note, however, that this fundamentally does not work on > Linux because we use a signal trampoline only to *exit* > from a signal handler -- the kernel *starts* signal handlers > by jumping to them directly without any trampoline. > > So this test doesn't trigger, and the event turns out to > be interpreted as call to a subroutine, and is passed on > to handle_step_info_function, which decides to step over > that function call using step_over_function. This would > basically work just fine for this case -- we'd continue > just after the signal handler terminates, which is what > we want here. > > However, step_over_function continues to run not only > until a specific PC has been reached, but at the same time > a specific *frame* needs to be reached. In the situation > we're in right now: > > PC frame > 1st instruction of signal handler sig. handler frame > 1st instruction of sigreturn trampoline sigtramp frame > interrupted main routine main routine frame > > step_over_function tries to continue running until it > reaches the current frame's return address (i.e. the 1st > instruction of the sigreturn trampoline) while at the same > time reaching the frame currently being stepped (i.e the > main routine's frame). Of course, these two events never > coincide, and hence gdb steps until the program terminates. I'm lost here. What happens with: - get_frame_id (get_prev_frame (signal handler)) - get_frame_id (sigreturn trampoline) Hopefully you can see the values by grubbing around in the output from "set debug frame 1". They should match (check my tramp branch, it contains, er, interesting ideas on how to better handle trampolines). Andrew From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6381 invoked by alias); 10 Mar 2004 23:25:56 -0000 Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 6372 invoked from network); 10 Mar 2004 23:25:54 -0000 Received: from unknown (HELO localhost.redhat.com) (216.129.200.20) by sources.redhat.com with SMTP; 10 Mar 2004 23:25:54 -0000 Received: from gnu.org (localhost [127.0.0.1]) by localhost.redhat.com (Postfix) with ESMTP id AC1E62B92; Wed, 10 Mar 2004 18:25:17 -0500 (EST) Message-ID: <404FA3DD.8030801@gnu.org> Date: Fri, 19 Mar 2004 00:09:00 -0000 From: Andrew Cagney User-Agent: Mozilla/5.0 (X11; U; NetBSD macppc; en-GB; rv:1.4.1) Gecko/20040217 MIME-Version: 1.0 To: Ulrich Weigand Cc: gdb-patches@sources.redhat.com Subject: Re: [PATCH] Fix signals.exp test case on S/390 References: <200403102032.VAA05524@faui1d.informatik.uni-erlangen.de> In-Reply-To: <200403102032.VAA05524@faui1d.informatik.uni-erlangen.de> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2004-03/txt/msg00256.txt.bz2 Message-ID: <20040319000900.NA9Tcuksbfcsl7tx0ifMtmz1FBOeaA7sDKAl4ROkPns@z> > Hello, > > I was trying to get the signals.exp test case to pass on s390. > This exposed a number of problems in the Linux kernel and gdb; > in particular I think I've found a bug in gdb common code > (step_over_function in infrun.c). > > However, I'm not completely sure I fully understand the intended > flow of control throughout infrun.c, so I'd appreciate comments > on what I found. > > Basically, the test case arranges for a signal (SIGALRM) to > arrive while the inferior is stopped, and then does a 'next'. > When the inferior is continued with PT_SINGLESTEP due to the > 'next', the kernel tries to deliver the pending SIGALRM. > > This in turn causes another ptrace intercept. gdb gets active, > notices that it doesn't want to do anything special with the > signal, and continues with PT_SINGLESTEP giving the signal > number. > > At this point, the Linux kernel delivers the signal, runs to > the first instruction of the signal handler, and gets the > single step trap there. (Note that there was a bug in the > kernel that caused this not to work, but I've fixed that one.) > > Now the interesting aspect starts. handle_inferior_event > needs to figure out what has happened here. There is special > code in handle_inferior_event that tries to recognize the > 'we've just stepped into a signal handler' situation: > > /* Did we just take a signal? */ > if (pc_in_sigtramp (stop_pc) > && !pc_in_sigtramp (prev_pc) > && INNER_THAN (read_sp (), step_sp)) [anything in infrun.c using INNER_THAN is always suspect] > { > /* We've just taken a signal; go until we are back to > the point where we took it and one more. */ > > Note, however, that this fundamentally does not work on > Linux because we use a signal trampoline only to *exit* > from a signal handler -- the kernel *starts* signal handlers > by jumping to them directly without any trampoline. > > So this test doesn't trigger, and the event turns out to > be interpreted as call to a subroutine, and is passed on > to handle_step_info_function, which decides to step over > that function call using step_over_function. This would > basically work just fine for this case -- we'd continue > just after the signal handler terminates, which is what > we want here. > > However, step_over_function continues to run not only > until a specific PC has been reached, but at the same time > a specific *frame* needs to be reached. In the situation > we're in right now: > > PC frame > 1st instruction of signal handler sig. handler frame > 1st instruction of sigreturn trampoline sigtramp frame > interrupted main routine main routine frame > > step_over_function tries to continue running until it > reaches the current frame's return address (i.e. the 1st > instruction of the sigreturn trampoline) while at the same > time reaching the frame currently being stepped (i.e the > main routine's frame). Of course, these two events never > coincide, and hence gdb steps until the program terminates. I'm lost here. What happens with: - get_frame_id (get_prev_frame (signal handler)) - get_frame_id (sigreturn trampoline) Hopefully you can see the values by grubbing around in the output from "set debug frame 1". They should match (check my tramp branch, it contains, er, interesting ideas on how to better handle trampolines). Andrew