RE:[rfa] Handle amd64-linux %orig

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

* RE:[rfa] Handle amd64-linux %orig_rax
@ 2006-10-31 18:17 Datoda
  2006-10-31 18:22 ` [rfa] " Daniel Jacobowitz
  0 siblings, 1 reply; 7+ messages in thread
From: Datoda @ 2006-10-31 18:17 UTC (permalink / raw)
  To: gdb-patches

I just downloaded the GDB CVS snapshot from 20061030, which contains Daniel's patch, and built a gdb with it on x86_64 Linux. Here's a transcript of a debug session:

--- Transcript begins ---

GNU gdb 6.5.50.20061030
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
Using host libthread_db library "/lib64/tls/libthread_db.so.1".
Setting up the environment for debugging gdb.
Function "internal_error" not defined.
Make breakpoint pending on future shared library load? (y or [n]) [answered N; input not from terminal]
Function "info_command" not defined.
Make breakpoint pending on future shared library load? (y or [n]) [answered N; input not from terminal]
/site/spt/usr7/cchen15/gdb-ftp/gdb-6.5.50.20061030/gdb/.gdbinit:8: Error in sourced command file:
No breakpoint number 0.
(gdb) l 1
1       #include <stdio.h>
2       #include <unistd.h>
3
4       int func1 (int a, int b, int c, int d)
5       {
6           return a + b + c + d;
7       }
8           
9       int main ()
10      {
(gdb) l
11          printf ("Sleeping...\n");
12          sleep(100);
13          func1(1,2,3,4);
14          return 0;
15      }
16
(gdb) r
Starting program: /site/spt/usr7/cchen15/em64t/interrupt 
Sleeping...
Program received signal SIGINT, Interrupt.
0x0000003cad08e992 in __nanosleep_nocancel () from /lib64/tls/libc.so.6
(gdb) p func1(1,2,3,4)
During symbol reading, incomplete CFI data; unspecified registers (e.g., rax) at 0x3cad08e99f.
$1 = 4195998
(gdb) p func1(1,2,3,4)
$2 = 10
(gdb) 

--- Transcript ends ---

As you can see, gdb returns the wrong value on a function that takes four (as shown here) or more arguments when the process is interrupted when inside a system call. 

Here's my explanation of the cause of this problem: According to the AMD64 ABI, Section A.2, Item 2., the kernel destroys %rcx in a syscall. Meanwhile, the calling convention uses %rcx to pass in the fourth argument to the function being called. Therefore, the kernel may trash the fourth argument to an inferior call even when it's not restarting the interrupted system call (i.e., when %orig_rax is set to a negative value) because the kernel is still in the "syscall mode".

A repeat inferior call returns the correct value because the kernel has left that syscall mode when doing the first inferior call.

It appears to me that instead of telling the kernel not to restart a syscall by setting %orig_rax to -1, gdb should be telling the kernel to forget about the syscall all together when initiating an inferior call, and restoring the kernel's memory about the interrupted syscall when the inferior call finishes. I don't know how that can be achieved, though. 

Alternatively, the kernel can be made to suppress trashing %rcx when the process is being debugged and %orig_rax is -1. But I don't know the ramification/implication of this change.

Or perhaps it's just a kernel bug that needs to be fixed....

Any other insight on this?

===========================================
[rfa] Handle amd64-linux %orig_rax
From: Daniel Jacobowitz <drow at false dot org> 
To: gdb-patches at sourceware dot org 
Date: Sat, 19 Aug 2006 00:20:40 -0400 
Subject: [rfa] Handle amd64-linux %orig_rax 
--------------------------------------------------------------------------------
x86_64-pc-linux-gnu has an %orig_rax "register", exactly parallel to the
i386-pc-linux-gnu %orig_eax.  It's set to a syscall number to trigger
system call restarting, and has to be set to -1 when we want to prevent
restarting.  It appears to be in every regard the same as the i386
version; this copies the i386 handling, and fits it into the existing
amd64 infrastructure.
Tested on x86_64-pc-linux-gnu; this fixes the interrupt.exp timeouts,
finally.  I've known what this was for ages but never had the patience to
make the mechanical change before, and someone reported this as a bug on the
Eclipse CDT list recently.
Does this look OK?
-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [rfa] Handle amd64-linux %orig_rax
  2006-10-31 18:17 RE:[rfa] Handle amd64-linux %orig_rax Datoda
@ 2006-10-31 18:22 ` Daniel Jacobowitz
  2006-10-31 18:40   ` Andi Kleen
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Jacobowitz @ 2006-10-31 18:22 UTC (permalink / raw)
  To: Datoda, Andi Kleen; +Cc: gdb-patches

Andi, have you got any opinion on this?  The problem arises when GDB
sets %orig_rax to -1 to indicate that the interrupted syscall should
not be resumed, and then sets %rip to some other address; the kernel is
still changing %rcx on the way out to userspace.  I think this sounds
like a kernel bug.

On Tue, Oct 31, 2006 at 10:17:01AM -0800, Datoda wrote:
> Here's my explanation of the cause of this problem: According to the
> AMD64 ABI, Section A.2, Item 2., the kernel destroys %rcx in a
> syscall. Meanwhile, the calling convention uses %rcx to pass in the
> fourth argument to the function being called. Therefore, the kernel
> may trash the fourth argument to an inferior call even when it's not
> restarting the interrupted system call (i.e., when %orig_rax is set
> to a negative value) because the kernel is still in the "syscall
> mode".
> 
> A repeat inferior call returns the correct value because the kernel
> has left that syscall mode when doing the first inferior call.
> 
> It appears to me that instead of telling the kernel not to restart a
> syscall by setting %orig_rax to -1, gdb should be telling the kernel
> to forget about the syscall all together when initiating an inferior
> call, and restoring the kernel's memory about the interrupted syscall
> when the inferior call finishes. I don't know how that can be
> achieved, though.

We don't need to restore it; we can just restart the syscall from
scratch.

> Alternatively, the kernel can be made to suppress trashing %rcx when
> the process is being debugged and %orig_rax is -1. But I don't know
> the ramification/implication of this change.
> 
> Or perhaps it's just a kernel bug that needs to be fixed....
> 
> Any other insight on this?

-- 
Daniel Jacobowitz
CodeSourcery


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [rfa] Handle amd64-linux %orig_rax
  2006-10-31 18:22 ` [rfa] " Daniel Jacobowitz
@ 2006-10-31 18:40   ` Andi Kleen
  2006-10-31 18:49     ` Daniel Jacobowitz
  0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2006-10-31 18:40 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Datoda, gdb-patches

On Tuesday 31 October 2006 19:22, Daniel Jacobowitz wrote:
> Andi, have you got any opinion on this?  The problem arises when GDB
> sets %orig_rax to -1 to indicate that the interrupted syscall should
> not be resumed, and then sets %rip to some other address; the kernel is
> still changing %rcx on the way out to userspace.  I think this sounds
> like a kernel bug.

You would need to complain to the x86 ISA designers.

SYSRET requires us to trash %rcx, there is no other way to use it.
This means IRET won't clobber any registers (and it is used in a few
situations where this is critical), but it is significantly slower.

Ok in theory we could check if the process is traced and then
always use IRET, but then you would get different behaviour
depending on being traced or not which is probably not
a good idea.

BTW on i386 which uses SYSEXIT sometimes there are likely similar
problems. SYSEXIT also requires to clobber registers.

-Andi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [rfa] Handle amd64-linux %orig_rax
  2006-10-31 18:40   ` Andi Kleen
@ 2006-10-31 18:49     ` Daniel Jacobowitz
  2006-10-31 19:11       ` Andi Kleen
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Jacobowitz @ 2006-10-31 18:49 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Datoda, gdb-patches

On Tue, Oct 31, 2006 at 07:40:38PM +0100, Andi Kleen wrote:
> On Tuesday 31 October 2006 19:22, Daniel Jacobowitz wrote:
> > Andi, have you got any opinion on this?  The problem arises when GDB
> > sets %orig_rax to -1 to indicate that the interrupted syscall should
> > not be resumed, and then sets %rip to some other address; the kernel is
> > still changing %rcx on the way out to userspace.  I think this sounds
> > like a kernel bug.
> 
> You would need to complain to the x86 ISA designers.
> 
> SYSRET requires us to trash %rcx, there is no other way to use it.
> This means IRET won't clobber any registers (and it is used in a few
> situations where this is critical), but it is significantly slower.

Oh dear.  So if we set registers on the syscall exit path, the
kernel/ISA may just eat them.  And we have no reliable way to know
whether we're stopped on the syscall exit path.  There's gotta be a
better way, but I don't know what it might be...

-- 
Daniel Jacobowitz
CodeSourcery


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [rfa] Handle amd64-linux %orig_rax
  2006-10-31 18:49     ` Daniel Jacobowitz
@ 2006-10-31 19:11       ` Andi Kleen
  2006-10-31 19:30         ` Daniel Jacobowitz
  0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2006-10-31 19:11 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Datoda, gdb-patches


> Oh dear.  So if we set registers on the syscall exit path, the
> kernel/ISA may just eat them.  And we have no reliable way to know
> whether we're stopped on the syscall exit path.

If you're single stepping over it you can remember it from
one instruction before (check if the opcode is SYSCALL or SYSENTER,
these are unique 2 byte opcodes each)

If someone sets a breakpoint directly on the return point
and doesn't single step that wouldn't work, but then you shouldn't care about 
the previous register state anyways.

-Andi


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [rfa] Handle amd64-linux %orig_rax
  2006-10-31 19:11       ` Andi Kleen
@ 2006-10-31 19:30         ` Daniel Jacobowitz
  2006-10-31 19:33           ` Daniel Jacobowitz
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Jacobowitz @ 2006-10-31 19:30 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Datoda, gdb-patches

On Tue, Oct 31, 2006 at 08:11:20PM +0100, Andi Kleen wrote:
> 
> > Oh dear.  So if we set registers on the syscall exit path, the
> > kernel/ISA may just eat them.  And we have no reliable way to know
> > whether we're stopped on the syscall exit path.
> 
> If you're single stepping over it you can remember it from
> one instruction before (check if the opcode is SYSCALL or SYSENTER,
> these are unique 2 byte opcodes each)
> 
> If someone sets a breakpoint directly on the return point
> and doesn't single step that wouldn't work, but then you shouldn't care about 
> the previous register state anyways.

This case is usually SIGINT while inside a syscall, e.g. nanosleep. 
That gives us a prompt, and if the user changes $rcx there, we write
into the register - and later it gets overridden.  i.e. we're at the
ptrace_stop call in kernel/signal.c:get_signal_to_deliver.

I'm not quite sure how we're getting into the problem case though?
I'd have guessed we were in sysret_signal and that uses iret.

-- 
Daniel Jacobowitz
CodeSourcery


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [rfa] Handle amd64-linux %orig_rax
  2006-10-31 19:30         ` Daniel Jacobowitz
@ 2006-10-31 19:33           ` Daniel Jacobowitz
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Jacobowitz @ 2006-10-31 19:33 UTC (permalink / raw)
  To: Andi Kleen, Datoda, gdb-patches

On Tue, Oct 31, 2006 at 02:30:35PM -0500, Daniel Jacobowitz wrote:
> On Tue, Oct 31, 2006 at 08:11:20PM +0100, Andi Kleen wrote:
> > 
> > > Oh dear.  So if we set registers on the syscall exit path, the
> > > kernel/ISA may just eat them.  And we have no reliable way to know
> > > whether we're stopped on the syscall exit path.
> > 
> > If you're single stepping over it you can remember it from
> > one instruction before (check if the opcode is SYSCALL or SYSENTER,
> > these are unique 2 byte opcodes each)
> > 
> > If someone sets a breakpoint directly on the return point
> > and doesn't single step that wouldn't work, but then you shouldn't care about 
> > the previous register state anyways.
> 
> This case is usually SIGINT while inside a syscall, e.g. nanosleep. 
> That gives us a prompt, and if the user changes $rcx there, we write
> into the register - and later it gets overridden.  i.e. we're at the
> ptrace_stop call in kernel/signal.c:get_signal_to_deliver.
> 
> I'm not quite sure how we're getting into the problem case though?
> I'd have guessed we were in sysret_signal and that uses iret.

Datoda, what kernel version were you using?  I wonder if this fixed it
as a side effect:

Commit: 7bf36bbc5e0c09271f9efe22162f8cc3f8ebd3d2 
Author: Andi Kleen <ak@suse.de> Fri, 07 Apr 2006 19:50:00 +0200 

    [PATCH] x86_64: When user could have changed RIP always force IRET

-- 
Daniel Jacobowitz
CodeSourcery


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-10-31 19:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-10-31 18:17 RE:[rfa] Handle amd64-linux %orig_rax Datoda
2006-10-31 18:22 ` [rfa] " Daniel Jacobowitz
2006-10-31 18:40   ` Andi Kleen
2006-10-31 18:49     ` Daniel Jacobowitz
2006-10-31 19:11       ` Andi Kleen
2006-10-31 19:30         ` Daniel Jacobowitz
2006-10-31 19:33           ` Daniel Jacobowitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox