Mirror of the gdb mailing list
 help / color / mirror / Atom feed
* Non-stop mode disfunctional ?
@ 2011-03-14 14:20 Chris Hall
  2011-03-14 14:52 ` Pedro Alves
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Hall @ 2011-03-14 14:20 UTC (permalink / raw)
  To: gdb


Sorry to follow up my own posting...

... I'm hoping to discover if the fact that non-stop doesn't work is
(a) because I'm not using it correctly, or (b) a well known problem I
should know about, or (c) an actual bug, or (d) something else ?

Thanks,

Chris

> -----Original Message-----
> From: gdb-owner@sourceware.org [mailto:gdb-owner@sourceware.org] On
> Behalf Of Chris Hall
> Sent: 07 March 2011 11:01
> To: gdb@sourceware.org
> Subject: SIGSEGV on exit from subroutines -- problem with non-stop ?
> 
> Hi,
> 
> I am using gdb 7.2-14.fc14 to work on a large multi-threaded
> application, in C, x86-64.
> 
> I have .gdbinit, per the book:
> 
>   set target async 1
>   set pagination off
>   set non-stop on
> 
> When I step using 's' or 'n', as it leaves some subroutines I keep
> getting SIGSEGV, such as:
> 
>   Program received signal SIGSEGV, Segmentation fault.
>   signal_set (signo=Cannot access memory at address
> 0xffffffffffffff5c)
>   at ...
> 
> When I 'disass' the current instruction is a leaveq.  Examining the
> registers I observe that rbp is zero, which is clearly nonsense.
> 
> I found one instance which was repeatable, which happened to be
> before
> any threads were started: if I 'ni' through a particular function,
> it
> gets to the leaveq, and gets stuck there.  Each time I do ni, the
> rsp
> and the rbp are updated by the repeated leaveq, until it goes bang.
> 
> So... I began to think this isn't something complicated to do with
> multiple threads... so here is a test:
> 
> <<--test.c-----------------------------------------------
> #include <stdio.h>
> #include <stdlib.h>
> 
> static void
> target(const char* message) {
> 	printf("%s ...BANG!\n", message) ;
> }
> 
> int main(int argc, char* argv[]) {
> 
> 	target("Light the blue touch paper") ;
> 
> 	return 0 ;
> }
> ------------------------------------------------------->>
> 
> Compiled by gcc 4.5.1 "-g -O0".
> 
> If I do "gdb test", stepping by "n":
> 
> <<-------------------------------------------------------
> (gdb) show non-stop
> Controlling the inferior in non-stop mode is on.
> (gdb) b target
> Breakpoint 1 at 0x4004d0: file test.c, line 6.
> (gdb) run
> Starting program: ...........test
> 
> Breakpoint 1, target (message=0x400615 "Light the blue touch paper")
> at test.c:6
> 6		printf("%s ...BANG!\n", message) ;
> (gdb) n
> Light the blue touch paper ...BANG!
> 7	}
> (gdb) n
> 
> Program received signal SIGSEGV, Segmentation fault.
> target (message=Cannot access memory at address 0xfffffffffffffff8
> ) at test.c:7
> 7	}
> (gdb) info reg
> ....
> rbp   0x0         	0x0
> rsp   0x7fffffffe248	0x7fffffffe248
> ....
> rip   0x4004e9		0x4004e9 	<target+37>
> ....
> ------------------------------------------------------->>
> 
> Or, stepping by 'ni':
> 
> <<-------------------------------------------------------
> (gdb) show non-stop
> Controlling the inferior in non-stop mode is on.
> (gdb) b target
> Breakpoint 1 at 0x4004d0: file test.c, line 6.
> (gdb) disass target
> Dump of assembler code for function target:
>    0x00000000004004c4 <+0>:	push   %rbp
>    0x00000000004004c5 <+1>:	mov    %rsp,%rbp
>    0x00000000004004c8 <+4>:	sub    $0x10,%rsp
>    0x00000000004004cc <+8>:	mov    %rdi,-0x8(%rbp)
>    0x00000000004004d0 <+12>:	mov    $0x400608,%eax
>    0x00000000004004d5 <+17>:	mov    -0x8(%rbp),%rdx
>    0x00000000004004d9 <+21>:	mov    %rdx,%rsi
>    0x00000000004004dc <+24>:	mov    %rax,%rdi
>    0x00000000004004df <+27>:	mov    $0x0,%eax
>    0x00000000004004e4 <+32>:	callq  0x4003b8 <printf@plt>
>    0x00000000004004e9 <+37>:	leaveq
>    0x00000000004004ea <+38>:	retq
> End of assembler dump.
> (gdb) disp/i $pc
> (gdb) run
> Starting program: .......test
> 
> Breakpoint 1, target (message=0x400615 "Light the blue touch paper")
> at test.c:6
> 6		printf("%s ...BANG!\n", message) ;
> .....
> 1: x/i $pc
> => 0x4004e4 <target+32>:	callq  0x4003b8 <printf@plt>
> (gdb) ni
> Light the blue touch paper ...BANG!
> 7	}
> 1: x/i $pc
> => 0x4004e9 <target+37>:	leaveq
> (gdb) ni
> target (message=0x100000000 <Address 0x100000000 out of bounds>) at
> test.c:7
> 7	}
> 1: x/i $pc
> => 0x4004e9 <target+37>:	leaveq
> (gdb) ni
> Cannot access memory at address 0x8
> (gdb) ni
> The program is not being run.
> ------------------------------------------------------->>
> 
> I note that if I turn off the "non-stop" option, it works.  So this
> is
> something to do with debugging multi-threaded !
> 
> I note also that if I change the target to:
> 
>   static int
>   target(const char* message) {
>           printf("%s ...BANG!\n", message) ;
>           return 0 ;
>   }
> 
> the problem goes away... so one extra instruction between the callq
> and the leaveq makes a difference:
> 
>    0x00000000004004dc <+24>:	mov    %rax,%rdi
>    0x00000000004004df <+27>:	mov    $0x0,%eax
>    0x00000000004004e4 <+32>:	callq  0x4003b8 <printf@plt>
>    0x00000000004004e9 <+37>:	mov    $0x0,%eax
>    0x00000000004004ee <+42>:	leaveq
>    0x00000000004004ef <+43>:	retq
> 
> This goes some way to explaining why it appeared to be a sporadic
> problem.
> 
> Is this me, or is this a bug ?  It used to work :-(
> 
> Thanks,
> 
> Chris


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Non-stop mode disfunctional ?
  2011-03-14 14:20 Non-stop mode disfunctional ? Chris Hall
@ 2011-03-14 14:52 ` Pedro Alves
  2011-03-14 16:17   ` chris.hall.list
  2011-03-24 10:16   ` 'Chris Hall'
  0 siblings, 2 replies; 4+ messages in thread
From: Pedro Alves @ 2011-03-14 14:52 UTC (permalink / raw)
  To: gdb; +Cc: Chris Hall

On Monday 14 March 2011 14:20:11, Chris Hall wrote:
> 
> Sorry to follow up my own posting...

It'd be better to not start a new thread for each post.

> ... I'm hoping to discover if the fact that non-stop doesn't work is
> (a) because I'm not using it correctly, or (b) a well known problem I
> should know about, or (c) an actual bug, or (d) something else ?

I only read your original post diagonally, but I'd suspect
a problem with displaced-stepping.  Try with both non-stop and
target-async off (the default), but enabling
"set breakpoint always-inserted on", and "set displaced-stepping on".

-- 
Pedro Alves

> 
> Thanks,
> 
> Chris
> 
> > -----Original Message-----
> > From: gdb-owner@sourceware.org [mailto:gdb-owner@sourceware.org] On
> > Behalf Of Chris Hall
> > Sent: 07 March 2011 11:01
> > To: gdb@sourceware.org
> > Subject: SIGSEGV on exit from subroutines -- problem with non-stop ?
> > 
> > Hi,
> > 
> > I am using gdb 7.2-14.fc14 to work on a large multi-threaded
> > application, in C, x86-64.
> > 
> > I have .gdbinit, per the book:
> > 
> >   set target async 1
> >   set pagination off
> >   set non-stop on
> > 
> > When I step using 's' or 'n', as it leaves some subroutines I keep
> > getting SIGSEGV, such as:
> > 
> >   Program received signal SIGSEGV, Segmentation fault.
> >   signal_set (signo=Cannot access memory at address
> > 0xffffffffffffff5c)
> >   at ...
> > 
> > When I 'disass' the current instruction is a leaveq.  Examining the
> > registers I observe that rbp is zero, which is clearly nonsense.
> > 
> > I found one instance which was repeatable, which happened to be
> > before
> > any threads were started: if I 'ni' through a particular function,
> > it
> > gets to the leaveq, and gets stuck there.  Each time I do ni, the
> > rsp
> > and the rbp are updated by the repeated leaveq, until it goes bang.
> > 
> > So... I began to think this isn't something complicated to do with
> > multiple threads... so here is a test:
> > 
> > <<--test.c-----------------------------------------------
> > #include <stdio.h>
> > #include <stdlib.h>
> > 
> > static void
> > target(const char* message) {
> > 	printf("%s ...BANG!\n", message) ;
> > }
> > 
> > int main(int argc, char* argv[]) {
> > 
> > 	target("Light the blue touch paper") ;
> > 
> > 	return 0 ;
> > }
> > ------------------------------------------------------->>
> > 
> > Compiled by gcc 4.5.1 "-g -O0".
> > 
> > If I do "gdb test", stepping by "n":
> > 
> > <<-------------------------------------------------------
> > (gdb) show non-stop
> > Controlling the inferior in non-stop mode is on.
> > (gdb) b target
> > Breakpoint 1 at 0x4004d0: file test.c, line 6.
> > (gdb) run
> > Starting program: ...........test
> > 
> > Breakpoint 1, target (message=0x400615 "Light the blue touch paper")
> > at test.c:6
> > 6		printf("%s ...BANG!\n", message) ;
> > (gdb) n
> > Light the blue touch paper ...BANG!
> > 7	}
> > (gdb) n
> > 
> > Program received signal SIGSEGV, Segmentation fault.
> > target (message=Cannot access memory at address 0xfffffffffffffff8
> > ) at test.c:7
> > 7	}
> > (gdb) info reg
> > ....
> > rbp   0x0         	0x0
> > rsp   0x7fffffffe248	0x7fffffffe248
> > ....
> > rip   0x4004e9		0x4004e9 	<target+37>
> > ....
> > ------------------------------------------------------->>
> > 
> > Or, stepping by 'ni':
> > 
> > <<-------------------------------------------------------
> > (gdb) show non-stop
> > Controlling the inferior in non-stop mode is on.
> > (gdb) b target
> > Breakpoint 1 at 0x4004d0: file test.c, line 6.
> > (gdb) disass target
> > Dump of assembler code for function target:
> >    0x00000000004004c4 <+0>:	push   %rbp
> >    0x00000000004004c5 <+1>:	mov    %rsp,%rbp
> >    0x00000000004004c8 <+4>:	sub    $0x10,%rsp
> >    0x00000000004004cc <+8>:	mov    %rdi,-0x8(%rbp)
> >    0x00000000004004d0 <+12>:	mov    $0x400608,%eax
> >    0x00000000004004d5 <+17>:	mov    -0x8(%rbp),%rdx
> >    0x00000000004004d9 <+21>:	mov    %rdx,%rsi
> >    0x00000000004004dc <+24>:	mov    %rax,%rdi
> >    0x00000000004004df <+27>:	mov    $0x0,%eax
> >    0x00000000004004e4 <+32>:	callq  0x4003b8 <printf@plt>
> >    0x00000000004004e9 <+37>:	leaveq
> >    0x00000000004004ea <+38>:	retq
> > End of assembler dump.
> > (gdb) disp/i $pc
> > (gdb) run
> > Starting program: .......test
> > 
> > Breakpoint 1, target (message=0x400615 "Light the blue touch paper")
> > at test.c:6
> > 6		printf("%s ...BANG!\n", message) ;
> > .....
> > 1: x/i $pc
> > => 0x4004e4 <target+32>:	callq  0x4003b8 <printf@plt>
> > (gdb) ni
> > Light the blue touch paper ...BANG!
> > 7	}
> > 1: x/i $pc
> > => 0x4004e9 <target+37>:	leaveq
> > (gdb) ni
> > target (message=0x100000000 <Address 0x100000000 out of bounds>) at
> > test.c:7
> > 7	}
> > 1: x/i $pc
> > => 0x4004e9 <target+37>:	leaveq
> > (gdb) ni
> > Cannot access memory at address 0x8
> > (gdb) ni
> > The program is not being run.
> > ------------------------------------------------------->>
> > 
> > I note that if I turn off the "non-stop" option, it works.  So this
> > is
> > something to do with debugging multi-threaded !
> > 
> > I note also that if I change the target to:
> > 
> >   static int
> >   target(const char* message) {
> >           printf("%s ...BANG!\n", message) ;
> >           return 0 ;
> >   }
> > 
> > the problem goes away... so one extra instruction between the callq
> > and the leaveq makes a difference:
> > 
> >    0x00000000004004dc <+24>:	mov    %rax,%rdi
> >    0x00000000004004df <+27>:	mov    $0x0,%eax
> >    0x00000000004004e4 <+32>:	callq  0x4003b8 <printf@plt>
> >    0x00000000004004e9 <+37>:	mov    $0x0,%eax
> >    0x00000000004004ee <+42>:	leaveq
> >    0x00000000004004ef <+43>:	retq
> > 
> > This goes some way to explaining why it appeared to be a sporadic
> > problem.
> > 
> > Is this me, or is this a bug ?  It used to work :-(
> > 
> > Thanks,
> > 
> > Chris
> 
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Non-stop mode disfunctional ?
  2011-03-14 14:52 ` Pedro Alves
@ 2011-03-14 16:17   ` chris.hall.list
  2011-03-24 10:16   ` 'Chris Hall'
  1 sibling, 0 replies; 4+ messages in thread
From: chris.hall.list @ 2011-03-14 16:17 UTC (permalink / raw)
  To: 'Pedro Alves', gdb; +Cc: 'Chris Hall'

Pedro Alves wrote (on Mon 14-Mar-2011 at 14:52):
> > ... I'm hoping to discover if the fact that non-stop doesn't work
> > is (a) because I'm not using it correctly, or (b) a well known
> > problem I should know about, or (c) an actual bug, or
> > (d) something else ?

> I only read your original post diagonally, but I'd suspect a problem
> with displaced-stepping.  Try with both non-stop and target-async
> off (the default), but enabling "set breakpoint always-inserted on",
> and "set displaced-stepping on".

I tried:

  set non-stop off
  set target-async off
  set breakpoint always-inserted on
  set displaced-stepping on

and gdb worked just fine.

I note that "breakpoint always-inserted" and "displaced-stepping"
default to "on" for "non-stop", so I tried:

  set non-stop on
  set target-async on
  set breakpoint always-inserted off
  set displaced-stepping off

and gdb goes BANG (SIGSEGV).

As far as I can tell, with "non-stop" "on", single stepping is borked
(at least on leaving a void function immediately after calling another
function).

In case it makes a difference, I am running AMD Phenom II X6 1090T
(Stepping 0).

Thanks,

Chris

> --
> Pedro Alves
> 
> >
> > Thanks,
> >
> > Chris
> >
> > > -----Original Message-----
> > > From: gdb-owner@sourceware.org [mailto:gdb-owner@sourceware.org]
> On
> > > Behalf Of Chris Hall
> > > Sent: 07 March 2011 11:01
> > > To: gdb@sourceware.org
> > > Subject: SIGSEGV on exit from subroutines -- problem with non-
> stop ?
> > >
> > > Hi,
> > >
> > > I am using gdb 7.2-14.fc14 to work on a large multi-threaded
> > > application, in C, x86-64.
> > >
> > > I have .gdbinit, per the book:
> > >
> > >   set target async 1
> > >   set pagination off
> > >   set non-stop on
> > >
> > > When I step using 's' or 'n', as it leaves some subroutines I
> keep
> > > getting SIGSEGV, such as:
> > >
> > >   Program received signal SIGSEGV, Segmentation fault.
> > >   signal_set (signo=Cannot access memory at address
> > > 0xffffffffffffff5c)
> > >   at ...
> > >
> > > When I 'disass' the current instruction is a leaveq.  Examining
> the
> > > registers I observe that rbp is zero, which is clearly nonsense.
> > >
> > > I found one instance which was repeatable, which happened to be
> > > before any threads were started: if I 'ni' through a particular
> > > function, it gets to the leaveq, and gets stuck there.  Each
> time I
> > > do ni, the rsp and the rbp are updated by the repeated leaveq,
> until
> > > it goes bang.
> > >
> > > So... I began to think this isn't something complicated to do
> with
> > > multiple threads... so here is a test:
> > >
> > > <<--test.c-----------------------------------------------
> > > #include <stdio.h>
> > > #include <stdlib.h>
> > >
> > > static void
> > > target(const char* message) {
> > > 	printf("%s ...BANG!\n", message) ;
> > > }
> > >
> > > int main(int argc, char* argv[]) {
> > >
> > > 	target("Light the blue touch paper") ;
> > >
> > > 	return 0 ;
> > > }
> > > ------------------------------------------------------->>
> > >
> > > Compiled by gcc 4.5.1 "-g -O0".
> > >
> > > If I do "gdb test", stepping by "n":
> > >
> > > <<-------------------------------------------------------
> > > (gdb) show non-stop
> > > Controlling the inferior in non-stop mode is on.
> > > (gdb) b target
> > > Breakpoint 1 at 0x4004d0: file test.c, line 6.
> > > (gdb) run
> > > Starting program: ...........test
> > >
> > > Breakpoint 1, target (message=0x400615 "Light the blue touch
> paper")
> > > at test.c:6
> > > 6		printf("%s ...BANG!\n", message) ;
> > > (gdb) n
> > > Light the blue touch paper ...BANG!
> > > 7	}
> > > (gdb) n
> > >
> > > Program received signal SIGSEGV, Segmentation fault.
> > > target (message=Cannot access memory at address
> 0xfffffffffffffff8
> > > ) at test.c:7
> > > 7	}
> > > (gdb) info reg
> > > ....
> > > rbp   0x0         	0x0
> > > rsp   0x7fffffffe248	0x7fffffffe248
> > > ....
> > > rip   0x4004e9		0x4004e9 	<target+37>
> > > ....
> > > ------------------------------------------------------->>
> > >
> > > Or, stepping by 'ni':
> > >
> > > <<-------------------------------------------------------
> > > (gdb) show non-stop
> > > Controlling the inferior in non-stop mode is on.
> > > (gdb) b target
> > > Breakpoint 1 at 0x4004d0: file test.c, line 6.
> > > (gdb) disass target
> > > Dump of assembler code for function target:
> > >    0x00000000004004c4 <+0>:	push   %rbp
> > >    0x00000000004004c5 <+1>:	mov    %rsp,%rbp
> > >    0x00000000004004c8 <+4>:	sub    $0x10,%rsp
> > >    0x00000000004004cc <+8>:	mov    %rdi,-0x8(%rbp)
> > >    0x00000000004004d0 <+12>:	mov    $0x400608,%eax
> > >    0x00000000004004d5 <+17>:	mov    -0x8(%rbp),%rdx
> > >    0x00000000004004d9 <+21>:	mov    %rdx,%rsi
> > >    0x00000000004004dc <+24>:	mov    %rax,%rdi
> > >    0x00000000004004df <+27>:	mov    $0x0,%eax
> > >    0x00000000004004e4 <+32>:	callq  0x4003b8 <printf@plt>
> > >    0x00000000004004e9 <+37>:	leaveq
> > >    0x00000000004004ea <+38>:	retq
> > > End of assembler dump.
> > > (gdb) disp/i $pc
> > > (gdb) run
> > > Starting program: .......test
> > >
> > > Breakpoint 1, target (message=0x400615 "Light the blue touch
> paper")
> > > at test.c:6
> > > 6		printf("%s ...BANG!\n", message) ;
> > > .....
> > > 1: x/i $pc
> > > => 0x4004e4 <target+32>:	callq  0x4003b8 <printf@plt>
> > > (gdb) ni
> > > Light the blue touch paper ...BANG!
> > > 7	}
> > > 1: x/i $pc
> > > => 0x4004e9 <target+37>:	leaveq
> > > (gdb) ni
> > > target (message=0x100000000 <Address 0x100000000 out of bounds>)
> at
> > > test.c:7
> > > 7	}
> > > 1: x/i $pc
> > > => 0x4004e9 <target+37>:	leaveq
> > > (gdb) ni
> > > Cannot access memory at address 0x8
> > > (gdb) ni
> > > The program is not being run.
> > > ------------------------------------------------------->>
> > >
> > > I note that if I turn off the "non-stop" option, it works.  So
> this
> > > is something to do with debugging multi-threaded !
> > >
> > > I note also that if I change the target to:
> > >
> > >   static int
> > >   target(const char* message) {
> > >           printf("%s ...BANG!\n", message) ;
> > >           return 0 ;
> > >   }
> > >
> > > the problem goes away... so one extra instruction between the
> callq
> > > and the leaveq makes a difference:
> > >
> > >    0x00000000004004dc <+24>:	mov    %rax,%rdi
> > >    0x00000000004004df <+27>:	mov    $0x0,%eax
> > >    0x00000000004004e4 <+32>:	callq  0x4003b8 <printf@plt>
> > >    0x00000000004004e9 <+37>:	mov    $0x0,%eax
> > >    0x00000000004004ee <+42>:	leaveq
> > >    0x00000000004004ef <+43>:	retq
> > >
> > > This goes some way to explaining why it appeared to be a
> sporadic
> > > problem.
> > >
> > > Is this me, or is this a bug ?  It used to work :-(
> > >
> > > Thanks,
> > >
> > > Chris
> >
> >


^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Non-stop mode disfunctional ?
  2011-03-14 14:52 ` Pedro Alves
  2011-03-14 16:17   ` chris.hall.list
@ 2011-03-24 10:16   ` 'Chris Hall'
  1 sibling, 0 replies; 4+ messages in thread
From: 'Chris Hall' @ 2011-03-24 10:16 UTC (permalink / raw)
  To: gdb

> Sorry to follow up my own posting...

Sorry again.... AFAICS non-stop is *broken*....

> ... I'm hoping to discover if the fact that non-stop doesn't work is
> (a) because I'm not using it correctly, or (b) a well known problem
I 
> should know about, or (c) an actual bug, or (d) something else ?

....and I could really do with it working again :-(

Am I asking in the wrong place ?

Thanks,

Chris


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-03-24 10:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-14 14:20 Non-stop mode disfunctional ? Chris Hall
2011-03-14 14:52 ` Pedro Alves
2011-03-14 16:17   ` chris.hall.list
2011-03-24 10:16   ` 'Chris Hall'

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox