* Problem with threaded program @ 2001-12-02 8:44 ` David Relson 2001-12-02 10:41 ` Andrew Cagney 2001-12-06 16:04 ` Kevin Buettner 0 siblings, 2 replies; 9+ messages in thread From: David Relson @ 2001-12-02 8:44 UTC (permalink / raw) To: gdb Greetings, The problem below was originally reported to the Linux Kernel Mailing List. It looks to me to be a gdb problem. I used a freshly compiled and installed copy of gdb-5.1 (configured as "i686-pc-linux-gnu") for this test on a Pentium III 500mhz running the 2.4.16 kernel. The same problem happens with gdb-5.0. gdb-4.18 appears to work fine. Here's the test program, test.c: #include <stdlib.h> int main() { char *t="1.0"; double d=0; d=strtod(t,(char **)NULL); printf( "%f\n", d ); return 0; } Build using "gcc -g -lpthread test.c"; run using "gdb a.out". If you step through the program one line at a time and display variable d after each assignment, the strtod() call seems to return "nan(0x8000000000000)", which is also shown by print(). If you restart the program with a breakpoint at printf(), let it run, and display d at the breakpoint, the value shown is "1.000000" which is correct. Is this a defect in gdb, or is my analysis wrong? David P.S. -------------------------------------------------------- David Relson Osage Software Systems, Inc. relson@osagesoftware.com Ann Arbor, MI 48103 www.osagesoftware.com tel: 734.821.8800 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program 2001-12-02 8:44 ` Problem with threaded program David Relson @ 2001-12-02 10:41 ` Andrew Cagney 2001-12-02 15:48 ` Daniel Jacobowitz 2001-12-06 16:04 ` Kevin Buettner 1 sibling, 1 reply; 9+ messages in thread From: Andrew Cagney @ 2001-12-02 10:41 UTC (permalink / raw) To: David Relson; +Cc: gdb > Greetings, > > The problem below was originally reported to the Linux Kernel Mailing List. It looks to me to be a gdb problem. > > I used a freshly compiled and installed copy of gdb-5.1 (configured as "i686-pc-linux-gnu") for this test on a Pentium III 500mhz running the 2.4.16 kernel. The same problem happens with gdb-5.0. gdb-4.18 appears to work fine. > > Here's the test program, test.c: > > #include <stdlib.h> > int main() { > char *t="1.0"; > double d=0; > d=strtod(t,(char **)NULL); > printf( "%f\n", d ); > return 0; > } > > Build using "gcc -g -lpthread test.c"; run using "gdb a.out". > > If you step through the program one line at a time and display variable d after each assignment, the strtod() call seems to return "nan(0x8000000000000)", which is also shown by print(). > > If you restart the program with a breakpoint at printf(), let it run, and display d at the breakpoint, the value shown is "1.000000" which is correct. > > Is this a defect in gdb, or is my analysis wrong? Ah, looks like the GDB is corrupting a threaded programs FP registers problem. I'm 99% certain this is in the thread-db/kernel interface that GDB is using. Each time this crops up, the problem gets resolved with a kernel/library update. If someone can point out a definitive explination I'll add it to the 5.1.1 PROBLEMS file. That way it is at least clearly documented. The apparent 4.18 -> 5.0 ``breakage'' would have occured because GDB switched to using the thread-db/kernel interface. enjoy, Andrew ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program 2001-12-02 10:41 ` Andrew Cagney @ 2001-12-02 15:48 ` Daniel Jacobowitz 2001-12-02 17:24 ` David Relson 0 siblings, 1 reply; 9+ messages in thread From: Daniel Jacobowitz @ 2001-12-02 15:48 UTC (permalink / raw) To: Andrew Cagney; +Cc: David Relson, gdb On Sun, Dec 02, 2001 at 01:40:25PM -0500, Andrew Cagney wrote: > >Greetings, > > > >The problem below was originally reported to the Linux Kernel Mailing List. > >It looks to me to be a gdb problem. > > > >I used a freshly compiled and installed copy of gdb-5.1 (configured as > >"i686-pc-linux-gnu") for this test on a Pentium III 500mhz running the > >2.4.16 kernel. The same problem happens with gdb-5.0. gdb-4.18 appears to > >work fine. > > > >Here's the test program, test.c: > > > >#include <stdlib.h> > >int main() { > > char *t="1.0"; > > double d=0; > > d=strtod(t,(char **)NULL); > > printf( "%f\n", d ); > > return 0; > >} > > > >Build using "gcc -g -lpthread test.c"; run using "gdb a.out". > > > >If you step through the program one line at a time and display variable d > >after each assignment, the strtod() call seems to return > >"nan(0x8000000000000)", which is also shown by print(). > > > >If you restart the program with a breakpoint at printf(), let it run, and > >display d at the breakpoint, the value shown is "1.000000" which is > >correct. > > > >Is this a defect in gdb, or is my analysis wrong? > > Ah, looks like the GDB is corrupting a threaded programs FP registers > problem. > > I'm 99% certain this is in the thread-db/kernel interface that GDB is > using. Each time this crops up, the problem gets resolved with a > kernel/library update. > > If someone can point out a definitive explination I'll add it to the > 5.1.1 PROBLEMS file. That way it is at least clearly documented. > > The apparent 4.18 -> 5.0 ``breakage'' would have occured because GDB > switched to using the thread-db/kernel interface. Well, it happens every time we try to step over an fstpl instruction. We never call any of the SETREGS or POKE variants, only GETREGS and GETFPXREGS; I don't see how it could really be our bug. Note that in the non-threaded case we never call PTRACE_GETFPXREGS at all. That's: - an inefficency in the thread code, not surprisingly - highly suggestive of a kernel bug. My money's on the kernel, but I don't have time to debug this just now. -- Daniel Jacobowitz Carnegie Mellon University MontaVista Software Debian GNU/Linux Developer ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program 2001-12-02 15:48 ` Daniel Jacobowitz @ 2001-12-02 17:24 ` David Relson 2001-12-02 21:28 ` Andrew Cagney 0 siblings, 1 reply; 9+ messages in thread From: David Relson @ 2001-12-02 17:24 UTC (permalink / raw) To: gdb At 06:49 PM 12/2/01, you wrote: >On Sun, Dec 02, 2001 at 01:40:25PM -0500, Andrew Cagney wrote: > > > > > Ah, looks like the GDB is corrupting a threaded programs FP registers > > problem. > > > > I'm 99% certain this is in the thread-db/kernel interface that GDB is > > using. Each time this crops up, the problem gets resolved with a > > kernel/library update. > > > > If someone can point out a definitive explination I'll add it to the > > 5.1.1 PROBLEMS file. That way it is at least clearly documented. > > > > The apparent 4.18 -> 5.0 ``breakage'' would have occured because GDB > > switched to using the thread-db/kernel interface. > >Well, it happens every time we try to step over an fstpl instruction. >We never call any of the SETREGS or POKE variants, only GETREGS and >GETFPXREGS; I don't see how it could really be our bug. > >Note that in the non-threaded case we never call PTRACE_GETFPXREGS at >all. That's: > - an inefficency in the thread code, not surprisingly > - highly suggestive of a kernel bug. > >My money's on the kernel, but I don't have time to debug this just now. Daniel, I don't know what the division of labor is between gdb and the kernel for situations like this, so I can't comment on that. I can report, however, that the problem doesn't occur with gdb-4.18. I had previously tested with an old copy of gdb-4.18. To remove one possible difference, I have tested for the problem using copies of both gdb-4.18 and gdb-5.1 built today under the 2.4.16 kernel. gdb-4.18 works properly and gdb-5.1 shows the problem. If the problem is in the kernel, then 4.18 and 5.1 must be using different capabilities ... David ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program 2001-12-02 17:24 ` David Relson @ 2001-12-02 21:28 ` Andrew Cagney 2001-12-03 4:14 ` David Relson 2001-12-03 7:49 ` Trond Eivind Glomsrød 0 siblings, 2 replies; 9+ messages in thread From: Andrew Cagney @ 2001-12-02 21:28 UTC (permalink / raw) To: David Relson; +Cc: gdb > Daniel, > > I don't know what the division of labor is between gdb and the kernel for situations like this, so I can't comment on that. I can report, however, that the problem doesn't occur with gdb-4.18. I had previously tested with an old copy of gdb-4.18. To remove one possible difference, I have tested for the problem using copies of both gdb-4.18 and gdb-5.1 built today under the 2.4.16 kernel. gdb-4.18 works properly and gdb-5.1 shows the problem. If the problem is in the kernel, then 4.18 and 5.1 must be using different capabilities ... > Yes, very, very different: >> The apparent 4.18 -> 5.0 ``breakage'' would have occured because GDB >> switched to using the thread-db/kernel interface. best way to describe this is, unfortunatly, is ``radical surgery''. While the new implementation is significantly better and definitly worth the effort. however, it is showing up a few teathing problems. Just make certain that your kernel/glibc are recent. sigh, Andrew ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program 2001-12-02 21:28 ` Andrew Cagney @ 2001-12-03 4:14 ` David Relson 2001-12-03 7:49 ` Trond Eivind Glomsrød 1 sibling, 0 replies; 9+ messages in thread From: David Relson @ 2001-12-03 4:14 UTC (permalink / raw) To: gdb At 12:28 AM 12/3/01, you wrote: >>Daniel, >>I don't know what the division of labor is between gdb and the kernel for >>situations like this, so I can't comment on that. I can report, however, >>that the problem doesn't occur with gdb-4.18. I had previously tested >>with an old copy of gdb-4.18. To remove one possible difference, I have >>tested for the problem using copies of both gdb-4.18 and gdb-5.1 built >>today under the 2.4.16 kernel. gdb-4.18 works properly and gdb-5.1 shows >>the problem. If the problem is in the kernel, then 4.18 and 5.1 must be >>using different capabilities ... > >Yes, very, very different: > >>>The apparent 4.18 -> 5.0 ``breakage'' would have occured because GDB >>>switched to using the thread-db/kernel interface. > >best way to describe this is, unfortunatly, is ``radical surgery''. While >the new implementation is significantly better and definitly worth the >effort. however, it is showing up a few teathing problems. I believe the phrase goes "you have to break an egg to make an omelet" or is it that progress requires "two steps forward and one step backwards"? > Just make certain that your kernel/glibc are recent. Current they are. kernel is 2.4.16 and glibc is 2.2.4-6mdk. I'd be willing to update to 2.4.17-pre2 and 2.2.4-7 if you thought it'd make a difference. Is this something that ought to be brought to the attention of the kernel maintainers? If so, would you notify them? Your deeper knowledge and understanding of the issues involved would be helpful. Cheers! David ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program 2001-12-02 21:28 ` Andrew Cagney 2001-12-03 4:14 ` David Relson @ 2001-12-03 7:49 ` Trond Eivind Glomsrød 1 sibling, 0 replies; 9+ messages in thread From: Trond Eivind Glomsrød @ 2001-12-03 7:49 UTC (permalink / raw) To: Andrew Cagney; +Cc: David Relson, gdb Andrew Cagney <ac131313@cygnus.com> writes: > > Daniel, > > I don't know what the division of labor is between gdb and the > > kernel for situations like this, so I can't comment on that. I can > > report, however, that the problem doesn't occur with gdb-4.18. I > > had previously tested with an old copy of gdb-4.18. To remove one > > possible difference, I have tested for the problem using copies of > > both gdb-4.18 and gdb-5.1 built today under the 2.4.16 kernel. > > gdb-4.18 works properly and gdb-5.1 shows the problem. If the > > problem is in the kernel, then 4.18 and 5.1 must be using different > > capabilities ... > > > > Yes, very, very different: > > >> The apparent 4.18 -> 5.0 ``breakage'' would have occured because GDB > >> switched to using the thread-db/kernel interface. > > best way to describe this is, unfortunatly, is ``radical > surgery''. While the new implementation is significantly better and > definitly worth the effort. however, it is showing up a few teathing > problems. Just make certain that your kernel/glibc are recent. Doesn't help in this case. -- Trond Eivind Glomsrød Red Hat, Inc. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program 2001-12-02 8:44 ` Problem with threaded program David Relson 2001-12-02 10:41 ` Andrew Cagney @ 2001-12-06 16:04 ` Kevin Buettner 2001-12-06 16:37 ` David Relson 1 sibling, 1 reply; 9+ messages in thread From: Kevin Buettner @ 2001-12-06 16:04 UTC (permalink / raw) To: David Relson, gdb On Dec 2, 11:44am, David Relson wrote: > Here's the test program, test.c: > > #include <stdlib.h> > int main() { > char *t="1.0"; > double d=0; > d=strtod(t,(char **)NULL); > printf( "%f\n", d ); > return 0; > } > > Build using "gcc -g -lpthread test.c"; run using "gdb a.out". > > If you step through the program one line at a time and display variable d > after each assignment, the strtod() call seems to return > "nan(0x8000000000000)", which is also shown by print(). > > If you restart the program with a breakpoint at printf(), let it run, and > display d at the breakpoint, the value shown is "1.000000" which is correct. > > Is this a defect in gdb, or is my analysis wrong? It's a defect in gdb. I've just posted a patch which fixes this bug. See http://sources.redhat.com/ml/gdb-patches/2001-12/msg00183.html It hasn't been approved yet, but once it has, I'll push for getting it into 5.1.1 too. Thanks, Kevin ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program 2001-12-06 16:04 ` Kevin Buettner @ 2001-12-06 16:37 ` David Relson 0 siblings, 0 replies; 9+ messages in thread From: David Relson @ 2001-12-06 16:37 UTC (permalink / raw) To: Kevin Buettner; +Cc: gdb Kevin Way to go fella! I patched my gdb-5.1 source code, rebuilt, tested, and got the right result! I've forwarded the news to Emmanuel Blindauer who discovered the problem. I'm sure he'll be pleased. David At 07:03 PM 12/6/01, Kevin Buettner wrote: >On Dec 2, 11:44am, David Relson wrote: > > > Here's the test program, test.c: > > > > #include <stdlib.h> > > int main() { > > char *t="1.0"; > > double d=0; > > d=strtod(t,(char **)NULL); > > printf( "%f\n", d ); > > return 0; > > } > > > > Build using "gcc -g -lpthread test.c"; run using "gdb a.out". > > > > If you step through the program one line at a time and display variable d > > after each assignment, the strtod() call seems to return > > "nan(0x8000000000000)", which is also shown by print(). > > > > If you restart the program with a breakpoint at printf(), let it run, and > > display d at the breakpoint, the value shown is "1.000000" which is > correct. > > > > Is this a defect in gdb, or is my analysis wrong? > >It's a defect in gdb. > >I've just posted a patch which fixes this bug. See > > http://sources.redhat.com/ml/gdb-patches/2001-12/msg00183.html > >It hasn't been approved yet, but once it has, I'll push for getting >it into 5.1.1 too. > >Thanks, > >Kevin ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2001-12-07 0:37 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <David Relson <relson@osagesoftware.com>
2001-12-02 8:44 ` Problem with threaded program David Relson
2001-12-02 10:41 ` Andrew Cagney
2001-12-02 15:48 ` Daniel Jacobowitz
2001-12-02 17:24 ` David Relson
2001-12-02 21:28 ` Andrew Cagney
2001-12-03 4:14 ` David Relson
2001-12-03 7:49 ` Trond Eivind Glomsrød
2001-12-06 16:04 ` Kevin Buettner
2001-12-06 16:37 ` David Relson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox