* Problem with threaded program
@ 2001-12-02 8:44 ` David Relson
2001-12-02 10:41 ` Andrew Cagney
2001-12-06 16:04 ` Kevin Buettner
0 siblings, 2 replies; 9+ messages in thread
From: David Relson @ 2001-12-02 8:44 UTC (permalink / raw)
To: gdb
Greetings,
The problem below was originally reported to the Linux Kernel Mailing
List. It looks to me to be a gdb problem.
I used a freshly compiled and installed copy of gdb-5.1 (configured as
"i686-pc-linux-gnu") for this test on a Pentium III 500mhz running the
2.4.16 kernel. The same problem happens with gdb-5.0. gdb-4.18 appears to
work fine.
Here's the test program, test.c:
#include <stdlib.h>
int main() {
char *t="1.0";
double d=0;
d=strtod(t,(char **)NULL);
printf( "%f\n", d );
return 0;
}
Build using "gcc -g -lpthread test.c"; run using "gdb a.out".
If you step through the program one line at a time and display variable d
after each assignment, the strtod() call seems to return
"nan(0x8000000000000)", which is also shown by print().
If you restart the program with a breakpoint at printf(), let it run, and
display d at the breakpoint, the value shown is "1.000000" which is correct.
Is this a defect in gdb, or is my analysis wrong?
David
P.S.
--------------------------------------------------------
David Relson Osage Software Systems, Inc.
relson@osagesoftware.com Ann Arbor, MI 48103
www.osagesoftware.com tel: 734.821.8800
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program
2001-12-02 8:44 ` Problem with threaded program David Relson
@ 2001-12-02 10:41 ` Andrew Cagney
2001-12-02 15:48 ` Daniel Jacobowitz
2001-12-06 16:04 ` Kevin Buettner
1 sibling, 1 reply; 9+ messages in thread
From: Andrew Cagney @ 2001-12-02 10:41 UTC (permalink / raw)
To: David Relson; +Cc: gdb
> Greetings,
>
> The problem below was originally reported to the Linux Kernel Mailing List. It looks to me to be a gdb problem.
>
> I used a freshly compiled and installed copy of gdb-5.1 (configured as "i686-pc-linux-gnu") for this test on a Pentium III 500mhz running the 2.4.16 kernel. The same problem happens with gdb-5.0. gdb-4.18 appears to work fine.
>
> Here's the test program, test.c:
>
> #include <stdlib.h>
> int main() {
> char *t="1.0";
> double d=0;
> d=strtod(t,(char **)NULL);
> printf( "%f\n", d );
> return 0;
> }
>
> Build using "gcc -g -lpthread test.c"; run using "gdb a.out".
>
> If you step through the program one line at a time and display variable d after each assignment, the strtod() call seems to return "nan(0x8000000000000)", which is also shown by print().
>
> If you restart the program with a breakpoint at printf(), let it run, and display d at the breakpoint, the value shown is "1.000000" which is correct.
>
> Is this a defect in gdb, or is my analysis wrong?
Ah, looks like the GDB is corrupting a threaded programs FP registers
problem.
I'm 99% certain this is in the thread-db/kernel interface that GDB is
using. Each time this crops up, the problem gets resolved with a
kernel/library update.
If someone can point out a definitive explination I'll add it to the
5.1.1 PROBLEMS file. That way it is at least clearly documented.
The apparent 4.18 -> 5.0 ``breakage'' would have occured because GDB
switched to using the thread-db/kernel interface.
enjoy,
Andrew
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program
2001-12-02 10:41 ` Andrew Cagney
@ 2001-12-02 15:48 ` Daniel Jacobowitz
2001-12-02 17:24 ` David Relson
0 siblings, 1 reply; 9+ messages in thread
From: Daniel Jacobowitz @ 2001-12-02 15:48 UTC (permalink / raw)
To: Andrew Cagney; +Cc: David Relson, gdb
On Sun, Dec 02, 2001 at 01:40:25PM -0500, Andrew Cagney wrote:
> >Greetings,
> >
> >The problem below was originally reported to the Linux Kernel Mailing List.
> >It looks to me to be a gdb problem.
> >
> >I used a freshly compiled and installed copy of gdb-5.1 (configured as
> >"i686-pc-linux-gnu") for this test on a Pentium III 500mhz running the
> >2.4.16 kernel. The same problem happens with gdb-5.0. gdb-4.18 appears to
> >work fine.
> >
> >Here's the test program, test.c:
> >
> >#include <stdlib.h>
> >int main() {
> > char *t="1.0";
> > double d=0;
> > d=strtod(t,(char **)NULL);
> > printf( "%f\n", d );
> > return 0;
> >}
> >
> >Build using "gcc -g -lpthread test.c"; run using "gdb a.out".
> >
> >If you step through the program one line at a time and display variable d
> >after each assignment, the strtod() call seems to return
> >"nan(0x8000000000000)", which is also shown by print().
> >
> >If you restart the program with a breakpoint at printf(), let it run, and
> >display d at the breakpoint, the value shown is "1.000000" which is
> >correct.
> >
> >Is this a defect in gdb, or is my analysis wrong?
>
> Ah, looks like the GDB is corrupting a threaded programs FP registers
> problem.
>
> I'm 99% certain this is in the thread-db/kernel interface that GDB is
> using. Each time this crops up, the problem gets resolved with a
> kernel/library update.
>
> If someone can point out a definitive explination I'll add it to the
> 5.1.1 PROBLEMS file. That way it is at least clearly documented.
>
> The apparent 4.18 -> 5.0 ``breakage'' would have occured because GDB
> switched to using the thread-db/kernel interface.
Well, it happens every time we try to step over an fstpl instruction.
We never call any of the SETREGS or POKE variants, only GETREGS and
GETFPXREGS; I don't see how it could really be our bug.
Note that in the non-threaded case we never call PTRACE_GETFPXREGS at
all. That's:
- an inefficency in the thread code, not surprisingly
- highly suggestive of a kernel bug.
My money's on the kernel, but I don't have time to debug this just now.
--
Daniel Jacobowitz Carnegie Mellon University
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program
2001-12-02 15:48 ` Daniel Jacobowitz
@ 2001-12-02 17:24 ` David Relson
2001-12-02 21:28 ` Andrew Cagney
0 siblings, 1 reply; 9+ messages in thread
From: David Relson @ 2001-12-02 17:24 UTC (permalink / raw)
To: gdb
At 06:49 PM 12/2/01, you wrote:
>On Sun, Dec 02, 2001 at 01:40:25PM -0500, Andrew Cagney wrote:
>
> >
> > Ah, looks like the GDB is corrupting a threaded programs FP registers
> > problem.
> >
> > I'm 99% certain this is in the thread-db/kernel interface that GDB is
> > using. Each time this crops up, the problem gets resolved with a
> > kernel/library update.
> >
> > If someone can point out a definitive explination I'll add it to the
> > 5.1.1 PROBLEMS file. That way it is at least clearly documented.
> >
> > The apparent 4.18 -> 5.0 ``breakage'' would have occured because GDB
> > switched to using the thread-db/kernel interface.
>
>Well, it happens every time we try to step over an fstpl instruction.
>We never call any of the SETREGS or POKE variants, only GETREGS and
>GETFPXREGS; I don't see how it could really be our bug.
>
>Note that in the non-threaded case we never call PTRACE_GETFPXREGS at
>all. That's:
> - an inefficency in the thread code, not surprisingly
> - highly suggestive of a kernel bug.
>
>My money's on the kernel, but I don't have time to debug this just now.
Daniel,
I don't know what the division of labor is between gdb and the kernel for
situations like this, so I can't comment on that. I can report, however,
that the problem doesn't occur with gdb-4.18. I had previously tested with
an old copy of gdb-4.18. To remove one possible difference, I have tested
for the problem using copies of both gdb-4.18 and gdb-5.1 built today under
the 2.4.16 kernel. gdb-4.18 works properly and gdb-5.1 shows the
problem. If the problem is in the kernel, then 4.18 and 5.1 must be using
different capabilities ...
David
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program
2001-12-02 17:24 ` David Relson
@ 2001-12-02 21:28 ` Andrew Cagney
2001-12-03 4:14 ` David Relson
2001-12-03 7:49 ` Trond Eivind Glomsrød
0 siblings, 2 replies; 9+ messages in thread
From: Andrew Cagney @ 2001-12-02 21:28 UTC (permalink / raw)
To: David Relson; +Cc: gdb
> Daniel,
>
> I don't know what the division of labor is between gdb and the kernel for situations like this, so I can't comment on that. I can report, however, that the problem doesn't occur with gdb-4.18. I had previously tested with an old copy of gdb-4.18. To remove one possible difference, I have tested for the problem using copies of both gdb-4.18 and gdb-5.1 built today under the 2.4.16 kernel. gdb-4.18 works properly and gdb-5.1 shows the problem. If the problem is in the kernel, then 4.18 and 5.1 must be using different capabilities ...
>
Yes, very, very different:
>> The apparent 4.18 -> 5.0 ``breakage'' would have occured because GDB
>> switched to using the thread-db/kernel interface.
best way to describe this is, unfortunatly, is ``radical surgery''.
While the new implementation is significantly better and definitly worth
the effort. however, it is showing up a few teathing problems. Just
make certain that your kernel/glibc are recent.
sigh,
Andrew
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program
2001-12-02 21:28 ` Andrew Cagney
@ 2001-12-03 4:14 ` David Relson
2001-12-03 7:49 ` Trond Eivind Glomsrød
1 sibling, 0 replies; 9+ messages in thread
From: David Relson @ 2001-12-03 4:14 UTC (permalink / raw)
To: gdb
At 12:28 AM 12/3/01, you wrote:
>>Daniel,
>>I don't know what the division of labor is between gdb and the kernel for
>>situations like this, so I can't comment on that. I can report, however,
>>that the problem doesn't occur with gdb-4.18. I had previously tested
>>with an old copy of gdb-4.18. To remove one possible difference, I have
>>tested for the problem using copies of both gdb-4.18 and gdb-5.1 built
>>today under the 2.4.16 kernel. gdb-4.18 works properly and gdb-5.1 shows
>>the problem. If the problem is in the kernel, then 4.18 and 5.1 must be
>>using different capabilities ...
>
>Yes, very, very different:
>
>>>The apparent 4.18 -> 5.0 ``breakage'' would have occured because GDB
>>>switched to using the thread-db/kernel interface.
>
>best way to describe this is, unfortunatly, is ``radical surgery''. While
>the new implementation is significantly better and definitly worth the
>effort. however, it is showing up a few teathing problems.
I believe the phrase goes "you have to break an egg to make an omelet" or
is it that progress requires "two steps forward and one step backwards"?
> Just make certain that your kernel/glibc are recent.
Current they are. kernel is 2.4.16 and glibc is 2.2.4-6mdk. I'd be
willing to update to 2.4.17-pre2 and 2.2.4-7 if you thought it'd make a
difference.
Is this something that ought to be brought to the attention of the kernel
maintainers? If so, would you notify them? Your deeper knowledge and
understanding of the issues involved would be helpful.
Cheers!
David
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program
2001-12-02 21:28 ` Andrew Cagney
2001-12-03 4:14 ` David Relson
@ 2001-12-03 7:49 ` Trond Eivind Glomsrød
1 sibling, 0 replies; 9+ messages in thread
From: Trond Eivind Glomsrød @ 2001-12-03 7:49 UTC (permalink / raw)
To: Andrew Cagney; +Cc: David Relson, gdb
Andrew Cagney <ac131313@cygnus.com> writes:
> > Daniel,
> > I don't know what the division of labor is between gdb and the
> > kernel for situations like this, so I can't comment on that. I can
> > report, however, that the problem doesn't occur with gdb-4.18. I
> > had previously tested with an old copy of gdb-4.18. To remove one
> > possible difference, I have tested for the problem using copies of
> > both gdb-4.18 and gdb-5.1 built today under the 2.4.16 kernel.
> > gdb-4.18 works properly and gdb-5.1 shows the problem. If the
> > problem is in the kernel, then 4.18 and 5.1 must be using different
> > capabilities ...
> >
>
> Yes, very, very different:
>
> >> The apparent 4.18 -> 5.0 ``breakage'' would have occured because GDB
> >> switched to using the thread-db/kernel interface.
>
> best way to describe this is, unfortunatly, is ``radical
> surgery''. While the new implementation is significantly better and
> definitly worth the effort. however, it is showing up a few teathing
> problems. Just make certain that your kernel/glibc are recent.
Doesn't help in this case.
--
Trond Eivind Glomsrød
Red Hat, Inc.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program
2001-12-02 8:44 ` Problem with threaded program David Relson
2001-12-02 10:41 ` Andrew Cagney
@ 2001-12-06 16:04 ` Kevin Buettner
2001-12-06 16:37 ` David Relson
1 sibling, 1 reply; 9+ messages in thread
From: Kevin Buettner @ 2001-12-06 16:04 UTC (permalink / raw)
To: David Relson, gdb
On Dec 2, 11:44am, David Relson wrote:
> Here's the test program, test.c:
>
> #include <stdlib.h>
> int main() {
> char *t="1.0";
> double d=0;
> d=strtod(t,(char **)NULL);
> printf( "%f\n", d );
> return 0;
> }
>
> Build using "gcc -g -lpthread test.c"; run using "gdb a.out".
>
> If you step through the program one line at a time and display variable d
> after each assignment, the strtod() call seems to return
> "nan(0x8000000000000)", which is also shown by print().
>
> If you restart the program with a breakpoint at printf(), let it run, and
> display d at the breakpoint, the value shown is "1.000000" which is correct.
>
> Is this a defect in gdb, or is my analysis wrong?
It's a defect in gdb.
I've just posted a patch which fixes this bug. See
http://sources.redhat.com/ml/gdb-patches/2001-12/msg00183.html
It hasn't been approved yet, but once it has, I'll push for getting
it into 5.1.1 too.
Thanks,
Kevin
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Problem with threaded program
2001-12-06 16:04 ` Kevin Buettner
@ 2001-12-06 16:37 ` David Relson
0 siblings, 0 replies; 9+ messages in thread
From: David Relson @ 2001-12-06 16:37 UTC (permalink / raw)
To: Kevin Buettner; +Cc: gdb
Kevin
Way to go fella! I patched my gdb-5.1 source code, rebuilt, tested, and
got the right result!
I've forwarded the news to Emmanuel Blindauer who discovered the
problem. I'm sure he'll be pleased.
David
At 07:03 PM 12/6/01, Kevin Buettner wrote:
>On Dec 2, 11:44am, David Relson wrote:
>
> > Here's the test program, test.c:
> >
> > #include <stdlib.h>
> > int main() {
> > char *t="1.0";
> > double d=0;
> > d=strtod(t,(char **)NULL);
> > printf( "%f\n", d );
> > return 0;
> > }
> >
> > Build using "gcc -g -lpthread test.c"; run using "gdb a.out".
> >
> > If you step through the program one line at a time and display variable d
> > after each assignment, the strtod() call seems to return
> > "nan(0x8000000000000)", which is also shown by print().
> >
> > If you restart the program with a breakpoint at printf(), let it run, and
> > display d at the breakpoint, the value shown is "1.000000" which is
> correct.
> >
> > Is this a defect in gdb, or is my analysis wrong?
>
>It's a defect in gdb.
>
>I've just posted a patch which fixes this bug. See
>
> http://sources.redhat.com/ml/gdb-patches/2001-12/msg00183.html
>
>It hasn't been approved yet, but once it has, I'll push for getting
>it into 5.1.1 too.
>
>Thanks,
>
>Kevin
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2001-12-07 0:37 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <David Relson <relson@osagesoftware.com>
2001-12-02 8:44 ` Problem with threaded program David Relson
2001-12-02 10:41 ` Andrew Cagney
2001-12-02 15:48 ` Daniel Jacobowitz
2001-12-02 17:24 ` David Relson
2001-12-02 21:28 ` Andrew Cagney
2001-12-03 4:14 ` David Relson
2001-12-03 7:49 ` Trond Eivind Glomsrød
2001-12-06 16:04 ` Kevin Buettner
2001-12-06 16:37 ` David Relson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox