From mboxrd@z Thu Jan 1 00:00:00 1970 From: Fernando Nasser To: Michael Snyder Cc: Daniel Jacobowitz , gdb-patches@sources.redhat.com, fnasser@cygnus.com Subject: Re: [rfa/testsuite] Make pthreads test more robust Date: Mon, 01 Oct 2001 08:17:00 -0000 Message-id: <3BB887DF.D4452C00@redhat.com> References: <20010928114642.A913@nevyn.them.org> <3BB4C175.42072289@cygnus.com> X-SW-Source: 2001-10/msg00008.html (please see below) Michael Snyder wrote: > > Daniel Jacobowitz wrote: > > > > I've been seeing about 90% failure rate for gdb.threads/pthreads.exp lately. > > The test which fails is always stopping threads with a control-c. > > I spent some time debugging this today; it seems that the course of events > > looks something like this: > > - send "continue\n" > > - wait > > - send "\003" > > - read back "Continuing." > > - timeout > > > > Note that, among other things, we never see "continue\n" echoed back to us, > > and yet gdb continues anyway. Obviously something is fishy in timing. We > > also do no reads between the continue and the \003. My best guess is that > > gdb does not get scheduled between the two sends; so waiting for the output > > of continue seems like a good idea. > > > > Also, I'm not entirely sure what 'after 1000 [ ... ]' is supposed to do, but > > it doesn't seem to delay by any measurable amount of time. Adding an > > additional sleep, so that the process has actually continued before we send > > the ^C, lets the test pass. > > > > Is this OK to commit? With it, the rest of pthreads.exp passes (except for > > the last test: > > break common_routine thread 4 > > Breakpoint 6 at 0x8048666: file ../../../../src-hashtest/gdb/testsuite/gdb.threads/pthreads.c, line 50. > > (gdb) PASS: gdb.threads/pthreads.exp: set break at common_routine in thread 2 > > continue > > Continuing. > > Cannot find thread 1024: generic error > > (gdb) FAIL: gdb.threads/pthreads.exp: continue to bkpt at common_routine in thread 2 > > ) > > The patch looks sane. I'd like Fernando's blessing, but I'm inclined > to suggest checking it in and just watching out to see if it breaks > on any other platform. > I thought I had already responded to that, but I can't find the answer... I agree with Michael -- lets try. There is definitively a race condition in there. If GDB does not say continue in 1 sec. we send it a Cntl-C and I am not so sure what the output will look like if we send the Cntl-C before GDB says "Continuing". With your change we will be sure that the program is running before sending the interrupt request. I guess it is the right thing to do. P.S.: The "after" command schedules something to be done after a certain time. In this case, after a second (1000 milliseconds), a "\003" will be sent to GDB. So, we make it run and then interrupt it. P.S.2: I wonder if there isn't a second race condition between the 1 sec to interrupt and the timeout of gdb_expect... > > 2001-09-28 Daniel Jacobowitz > > > > * gdb.threads/pthreads.exp: Wait for output and delay > > before sending ^C. > > > > Index: pthreads.exp > > =================================================================== > > RCS file: /cvs/src/src/gdb/testsuite/gdb.threads/pthreads.exp,v > > retrieving revision 1.6 > > diff -u -r1.6 pthreads.exp > > --- pthreads.exp 2001/06/06 18:34:53 1.6 > > +++ pthreads.exp 2001/09/28 15:30:28 > > @@ -248,6 +248,15 @@ > > > > # Send a continue followed by ^C to the process to stop it. > > send_gdb "continue\n" > > + gdb_expect { > > + -re "Continuing." { > > + pass "Continue with all threads running" > > + } > > + timeout { > > + fail "Continue with all threads running (timeout)" > > + } > > + } > > + sleep 1 > > set description "Stopped with a ^C" > > after 1000 [send_gdb "\003"] > > gdb_expect { -- Fernando Nasser Red Hat Canada Ltd. E-Mail: fnasser@redhat.com 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9