From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-21894-listarch-gdb-patches=sources.redhat.com@sources.redhat.com>
Received: (qmail 29245 invoked by alias); 16 Jan 2003 17:05:49 -0000
Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-patches-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sources.redhat.com>
List-Help: <mailto:gdb-patches-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: gdb-patches-owner@sources.redhat.com
Received: (qmail 29234 invoked from network); 16 Jan 2003 17:05:45 -0000
Received: from unknown (HELO crack.them.org) (65.125.64.184)
  by sources.redhat.com with SMTP; 16 Jan 2003 17:05:45 -0000
Received: from nevyn.them.org ([66.93.61.169] ident=mail)
	by crack.them.org with asmtp (Exim 3.12 #1 (Debian))
	id 18ZFLP-0005qm-00; Thu, 16 Jan 2003 13:06:27 -0600
Received: from drow by nevyn.them.org with local (Exim 3.36 #1 (Debian))
	id 18ZDSa-00059y-00; Thu, 16 Jan 2003 12:05:44 -0500
Date: Thu, 16 Jan 2003 17:05:00 -0000
From: Daniel Jacobowitz <drow@mvista.com>
To: Andrew Cagney <ac131313@redhat.com>
Cc: Fernando Nasser <fnasser@redhat.com>, gdb-patches@sources.redhat.com
Subject: Re: [patch/rfc] Remove all setup_xfail's from testsuite/gdb.mi/
Message-ID: <20030116170544.GA19605@nevyn.them.org>
Mail-Followup-To: Andrew Cagney <ac131313@redhat.com>,
	Fernando Nasser <fnasser@redhat.com>,
	gdb-patches@sources.redhat.com
References: <3DB83EC1.6070609@redhat.com> <3E25845A.4030405@redhat.com> <3E259999.8000503@redhat.com> <3E26E389.7040704@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <3E26E389.7040704@redhat.com>
User-Agent: Mutt/1.5.1i
X-SW-Source: 2003-01/txt/msg00609.txt.bz2

On Thu, Jan 16, 2003 at 11:53:29AM -0500, Andrew Cagney wrote:
> >Andrew Cagney wrote:
> >Fernando,
> >
> >What is the resolution of this thread?
> >http://sources.redhat.com/ml/gdb-patches/2002-10/msg00508.html
> >(you'll probably want to read all of it :-)
> >(I'm now going back and emptying my mail box :-()
> >
> >Andrew
> >
> >
> >Hi Andrew,
> >
> >I am just starting to think about this, so bear with me while I think 
> >aloud a little bit :-)  There are lots of good points made on that 
> >discussion and I don't think this is a clear cut yet or no, lets vote, or 
> >anything like that.  I am trying to see if I can collect all good 
> >suggestions and propose a course of action.
> >
> >It seems that a widespread concern is not to through the baby with the 
> >water. On the other hand there is the feeling that we do not act now 
> >nothing will ever happen.  And we all agree that many xfails were added 
> >incorrectly.
> >
> >Independently of the "big plan", one thing is clear.  If you know that a 
> >xfail is incorrect (I mean, know for good), then it should be ripped off. 
> >But read on...
> >
> >Maybe we may not be in full agreement here, but I think it would be nice 
> >to enter a bug report even if does not include a full analysis -- that 
> >should come eventually.  Something like "something is wrong with the 
> >'--xxx-yyy' command output" may eventually cause someone to look into it.  
> >And you can make the xfail a kfail in that case, which is much better.  
> >But I know it is much more work than just removing the xfail and letting 
> >the test fail -- you would have to enter 28 bug reports just for the MI 
> >tests.  Still sounds like a better approach to me, but there is no 
> >consensus on it.
> 
> Some things I think worth noting.
> 
> In this process, I don't think it is reasonable make it a requirement 
> that people fully kfail/analize old badly specified xfails.  Rather I 
> think it is only reasonable to ask people to ensure that new tests, or 
> changes to existing tests, meet the new requirement (there must be a bug 
> report).
> 
> An initial bar is established, and then, over time, that bar is raised.
> 
> The difference is important.  It avoids the problem of developement 
> stalling because of the impossible requirement fixing all the old code 
> before the new code can be approved.

I don't think making it a requirement that go out and analyze all the
existing XFAILs is reasonable, although it is patently something we
need to do.  That's not the same as ripping them out and introducing
failures in the test results without addressing those failures.

> As a specific example, the i386 has an apparently low failure rate. 
> That rate is badly misleading and the real number of failures is much 
> higher :-(  It's just that those failures have been [intentionally] 
> camoflaged using xfail.  It would be unfortunate if people, for the 
> i386, tried to use that false result (almost zero fails) when initally 
> setting the bar.

Have you reviewed the list of XFAILs?  None of them are related to the
i386.  One, in signals.exp, is either related to GDB's handling of
signals or to a longstanding limitation in most operating system
kernels, depending how you look at it.  The rest are pretty much
platform independent.

> This is also why I think the xfail's should simply be yanked.  It acts 
> as a one time reset of gdb's test results, restoring them to their true 
> values.   While this may cause the bar to start out lower than some 
> would like,  I think that is far better and far more realistic than 
> trying to start with a bar falsely set too high.

This is a _regression_ testsuite.  I've been trying for months to get
it down to zero failures without compromising its integrity, and I've
just about done it for one target, by judicious use of KFAILs (and
fixing bugs!).  The existing XFAILs all look to me like either
legitimate XFAILs or things that should be KFAILed.  If you're going
to rip up my test results, please sort them accordingly first.

It doesn't need to be done all at once.  We can put markers in .exp
files saying "xfails audited".  But I think that we should audit
individual files, not yank madly.

Am I the only one who considers well-categorized results important?  If
you introduce seventy failures, then that's another couple of weeks I
can't just look at the results, see "oh, two failures in threads and
that's it, I didn't break anything".

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer