From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 105773 invoked by alias); 1 Oct 2018 09:36:34 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 105761 invoked by uid 89); 1 Oct 2018 09:36:32 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=meanwhile, vintage, spuriously, row X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 01 Oct 2018 09:36:30 +0000 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.24]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BF8BF44BD5; Mon, 1 Oct 2018 09:36:28 +0000 (UTC) Received: from [127.0.0.1] (ovpn04.gateway.prod.ext.ams2.redhat.com [10.39.146.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id A9E1E308BDAF; Mon, 1 Oct 2018 09:36:27 +0000 (UTC) Subject: dg-extract-results's bad sorting behavior (Re: Update dg-extract-results.* from gcc) To: Rainer Orth References: <515bfbff-e2f7-3574-2726-ee1592786891@redhat.com> <400a6540-c7fb-8de8-d5dc-3f07e2d632ca@redhat.com> Cc: gdb-patches@sourceware.org From: Pedro Alves Message-ID: Date: Mon, 01 Oct 2018 09:36:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <400a6540-c7fb-8de8-d5dc-3f07e2d632ca@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2018-10/txt/msg00007.txt.bz2 On 08/08/2018 04:24 PM, Pedro Alves wrote: > On 08/08/2018 03:36 PM, Rainer Orth wrote: > >>> On 07/20/2018 12:02 PM, Rainer Orth wrote: >>>> When looking at the gdb.sum file produced by dg-extract-results.sh on >>>> Solaris 11/x86, I noticed some wrong sorting, like this: >>>> >>>> PASS: gdb.ada/addr_arith.exp: print something'address + 0 >>>> PASS: gdb.ada/addr_arith.exp: print 0 + something'address >>>> PASS: gdb.ada/addr_arith.exp: print something'address - 0 >>>> PASS: gdb.ada/addr_arith.exp: print 0 - something'address >>>> >>>> Looking closer, I noticed that while dg-extract-results.sh had been >>>> copied over from contrib in the gcc repo, the corresponding >>>> dg-extract-results.py file had not. The latter not only fixes the >>>> sorting problem I'd observed, but is also way faster than the shell >>>> version (like a factor of 50 faster). >>> >>> We used to have the dg-extract-results.py file, but we deleted it >>> because it caused (funnily enough, sorting) problems. See: >>> >>> https://sourceware.org/ml/gdb-patches/2015-02/msg00333.html >>> >>> Has that sorting stability issue been meanwhile fixed upstream? >> >> not that I can see: between the version of dg-extract-results.py removed >> in early 2015 and the one in current gcc trunk, there's only added >> handling for DejaGnu ERRORs and another minor change to do with >> summaries that doesn't seem to change anything wrt. sorting on first >> blush. > > OK. > >> Howver, I've just run make -j16 check three times in a row on >> amd64-pc-solaris2.11, followed by make -j48 check, and the only >> differences were to to 200+ racy tests, the vast majority of them in >> gdb.threads. > > Thanks for testing. > >> Maybe the prior problems have been due to bugs in older >> versions of python? > Might be. > > IIRC, the sorting that would change would be the order that the different > individual gdb.sum results would be merged (the per-gdb.sum order was > stable). So comparing two runs, you'd get something like, in one run this: > > gdb.base/foo.exp:PASS: test1 > gdb.base/foo.exp:PASS: test2 > gdb.base/foo.exp:PASS: test3 > gdb.base/bar.exp:PASS: testA > gdb.base/bar.exp:PASS: testB > gdb.base/bar.exp:PASS: testC > > and another run this: > > gdb.base/bar.exp:PASS: testA > gdb.base/bar.exp:PASS: testB > gdb.base/bar.exp:PASS: testC > gdb.base/foo.exp:PASS: test1 > gdb.base/foo.exp:PASS: test2 > gdb.base/foo.exp:PASS: test3 > > which would result in all those tests spuriously showing up in a > gdb.sum old/new diff. > > I'm not sure whether we were seeing that if you compared runs > of the same tree multiple times. It could be that it only happened > when comparing the results of different trees, which contained > a slightly different set of tests and testcases, like for example > comparing testresults of a patched master against the testresults > of master from a month or week ago, which is something I frequently > do, for example. > > *time passes* > > Wait wait wait, can you clarify what you meant by wrong sorting in: > > PASS: gdb.ada/addr_arith.exp: print something'address + 0 > PASS: gdb.ada/addr_arith.exp: print 0 + something'address > PASS: gdb.ada/addr_arith.exp: print something'address - 0 > PASS: gdb.ada/addr_arith.exp: print 0 - something'address > > ? > > Why do you think those results _should_ be sorted? And in what order? > > Typically, the order/sequence in which the tests of a given exp > file is executed is important. The order in the gdb.sum file must > be the order in which the fail/pass calls are written/issued in the .exp file. > It'd be absolutely incorrect to alphabetically sort the gdb.sum output. > Is that what the .py version does? That's not what I recall, though. > I guess I may be confused. Getting back to this, because I just diffed testresults between runs of different vintage, and got bitten by the sorting problems. I'm diffing testresults between a run on 20180713 and a run against today's gdb, and I got a _ton_ of spurious diffs like these: -PASS: gdb.ada/complete.exp: complete p my_glob -PASS: gdb.ada/complete.exp: complete p insi -PASS: gdb.ada/complete.exp: complete p inner.insi -PASS: gdb.ada/complete.exp: complete p pck.inne -PASS: gdb.ada/complete.exp: complete p pck__inner__ins -PASS: gdb.ada/complete.exp: complete p pck.inner.ins -PASS: gdb.ada/complete.exp: complete p side -PASS: gdb.ada/complete.exp: complete p exported +PASS: gdb.ada/complete.exp: complete break ada PASS: gdb.ada/complete.exp: complete p -PASS: gdb.ada/complete.exp: p Exported_Capitalized -PASS: gdb.ada/complete.exp: p exported_capitalized -PASS: gdb.ada/complete.exp: complete p __gnat_ada_main_progra PASS: gdb.ada/complete.exp: complete p <__gnat_ada_main_prog -PASS: gdb.ada/complete.exp: complete p some +PASS: gdb.ada/complete.exp: complete p +PASS: gdb.ada/complete.exp: p Exported_Capitalized +PASS: gdb.ada/complete.exp: p exported_capitalized PASS: gdb.ada/complete.exp: set max-completions unlimited -PASS: gdb.ada/complete.exp: complete break ada Given the earlier discussions about sorting, I could immediately recognize what is wrong. It's that while testsuite/outputs/gdb.ada/complete/gdb.sum lists the test results in chronological order, preserving execution sequence: Running src/gdb/testsuite/gdb.ada/complete.exp ... PASS: gdb.ada/complete.exp: compilation foo.adb PASS: gdb.ada/complete.exp: complete p my_glob PASS: gdb.ada/complete.exp: complete p insi PASS: gdb.ada/complete.exp: complete p inner.insi PASS: gdb.ada/complete.exp: complete p pck.inne PASS: gdb.ada/complete.exp: complete p pck__inner__ins PASS: gdb.ada/complete.exp: complete p pck.inner.ins PASS: gdb.ada/complete.exp: complete p side PASS: gdb.ada/complete.exp: complete p exported PASS: gdb.ada/complete.exp: complete p PASS: gdb.ada/complete.exp: p Exported_Capitalized PASS: gdb.ada/complete.exp: p exported_capitalized PASS: gdb.ada/complete.exp: complete p __gnat_ada_main_progra PASS: gdb.ada/complete.exp: complete p <__gnat_ada_main_prog PASS: gdb.ada/complete.exp: complete p some PASS: gdb.ada/complete.exp: complete p not_in_sco PASS: gdb.ada/complete.exp: complete p pck.ins PASS: gdb.ada/complete.exp: complete p pck.my PASS: gdb.ada/complete.exp: complete p pck.inne PASS: gdb.ada/complete.exp: complete p pck.inner. PASS: gdb.ada/complete.exp: complete p local_ident PASS: gdb.ada/complete.exp: complete p pck.local_ident PASS: gdb.ada/complete.exp: complete p pck__local_ident PASS: gdb.ada/complete.exp: complete p external_ident PASS: gdb.ada/complete.exp: complete p pck PASS: gdb.ada/complete.exp: complete p pck. PASS: gdb.ada/complete.exp: complete p PASS: gdb.ada/complete.exp: p Exported_Capitalized PASS: gdb.ada/complete.exp: p exported_capitalized PASS: gdb.ada/complete.exp: set max-completions unlimited ... which is clearly incorrect. So you won't see the problem if you compare test results of two runs that both postdate the dg-extract-results update, and if they're both run in parallel mode. I assume the problem is visible if you compare a parallel mode run against a serial mode run, since the latter won't sort. Is this something that can be easily fixed? Thanks, Pedro Alves