From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 19115 invoked by alias); 11 Jun 2018 16:37:00 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 19096 invoked by uid 89); 11 Jun 2018 16:37:00 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: =?ISO-8859-1?Q?No, score=-26.1 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,MIME_BASE64_BLANKS,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=Tuesday, tuesday, Hx-languages-length:2239, It=e2?= X-HELO: EUR01-HE1-obe.outbound.protection.outlook.com Received: from mail-he1eur01on0077.outbound.protection.outlook.com (HELO EUR01-HE1-obe.outbound.protection.outlook.com) (104.47.0.77) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 11 Jun 2018 16:36:57 +0000 Received: from DB6PR0802MB2133.eurprd08.prod.outlook.com (10.172.226.148) by DB6PR0802MB2264.eurprd08.prod.outlook.com (10.172.227.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.841.15; Mon, 11 Jun 2018 16:36:54 +0000 Received: from DB6PR0802MB2133.eurprd08.prod.outlook.com ([fe80::d984:bdee:1856:c64]) by DB6PR0802MB2133.eurprd08.prod.outlook.com ([fe80::d984:bdee:1856:c64%7]) with mapi id 15.20.0841.019; Mon, 11 Jun 2018 16:36:54 +0000 From: Alan Hayward To: Simon Marchi CC: "gdb-patches@sourceware.org" , nd Subject: Re: [PATCH v2 10/10] Remove reg2 section from Aarch64 SVE cores Date: Mon, 11 Jun 2018 16:37:00 -0000 Message-ID: <0E69B756-49D8-4802-8EBC-FCB27859F981@arm.com> References: <20180606151629.36602-1-alan.hayward@arm.com> <20180606151629.36602-11-alan.hayward@arm.com> In-Reply-To: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Alan.Hayward@arm.com; x-ms-publictraffictype: Email x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-ht: Tenant x-ms-traffictypediagnostic: DB6PR0802MB2264: nodisclaimer: True x-exchange-antispam-report-test: UriScan:(180628864354917); x-ms-exchange-senderadcheck: 1 x-forefront-prvs: 070092A9D3 received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" Content-ID: <09B53C68022C2A4888680864049E0CA9@eurprd08.prod.outlook.com> Content-Transfer-Encoding: base64 MIME-Version: 1.0 X-MS-Office365-Filtering-Correlation-Id: 2b504b99-5eed-4cb4-8ff7-08d5cfb98a98 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2b504b99-5eed-4cb4-8ff7-08d5cfb98a98 X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Jun 2018 16:36:54.2875 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0802MB2264 X-IsSubscribed: yes X-SW-Source: 2018-06/txt/msg00283.txt.bz2 DQoNCj4gT24gMTEgSnVuIDIwMTgsIGF0IDAzOjQ3LCBTaW1vbiBNYXJjaGkg PHNpbWFya0BzaW1hcmsuY2E+IHdyb3RlOg0KPiANCj4gT24gMjAxOC0wNi0w NiAxMToxNiBBTSwgQWxhbiBIYXl3YXJkIHdyb3RlOg0KPj4gcmVnMiBzZWN0 aW9ucyBpbiBTVkUgYmluYXJpZXMgd2lsbCBjYXVzZSBnZGIgdG8gc2VnZmF1 bHQgb24gbG9hZGluZw0KPj4gZHVlIHRvIG1pc2NhbGN1bGF0aW5nIHRoZSBy ZWdpc3RlciBzaXplLg0KPj4gDQo+PiBGb3Igbm93LCBzaW1wbHkgcmVtb3Zl IHJlZzIgZnJvbSBTVkUgY29yZSBmaWxlcy4gVGhpcyByZXN1bHRzIGluDQo+ PiBjb3JlIGZpbGVzIHdpdGhvdXQgYW55IHZlY3Rvci9mbG9hdCByZWdpc3Rl ci4gRnVsbCBjb3JlIHN1cHBvcnQNCj4+IGZvciBTVkUgd2lsbCBjb21lIGlu IGEgbGF0ZXIgc2V0IG9mIHBhdGNoZXMuDQo+PiANCj4+IDIwMTgtMDYtMDYg IEFsYW4gSGF5d2FyZCAgPGFsYW4uaGF5d2FyZEBhcm0uY29tPg0KPj4gDQo+ PiBnZGIvDQo+PiAJKiBhYXJjaDY0LWxpbnV4LXRkZXAuYw0KPj4gCShhYXJj aDY0X2xpbnV4X2l0ZXJhdGVfb3Zlcl9yZWdzZXRfc2VjdGlvbnMpOiBDaGVj ayBmb3IgU1ZFLg0KPj4gLS0tDQo+PiBnZGIvYWFyY2g2NC1saW51eC10ZGVw LmMgfCA3ICsrKysrLS0NCj4+IDEgZmlsZSBjaGFuZ2VkLCA1IGluc2VydGlv bnMoKyksIDIgZGVsZXRpb25zKC0pDQo+PiANCj4+IGRpZmYgLS1naXQgYS9n ZGIvYWFyY2g2NC1saW51eC10ZGVwLmMgYi9nZGIvYWFyY2g2NC1saW51eC10 ZGVwLmMNCj4+IGluZGV4IDk2ZGM4YTExMzIuLmIwNWJiNDlhZTggMTAwNjQ0 DQo+PiAtLS0gYS9nZGIvYWFyY2g2NC1saW51eC10ZGVwLmMNCj4+ICsrKyBi L2dkYi9hYXJjaDY0LWxpbnV4LXRkZXAuYw0KPj4gQEAgLTIyNywxMCArMjI3 LDEzIEBAIGFhcmNoNjRfbGludXhfaXRlcmF0ZV9vdmVyX3JlZ3NldF9zZWN0 aW9ucyAoc3RydWN0IGdkYmFyY2ggKmdkYmFyY2gsDQo+PiAJCQkJCSAgICB2 b2lkICpjYl9kYXRhLA0KPj4gCQkJCQkgICAgY29uc3Qgc3RydWN0IHJlZ2Nh Y2hlICpyZWdjYWNoZSkNCj4+IHsNCj4+ICsgIHN0cnVjdCBnZGJhcmNoX3Rk ZXAgKnRkZXAgPSBnZGJhcmNoX3RkZXAgKGdkYmFyY2gpOw0KPj4gKw0KPj4g ICBjYiAoIi5yZWciLCBBQVJDSDY0X0xJTlVYX1NJWkVPRl9HUkVHU0VULCAm YWFyY2g2NF9saW51eF9ncmVnc2V0LA0KPj4gICAgICAgTlVMTCwgY2JfZGF0 YSk7DQo+PiAtICBjYiAoIi5yZWcyIiwgQUFSQ0g2NF9MSU5VWF9TSVpFT0Zf RlBSRUdTRVQsICZhYXJjaDY0X2xpbnV4X2ZwcmVnc2V0LA0KPj4gLSAgICAg IE5VTEwsIGNiX2RhdGEpOw0KPj4gKyAgaWYgKCF0ZGVwLT5oYXNfc3ZlICgp KQ0KPj4gKyAgICBjYiAoIi5yZWcyIiwgQUFSQ0g2NF9MSU5VWF9TSVpFT0Zf RlBSRUdTRVQsICZhYXJjaDY0X2xpbnV4X2ZwcmVnc2V0LA0KPj4gKwlOVUxM LCBjYl9kYXRhKTsNCj4+IH0NCj4+IA0KPj4gLyogSW1wbGVtZW50IHRoZSAi Y29yZV9yZWFkX2Rlc2NyaXB0aW9uIiBnZGJhcmNoIG1ldGhvZC4gIFNWRSBu b3QgeWV0DQo+PiANCj4gDQo+IElJVUMsIHRoaXMgZG9lc24ndCByZW1vdmUg ZXhpc3RpbmcgZmVhdHVyZXM/ICBJZiBhIHByb2dyYW0gd2FzIHVzaW5nIHRo ZSBuZW9uDQo+IHJlZ2lzdGVycywgdGhleSB3aWxsIHN0aWxsIGJlIGF2YWls YWJsZT8NCj4gDQo+IElmIHNvLCB0aGlzIExHVE0uDQo+IA0KDQpXaXRoIHRo aXMgcGF0Y2gsIG9uIFNWRSBzeXN0ZW1zLCBnZGIgd2lsbCBjcmVhdGUgY29y ZXMgd2l0aG91dCBhbnkgc3ZlIE9SIG5lb24NCnJlZ2lzdGVycy4gV2hpbHN0 IG5vdCBpZGVhbCB0aGF04oCZcyBiZXR0ZXIgdGhhbiBnZGIgc2VnZmF1bHRp bmcuDQoNCi4uLkhvd2V2ZXIsIHRoaXMgZ290IG1lIHRoaW5raW5nLg0KDQpT VkUgY29yZSBmaWxlcyBwcm9kdWNlZCBieSB0aGUga2VybmVsIHdpbGwgaGF2 ZSBib3RoIEZQIGFuZCBTVkUgc2VjdGlvbnMsIGV2ZW4NCnRob3VnaCB0aGUg RlAgc2VjdGlvbiBpcyBlc3NlbnRpYWxseSByZWR1bmRhbnQuIFNvLCBnZGIg ZG9lcyBuZWVkIHRvIHN1cHBvcnQNCnRoYXQuDQoNCkknbSB3b3JraW5nIG15 IHdheSB0aHJvdWdoIGZpeGluZyB0aGF0LiBJdOKAmWxsIG1vc3RseSBiZSBh IGNoYW5nZSB0byByZWdjYWNoZQ0KdG8gaGFuZGxlIG1pc21hdGNoZWQgcmVn aXN0ZXIgc2l6ZXMgdG8gY29yZSBmaWxlIHNsb3Qgc2l6ZXMuIEnigJlsbCBk cm9wIHRoaXMgcGF0Y2gNCmZvciBub3csIGFuZCB3aWxsIGhvcGVmdWxseSBo YXZlIGEgbmV3IHBhdGNoIHBvc3RlZCBUdWVzZGF5Lg0KDQoNCkFsYW4u >From gdb-patches-return-148141-listarch-gdb-patches=sources.redhat.com@sourceware.org Mon Jun 11 16:41:01 2018 Return-Path: Delivered-To: listarch-gdb-patches@sources.redhat.com Received: (qmail 23218 invoked by alias); 11 Jun 2018 16:41:01 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 23197 invoked by uid 89); 11 Jun 2018 16:41:00 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.3 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=dropping X-HELO: mx1.redhat.com Received: from mx3-rdu2.redhat.com (HELO mx1.redhat.com) (66.187.233.73) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 11 Jun 2018 16:40:59 +0000 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7E68D400225D; Mon, 11 Jun 2018 16:40:57 +0000 (UTC) Received: from [127.0.0.1] (ovpn04.gateway.prod.ext.ams2.redhat.com [10.39.146.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7BF2B42226; Mon, 11 Jun 2018 16:40:56 +0000 (UTC) Subject: Re: [PATCH] gdb: Don't drop SIGSTOP during stop_all_threads To: Andrew Burgess , gdb-patches@sourceware.org References: <20180518105654.6932-1-andrew.burgess@embecosm.com> Cc: Richard Bunt From: Pedro Alves Message-ID: <6841cae5-f4d4-a1d4-be74-4db2718bb949@redhat.com> Date: Mon, 11 Jun 2018 16:41:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <20180518105654.6932-1-andrew.burgess@embecosm.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2018-06/txt/msg00284.txt.bz2 Content-length: 6248 On 05/18/2018 11:56 AM, Andrew Burgess wrote: > This patch fixes an issues where GDB would sometimes hang when attaching > to a multi-threaded process. This issues was especially likely to > trigger if the machine (running the inferior) was under load. "issues" -> "issue" two times. > The target_wait call ends up in linux_nat_wait_1, which first checks to > see if any threads already have stashed stop events to report, and if > there are non then we enter a loop fetching as many events as possible "non" -> "none" > out of the kernel. This event fetching is non-blocking, and we give up > once the kernel has no more events ready to give us. > > All of the events from the kernel are passed through > What we really want to do though is simulate the kernel being slow to > report events through waitpid during the initial attach. The solution I > came up with was to write an LD_PRELOAD library that intercepts (some) > waitpid calls and rate limits them to one per-second. Any more than > that simply return 0 indicating there's no event available. Obviously > this can only be applied to waitpid calls that have the WNOHANG flag > set. > > Unfortunately, this library does break the "rest" of GDB, I suspect that > the issue is that in some places GDB knows that there's an event > pending, calls non-blocking waitpid and when the event fails to arrive > GDB moves into some unexpected/broken state from which it can't recover. > Still, the preload library does, at the moment, trigger the bug during > attach. The fix sounds good to me. Thanks so much for adding a testcase. That adds a lot of value. Some more comments below. > > diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c > index 445b59fa4ad..d1d0ba66d2e 100644 > --- a/gdb/linux-nat.c > +++ b/gdb/linux-nat.c > @@ -2527,17 +2527,23 @@ stop_wait_callback (struct lwp_info *lp, void *data) > } > else > { > - /* We caught the SIGSTOP that we intended to catch, so > - there's no SIGSTOP pending. */ > + /* We caught the SIGSTOP that we intended to catch. */ > > if (debug_linux_nat) > fprintf_unfiltered (gdb_stdlog, > "SWC: Expected SIGSTOP caught for %s.\n", > target_pid_to_str (lp->ptid)); > > - /* Reset SIGNALLED only after the stop_wait_callback call > - above as it does gdb_assert on SIGNALLED. */ > lp->signalled = 0; > + > + /* If we are waiting for this stop so we can report the thread > + stopped then we need to record that this status. Otherwise, Looks like on "that" too many in "that this"? > + we can now discard this stop event. */ > + if (lp->last_resume_kind == resume_stop) > + { > + lp->status = status; > + save_stop_reason (lp); > + } > } > } > +# This test script tries to expose a bug in some of the use of waitpid "some of the uses" ? > +# in the linux native support within GDB. The problem was spotted on "linux" -> "Linux". > +# systems which were heavily loaded when attaching to threaded test > +# programs. What happened was that during the initial attach, the > +# loop of waitpid calls that normally received the stop events from > +# each of the threads in the inferior was not receiving a stop event > +# for some threads (the kernel just hadn't sent the stop event yet). > +# > +# GDB would then trigger a call to stop_all_threads which would > +# continue to wait for all of the outstanding threads to stop, when > +# the outstanding stop events finally arrived GDB would then > +# (incorrectly) discard the stop event, resume the thread, and > +# continue to wait for the thread to stop.... which it now never > +# would. > +# > +# In order to try and expose this issue reliably, this test preloads a > +# library that intercepts waitpid calls. And waitpid calls targeting > +# pid -1, and with the WNOHANG flag are rate limited so that only 1 > +# per second can complete, any additional calls are forced to return 0 > +# indicating no event waiting. I find this last sentence starting with "And" hard to parse. Maybe you meant "All" instead of "And". I'd also suggest splitting it in two and dropping the comma after "-1". Like: All waitpid calls targeting pid -1 with the WNOHANG flag are rate limited so that only 1 per second can complete. Additional calls are forced to return 0 indicating no event waiting. > + > +# Spawn GDB with LIB preloaded with LD_PRELOAD. CMDLINE_OPTS are > +# command line options passed to GDB. > + > +proc gdb_spawn_with_ld_preload {lib cmdline_opts} { > + global env > + > + save_vars { env(LD_PRELOAD) } { > + if { ![info exists env(LD_PRELOAD) ] > + || $env(LD_PRELOAD) == "" } { > + set env(LD_PRELOAD) "$lib" > + } else { > + append env(LD_PRELOAD) ":$lib" > + } > + > + gdb_spawn_with_cmdline_opts $cmdline_opts > + } > +} The testcase currently fails with --target_board=native-extended-gdbserver: (gdb) attach 487 Don't know how to attach. Try "help target". (gdb) FAIL: gdb.threads/attach-slow-waitpid.exp: attach to target It'd be really nice to make it work there too, considering that at some point we may switch to using gdbserver's linux backend even for native debugging. The test fails there because that board overrides gdb_start to automatically start gdbserver and have gdb connect to it, but not gdb_spawn_with_cmdline_opts. Can we simply use gdb_start instead? It seems like no cmdline opts are passed down anyway. In principle, the LD_PRELOAD will end up injecting the slow-waitpid lib in gdbserver too. > +static int > +should_skip_waitpid (void) > +{ > + /* When we last allowed a waitpid to complete. */ > + static struct timeval last_waitpid_time = { 0, 0 }; > + > + struct timeval *tv = &last_waitpid_time; > + if (tv->tv_sec == 0) > + { > + if (gettimeofday (tv, NULL) < 0) > + error ("error: gettimeofday failed\n"); > + return 0; /* Don't skip. */ > + } > + else > + { > + struct timeval new_tv; > + > + if (gettimeofday (&new_tv, NULL) < 0) > + error ("error: gettimeofday failed\n"); > + > + if ((new_tv.tv_sec - tv->tv_sec) < 1) > + return 1; /* Skip. */ > + > + memcpy (tv, &new_tv, sizeof (new_tv)); Write: *tv = new_tv; ? Thanks, Pedro Alves