From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id CPRWAjt13GMyjCgAWB0awg (envelope-from ) for ; Thu, 02 Feb 2023 21:45:15 -0500 Received: by simark.ca (Postfix, from userid 112) id 074C61E128; Thu, 2 Feb 2023 21:45:15 -0500 (EST) Authentication-Results: simark.ca; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=UEEIEsVD; dkim-atps=neutral X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-7.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, RDNS_DYNAMIC,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 83F3B1E110 for ; Thu, 2 Feb 2023 21:45:14 -0500 (EST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6A1BF385828D for ; Fri, 3 Feb 2023 02:45:12 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6A1BF385828D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1675392312; bh=WcRRoxIZ0MSAFzutxBDN1dMuc7h/hoVjJeqXm8Yc6fs=; h=References:To:Cc:Subject:In-reply-to:Date:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=UEEIEsVDekEm8HdfWO/IUg3s0MdQEvsb7QWgg+72/wixHvrXkICLQ7QftMX4DQy9b Ahz255pICJZiU8grbt5zouKGCDA3S9EuB9DzSuhaMkQPFAjBcUBZEb2UqqSnO50ak8 A2o1F4ww34rANhWvWszchNqW1rJzKKuoJTCVAB9w= Received: from mail-oi1-x229.google.com (mail-oi1-x229.google.com [IPv6:2607:f8b0:4864:20::229]) by sourceware.org (Postfix) with ESMTPS id 77B1E3858C5E for ; Fri, 3 Feb 2023 02:44:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 77B1E3858C5E Received: by mail-oi1-x229.google.com with SMTP id r205so3102726oib.9 for ; Thu, 02 Feb 2023 18:44:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:in-reply-to :subject:cc:to:from:user-agent:references:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=WcRRoxIZ0MSAFzutxBDN1dMuc7h/hoVjJeqXm8Yc6fs=; b=yZmYZySXoVTfr4DfxkzzopIWLfuqilM7zwyhI2psIFkj40A0qKaoX7N9Lnc1pxlir1 VJyjalp1B0yEn1WJHk/7egR/EVHPHttPbr7egiQy0nIIgGkIdK0eFouh3mV9ATAoC6cR Qa77J7HjToSsrW2wqZz9JZkfoRjDoj0HNkUUsylDCotdsYv24oDWkeESRq0pFKoimYy1 Fwt7XasrmBtifPtHXNZ2xT75V1/Qm+TQIzE1GfJJdEmSdzLNpbFe/XSfQJDMJ7RPit8s alotWnU1zKNOsLGoVcYkfjgdjmYD0kLu7vHftpkYrhJVrmtkYuiC8X2ZURZ1+51FSgiE EENA== X-Gm-Message-State: AO0yUKWL/Qr5RZR0164tFzKxHxxvojT+3BZQ3WVL5+KV4yK/TlS20I9v GPs60ZwMyl5ezNUUqgEAhDq2/aP9tyNGXycO X-Google-Smtp-Source: AK7set+PXnX27xCxOV8mJ8aS6Hcan2l5kPuS5XLhmFogqjs2S171w+uIzLg1TL+9urVgOdjFQq4WaQ== X-Received: by 2002:a05:6808:208f:b0:377:27da:5ed7 with SMTP id s15-20020a056808208f00b0037727da5ed7mr5232482oiw.19.1675392288803; Thu, 02 Feb 2023 18:44:48 -0800 (PST) Received: from localhost ([2804:14d:7e39:8470:7132:fbe:2b2e:ae3e]) by smtp.gmail.com with ESMTPSA id bl24-20020a056808309800b003785a948b27sm400577oib.16.2023.02.02.18.44.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Feb 2023 18:44:47 -0800 (PST) References: <20230130044518.3322695-1-thiago.bauermann@linaro.org> <20230130044518.3322695-6-thiago.bauermann@linaro.org> <87bkmdtp4n.fsf@redhat.com> <502f0bd0-9b18-9a31-0094-7a9bd4778bd2@simark.ca> <87r0v7alq1.fsf@linaro.org> <45bd55f5-d33b-58f7-0ac6-13d31c4d03f5@simark.ca> User-agent: mu4e 1.8.13; emacs 28.2 To: Simon Marchi Cc: Andrew Burgess , Thiago Jung Bauermann via Gdb-patches , Simon Marchi Subject: Re: [PATCH v3 5/8] gdbserver: Transmit target description ID in thread list and stop reply In-reply-to: <45bd55f5-d33b-58f7-0ac6-13d31c4d03f5@simark.ca> Date: Fri, 03 Feb 2023 02:44:45 +0000 Message-ID: <875ycja2n6.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Thiago Jung Bauermann via Gdb-patches Reply-To: Thiago Jung Bauermann Errors-To: gdb-patches-bounces+public-inbox=simark.ca@sourceware.org Sender: "Gdb-patches" Simon Marchi writes: > On 2/2/23 14:52, Thiago Jung Bauermann wrote: >>=20 >> Simon Marchi writes: >>=20 >>>> Having read some of the later patches, I have some additional thoughts >>>> here: >>>> >>>> I think we should make it explicit here that IDs are connection wide, >>>> not per-process. We should also make it clear that GDB might[1] cache >>>> target descriptions per remote connection, and so a remote target shou= ld >>>> not reuse a target description ID except where the target description = is >>>> identical. >>=20 >> Ah, good point. I will make that clarification. >>=20 >>>> [1] I say "GDB might" here because if we say "GDB will" then this would >>>> imply each target description will only be asked for once. And I >>>> figure, why be overly restrictive. >>> >>> Thanks for pointing this out, I had the same thought while reading the >>> patch. >>> >>> In my original idea, I imagined that target description IDs could be >>> some hashes computed from the XML content (a bit like git hashes or ELF >>> build IDs), such that a given target description would always have the >>> same ID. This would give clients the possibility to cache target >>> descriptions locally, a bit like the index cache. It did sound nice, >>> but perhaps it's not really important. >>=20 >> Ah, sorry I misunderstood this part of your suggestion. I thought that >> the caching was supposed to be limited to the duration of the connection, >> and thus the m_tdescs map in struct remote state would be enough to >> provide that functionality. Do you mean that the cache should be on >> disk, so that it survives GDB quitting? I can look into that if you want >> and implement it, if not complicated. > > To be clear, I'm not asking you to implement an on-disk cache, I'm just > trying to think about what the limitations of the proposed solution are. > Because mistakes or shortcomings introduced in the remote protocol (like > any API / ABI meant to be stable) are difficult to fix afterwards. If > there is a chance we want to do an on-disk cache (or share a cache > between multiple concurrent connections), we should think about that > now. Indeed, that's a good idea. > On one hand, perhaps we consider that target descriptions are small > enough that there's no point in having a persistent cache like that. > The time to fetch each target description once per connection is > probably insignificant. I don't have a good intuition about that. My uneducated guess is that it's not worth it, but as you say I think it's a good idea to leave the door open if we can. > On the other hand, perhaps doing the hash approach is easy enough that > we might as well do it, and that just leaves more doors open. In my > mind, as a start, we could just pass the XML of the tdesc through a > sha256 function, possibly doing something more sophisticated later (this > means the hash of a given tdesc could change over time, but still two > identical hashes mean identical tdescs). However, I don't think there > are sha256 functions built in on the OSes we want to support, so we'd > have to depend on some external library. And that complicates things > quite a bit... We already have a CRC32 function in gdbserver, but I > don't know if that's good enough to be confident there won't be > collisions. I tried to research this a bit, but I'm not confident enough in my knowledge of the subject to reach a conclusion. I only know enough about hashes to be dangerous. I did find a hash testsuite called SMhasher which seems to be well regarded, and its website=C2=B9 has reports for CRC32=C2=B2, which shows that it fails many collision and distribution bias tests. So it doesn't look like it is very reliable for using as a hash. We do support libxxhash as an optional dependency, so we could use that if gdbserver is linked with it. Its SMhasher report=C2=B3 looks good, IUUC (but not the 32 bits version though=E2=81=B4). We could also vendor coreutils's lib/sha256.[ch] files. They're 660 lines in total. >> In this case, I think the tdesc ID should be of the format >> :, so that can be =E2=80=9Csha1=E2=80=9D, = =E2=80=9Csha256=E2=80=9D etc to >> indicate that is a hash with the given algo, or even >> =E2=80=9Cnumber=E2=80=9D to indicate that it's a simple integer like it = is today. >>=20 >> Perhaps we can do the prefix thing even if not implementing the cache, >> to leave the possibility of adding it in the future? > > I think I would choose between hash and number, but not do both, I don't > see the need to have both. I suggested a prefix to designate an integer ID as a way to implement only this possibility for now, but which would allow implementing hash IDs later, as additional prefixes. Or another possibility is that if we later decide to support hashes, we could add a separate protocol feature to signal that. --=20 Thiago =C2=B9 https://rurban.github.io/smhasher/ =C2=B2 https://rurban.github.io/smhasher/doc/crc32.txt =C2=B3 https://rurban.github.io/smhasher/doc/xxHash64.txt =E2=81=B4 https://rurban.github.io/smhasher/doc/xxHash32.txt