From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id 9ShgKkvps1+4BwAAWB0awg (envelope-from ) for ; Tue, 17 Nov 2020 10:16:27 -0500 Received: by simark.ca (Postfix, from userid 112) id A216D1F08B; Tue, 17 Nov 2020 10:16:27 -0500 (EST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RDNS_NONE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.2 Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 0F78C1E58D for ; Tue, 17 Nov 2020 10:16:27 -0500 (EST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3485B3973080; Tue, 17 Nov 2020 15:16:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3485B3973080 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1605626186; bh=DQyuxN/Ow/FLgOJzEiKgsXPOL9qv1x83usIcbm8GXg0=; h=References:In-Reply-To:Date:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=ZB9+zfliFw28qYiOvOFHu2sT9VSUxBdKAkh4/Fh0SFrxC5kaP6abmmNVrBOvbOxp7 nlYhZMO0H6PJNsMbCb+NiqUyy+k6+EPuVbmpKXyu0U+HxtonTQZAoA+94OJbgXID6R k2ciizJIYhcua59YuopfxSDri5iOJvgqDiHy4DTQ= Received: from mail-vs1-xe43.google.com (mail-vs1-xe43.google.com [IPv6:2607:f8b0:4864:20::e43]) by sourceware.org (Postfix) with ESMTPS id 942FC3846078 for ; Tue, 17 Nov 2020 15:16:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 942FC3846078 Received: by mail-vs1-xe43.google.com with SMTP id x11so11210856vsx.12 for ; Tue, 17 Nov 2020 07:16:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=DQyuxN/Ow/FLgOJzEiKgsXPOL9qv1x83usIcbm8GXg0=; b=BaHQ+j6DLVDaynDe2Q7833OEU/vyaMttiLd8WEL7HwxEOPCUvzxa1pX3wzpoe+JLGJ rGzYtLOBNWkajzSxa1biCbiKhxS9B5MaLgMYjPyy5jkQHIwOBEERfnRlBUBV/ht3odcC Jg8Uv8025gtSFT0U4tI6pVuVVwyznM91TjG1Oij9XGiTLKdp9fVBZyqQFoUb2bmDhxE5 J+IoVduTm+iQon/Jxxh7cGTU0Ipy6PQEtbdhD/Or6TZI3BLW9f6G7j1/rtxcljdwIygR rcHSl7HdGLmUKccQzsW7Qllwqtfs7brFV6O0fAXuiwVB0QW6PoWD8jVEAkLB/PmjFLII EFzA== X-Gm-Message-State: AOAM530PI9fLT4HtqYxA1IPklbXA00PTEZnBjWz/qPjppZAk/WBoD66i hLp/uK5a2LE+SvKIa/L0yMRRomSSSW2jGujpAvfRnWDeuzhI6w== X-Google-Smtp-Source: ABdhPJyLt2vNBkvAabQztMvAVHjZe65nvY3+IWg4TVSqijlg1O1QINPCQGXgVlR1iwRJagdqqso4M1hJDzyMNb9Xq1A= X-Received: by 2002:a67:683:: with SMTP id 125mr1016366vsg.34.1605626181469; Tue, 17 Nov 2020 07:16:21 -0800 (PST) MIME-Version: 1.0 References: <20201109170435.15766-1-luis.machado@linaro.org> <20201109170435.15766-8-luis.machado@linaro.org> <83ft5i4fwl.fsf@gnu.org> <08cb075e-018c-474e-abd4-b76ba1ed6a52@linaro.org> <27e9cc8e-95e6-4ac3-bd05-0f3846c96916@linaro.org> In-Reply-To: Date: Tue, 17 Nov 2020 15:16:10 +0000 Message-ID: Subject: Re: [PATCH v3 07/24] Documentation for memory tagging remote packets To: Luis Machado Content-Type: text/plain; charset="UTF-8" X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: David Spickett via Gdb-patches Reply-To: David Spickett Cc: gdb-patches@sourceware.org Errors-To: gdb-patches-bounces@sourceware.org Sender: "Gdb-patches" > Right now you can infer the technology and logical/allocation from the > type. There is no need for a two step process. I don't anticipate a need > for that in the future either, as long as we prevent the tag type ENUM > from having overlapping values. Right, so to not overlap, a value of that tag type ENUM must be a combination of technology and tag...kind. (words are hard) E.g. MTE logical tag, allocation tag , strange tag So if were to also have "logical" and "allocation" tags, they would be new type values, yes? 0 = MTE logical 1 = MTE allocation 2 = logical 3 = allocation Because if they are still 0 and 1, they overlap and you can't tell MTE and apart. I get that at the user interface layer you have some mapping back to a set of generic types, talking just about the packet contents. > So this is wrong, as the remote/native side won't be able to tell both > definitions of type "1" apart. Sorry that was a typo. Should be: 0 = mte logical, 1 = mte allocation, 1 = future logical, 2 = future allocation, 3 = future tag form (not logical/allocation) etc... (basically what I restated above) > If we want to address this and make it a bit more flexible, then we will > need to let architectures define their own tag type values. Then generic > code will have to go through an arch-specific hook to fetch the tag type. > Would that address your concerns? I think so. I thought of this working like the breakpoint types I referenced. Breakpoint type 1 for MIPS isn't in any way the same as type 1 for Arm etc. (ok they might be similar but the docs make it clear that the list of types is per architecture) > As long as GDB sends a unique tag type identifier (non-overlapping > values), we will be able to tell them apart from the remote side. Yes but depending on how those IDs are allocated we may only be able to tell logical vs allocation not MTE vs CHERI vs . On Tue, 17 Nov 2020 at 14:44, Luis Machado wrote: > > On 11/17/20 9:29 AM, David Spickett wrote: > >> Right. The type is really telling you what specific kind of tag you are > >> requesting, not the technology. So it may be perfectly valid to request > >> a MTE logical tag from the remote target, but the remote target doesn't > >> know how to reply to that at the moment (nor does it make much sense, IMO). > >> > >> The tag types don't overlap at the moment, given they are an ENUM in > >> generic code. So the server will tell them apart by their values. > > > > Ok but my concern here is that there are two aspects to type. (which I > > probably confused earlier tbf) > > > > 1. logical vs allocation > > 2. MTE vs > > > > So we have three scenarios: > > 1. Server uses type to decide between logical and allocation > > Right now 0 maps to logical and 1 maps to allocation for all > architectures. For AArch64 0 maps to MTE logical and 1 maps to MTE > allocation. > > But, if more tag types are supported in the future, we will need to map > types to logical, allocation or something else. > > > 2. Server uses type to decide between MTE and > > Right now 0 and 1 map to MTE for AArch64 and ADI for SPARC (not yet > supported). In the future, if there are more tagging technologies, we > will need to provide a mapping from tag types to tag technology. > > > 3. The combination of the two, should a system have both technologies > > In this case, we will use two mappings: type to tag type and type to tag > technology. > > > > > What I want to avoid is a two step process: > > 1. Somehow select tagging technology you want to interact with > > 2. You send memtag packets with the logical vs allocation "type" > > > > Which would be needed if the protocol "type" is just allocation vs > > logical. If the "type" includes the technology > > then we can do it in one step. > > Right now you can infer the technology and logical/allocation from the > type. There is no need for a two step process. I don't anticipate a need > for that in the future either, as long as we prevent the tag type ENUM > from having overlapping values. > > > > > So I prefer: > > 0 = mte logical, 1 = mte allocation, 1 = future logical, 2 = future > > allocation, 3 = future tag form (not logical/allocation) etc... > > So this is wrong, as the remote/native side won't be able to tell both > definitions of type "1" apart. > > > Over: > > 0 = logical, 1 = allocation, 2 = future term for other tag form etc... > > This is correct in my view. Everything is unique. > > The complicating factor here is that generic code/UI needs to use a > generic tag type so all architectures supporting memory tagging can use > the commands out of the box. > > For AArch64, tag_logical/tag_allocation means MTE logical/MTE > allocation. For SPARC ADI, this means whatever their tagging mechanism is. > > If we turn tag_logical into mte_logical, SPARC ADI won't be able to use > that. We will, again, need a mapping. > > If we want to address this and make it a bit more flexible, then we will > need to let architectures define their own tag type values. Then generic > code will have to go through an arch-specific hook to fetch the tag type. > > Would that address your concerns? > > > > > I see the need for the second form in gdb internally, but my concern > > is only with the protocol side. > > As long as GDB sends a unique tag type identifier (non-overlapping > values), we will be able to tell them apart from the remote side. > > > > >> I just want to keep that option open > >> if someone wants to do it, or if some other type of tag shows up that > >> would require such support. > > > > I hadn't thought of that. I agree that type should include that too. > > > > On Tue, 17 Nov 2020 at 12:01, Luis Machado wrote: > >> > >> Hi, > >> > >> On 11/17/20 7:05 AM, David Spickett wrote: > >>>> Right now the design makes these types architecture-specific. > >>> > >>> This works too, in fact it matches the breakpoint types example better that way. > >>> > >>>> But there's one catch right now. The user-visible commands know about > >>>> two types of tags (logical and allocation). The native/remote side of > >>>> GDB only sees one type, the allocation one, as it doesn't make sense to > >>>> ask the native/remote target about logical tags. > >>>> > >>>> This is slightly messy and, in my opinion, should be an implementation > >>>> detail. > >>> > >>> Tell me if I have this right. > >>> > >>> In gdb in overall you have these two types but the server only uses > >>> one of them, the allocation tag type. > >>> So only the allocation tag type will ever go over the protocol. (for > >>> MTE at least) > >> > >> That's correct. Only GDB knows about logical tags. Those don't make > >> their way to the remote via the remote protocol. > >> > >>> > >>> Given that, if we assume that "mte allocation" type is 1. A future > >>> AArch64 kind of memory tagging could allocate 2 and on for its tag > >>> types. > >>> Something like: > >>> AArch64 Memory Tag types - > >>> 0 : MTE logical (which is internal only, reserved but documented as > >>> unused, or left out completely?) > >>> 1 : MTE allocation (the one we use at present) > >>> 2: logical tag (because maybe there is some server > >>> component for this kind of tagging extension?) > >>> 3: allocation tag > >>> > >>> The reason I want to clarify is that I understood the type to > >>> differentiate tagging technologies, not the kind of tag within them. > >>> (the type tells you MTE vs instead of allocation vs logical) > >>> The use case being what if you have MTE and active > >>> in the same target and I want to set an MTE allocation tag, > >>> how can the server tell them apart? > >> > >> Right. The type is really telling you what specific kind of tag you are > >> requesting, not the technology. So it may be perfectly valid to request > >> a MTE logical tag from the remote target, but the remote target doesn't > >> know how to reply to that at the moment (nor does it make much sense, IMO). > >> > >> The tag types don't overlap at the moment, given they are an ENUM in > >> generic code. So the server will tell them apart by their values. > >> > >>> > >>> If the type numbers overlap between tagging technologies, we can't > >>> tell them apart. > >>> However if they encode what extension they are for and the > >>> logical/allocation type (as in the example above) then we can. > >>> > >>> A lot of that is probably academic given there's one relevant type but > >>> we can at least document the intent of the field. > >>> E.g. "these types are global to AArch64 so new types should not > >>> overlap existing ones" > >> > >> I suppose. I chose to have generic ENUM's without specific references to > >> MTE for that reason. Architectures can use logical/allocation tags as > >> they see fit, but the ENUM values will not overlap. > >> > >> We need to support the UI as well, so there needs to be some generic > >> definitions so commands can query different tag types. > >> > >>> > >>>> Otherwise we'd need to standardize on particular tag type names across > >>>> different architectures, like "hw memory tag", "sw memory tag", > >>>> "capability tag" etc. > >>> > >>> Well I was thinking of type more as a single value like "mte". Anyway > >>> I'm fine with the integer route. > >> > >> Though we don't have a use for requesting logical tags from the remote > >> targets, it is possible to support that. Passing "mte" or any other > >> technology name would close that option. > >> > >> If, for example, we decide to have a dumb GDB client and a smart > >> GDBserver (unlikely at this point), then it would make sense to pass > >> down logical tag requests I think. I just want to keep that option open > >> if someone wants to do it, or if some other type of tag shows up that > >> would require such support. > >> > >>> > >>> On Mon, 16 Nov 2020 at 17:23, Luis Machado wrote: > >>>> > >>>> > >>>> > >>>> On 11/16/20 1:04 PM, David Spickett wrote: > >>>>> Also with regard to the "type" field. > >>>>> > >>>>>> +@var{type} is the type of tag the request wants to fetch. The typeis a signed > >>>>>> +integer. > >>>>> > >>>>> (typo aside) Is this field architecture specific and will there be a > >>>>> list of these type numbers documented anywhere? (or already is) > >>>>> For example would 1 on AArch64 be MTE, and on be >>>>> tag type>. Or would that be 2. > >>>>> > >>>>> My assumption has been that it is the latter and that a value means a > >>>>> kind of tagging extension. So for example 1=MTE rather than > >>>>> 1= mte logical and 2 = mte allocation. Correct me if I am wrong there. > >>>> > >>>> Right now the design makes these types architecture-specific. It would > >>>> be nice to have more documentation about them, for sure. > >>>> > >>>> But there's one catch right now. The user-visible commands know about > >>>> two types of tags (logical and allocation). The native/remote side of > >>>> GDB only sees one type, the allocation one, as it doesn't make sense to > >>>> ask the native/remote target about logical tags. > >>>> > >>>> This is slightly messy and, in my opinion, should be an implementation > >>>> detail. > >>>> > >>>> So, in summary... We have a couple generic tag types GDB knows about: > >>>> logical and allocation. > >>>> > >>>> Those types get translated to an arch/a target-specific type when they > >>>> cross the native/remote target boundary. > >>>> > >>>> In theory we could have generic tag types 1 and 2 in generic code, but > >>>> tag type 2 gets translated to type 1 in a remote packet. > >>>> > >>>> Maybe we could improve this a little. > >>>> > >>>>> > >>>>> A page like: > >>>>> https://sourceware.org/gdb/current/onlinedocs/gdb/ARM-Breakpoint-Kinds.html#ARM-Breakpoint-Kinds > >>>>> > >>>>> Or just a short note, given that there's only one type right now. > >>>> > >>>> Yes, that would be nice to expand for the tag types. > >>>> > >>>>> > >>>>> Also, I may have suggested the type be a string at some point. However > >>>>> based on examples like the link above > >>>>> I don't see much advantage to it apart from making packet dumps easier > >>>>> to read. Just wanted to close the loop on that > >>>>> if I didn't before. > >>>> > >>>> I don't have a strong preference here. I'm just forwarding the tag type > >>>> from generic code. > >>>> > >>>> If we want to pass strings, we will need a gdbarch hook that maps a type > >>>> to a string in the remote target layer. > >>>> > >>>> Otherwise we'd need to standardize on particular tag type names across > >>>> different architectures, like "hw memory tag", "sw memory tag", > >>>> "capability tag" etc. > >>>> > >>>>> > >>>>> > >>>>> > >>>>> On Mon, 16 Nov 2020 at 15:44, David Spickett wrote: > >>>>>> > >>>>>> Minor thing, there is a missing space here in "typeis". > >>>>>> > >>>>>>> +@var{type} is the type of tag the request wants to fetch. The typeis a signed > >>>>>>> +integer. > >>>>>> > >>>>>> On Mon, 9 Nov 2020 at 17:08, Eli Zaretskii wrote: > >>>>>>> > >>>>>>>> Date: Mon, 9 Nov 2020 14:04:18 -0300 > >>>>>>>> From: Luis Machado via Gdb-patches > >>>>>>>> Cc: david.spickett@linaro.org > >>>>>>>> > >>>>>>>> gdb/doc/ChangeLog: > >>>>>>>> > >>>>>>>> YYYY-MM-DD Luis Machado > >>>>>>>> > >>>>>>>> * gdb.texinfo (General Query Packets): Document qMemTags and > >>>>>>>> QMemTags. Document the "memory-tagging" feature. > >>>>>>>> --- > >>>>>>>> gdb/doc/gdb.texinfo | 96 +++++++++++++++++++++++++++++++++++++++++++++ > >>>>>>>> 1 file changed, 96 insertions(+) > >>>>>>> > >>>>>>> OK for this part, thanks.