From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id XaxwFJEItF/iCQAAWB0awg (envelope-from ) for ; Tue, 17 Nov 2020 12:29:53 -0500 Received: by simark.ca (Postfix, from userid 112) id 4A20C1F08B; Tue, 17 Nov 2020 12:29:53 -0500 (EST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RDNS_NONE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.2 Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 981351E552 for ; Tue, 17 Nov 2020 12:29:52 -0500 (EST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A33873896800; Tue, 17 Nov 2020 17:29:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A33873896800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1605634191; bh=wUevBVloeBallnaymKZ9mAUSkoM+0nsPxVZ1SKySn1Y=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=NH1QWHDEzd7wKXMeyzmvj9v2AdlAmtXEjTFp3938XeyOgeGyQ2V0BN3LgEf1Y0W12 UnJjHy1Ixto1neTlsAK7cjO/GZTXD5itRBLxLo0iDK926CP6SDlA6xNYmR+L+EcNfe M/CGcqCxSDyY7Zscdq2WZ4afNrLLLL+2EytnO6xc= Received: from mail-qk1-x744.google.com (mail-qk1-x744.google.com [IPv6:2607:f8b0:4864:20::744]) by sourceware.org (Postfix) with ESMTPS id 0E7D2386F41D for ; Tue, 17 Nov 2020 17:29:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0E7D2386F41D Received: by mail-qk1-x744.google.com with SMTP id d28so21204937qka.11 for ; Tue, 17 Nov 2020 09:29:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=wUevBVloeBallnaymKZ9mAUSkoM+0nsPxVZ1SKySn1Y=; b=M6+MUdUYubLgZPFFz5QHkPbXAB/tbC43vnnlXQSs7suGSXf4OkK0Snw0UgIP6+ZLt0 y4OX8YyU5s6eSycCV76gxCUyQ2IULPGXbhxzndYhWjt1Dm/IZXBRkPTmKbfbxgf7Q/+n zsQPPeA0bzoN1czaO+dllITUpHGCLsepKEAvFd7s+dgUhATa2/Wpxz+NLwuIjt2C0Ci0 43gYsm0Icx2nP1qgIToodxYlj7oMMyXuPxOVfgO5gGyOty6TWJedygXSODMHoD5ERwjj ci5aezCTOdMDkXvKxZaFJJJPjpcrHJFkNzLJSuqVIsE7HhbnhSWpuzWzTvBGoLwXl1vG lnQg== X-Gm-Message-State: AOAM532R+Cy6A7JGNhO5ZJZnIIAJfE2z3QoPWgBXJg+8sew1PzfBlQv2 7kBfIUn6bz3/ymlEjbNoNkRgJKu6SHN1ew== X-Google-Smtp-Source: ABdhPJyYiHCQsjrZoPpUbaEh2zLhZqHeI9s/rWM+J/+0gvOhD9d0Lat9LHki0j/+2EnD6hSrwWbFOQ== X-Received: by 2002:a37:9c52:: with SMTP id f79mr577695qke.425.1605634184299; Tue, 17 Nov 2020 09:29:44 -0800 (PST) Received: from ?IPv6:2804:7f0:8284:1487:b4f7:71cf:fcce:a2f7? ([2804:7f0:8284:1487:b4f7:71cf:fcce:a2f7]) by smtp.gmail.com with ESMTPSA id h26sm15082012qkh.127.2020.11.17.09.29.41 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 17 Nov 2020 09:29:43 -0800 (PST) Subject: Re: [PATCH v3 07/24] Documentation for memory tagging remote packets To: David Spickett References: <20201109170435.15766-1-luis.machado@linaro.org> <20201109170435.15766-8-luis.machado@linaro.org> <83ft5i4fwl.fsf@gnu.org> <08cb075e-018c-474e-abd4-b76ba1ed6a52@linaro.org> <27e9cc8e-95e6-4ac3-bd05-0f3846c96916@linaro.org> Message-ID: <7b5b77c2-1bc4-8810-93df-79343e67d1be@linaro.org> Date: Tue, 17 Nov 2020 14:29:40 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Luis Machado via Gdb-patches Reply-To: Luis Machado Cc: gdb-patches@sourceware.org Errors-To: gdb-patches-bounces@sourceware.org Sender: "Gdb-patches" On 11/17/20 12:16 PM, David Spickett wrote: >> Right now you can infer the technology and logical/allocation from the >> type. There is no need for a two step process. I don't anticipate a need >> for that in the future either, as long as we prevent the tag type ENUM >> from having overlapping values. > > Right, so to not overlap, a value of that tag type ENUM must be a > combination of technology and tag...kind. (words are hard) > E.g. MTE logical tag, allocation tag , strange tag Why? It only needs to be a unique value that represents a unique combination. See below. > > So if were to also have "logical" and > "allocation" tags, they would be new type values, yes? > 0 = MTE logical > 1 = MTE allocation > 2 = logical > 3 = allocation > > Because if they are still 0 and 1, they overlap and you can't tell MTE > and apart. I think there is some confusion here. Let me try to clarify. Yes, we currently only have tag types 0 and 1 (AArch64's MTE and SPARC ADI technologies, logical/allocation kind). If we add more tag technologies, we will also add new code/commands that will pass the appropriate tag types down to the target/remote layer. For example, if we end up supporting capability tags as a new type of tag (say, type 2, given we already have 0 and 1) , we will have to modify the code to handle that. It will be either a new command or new code to handle it. The target/remote layer will only see unique numbers 0, 1 and 2. For AArch64, these numbers would mean the following: 0 - MTE logical 1 - MTE allocation 2 - Capability For SPARC ADI, the following: 0 - ADI logical 1 - ADI allocation 2 - Unknown/Invalid > > I get that at the user interface layer you have some mapping back to a > set of generic types, talking just about the > packet contents. > >> So this is wrong, as the remote/native side won't be able to tell both >> definitions of type "1" apart. > > Sorry that was a typo. Should be: > 0 = mte logical, 1 = mte allocation, 1 = future logical, 2 = future > allocation, 3 = future tag form (not logical/allocation) etc... > (basically what I restated above) I think the typo is still there. You meant 0,1,2 3 and 4, right? But I see what you mean. Even though the tag type name is "tag_logical" right now, the implementation choice is to interpret that as MTE for AArch64 and ADI for SPARC. Same thing for "tag_allocation". The rationale for having these two generic tag types ("tag_logical" and "tag_allocation") is that this technology may be common to multiple architectures (AArch64 and SPARC right now). Personally, I don't think having 4 type enums (mte_logical, mte_allocation, adi_logical and adi_allocation) would be better. That would require extra code for a similar technology. So, just to pass down, say, mte_logical instead of adi_logical, when those are in fact the same kind of tag/technology. Does that make sense? > >> If we want to address this and make it a bit more flexible, then we will >> need to let architectures define their own tag type values. Then generic >> code will have to go through an arch-specific hook to fetch the tag type. > >> Would that address your concerns? > > I think so. I thought of this working like the breakpoint types I > referenced. Breakpoint type 1 for MIPS isn't in any way the same as > type 1 for Arm etc. > (ok they might be similar but the docs make it clear that the list of > types is per architecture) > I think the situation is similar here. I opted to group similar things together in the type names. mte_allocation is similar to adi_allocation, but they're different in that mte_allocation has granule size 16, while adi_allocation has granule size 64. mte_logical is exactly the same as adi_logical. They are both 4 bits in size. >> As long as GDB sends a unique tag type identifier (non-overlapping >> values), we will be able to tell them apart from the remote side. > > Yes but depending on how those IDs are allocated we may only be able > to tell logical vs allocation not MTE vs CHERI vs . > > On Tue, 17 Nov 2020 at 14:44, Luis Machado wrote: >> >> On 11/17/20 9:29 AM, David Spickett wrote: >>>> Right. The type is really telling you what specific kind of tag you are >>>> requesting, not the technology. So it may be perfectly valid to request >>>> a MTE logical tag from the remote target, but the remote target doesn't >>>> know how to reply to that at the moment (nor does it make much sense, IMO). >>>> >>>> The tag types don't overlap at the moment, given they are an ENUM in >>>> generic code. So the server will tell them apart by their values. >>> >>> Ok but my concern here is that there are two aspects to type. (which I >>> probably confused earlier tbf) >>> >>> 1. logical vs allocation >>> 2. MTE vs >>> >>> So we have three scenarios: >>> 1. Server uses type to decide between logical and allocation >> >> Right now 0 maps to logical and 1 maps to allocation for all >> architectures. For AArch64 0 maps to MTE logical and 1 maps to MTE >> allocation. >> >> But, if more tag types are supported in the future, we will need to map >> types to logical, allocation or something else. >> >>> 2. Server uses type to decide between MTE and >> >> Right now 0 and 1 map to MTE for AArch64 and ADI for SPARC (not yet >> supported). In the future, if there are more tagging technologies, we >> will need to provide a mapping from tag types to tag technology. >> >>> 3. The combination of the two, should a system have both technologies >> >> In this case, we will use two mappings: type to tag type and type to tag >> technology. >> >>> >>> What I want to avoid is a two step process: >>> 1. Somehow select tagging technology you want to interact with >>> 2. You send memtag packets with the logical vs allocation "type" >>> >>> Which would be needed if the protocol "type" is just allocation vs >>> logical. If the "type" includes the technology >>> then we can do it in one step. >> >> Right now you can infer the technology and logical/allocation from the >> type. There is no need for a two step process. I don't anticipate a need >> for that in the future either, as long as we prevent the tag type ENUM >> from having overlapping values. >> >>> >>> So I prefer: >>> 0 = mte logical, 1 = mte allocation, 1 = future logical, 2 = future >>> allocation, 3 = future tag form (not logical/allocation) etc... >> >> So this is wrong, as the remote/native side won't be able to tell both >> definitions of type "1" apart. >> >>> Over: >>> 0 = logical, 1 = allocation, 2 = future term for other tag form etc... >> >> This is correct in my view. Everything is unique. >> >> The complicating factor here is that generic code/UI needs to use a >> generic tag type so all architectures supporting memory tagging can use >> the commands out of the box. >> >> For AArch64, tag_logical/tag_allocation means MTE logical/MTE >> allocation. For SPARC ADI, this means whatever their tagging mechanism is. >> >> If we turn tag_logical into mte_logical, SPARC ADI won't be able to use >> that. We will, again, need a mapping. >> >> If we want to address this and make it a bit more flexible, then we will >> need to let architectures define their own tag type values. Then generic >> code will have to go through an arch-specific hook to fetch the tag type. >> >> Would that address your concerns? >> >>> >>> I see the need for the second form in gdb internally, but my concern >>> is only with the protocol side. >> >> As long as GDB sends a unique tag type identifier (non-overlapping >> values), we will be able to tell them apart from the remote side. >> >>> >>>> I just want to keep that option open >>>> if someone wants to do it, or if some other type of tag shows up that >>>> would require such support. >>> >>> I hadn't thought of that. I agree that type should include that too. >>> >>> On Tue, 17 Nov 2020 at 12:01, Luis Machado wrote: >>>> >>>> Hi, >>>> >>>> On 11/17/20 7:05 AM, David Spickett wrote: >>>>>> Right now the design makes these types architecture-specific. >>>>> >>>>> This works too, in fact it matches the breakpoint types example better that way. >>>>> >>>>>> But there's one catch right now. The user-visible commands know about >>>>>> two types of tags (logical and allocation). The native/remote side of >>>>>> GDB only sees one type, the allocation one, as it doesn't make sense to >>>>>> ask the native/remote target about logical tags. >>>>>> >>>>>> This is slightly messy and, in my opinion, should be an implementation >>>>>> detail. >>>>> >>>>> Tell me if I have this right. >>>>> >>>>> In gdb in overall you have these two types but the server only uses >>>>> one of them, the allocation tag type. >>>>> So only the allocation tag type will ever go over the protocol. (for >>>>> MTE at least) >>>> >>>> That's correct. Only GDB knows about logical tags. Those don't make >>>> their way to the remote via the remote protocol. >>>> >>>>> >>>>> Given that, if we assume that "mte allocation" type is 1. A future >>>>> AArch64 kind of memory tagging could allocate 2 and on for its tag >>>>> types. >>>>> Something like: >>>>> AArch64 Memory Tag types - >>>>> 0 : MTE logical (which is internal only, reserved but documented as >>>>> unused, or left out completely?) >>>>> 1 : MTE allocation (the one we use at present) >>>>> 2: logical tag (because maybe there is some server >>>>> component for this kind of tagging extension?) >>>>> 3: allocation tag >>>>> >>>>> The reason I want to clarify is that I understood the type to >>>>> differentiate tagging technologies, not the kind of tag within them. >>>>> (the type tells you MTE vs instead of allocation vs logical) >>>>> The use case being what if you have MTE and active >>>>> in the same target and I want to set an MTE allocation tag, >>>>> how can the server tell them apart? >>>> >>>> Right. The type is really telling you what specific kind of tag you are >>>> requesting, not the technology. So it may be perfectly valid to request >>>> a MTE logical tag from the remote target, but the remote target doesn't >>>> know how to reply to that at the moment (nor does it make much sense, IMO). >>>> >>>> The tag types don't overlap at the moment, given they are an ENUM in >>>> generic code. So the server will tell them apart by their values. >>>> >>>>> >>>>> If the type numbers overlap between tagging technologies, we can't >>>>> tell them apart. >>>>> However if they encode what extension they are for and the >>>>> logical/allocation type (as in the example above) then we can. >>>>> >>>>> A lot of that is probably academic given there's one relevant type but >>>>> we can at least document the intent of the field. >>>>> E.g. "these types are global to AArch64 so new types should not >>>>> overlap existing ones" >>>> >>>> I suppose. I chose to have generic ENUM's without specific references to >>>> MTE for that reason. Architectures can use logical/allocation tags as >>>> they see fit, but the ENUM values will not overlap. >>>> >>>> We need to support the UI as well, so there needs to be some generic >>>> definitions so commands can query different tag types. >>>> >>>>> >>>>>> Otherwise we'd need to standardize on particular tag type names across >>>>>> different architectures, like "hw memory tag", "sw memory tag", >>>>>> "capability tag" etc. >>>>> >>>>> Well I was thinking of type more as a single value like "mte". Anyway >>>>> I'm fine with the integer route. >>>> >>>> Though we don't have a use for requesting logical tags from the remote >>>> targets, it is possible to support that. Passing "mte" or any other >>>> technology name would close that option. >>>> >>>> If, for example, we decide to have a dumb GDB client and a smart >>>> GDBserver (unlikely at this point), then it would make sense to pass >>>> down logical tag requests I think. I just want to keep that option open >>>> if someone wants to do it, or if some other type of tag shows up that >>>> would require such support. >>>> >>>>> >>>>> On Mon, 16 Nov 2020 at 17:23, Luis Machado wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 11/16/20 1:04 PM, David Spickett wrote: >>>>>>> Also with regard to the "type" field. >>>>>>> >>>>>>>> +@var{type} is the type of tag the request wants to fetch. The typeis a signed >>>>>>>> +integer. >>>>>>> >>>>>>> (typo aside) Is this field architecture specific and will there be a >>>>>>> list of these type numbers documented anywhere? (or already is) >>>>>>> For example would 1 on AArch64 be MTE, and on be >>>>>> tag type>. Or would that be 2. >>>>>>> >>>>>>> My assumption has been that it is the latter and that a value means a >>>>>>> kind of tagging extension. So for example 1=MTE rather than >>>>>>> 1= mte logical and 2 = mte allocation. Correct me if I am wrong there. >>>>>> >>>>>> Right now the design makes these types architecture-specific. It would >>>>>> be nice to have more documentation about them, for sure. >>>>>> >>>>>> But there's one catch right now. The user-visible commands know about >>>>>> two types of tags (logical and allocation). The native/remote side of >>>>>> GDB only sees one type, the allocation one, as it doesn't make sense to >>>>>> ask the native/remote target about logical tags. >>>>>> >>>>>> This is slightly messy and, in my opinion, should be an implementation >>>>>> detail. >>>>>> >>>>>> So, in summary... We have a couple generic tag types GDB knows about: >>>>>> logical and allocation. >>>>>> >>>>>> Those types get translated to an arch/a target-specific type when they >>>>>> cross the native/remote target boundary. >>>>>> >>>>>> In theory we could have generic tag types 1 and 2 in generic code, but >>>>>> tag type 2 gets translated to type 1 in a remote packet. >>>>>> >>>>>> Maybe we could improve this a little. >>>>>> >>>>>>> >>>>>>> A page like: >>>>>>> https://sourceware.org/gdb/current/onlinedocs/gdb/ARM-Breakpoint-Kinds.html#ARM-Breakpoint-Kinds >>>>>>> >>>>>>> Or just a short note, given that there's only one type right now. >>>>>> >>>>>> Yes, that would be nice to expand for the tag types. >>>>>> >>>>>>> >>>>>>> Also, I may have suggested the type be a string at some point. However >>>>>>> based on examples like the link above >>>>>>> I don't see much advantage to it apart from making packet dumps easier >>>>>>> to read. Just wanted to close the loop on that >>>>>>> if I didn't before. >>>>>> >>>>>> I don't have a strong preference here. I'm just forwarding the tag type >>>>>> from generic code. >>>>>> >>>>>> If we want to pass strings, we will need a gdbarch hook that maps a type >>>>>> to a string in the remote target layer. >>>>>> >>>>>> Otherwise we'd need to standardize on particular tag type names across >>>>>> different architectures, like "hw memory tag", "sw memory tag", >>>>>> "capability tag" etc. >>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, 16 Nov 2020 at 15:44, David Spickett wrote: >>>>>>>> >>>>>>>> Minor thing, there is a missing space here in "typeis". >>>>>>>> >>>>>>>>> +@var{type} is the type of tag the request wants to fetch. The typeis a signed >>>>>>>>> +integer. >>>>>>>> >>>>>>>> On Mon, 9 Nov 2020 at 17:08, Eli Zaretskii wrote: >>>>>>>>> >>>>>>>>>> Date: Mon, 9 Nov 2020 14:04:18 -0300 >>>>>>>>>> From: Luis Machado via Gdb-patches >>>>>>>>>> Cc: david.spickett@linaro.org >>>>>>>>>> >>>>>>>>>> gdb/doc/ChangeLog: >>>>>>>>>> >>>>>>>>>> YYYY-MM-DD Luis Machado >>>>>>>>>> >>>>>>>>>> * gdb.texinfo (General Query Packets): Document qMemTags and >>>>>>>>>> QMemTags. Document the "memory-tagging" feature. >>>>>>>>>> --- >>>>>>>>>> gdb/doc/gdb.texinfo | 96 +++++++++++++++++++++++++++++++++++++++++++++ >>>>>>>>>> 1 file changed, 96 insertions(+) >>>>>>>>> >>>>>>>>> OK for this part, thanks.