From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id y2fSJboL6198XgAAWB0awg (envelope-from ) for ; Tue, 29 Dec 2020 05:58:02 -0500 Received: by simark.ca (Postfix, from userid 112) id 8B1471F0AA; Tue, 29 Dec 2020 05:58:02 -0500 (EST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RDNS_NONE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.2 Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id BF9C71E99A for ; Tue, 29 Dec 2020 05:58:01 -0500 (EST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B8B5F38350BA; Tue, 29 Dec 2020 10:58:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B8B5F38350BA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1609239480; bh=FP9YLDnosfqu0gTsftWB7w4L3IOCtX1aFdQtzG2vC5E=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=VLGU8mgcZP5TaH8qVz1ohJqH3xvsxepEBCgyp3eHni76FgdzbI2by8+rQoGj1Z3S4 SeYCC9aumL/ikDmGQanuIfHrHph9A1wpOkBuoms8UXjoBbzRdgPtHw3RQLxq5mAzlT Sg1caRE6iYPt2qhog7XEKHjj5XgWp6KgmvFjpUoQ= Received: from mail-qk1-x72e.google.com (mail-qk1-x72e.google.com [IPv6:2607:f8b0:4864:20::72e]) by sourceware.org (Postfix) with ESMTPS id 4F7BF38438A3 for ; Tue, 29 Dec 2020 10:57:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 4F7BF38438A3 Received: by mail-qk1-x72e.google.com with SMTP id b64so11006249qkc.12 for ; Tue, 29 Dec 2020 02:57:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=FP9YLDnosfqu0gTsftWB7w4L3IOCtX1aFdQtzG2vC5E=; b=PLCArPRHA+g31c6BEuaox19N+NFQPT8HW03VBhdquwd5HWgCdWuXxHYaCRfY8Ryfsp 1YcHUhLXu74EEA+76kBvykaaV/j8esmkdpAiK5Mvkg6p+f86kWX/FIJxoHEZehC6Oav1 QAhz8OYW+7Uj+S+FbZP8iOWcw3Hb9AXPsnyzCPrNmFM5ukd6YT/5ECIU8eYQ1CwBLfMg qv7H9xl2th/RTcz6PEzK4PM680Ys0qZR4xVsTOZqpX+H8iktyXf2Mn3eocDT4ilpyf2q TrWU0xTKby5nyn36wFc2MufaTIpufTloIJhUIaPIkDbsA7gkYeSGzbb2gvGqzm+3F1RK 2DjA== X-Gm-Message-State: AOAM531j3iJ6Lmy1hqzEKGPGKQ7m/rq0tRPMwKjuSLcgiL0nwXgGtAvv 7B0V/uWeOs9e+vlpWsrkJMnLubWELf5PuA== X-Google-Smtp-Source: ABdhPJxjyr/neJsBZcLpDnIl6YE34gSVogWcvLgKdVTE8k+BW6ROVeJDSqwCixkK6sq7TT/K38EJmA== X-Received: by 2002:a37:d13:: with SMTP id 19mr19651665qkn.93.1609239476657; Tue, 29 Dec 2020 02:57:56 -0800 (PST) Received: from ?IPv6:2804:7f0:8284:370e:19e7:527f:e109:2734? ([2804:7f0:8284:370e:19e7:527f:e109:2734]) by smtp.gmail.com with ESMTPSA id p128sm3771396qkb.101.2020.12.29.02.57.54 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 29 Dec 2020 02:57:56 -0800 (PST) Subject: Re: [PATCH v3 13/24] Refactor parsing of /proc//smaps To: Simon Marchi , gdb-patches@sourceware.org References: <20201109170435.15766-1-luis.machado@linaro.org> <20201109170435.15766-14-luis.machado@linaro.org> Message-ID: Date: Tue, 29 Dec 2020 07:57:53 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Luis Machado via Gdb-patches Reply-To: Luis Machado Cc: david.spickett@linaro.org Errors-To: gdb-patches-bounces@sourceware.org Sender: "Gdb-patches" On 12/26/20 3:36 AM, Simon Marchi wrote: > > > On 2020-11-09 12:04 p.m., Luis Machado via Gdb-patches wrote: >> The Linux kernel exposes the information about MTE-protected pages via the >> proc filesystem, more specifically through the smaps file. >> >> What we're looking for is a mapping with the 'mt' flag, which tells us that >> mapping was created with a PROT_MTE flag and, thus, is capable of using memory >> tagging. >> >> We already parse that file for other purposes (core file >> generation/filtering), so this patch refactors the code to make the parsing >> of the smaps file reusable for memory tagging. >> >> The function linux_address_in_memtag_page uses the refactored code to allow >> querying for memory tag support in a particular address, and it gets used in the >> next patch. >> >> gdb/ChangeLog: >> >> YYYY-MM-DD Luis Machado >> >> * linux-tdep.c (struct smaps_vmflags) : New flag >> bit. >> (struct smaps_data): New struct. >> (decode_vmflags): Handle the 'mt' flag. >> (parse_smaps_data): New function, refactored from >> linux_find_memory_regions_full. >> (linux_address_in_memtag_page): New function. >> (linux_find_memory_regions_full): Refactor into parse_smaps_data. >> * linux-tdep.h (linux_address_in_memtag_page): New prototype. >> --- >> gdb/linux-tdep.c | 358 +++++++++++++++++++++++++++++++---------------- >> gdb/linux-tdep.h | 4 + >> 2 files changed, 241 insertions(+), 121 deletions(-) >> >> diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c >> index bacb61398f..91bdcc133b 100644 >> --- a/gdb/linux-tdep.c >> +++ b/gdb/linux-tdep.c >> @@ -86,8 +86,33 @@ struct smaps_vmflags >> /* Is this a MAP_SHARED mapping (VM_SHARED, "sh"). */ >> >> unsigned int shared_mapping : 1; >> + >> + /* Memory map has memory tagging enabled. */ >> + >> + unsigned int memory_tagging : 1; >> }; >> >> +/* Data structure that holds the information contained in the >> + /proc//smaps file. */ >> + >> +struct smaps_data >> +{ >> + ULONGEST start_address; >> + ULONGEST end_address; >> + std::string filename; >> + struct smaps_vmflags vmflags; >> + bool read; >> + bool write; >> + bool exec; >> + bool priv; >> + bool has_anonymous; >> + bool mapping_anon_p; >> + bool mapping_file_p; >> + >> + ULONGEST inode; >> + ULONGEST offset; >> +}; >> + >> /* Whether to take the /proc/PID/coredump_filter into account when >> generating a corefile. */ >> >> @@ -472,6 +497,8 @@ decode_vmflags (char *p, struct smaps_vmflags *v) >> v->exclude_coredump = 1; >> else if (strcmp (s, "sh") == 0) >> v->shared_mapping = 1; >> + else if (strcmp (s, "mt") == 0) >> + v->memory_tagging = 1; >> } >> } >> >> @@ -1267,6 +1294,184 @@ typedef int linux_dump_mapping_p_ftype (filter_flags filterflags, >> const char *filename, >> ULONGEST addr, >> ULONGEST offset); >> +/* Helper function to parse the contents of /proc//smaps into a data >> + structure, for easy access. >> + >> + DATA is the contents of the smaps file. The parsed contents are stored >> + into the SMAPS vector. */ > > Add a blank line above this comment. > Fixed. >> + >> +static int >> +parse_smaps_data (const char *data, >> + std::vector &smaps, >> + const char *mapsfilename) > > The return value of that function is unused, it always return 0. I'd suggest > making it return the std::vector directly. > There has been some changes to this function. I recall it returning 1 as well. I've changed it as suggested. It makes it simpler. >> +{ >> + char *line, *t; >> + >> + gdb_assert (data != nullptr); >> + >> + smaps.clear (); >> + >> + line = strtok_r ((char *) data, "\n", &t); >> + >> + while (line != NULL) >> + { >> + ULONGEST addr, endaddr, offset, inode; >> + const char *permissions, *device, *filename; >> + struct smaps_vmflags v; >> + size_t permissions_len, device_len; >> + int read, write, exec, priv; >> + int has_anonymous = 0; >> + int mapping_anon_p; >> + int mapping_file_p; >> + >> + memset (&v, 0, sizeof (v)); >> + read_mapping (line, &addr, &endaddr, &permissions, &permissions_len, >> + &offset, &device, &device_len, &inode, &filename); >> + mapping_anon_p = mapping_is_anonymous_p (filename); >> + /* If the mapping is not anonymous, then we can consider it >> + to be file-backed. These two states (anonymous or >> + file-backed) seem to be exclusive, but they can actually >> + coexist. For example, if a file-backed mapping has >> + "Anonymous:" pages (see more below), then the Linux >> + kernel will dump this mapping when the user specified >> + that she only wants anonymous mappings in the corefile >> + (*even* when she explicitly disabled the dumping of >> + file-backed mappings). */ >> + mapping_file_p = !mapping_anon_p; >> + >> + /* Decode permissions. */ >> + read = (memchr (permissions, 'r', permissions_len) != 0); >> + write = (memchr (permissions, 'w', permissions_len) != 0); >> + exec = (memchr (permissions, 'x', permissions_len) != 0); >> + /* 'private' here actually means VM_MAYSHARE, and not >> + VM_SHARED. In order to know if a mapping is really >> + private or not, we must check the flag "sh" in the >> + VmFlags field. This is done by decode_vmflags. However, >> + if we are using a Linux kernel released before the commit >> + 834f82e2aa9a8ede94b17b656329f850c1471514 (3.10), we will >> + not have the VmFlags there. In this case, there is >> + really no way to know if we are dealing with VM_SHARED, >> + so we just assume that VM_MAYSHARE is enough. */ >> + priv = memchr (permissions, 'p', permissions_len) != 0; >> + >> + /* Try to detect if region should be dumped by parsing smaps >> + counters. */ >> + for (line = strtok_r (NULL, "\n", &t); >> + line != NULL && line[0] >= 'A' && line[0] <= 'Z'; >> + line = strtok_r (NULL, "\n", &t)) >> + { >> + char keyword[64 + 1]; >> + >> + if (sscanf (line, "%64s", keyword) != 1) >> + { >> + warning (_("Error parsing {s,}maps file '%s'"), mapsfilename); >> + break; >> + } >> + >> + if (strcmp (keyword, "Anonymous:") == 0) >> + { >> + /* Older Linux kernels did not support the >> + "Anonymous:" counter. Check it here. */ >> + has_anonymous = 1; >> + } >> + else if (strcmp (keyword, "VmFlags:") == 0) >> + decode_vmflags (line, &v); >> + >> + if (strcmp (keyword, "AnonHugePages:") == 0 >> + || strcmp (keyword, "Anonymous:") == 0) >> + { >> + unsigned long number; >> + >> + if (sscanf (line, "%*s%lu", &number) != 1) >> + { >> + warning (_("Error parsing {s,}maps file '%s' number"), >> + mapsfilename); >> + break; >> + } >> + if (number > 0) >> + { >> + /* Even if we are dealing with a file-backed >> + mapping, if it contains anonymous pages we >> + consider it to be *also* an anonymous >> + mapping, because this is what the Linux >> + kernel does: >> + >> + // Dump segments that have been written to. >> + if (vma->anon_vma && FILTER(ANON_PRIVATE)) >> + goto whole; >> + >> + Note that if the mapping is already marked as >> + file-backed (i.e., mapping_file_p is >> + non-zero), then this is a special case, and >> + this mapping will be dumped either when the >> + user wants to dump file-backed *or* anonymous >> + mappings. */ >> + mapping_anon_p = 1; >> + } >> + } >> + } >> + /* Save the smaps entry to the vector. */ >> + struct smaps_data map; >> + std::string fname (filename); > > You don't need to create the string here, just do > > map.filename = filename; > > below. > Fixed. >> + >> + smaps.emplace_back (map); >> + } >> + >> + return 0; >> +} >> + >> +/* See linux-tdep.h. */ >> + >> +bool >> +linux_address_in_memtag_page (CORE_ADDR address) >> +{ >> + if (current_inferior ()->fake_pid_p) >> + return false;> + >> + pid_t pid = current_inferior ()->pid; >> + >> + std::string smaps_file = string_printf ("/proc/%d/smaps", pid); >> + >> + gdb::unique_xmalloc_ptr data >> + = target_fileio_read_stralloc (NULL, smaps_file.c_str ()); >> + >> + if (data == nullptr) >> + return false; >> + >> + std::vector smaps; >> + >> + /* Parse the contents of smaps into a vector. */ >> + parse_smaps_data (data.get (), smaps, smaps_file.c_str ()); >> + >> + if (!smaps.empty ()) >> + { > > No need for this `if`. > >> + for (struct smaps_data map : smaps) > > Iterate by const reference: > > for (const smaps_data &map : smaps) > Fixed now. >> + { >> + /* Is the address within [start_address, end_address) in a page >> + mapped with memory tagging? */ >> + if (address >= map.start_address >> + && address < map.end_address >> + && map.vmflags.memory_tagging) >> + return true; >> + } >> + } >> + >> + return false; >> +} >> >> /* List memory regions in the inferior for a corefile. */ >> >> @@ -1276,8 +1481,7 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch, >> linux_find_memory_region_ftype *func, >> void *obfd) >> { >> - char mapsfilename[100]; >> - char coredumpfilter_name[100]; >> + std::string coredumpfilter_name; > > Declare this on first use (we could push some preparatory patches to change > these to std::string, to make this patch simpler). > Yeah. That looks like it will benefit from a cleanup. I'll do as a separate patch. >> pid_t pid; >> /* Default dump behavior of coredump_filter (0x33), according to >> Documentation/filesystems/proc.txt from the Linux kernel >> @@ -1295,10 +1499,9 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch, >> >> if (use_coredump_filter) >> { >> - xsnprintf (coredumpfilter_name, sizeof (coredumpfilter_name), >> - "/proc/%d/coredump_filter", pid); >> + coredumpfilter_name = string_printf ("/proc/%d/coredump_filter", pid); >> gdb::unique_xmalloc_ptr coredumpfilterdata >> - = target_fileio_read_stralloc (NULL, coredumpfilter_name); >> + = target_fileio_read_stralloc (NULL, coredumpfilter_name.c_str ()); >> if (coredumpfilterdata != NULL) >> { >> unsigned int flags; >> @@ -1308,125 +1511,39 @@ linux_find_memory_regions_full (struct gdbarch *gdbarch, >> } >> } >> >> - xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/smaps", pid); >> + std::string mapsfilename = string_printf ("/proc/%d/smaps", pid); >> gdb::unique_xmalloc_ptr data >> - = target_fileio_read_stralloc (NULL, mapsfilename); >> + = target_fileio_read_stralloc (NULL, mapsfilename.c_str ()); >> if (data == NULL) >> { >> /* Older Linux kernels did not support /proc/PID/smaps. */ >> - xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/maps", pid); >> - data = target_fileio_read_stralloc (NULL, mapsfilename); >> + mapsfilename = string_printf ("/proc/%d/maps", pid); >> + data = target_fileio_read_stralloc (NULL, mapsfilename.c_str ()); >> } >> >> - if (data != NULL) >> - { >> - char *line, *t; >> - >> - line = strtok_r (data.get (), "\n", &t); >> - while (line != NULL) >> - { >> - ULONGEST addr, endaddr, offset, inode; >> - const char *permissions, *device, *filename; >> - struct smaps_vmflags v; >> - size_t permissions_len, device_len; >> - int read, write, exec, priv; >> - int has_anonymous = 0; >> - int should_dump_p = 0; >> - int mapping_anon_p; >> - int mapping_file_p; >> - >> - memset (&v, 0, sizeof (v)); >> - read_mapping (line, &addr, &endaddr, &permissions, &permissions_len, >> - &offset, &device, &device_len, &inode, &filename); >> - mapping_anon_p = mapping_is_anonymous_p (filename); >> - /* If the mapping is not anonymous, then we can consider it >> - to be file-backed. These two states (anonymous or >> - file-backed) seem to be exclusive, but they can actually >> - coexist. For example, if a file-backed mapping has >> - "Anonymous:" pages (see more below), then the Linux >> - kernel will dump this mapping when the user specified >> - that she only wants anonymous mappings in the corefile >> - (*even* when she explicitly disabled the dumping of >> - file-backed mappings). */ >> - mapping_file_p = !mapping_anon_p; >> - >> - /* Decode permissions. */ >> - read = (memchr (permissions, 'r', permissions_len) != 0); >> - write = (memchr (permissions, 'w', permissions_len) != 0); >> - exec = (memchr (permissions, 'x', permissions_len) != 0); >> - /* 'private' here actually means VM_MAYSHARE, and not >> - VM_SHARED. In order to know if a mapping is really >> - private or not, we must check the flag "sh" in the >> - VmFlags field. This is done by decode_vmflags. However, >> - if we are using a Linux kernel released before the commit >> - 834f82e2aa9a8ede94b17b656329f850c1471514 (3.10), we will >> - not have the VmFlags there. In this case, there is >> - really no way to know if we are dealing with VM_SHARED, >> - so we just assume that VM_MAYSHARE is enough. */ >> - priv = memchr (permissions, 'p', permissions_len) != 0; >> - >> - /* Try to detect if region should be dumped by parsing smaps >> - counters. */ >> - for (line = strtok_r (NULL, "\n", &t); >> - line != NULL && line[0] >= 'A' && line[0] <= 'Z'; >> - line = strtok_r (NULL, "\n", &t)) >> - { >> - char keyword[64 + 1]; >> + if (data == nullptr) >> + return 1; >> >> - if (sscanf (line, "%64s", keyword) != 1) >> - { >> - warning (_("Error parsing {s,}maps file '%s'"), mapsfilename); >> - break; >> - } >> + std::vector smaps; >> >> - if (strcmp (keyword, "Anonymous:") == 0) >> - { >> - /* Older Linux kernels did not support the >> - "Anonymous:" counter. Check it here. */ >> - has_anonymous = 1; >> - } >> - else if (strcmp (keyword, "VmFlags:") == 0) >> - decode_vmflags (line, &v); >> + /* Parse the contents of smaps into a vector. */ >> + parse_smaps_data (data.get (), smaps, mapsfilename.c_str ()); >> >> - if (strcmp (keyword, "AnonHugePages:") == 0 >> - || strcmp (keyword, "Anonymous:") == 0) >> - { >> - unsigned long number; >> - >> - if (sscanf (line, "%*s%lu", &number) != 1) >> - { >> - warning (_("Error parsing {s,}maps file '%s' number"), >> - mapsfilename); >> - break; >> - } >> - if (number > 0) >> - { >> - /* Even if we are dealing with a file-backed >> - mapping, if it contains anonymous pages we >> - consider it to be *also* an anonymous >> - mapping, because this is what the Linux >> - kernel does: >> - >> - // Dump segments that have been written to. >> - if (vma->anon_vma && FILTER(ANON_PRIVATE)) >> - goto whole; >> - >> - Note that if the mapping is already marked as >> - file-backed (i.e., mapping_file_p is >> - non-zero), then this is a special case, and >> - this mapping will be dumped either when the >> - user wants to dump file-backed *or* anonymous >> - mappings. */ >> - mapping_anon_p = 1; >> - } >> - } >> - } >> + if (!smaps.empty ()) >> + { >> + for (struct smaps_data map : smaps) > > Same comments as above. > > Simon > Fixed. Thanks.