From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 82013 invoked by alias); 23 Mar 2015 18:40:25 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 81952 invoked by uid 89); 23 Mar 2015 18:40:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Mon, 23 Mar 2015 18:40:20 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (Postfix) with ESMTPS id 6BBFB8E67B for ; Mon, 23 Mar 2015 18:40:19 +0000 (UTC) Received: from localhost (unused-10-15-17-126.yyz.redhat.com [10.15.17.126]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t2NIeIR1020498 (version=TLSv1/SSLv3 cipher=AES128-GCM-SHA256 bits=128 verify=NO); Mon, 23 Mar 2015 14:40:18 -0400 From: Sergio Durigan Junior To: Pedro Alves Cc: GDB Patches Subject: Re: [PATCH 1/2] Implement support for checking /proc/PID/coredump_filter References: <1426807358-18295-1-git-send-email-sergiodj@redhat.com> <1426807358-18295-2-git-send-email-sergiodj@redhat.com> <550C7118.6080605@redhat.com> <87vbhvwi7x.fsf@redhat.com> <550C98E9.1030802@redhat.com> X-URL: http://blog.sergiodj.net Date: Mon, 23 Mar 2015 18:40:00 -0000 In-Reply-To: <550C98E9.1030802@redhat.com> (Pedro Alves's message of "Fri, 20 Mar 2015 22:02:17 +0000") Message-ID: <87bnjj8smm.fsf@redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2015-03/txt/msg00732.txt.bz2 On Friday, March 20 2015, Pedro Alves wrote: > On 03/20/2015 08:02 PM, Sergio Durigan Junior wrote: >>>> >> + It is worth mentioning that, from all those checks described >>>> >> + above, the most fragile is the one to see if the file name en= ds >>>> >> + with " (deleted)". This does not necessarily mean that the >>>> >> + mapping is anonymous, because the deleted file associated with >>>> >> + the mapping may have been a hard link to another file, for >>>> >> + example. The Linux kernel checks to see if "i_nlink =3D=3D 0= ", but >>>> >> + GDB cannot easily do this check. Therefore, we made a compro= mise >>> > >>> > Cannot do it easily, or cannot do it at all? >> It cannot do it easily. If GDB is run by root, then it could (in >> theory) inspect the contents of /proc/PID/map_files and determine >> whether the " (deleted)" file was a hard link or not. > > Ah, could you include that in the comment please? > Otherwise every time I read that I'll wonder what wasn't easy. :-) > > Like: > > "(..) example. The Linux kernel checks to see if "i_nlink =3D=3D 0", but > GDB cannot easily (and normally) do this check (iff running as root, > it could find the mapping in /proc/PID/map_files/ and determine > whether there still are other hard links to the inode/file). > Therefore, we made a compromise (...)" Done. Patch attached. I will wait until you look at the second patch (and approve it) in order to commit everything together. > I quickly tried something like that with "/proc/pid/fd/*", and > unfortunately the link count is always 1, at least on F20's kernel: > > $ tail > a& > [1] 12343 > > [1]+ Stopped tail > a > $ ls -l a > -rw-rw-r--. 1 pedro pedro 0 Mar 20 21:29 a > $ ln a b > $ ls -l a > -rw-rw-r--. 2 pedro pedro 0 Mar 20 21:29 a > $ ls -l /proc/12343/fd/1 > l-wx------. 1 pedro pedro 64 Mar 20 21:30 /proc/12343/fd/1 -> /tmp/a > > If map_files/ works the same, I think we'd have to get the > file's inode number, and look over the whole filesystem for > other files with that inode number (the equivalent > of 'ls -i' 'find / -inum'). Urgh! Jan has provided a good example of it before. When you use stat -L, you can see if the 'Links' count is zero (which means the real file has been really deleted). For example: lrw-------. 1 sergio sergio 64 Mar 23 14:34 7f2c5911b000-7f2c5911c000 -> = /tmp/t2 (deleted) [root@bla map_files]# stat -L 7f2c5911b000-7f2c5911c000=20 File: =E2=80=987f2c5911b000-7f2c5911c000=E2=80=99 Size: 0 Blocks: 0 IO Block: 4096 regular empty= file Device: 23h/35d Inode: 217633770 Links: 1 The 'Links' is 1 because, despite the fact that 't2' has been deleted, the original file 't1' still exists. However, if I remove 't1': [root@bla map_files]# stat -L 7f2c5911b000-7f2c5911c000=20 File: =E2=80=987f2c5911b000-7f2c5911c000=E2=80=99 Size: 0 Blocks: 0 IO Block: 4096 regular empty= file Device: 23h/35d Inode: 217633770 Links: 0 Therefore, it should be possible to use stat even in this case. But anyway, this was just FYI, we're not doing that. --=20 Sergio GPG key ID: 0x65FC5E36 Please send encrypted e-mail if possible http://sergiodj.net/ diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c index ea0d4cd..26edd5d 100644 --- a/gdb/linux-tdep.c +++ b/gdb/linux-tdep.c @@ -35,9 +35,61 @@ #include "observer.h" #include "objfiles.h" #include "infcall.h" +#include "gdbcmd.h" +#include "gdb_regex.h" =20 #include =20 +/* This enum represents the values that the user can choose when + informing the Linux kernel about which memory mappings will be + dumped in a corefile. They are described in the file + Documentation/filesystems/proc.txt, inside the Linux kernel + tree. */ + +enum + { + COREFILTER_ANON_PRIVATE =3D 1 << 0, + COREFILTER_ANON_SHARED =3D 1 << 1, + COREFILTER_MAPPED_PRIVATE =3D 1 << 2, + COREFILTER_MAPPED_SHARED =3D 1 << 3, + COREFILTER_ELF_HEADERS =3D 1 << 4, + COREFILTER_HUGETLB_PRIVATE =3D 1 << 5, + COREFILTER_HUGETLB_SHARED =3D 1 << 6, + }; + +/* This struct is used to map flags found in the "VmFlags:" field (in + the /proc//smaps file). */ + +struct smaps_vmflags + { + /* Zero if this structure has not been initialized yet. It + probably means that the Linux kernel being used does not emit + the "VmFlags:" field on "/proc/PID/smaps". */ + + unsigned int initialized_p : 1; + + /* Memory mapped I/O area (VM_IO, "io"). */ + + unsigned int io_page : 1; + + /* Area uses huge TLB pages (VM_HUGETLB, "ht"). */ + + unsigned int uses_huge_tlb : 1; + + /* Do not include this memory region on the coredump (VM_DONTDUMP, "dd= "). */ + + unsigned int exclude_coredump : 1; + + /* Is this a MAP_SHARED mapping (VM_SHARED, "sh"). */ + + unsigned int shared_mapping : 1; + }; + +/* Whether to take the /proc/PID/coredump_filter into account when + generating a corefile. */ + +static int use_coredump_filter =3D 1; + /* This enum represents the signals' numbers on a generic architecture running the Linux kernel. The definition of "generic" comes from the file , from the Linux kernel @@ -381,6 +433,231 @@ read_mapping (const char *line, *filename =3D p; } =20 +/* Helper function to decode the "VmFlags" field in /proc/PID/smaps. + + This function was based on the documentation found on + , on the Linux kernel. + + Linux kernels before commit + 834f82e2aa9a8ede94b17b656329f850c1471514 (3.10) do not have this + field on smaps. */ + +static void +decode_vmflags (char *p, struct smaps_vmflags *v) +{ + char *saveptr; + const char *s; + + v->initialized_p =3D 1; + p =3D skip_to_space (p); + p =3D skip_spaces (p); + + for (s =3D strtok_r (p, " ", &saveptr); + s !=3D NULL; + s =3D strtok_r (NULL, " ", &saveptr)) + { + if (strcmp (s, "io") =3D=3D 0) + v->io_page =3D 1; + else if (strcmp (s, "ht") =3D=3D 0) + v->uses_huge_tlb =3D 1; + else if (strcmp (s, "dd") =3D=3D 0) + v->exclude_coredump =3D 1; + else if (strcmp (s, "sh") =3D=3D 0) + v->shared_mapping =3D 1; + } +} + +/* Return 1 if the memory mapping is anonymous, 0 otherwise. + + FILENAME is the name of the file present in the first line of the + memory mapping, in the "/proc/PID/smaps" output. For example, if + the first line is: + + 7fd0ca877000-7fd0d0da0000 r--p 00000000 fd:02 2100770 /path/to/file + + Then FILENAME will be "/path/to/file". */ + +static int +mapping_is_anonymous_p (const char *filename) +{ + static regex_t dev_zero_regex, shmem_file_regex, file_deleted_regex; + static int init_regex_p =3D 0; + + if (!init_regex_p) + { + struct cleanup *c =3D make_cleanup (null_cleanup, NULL); + + /* Let's be pessimistic and assume there will be an error while + compiling the regex'es. */ + init_regex_p =3D -1; + + /* DEV_ZERO_REGEX matches "/dev/zero" filenames (with or + without the "(deleted)" string in the end). We know for + sure, based on the Linux kernel code, that memory mappings + whose associated filename is "/dev/zero" are guaranteed to be + MAP_ANONYMOUS. */ + compile_rx_or_error (&dev_zero_regex, "^/dev/zero\\( (deleted)\\)\\?= $", + _("Could not compile regex to match /dev/zero " + "filename")); + /* SHMEM_FILE_REGEX matches "/SYSV%08x" filenames (with or + without the "(deleted)" string in the end). These filenames + refer to shared memory (shmem), and memory mappings + associated with them are MAP_ANONYMOUS as well. */ + compile_rx_or_error (&shmem_file_regex, + "^/\\?SYSV[0-9a-fA-F]\\{8\\}\\( (deleted)\\)\\?$", + _("Could not compile regex to match shmem " + "filenames")); + /* FILE_DELETED_REGEX is a heuristic we use to try to mimic the + Linux kernel's 'n_link =3D=3D 0' code, which is responsible to + decide if it is dealing with a 'MAP_SHARED | MAP_ANONYMOUS' + mapping. In other words, if FILE_DELETED_REGEX matches, it + does not necessarily mean that we are dealing with an + anonymous shared mapping. However, there is no easy way to + detect this currently, so this is the best approximation we + have. + + As a result, GDB will dump readonly pages of deleted + executables when using the default value of coredump_filter + (0x33), while the Linux kernel will not dump those pages. + But we can live with that. */ + compile_rx_or_error (&file_deleted_regex, " (deleted)$", + _("Could not compile regex to match " + "' (deleted)'")); + /* We will never release these regexes, so just discard the + cleanups. */ + discard_cleanups (c); + + /* If we reached this point, then everything succeeded. */ + init_regex_p =3D 1; + } + + if (init_regex_p =3D=3D -1) + { + const char deleted[] =3D " (deleted)"; + size_t del_len =3D sizeof (deleted) - 1; + size_t filename_len =3D strlen (filename); + + /* There was an error while compiling the regex'es above. In + order to try to give some reliable information to the caller, + we just try to find the string " (deleted)" in the filename. + If we managed to find it, then we assume the mapping is + anonymous. */ + return (filename_len >=3D del_len + && strcmp (filename + filename_len - del_len, deleted) =3D=3D 0); + } + + if (*filename =3D=3D '\0' + || regexec (&dev_zero_regex, filename, 0, NULL, 0) =3D=3D 0 + || regexec (&shmem_file_regex, filename, 0, NULL, 0) =3D=3D 0 + || regexec (&file_deleted_regex, filename, 0, NULL, 0) =3D=3D 0) + return 1; + + return 0; +} + +/* Return 0 if the memory mapping (which is related to FILTERFLAGS, V, + MAYBE_PRIVATE_P, and MAPPING_ANONYMOUS_P) should not be dumped, or + greater than 0 if it should. + + In a nutshell, this is the logic that we follow in order to decide + if a mapping should be dumped or not. + + - If the mapping is associated to a file whose name ends with + " (deleted)", or if the file is "/dev/zero", or if it is + "/SYSV%08x" (shared memory), or if there is no file associated + with it, or if the AnonHugePages: or the Anonymous: fields in the + /proc/PID/smaps have contents, then GDB considers this mapping to + be anonymous. Otherwise, GDB considers this mapping to be a + file-backed mapping (because there will be a file associated with + it). +=20 + It is worth mentioning that, from all those checks described + above, the most fragile is the one to see if the file name ends + with " (deleted)". This does not necessarily mean that the + mapping is anonymous, because the deleted file associated with + the mapping may have been a hard link to another file, for + example. The Linux kernel checks to see if "i_nlink =3D=3D 0", but + GDB cannot easily (and normally) do this check (iff running as + root, it could find the mapping in /proc/PID/map_files/ and + determine whether there still are other hard links to the + inode/file). Therefore, we made a compromise here, and we assume + that if the file name ends with " (deleted)", then the mapping is + indeed anonymous. FWIW, this is something the Linux kernel could + do better: expose this information in a more direct way. +=20 + - If we see the flag "sh" in the "VmFlags:" field (in + /proc/PID/smaps), then certainly the memory mapping is shared + (VM_SHARED). If we have access to the VmFlags, and we don't see + the "sh" there, then certainly the mapping is private. However, + Linux kernels before commit + 834f82e2aa9a8ede94b17b656329f850c1471514 (3.10) do not have the + "VmFlags:" field; in that case, we use another heuristic: if we + see 'p' in the permission flags, then we assume that the mapping + is private, even though the presence of the 's' flag there would + mean VM_MAYSHARE, which means the mapping could still be private. + This should work OK enough, however. */ + +static int +dump_mapping_p (unsigned int filterflags, const struct smaps_vmflags *v, + int maybe_private_p, int mapping_anon_p, const char *filename) +{ + /* Initially, we trust in what we received from our caller. This + value may not be very precise (i.e., it was probably gathered + from the permission line in the /proc/PID/smaps list, which + actually refers to VM_MAYSHARE, and not VM_SHARED), but it is + what we have until we take a look at the "VmFlags:" field + (assuming that the version of the Linux kernel being used + supports it, of course). */ + int private_p =3D maybe_private_p; + + /* We always dump vDSO and vsyscall mappings, because it's likely that + there'll be no file to read the contents from at core load time. + The kernel does the same. */ + if (strcmp ("[vdso]", filename) =3D=3D 0 + || strcmp ("[vsyscall]", filename) =3D=3D 0) + return 1; + + if (v->initialized_p) + { + /* We never dump I/O mappings. */ + if (v->io_page) + return 0; + + /* Check if we should exclude this mapping. */ + if (v->exclude_coredump) + return 0; + + /* Update our notion of whether this mapping is shared or + private based on a trustworthy value. */ + private_p =3D !v->shared_mapping; + + /* HugeTLB checking. */ + if (v->uses_huge_tlb) + { + if ((private_p && (filterflags & COREFILTER_HUGETLB_PRIVATE)) + || (!private_p && (filterflags & COREFILTER_HUGETLB_SHARED))) + return 1; + + return 0; + } + } + + if (private_p) + { + if (mapping_anon_p) + return (filterflags & COREFILTER_ANON_PRIVATE) !=3D 0; + else + return (filterflags & COREFILTER_MAPPED_PRIVATE) !=3D 0; + } + else + { + if (mapping_anon_p) + return (filterflags & COREFILTER_ANON_SHARED) !=3D 0; + else + return (filterflags & COREFILTER_MAPPED_SHARED) !=3D 0; + } +} + /* Implement the "info proc" command. */ =20 static void @@ -819,48 +1096,86 @@ linux_find_memory_regions_full (struct gdbarch *gdba= rch, void *obfd) { char mapsfilename[100]; - char *data; + char coredumpfilter_name[100]; + char *data, *coredumpfilterdata; + pid_t pid; + /* Default dump behavior of coredump_filter (0x33), according to + Documentation/filesystems/proc.txt from the Linux kernel + tree. */ + unsigned int filterflags =3D (COREFILTER_ANON_PRIVATE + | COREFILTER_ANON_SHARED + | COREFILTER_ELF_HEADERS + | COREFILTER_HUGETLB_PRIVATE); =20 /* We need to know the real target PID to access /proc. */ if (current_inferior ()->fake_pid_p) return 1; =20 - xsnprintf (mapsfilename, sizeof mapsfilename, - "/proc/%d/smaps", current_inferior ()->pid); + pid =3D current_inferior ()->pid; + + if (use_coredump_filter) + { + xsnprintf (coredumpfilter_name, sizeof (coredumpfilter_name), + "/proc/%d/coredump_filter", pid); + coredumpfilterdata =3D target_fileio_read_stralloc (coredumpfilter_n= ame); + if (coredumpfilterdata !=3D NULL) + { + sscanf (coredumpfilterdata, "%x", &filterflags); + xfree (coredumpfilterdata); + } + } + + xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/smaps", pid); data =3D target_fileio_read_stralloc (mapsfilename); if (data =3D=3D NULL) { /* Older Linux kernels did not support /proc/PID/smaps. */ - xsnprintf (mapsfilename, sizeof mapsfilename, - "/proc/%d/maps", current_inferior ()->pid); + xsnprintf (mapsfilename, sizeof mapsfilename, "/proc/%d/maps", pid); data =3D target_fileio_read_stralloc (mapsfilename); } - if (data) + + if (data !=3D NULL) { struct cleanup *cleanup =3D make_cleanup (xfree, data); - char *line; + char *line, *t; =20 - line =3D strtok (data, "\n"); - while (line) + line =3D strtok_r (data, "\n", &t); + while (line !=3D NULL) { ULONGEST addr, endaddr, offset, inode; const char *permissions, *device, *filename; + struct smaps_vmflags v; size_t permissions_len, device_len; - int read, write, exec; - int modified =3D 0, has_anonymous =3D 0; + int read, write, exec, private; + int has_anonymous =3D 0; + int should_dump_p =3D 0; + int mapping_anon_p; =20 + memset (&v, 0, sizeof (v)); read_mapping (line, &addr, &endaddr, &permissions, &permissions_len, &offset, &device, &device_len, &inode, &filename); + mapping_anon_p =3D mapping_is_anonymous_p (filename); =20 /* Decode permissions. */ read =3D (memchr (permissions, 'r', permissions_len) !=3D 0); write =3D (memchr (permissions, 'w', permissions_len) !=3D 0); exec =3D (memchr (permissions, 'x', permissions_len) !=3D 0); - - /* Try to detect if region was modified by parsing smaps counters. */ - for (line =3D strtok (NULL, "\n"); - line && line[0] >=3D 'A' && line[0] <=3D 'Z'; - line =3D strtok (NULL, "\n")) + /* 'private' here actually means VM_MAYSHARE, and not + VM_SHARED. In order to know if a mapping is really + private or not, we must check the flag "sh" in the + VmFlags field. This is done by decode_vmflags. However, + if we are using a Linux kernel released before the commit + 834f82e2aa9a8ede94b17b656329f850c1471514 (3.10), we will + not have the VmFlags there. In this case, there is + really no way to know if we are dealing with VM_SHARED, + so we just assume that VM_MAYSHARE is enough. */ + private =3D memchr (permissions, 'p', permissions_len) !=3D 0; + + /* Try to detect if region should be dumped by parsing smaps + counters. */ + for (line =3D strtok_r (NULL, "\n", &t); + line !=3D NULL && line[0] >=3D 'A' && line[0] <=3D 'Z'; + line =3D strtok_r (NULL, "\n", &t)) { char keyword[64 + 1]; =20 @@ -869,11 +1184,17 @@ linux_find_memory_regions_full (struct gdbarch *gdba= rch, warning (_("Error parsing {s,}maps file '%s'"), mapsfilename); break; } + if (strcmp (keyword, "Anonymous:") =3D=3D 0) - has_anonymous =3D 1; - if (strcmp (keyword, "Shared_Dirty:") =3D=3D 0 - || strcmp (keyword, "Private_Dirty:") =3D=3D 0 - || strcmp (keyword, "Swap:") =3D=3D 0 + { + /* Older Linux kernels did not support the + "Anonymous:" counter. Check it here. */ + has_anonymous =3D 1; + } + else if (strcmp (keyword, "VmFlags:") =3D=3D 0) + decode_vmflags (line, &v); + + if (strcmp (keyword, "AnonHugePages:") =3D=3D 0 || strcmp (keyword, "Anonymous:") =3D=3D 0) { unsigned long number; @@ -884,19 +1205,38 @@ linux_find_memory_regions_full (struct gdbarch *gdba= rch, mapsfilename); break; } - if (number !=3D 0) - modified =3D 1; + if (number > 0) + { + /* Even if we are dealing with a file-backed + mapping, if it contains anonymous pages we + consider it to be an anonymous mapping, + because this is what the Linux kernel does: + + // Dump segments that have been written to. + if (vma->anon_vma && FILTER(ANON_PRIVATE)) + goto whole; + */ + mapping_anon_p =3D 1; + } } } =20 - /* Older Linux kernels did not support the "Anonymous:" counter. - If it is missing, we can't be sure - dump all the pages. */ - if (!has_anonymous) - modified =3D 1; + if (has_anonymous) + should_dump_p =3D dump_mapping_p (filterflags, &v, private, + mapping_anon_p, filename); + else + { + /* Older Linux kernels did not support the "Anonymous:" counter. + If it is missing, we can't be sure - dump all the pages. */ + should_dump_p =3D 1; + } =20 /* Invoke the callback function to create the corefile segment. */ - func (addr, endaddr - addr, offset, inode, - read, write, exec, modified, filename, obfd); + if (should_dump_p) + func (addr, endaddr - addr, offset, inode, + read, write, exec, 1, /* MODIFIED is true because we + want to dump the mapping. */ + filename, obfd); } =20 do_cleanups (cleanup); @@ -1972,6 +2312,17 @@ linux_infcall_mmap (CORE_ADDR size, unsigned prot) return retval; } =20 +/* Display whether the gcore command is using the + /proc/PID/coredump_filter file. */ + +static void +show_use_coredump_filter (struct ui_file *file, int from_tty, + struct cmd_list_element *c, const char *value) +{ + fprintf_filtered (file, _("Use of /proc/PID/coredump_filter file to gene= rate" + " corefiles is %s.\n"), value); +} + /* To be called from the various GDB_OSABI_LINUX handlers for the various GNU/Linux architectures and machine types. */ =20 @@ -2008,4 +2359,16 @@ _initialize_linux_tdep (void) /* Observers used to invalidate the cache when needed. */ observer_attach_inferior_exit (invalidate_linux_cache_inf); observer_attach_inferior_appeared (invalidate_linux_cache_inf); + + add_setshow_boolean_cmd ("use-coredump-filter", class_files, + &use_coredump_filter, _("\ +Set whether gcore should consider /proc/PID/coredump_filter."), + _("\ +Show whether gcore should consider /proc/PID/coredump_filter."), + _("\ +Use this command to set whether gcore should consider the contents\n\ +of /proc/PID/coredump_filter when generating the corefile. For more infor= mation\n\ +about this file, refer to the manpage of core(5)."), + NULL, show_use_coredump_filter, + &setlist, &showlist); }