From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12304 invoked by alias); 13 Apr 2017 18:30:50 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 12195 invoked by uid 89); 13 Apr 2017 18:30:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 13 Apr 2017 18:30:43 +0000 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 200927D0C9 for ; Thu, 13 Apr 2017 18:30:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 200927D0C9 Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=sergiodj@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 200927D0C9 Received: from localhost (unused-10-15-17-193.yyz.redhat.com [10.15.17.193]) by smtp.corp.redhat.com (Postfix) with ESMTPS id CB75B179C2; Thu, 13 Apr 2017 18:30:43 +0000 (UTC) From: Sergio Durigan Junior To: Pedro Alves Cc: GDB Patches Subject: Re: [PATCH] fork-child.c: Avoid unnecessary heap-allocation / string copying (Re: [PATCH v5 3/5] C++-fy and prepare for sharing fork_inferior) References: <1482464361-4068-1-git-send-email-sergiodj@redhat.com> <20170330014915.4894-1-sergiodj@redhat.com> <20170330014915.4894-4-sergiodj@redhat.com> <877f2q3223.fsf@redhat.com> <87o9w2z068.fsf@redhat.com> <36f28afa-1174-df6a-cb54-fdcf129fa816@redhat.com> <878tn5w9d7.fsf@redhat.com> <01d7a597-ac60-9764-142b-bcaefd269271@redhat.com> <87y3v5szc6.fsf@redhat.com> Date: Thu, 13 Apr 2017 18:30:00 -0000 In-Reply-To: (Pedro Alves's message of "Thu, 13 Apr 2017 11:51:04 +0100") Message-ID: <874lxstb1o.fsf@redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-IsSubscribed: yes X-SW-Source: 2017-04/txt/msg00448.txt.bz2 On Thursday, April 13 2017, Pedro Alves wrote: > On 04/13/2017 05:31 AM, Sergio Durigan Junior wrote: >> On Wednesday, April 12 2017, Pedro Alves wrote: > >> Missing comments on the functions. > > Yup, it was a prototype patch, it was too late already. > I'd either finish it or not sleep. :-) Yeah, I noticed, sorry for being picky. >>> + /* The argument vector built. Holds non-owning pointers. */ >>> + std::vector m_argv; >> >> OOC, I notice you're prefixing a lot (if not all) of your class >> variables with "m_". I understand this is a code convention to signal >> when a variable is a member of a class, but are we following this >> officially? Because (again!) I'm not really fond of this :-). > > Yes, we are: > > https://sourceware.org/gdb/wiki/Internals%20GDB-C-Coding-Standards#C.2B-.2B--specific_coding_conventions > => > https://gcc.gnu.org/codingconventions.html > > "When structs and/or classes have member functions, prefer to name data members with a leading m_ and static data members with a leading s_. " OK, thanks for pointing that out. >>> @@ -155,7 +308,6 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs, >>> static const char *exec_file; >>> char **save_our_env; >>> int shell = 0; >>> - std::vector argv; >> >> I believe you can also remove the 'static' keyword from the 'shell_file' >> declaration. > > And exec_file too. It's not entirely clear to me why they > were ever made static. The rationale about vfork looks bogus, > since the child does not modify these. Anyway, better leave > that to a separate patch, IMO. Yeah, the 'static' rationale got me thinking too. I think this comes from long time ago, maybe. I'll send a patch to clean these up, then. > Thanks. Below's what I pushed, after another round of testing. Awesome, thanks for doing that. > From 808480f667e41e2fdb66bfdc9d5e047f1aa34a68 Mon Sep 17 00:00:00 2001 > From: Pedro Alves > Date: Thu, 13 Apr 2017 11:46:07 +0100 > Subject: [PATCH] fork-child.c: Avoid unnecessary heap-allocation / string > copying > > The previous change to fork-child.c converted the argv building from > an alloca-allocated array of non-owning arg pointers, to a std::vector > of owning pointers, which results in N string dups, with N being the > number of arguments in the vector, and then requires manually > releasing the pointers owned by the vector. > > This patch makes the vector hold non-owning pointers, and avoids the > string dups, by doing one single string copy of the arguments upfront, > and replacing separators with NULL terminators in place, like we used > to. All the logic to do that is encapsulated in a new class. > > With this, there's no need to remember to manually release the argv > elements with free_vector_argv either. > > gdb/ChangeLog: > 2017-04-13 Pedro Alves > > * fork-child.c (execv_argv): New class. > (breakup_args): Refactored as ... > (execv_argv::init_for_no_shell): .. this method of execv_argv. > Copy arguments to storage and replace separators with NULL > terminators in place. > (escape_bang_in_quoted_argument): Adjust to return bool. > (execv_argv::execv_argv): New ctor. > (execv_argv::init_for_shell): New method, factored out from > fork_inferior. Don't strdup strings into the vector. > (fork_inferior): Eliminate "shell" local and use execv_argv. Use > Remove free_vector_argv call. > --- > gdb/ChangeLog | 14 +++ > gdb/fork-child.c | 317 +++++++++++++++++++++++++++++++++++-------------------- > 2 files changed, 215 insertions(+), 116 deletions(-) > > diff --git a/gdb/ChangeLog b/gdb/ChangeLog > index ed71880..90ed21c 100644 > --- a/gdb/ChangeLog > +++ b/gdb/ChangeLog > @@ -1,3 +1,17 @@ > +2017-04-13 Pedro Alves > + > + * fork-child.c (execv_argv): New class. > + (breakup_args): Refactored as ... > + (execv_argv::init_for_no_shell): .. this method of execv_argv. > + Copy arguments to storage and replace separators with NULL > + terminators in place. > + (escape_bang_in_quoted_argument): Adjust to return bool. > + (execv_argv::execv_argv): New ctor. > + (execv_argv::init_for_shell): New method, factored out from > + fork_inferior. Don't strdup strings into the vector. > + (fork_inferior): Eliminate "shell" local and use execv_argv. Use > + Remove free_vector_argv call. > + > 2017-04-13 Yao Qi > > * rx-tdep.c (rx_fpsw_type): Check tdep->rx_fpsw_type instead of > diff --git a/gdb/fork-child.c b/gdb/fork-child.c > index 6b7386e..11ffee9 100644 > --- a/gdb/fork-child.c > +++ b/gdb/fork-child.c > @@ -43,62 +43,235 @@ extern char **environ; > > static char *exec_wrapper; > > -/* Break up SCRATCH into an argument vector suitable for passing to > - execvp and store it in ARGV. E.g., on "run a b c d" this routine > - would get as input the string "a b c d", and as output it would > - fill in ARGV with the four arguments "a", "b", "c", "d". */ > +/* Build the argument vector for execv(3). */ > > -static void > -breakup_args (const std::string &scratch, std::vector &argv) > +class execv_argv > +{ > +public: > + /* EXEC_FILE is the file to run. ALLARGS is a string containing the > + arguments to the program. If starting with a shell, SHELL_FILE > + is the shell to run. Otherwise, SHELL_FILE is NULL. */ > + execv_argv (const char *exec_file, const std::string &allargs, > + const char *shell_file); > + > + /* Return a pointer to the built argv, in the type expected by > + execv. The result is (only) valid for as long as this execv_argv > + object is live. We return a "char **" because that's the type > + that the execv functions expect. Note that it is guaranteed that > + the execv functions do not modify the argv[] array nor the > + strings to which the array point. */ > + char **argv () > + { > + return const_cast (&m_argv[0]); > + } > + > +private: > + /* Disable copying. */ > + execv_argv (const execv_argv &) = delete; > + void operator= (const execv_argv &) = delete; > + > + /* Helper methods for constructing the argument vector. */ > + > + /* Used when building an argv for a straight execv call, without > + going via the shell. */ > + void init_for_no_shell (const char *exec_file, > + const std::string &allargs); > + > + /* Used when building an argv for execing a shell that execs the > + child program. */ > + void init_for_shell (const char *exec_file, > + const std::string &allargs, > + const char *shell_file); > + > + /* The argument vector built. Holds non-owning pointers. Elements > + either point to the strings passed to the execv_argv ctor, or > + inside M_STORAGE. */ > + std::vector m_argv; > + > + /* Storage. In the no-shell case, this contains a copy of the > + arguments passed to the ctor, split by '\0'. In the shell case, > + this contains the quoted shell command. I.e., SHELL_COMMAND in > + {"$SHELL" "-c", SHELL_COMMAND, NULL}. */ > + std::string m_storage; > +}; > + > +/* Create argument vector for straight call to execvp. Breaks up > + ALLARGS into an argument vector suitable for passing to execvp and > + stores it in M_ARGV. E.g., on "run a b c d" this routine would get > + as input the string "a b c d", and as output it would fill in > + M_ARGV with the four arguments "a", "b", "c", "d". Each argument > + in M_ARGV points to a substring of a copy of ALLARGS stored in > + M_STORAGE. */ > + > +void > +execv_argv::init_for_no_shell (const char *exec_file, > + const std::string &allargs) > { > - for (size_t cur_pos = 0; cur_pos < scratch.size ();) > + > + /* Save/work with a copy stored in our storage. The pointers pushed > + to M_ARGV point directly into M_STORAGE, which is modified in > + place with the necessary NULL terminators. This avoids N heap > + allocations and string dups when 1 is sufficient. */ > + std::string &args_copy = m_storage = allargs; > + > + m_argv.push_back (exec_file); > + > + for (size_t cur_pos = 0; cur_pos < args_copy.size ();) > { > /* Skip whitespace-like chars. */ > - std::size_t pos = scratch.find_first_not_of (" \t\n", cur_pos); > + std::size_t pos = args_copy.find_first_not_of (" \t\n", cur_pos); > > if (pos != std::string::npos) > cur_pos = pos; > > /* Find the position of the next separator. */ > - std::size_t next_sep = scratch.find_first_of (" \t\n", cur_pos); > + std::size_t next_sep = args_copy.find_first_of (" \t\n", cur_pos); > > - /* No separator found, which means this is the last > - argument. */ > if (next_sep == std::string::npos) > - next_sep = scratch.size (); > + { > + /* No separator found, which means this is the last > + argument. */ > + next_sep = args_copy.size (); > + } > + else > + { > + /* Replace the separator with a terminator. */ > + args_copy[next_sep++] = '\0'; > + } > > - char *arg = savestring (scratch.c_str () + cur_pos, next_sep - cur_pos); > - argv.push_back (arg); > + m_argv.push_back (&args_copy[cur_pos]); > > cur_pos = next_sep; > } > > /* NULL-terminate the vector. */ > - argv.push_back (NULL); > + m_argv.push_back (NULL); > } > > -/* When executing a command under the given shell, return non-zero if > - the '!' character should be escaped when embedded in a quoted > +/* When executing a command under the given shell, return true if the > + '!' character should be escaped when embedded in a quoted > command-line argument. */ > > -static int > +static bool > escape_bang_in_quoted_argument (const char *shell_file) > { > - const int shell_file_len = strlen (shell_file); > + size_t shell_file_len = strlen (shell_file); > > /* Bang should be escaped only in C Shells. For now, simply check > that the shell name ends with 'csh', which covers at least csh > and tcsh. This should be good enough for now. */ > > if (shell_file_len < 3) > - return 0; > + return false; > > if (shell_file[shell_file_len - 3] == 'c' > && shell_file[shell_file_len - 2] == 's' > && shell_file[shell_file_len - 1] == 'h') > - return 1; > + return true; > > - return 0; > + return false; > +} > + > +/* See declaration. */ > + > +execv_argv::execv_argv (const char *exec_file, > + const std::string &allargs, > + const char *shell_file) > +{ > + if (shell_file == NULL) > + init_for_no_shell (exec_file, allargs); > + else > + init_for_shell (exec_file, allargs, shell_file); > +} > + > +/* See declaration. */ > + > +void > +execv_argv::init_for_shell (const char *exec_file, > + const std::string &allargs, > + const char *shell_file) > +{ > + /* We're going to call a shell. */ > + bool escape_bang = escape_bang_in_quoted_argument (shell_file); > + > + /* We need to build a new shell command string, and make argv point > + to it. So build it in the storage. */ > + std::string &shell_command = m_storage; > + > + shell_command = "exec "; > + > + /* Add any exec wrapper. That may be a program name with arguments, > + so the user must handle quoting. */ > + if (exec_wrapper) > + { > + shell_command += exec_wrapper; > + shell_command += ' '; > + } > + > + /* Now add exec_file, quoting as necessary. */ > + > + /* Quoting in this style is said to work with all shells. But csh > + on IRIX 4.0.1 can't deal with it. So we only quote it if we need > + to. */ > + bool need_to_quote; > + const char *p = exec_file; > + while (1) > + { > + switch (*p) > + { > + case '\'': > + case '!': > + case '"': > + case '(': > + case ')': > + case '$': > + case '&': > + case ';': > + case '<': > + case '>': > + case ' ': > + case '\n': > + case '\t': > + need_to_quote = true; > + goto end_scan; > + > + case '\0': > + need_to_quote = false; > + goto end_scan; > + > + default: > + break; > + } > + ++p; > + } > + end_scan: > + if (need_to_quote) > + { > + shell_command += '\''; > + for (p = exec_file; *p != '\0'; ++p) > + { > + if (*p == '\'') > + shell_command += "'\\''"; > + else if (*p == '!' && escape_bang) > + shell_command += "\\!"; > + else > + shell_command += *p; > + } > + shell_command += '\''; > + } > + else > + shell_command += exec_file; > + > + shell_command += ' ' + allargs; > + > + /* If we decided above to start up with a shell, we exec the shell. > + "-c" says to interpret the next arg as a shell command to > + execute, and this command is "exec ". */ > + m_argv.reserve (4); > + m_argv.push_back (shell_file); > + m_argv.push_back ("-c"); > + m_argv.push_back (shell_command.c_str ()); > + m_argv.push_back (NULL); > } > > /* See inferior.h. */ > @@ -154,8 +327,6 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs, > static char *shell_file; > static const char *exec_file; > char **save_our_env; > - int shell = 0; > - std::vector argv; > const char *inferior_io_terminal = get_inferior_io_terminal (); > struct inferior *inf; > int i; > @@ -172,106 +343,20 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs, > /* 'startup_with_shell' is declared in inferior.h and bound to the > "set startup-with-shell" option. If 0, we'll just do a > fork/exec, no shell, so don't bother figuring out what shell. */ > - shell_file = shell_file_arg; > if (startup_with_shell) > { > + shell_file = shell_file_arg; > /* Figure out what shell to start up the user program under. */ > if (shell_file == NULL) > shell_file = getenv ("SHELL"); > if (shell_file == NULL) > shell_file = default_shell_file; > - shell = 1; > - } > - > - if (!shell) > - { > - /* We're going to call execvp. Create argument vector. */ > - argv.push_back (xstrdup (exec_file)); > - breakup_args (allargs, argv); > } > else > - { > - /* We're going to call a shell. */ > - std::string shell_command; > - const char *p; > - int need_to_quote; > - const int escape_bang = escape_bang_in_quoted_argument (shell_file); > - > - shell_command = std::string ("exec "); > - > - /* Add any exec wrapper. That may be a program name with arguments, so > - the user must handle quoting. */ > - if (exec_wrapper) > - { > - shell_command += exec_wrapper; > - shell_command += ' '; > - } > + shell_file = NULL; > > - /* Now add exec_file, quoting as necessary. */ > - > - /* Quoting in this style is said to work with all shells. But > - csh on IRIX 4.0.1 can't deal with it. So we only quote it if > - we need to. */ > - p = exec_file; > - while (1) > - { > - switch (*p) > - { > - case '\'': > - case '!': > - case '"': > - case '(': > - case ')': > - case '$': > - case '&': > - case ';': > - case '<': > - case '>': > - case ' ': > - case '\n': > - case '\t': > - need_to_quote = 1; > - goto end_scan; > - > - case '\0': > - need_to_quote = 0; > - goto end_scan; > - > - default: > - break; > - } > - ++p; > - } > - end_scan: > - if (need_to_quote) > - { > - shell_command += '\''; > - for (p = exec_file; *p != '\0'; ++p) > - { > - if (*p == '\'') > - shell_command += "'\\''"; > - else if (*p == '!' && escape_bang) > - shell_command += "\\!"; > - else > - shell_command += *p; > - } > - shell_command += '\''; > - } > - else > - shell_command += exec_file; > - > - shell_command += " " + allargs; > - > - /* If we decided above to start up with a shell, we exec the > - shell, "-c" says to interpret the next arg as a shell command > - to execute, and this command is "exec > - ". We xstrdup all the strings here because they will > - be free'd later in the code. */ > - argv.push_back (xstrdup (shell_file)); > - argv.push_back (xstrdup ("-c")); > - argv.push_back (xstrdup (shell_command.c_str ())); > - argv.push_back (NULL); > - } > + /* Build the argument vector. */ > + execv_argv child_argv (exec_file, allargs, shell_file); > > /* Retain a copy of our environment variables, since the child will > replace the value of environ and if we're vforked, we have to > @@ -376,6 +461,8 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs, > path to find $SHELL. Rich Pixley says so, and I agree. */ > environ = env; > > + char **argv = child_argv.argv (); > + > if (exec_fun != NULL) > (*exec_fun) (argv[0], &argv[0], env); > else > @@ -393,8 +480,6 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs, > _exit (0177); > } > > - free_vector_argv (argv); > - > /* Restore our environment in case a vforked child clob'd it. */ > environ = save_our_env; > > -- > 2.5.5 -- Sergio GPG key ID: 237A 54B1 0287 28BF 00EF 31F4 D0EB 7628 65FC 5E36 Please send encrypted e-mail if possible http://sergiodj.net/