From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 74493 invoked by alias); 12 Apr 2017 22:26:31 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 74009 invoked by uid 89); 12 Apr 2017 22:26:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 12 Apr 2017 22:26:28 +0000 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 00C888046A for ; Wed, 12 Apr 2017 22:26:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 00C888046A Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=sergiodj@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 00C888046A Received: from localhost (unused-10-15-17-193.yyz.redhat.com [10.15.17.193]) by smtp.corp.redhat.com (Postfix) with ESMTPS id BA1CC785D8; Wed, 12 Apr 2017 22:26:28 +0000 (UTC) From: Sergio Durigan Junior To: Pedro Alves Cc: GDB Patches Subject: Re: [PATCH] fork-child.c: Avoid unnecessary heap-allocation / string copying (Re: [PATCH v5 3/5] C++-fy and prepare for sharing fork_inferior) References: <1482464361-4068-1-git-send-email-sergiodj@redhat.com> <20170330014915.4894-1-sergiodj@redhat.com> <20170330014915.4894-4-sergiodj@redhat.com> <877f2q3223.fsf@redhat.com> <87o9w2z068.fsf@redhat.com> <36f28afa-1174-df6a-cb54-fdcf129fa816@redhat.com> Date: Wed, 12 Apr 2017 22:26:00 -0000 In-Reply-To: <36f28afa-1174-df6a-cb54-fdcf129fa816@redhat.com> (Pedro Alves's message of "Wed, 12 Apr 2017 11:14:35 +0100") Message-ID: <878tn5w9d7.fsf@redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-IsSubscribed: yes X-SW-Source: 2017-04/txt/msg00374.txt.bz2 On Wednesday, April 12 2017, Pedro Alves wrote: > On 04/12/2017 06:04 AM, Sergio Durigan Junior wrote: > >>>>> static void >>>>> -breakup_args (char *scratch, char **argv) >>>>> +breakup_args (const std::string &scratch, std::vector &argv) >>>>> { >>>> >>>> ... >>>> >>>>> + >>>>> + std::string arg = scratch.substr (cur_pos, next_sep - cur_pos); >>>>> + >>>> >>>> This creates a temporary string (heap allocates) ... >>>> >>>>> + argv.push_back (xstrdup (arg.c_str ())); >>>> >>>> ... and here you create yet another copy. >>>> >>>> You should be able to avoid it by using e.g., savestring: >>>> >>>> char *arg = savestring (scratch.c_str () + cur_pos, next_sep - cur_pos); >>>> argv.push_back (arg); >>> >>> Fair enough. I had my mind on "C++-only mode" when writing this code. > > Yup, in C++, it's good to keep unnecessary temporaries > and hidden heap allocations in mind. Actually, now that I look a bit > deeper, I think we can avoid a "premature pessimization" here, keeping > the same level of clarity. > > I think it'd be good to push this patch below. WDYT? Hm, I don't know. I mean, I understand where you're coming from and what you're trying to achieve, but somehow I thought the old code (especially the part that modified the argument string in place, using \0's as separators) was somewhat hacky. Even though the new code uses more memory, it it also more readable, at least IMNSHO. And also more const-correct, since we can make breakup_args_for_exec receive a const as the first argument. But I liked a few bits of your patch. See below. > From 64d0035762344ed33858a3b8930a83e7071ea4a8 Mon Sep 17 00:00:00 2001 > From: Pedro Alves > Date: Wed, 12 Apr 2017 10:55:36 +0100 > Subject: [PATCH] fork-child.c: Avoid unnecessary heap-allocation / string > copying > > The previous patch converted the argv building from an > alloca-allocated array of non-owning arg pointers, to a std::vector of > owning pointers, which results in N string dups, with N being the > number of arguments in the vector, and then requires manually > releasing the pointers owned by the vector. > > This patch makes the vector hold non-owning pointers, and avoids the > string dups, by doing one single string copy of the arguments upfront, > and replacing separators with NULL terminators in place, like we used > to. > > With this, there's no need to remember to call free_vector_argv either. > > Tested on x86_64 Fedora 23. > > gdb/ChangeLog: > yyyy-mm-dd Pedro Alves > > * fork-child.c (breakup_args): Make 'scratch' non-const. > Replace separators with NULL terminators in place. Change > type of vector. (fork_inferior): The argument vector now > holds non-owning pointers. Don't strdup strings into the > vector. Remove free_vector_argv call. > --- > gdb/fork-child.c | 57 ++++++++++++++++++++++++++++++++++++-------------------- > 1 file changed, 37 insertions(+), 20 deletions(-) > > diff --git a/gdb/fork-child.c b/gdb/fork-child.c > index 6b7386e..8a6f9e6 100644 > --- a/gdb/fork-child.c > +++ b/gdb/fork-child.c > @@ -46,10 +46,12 @@ static char *exec_wrapper; > /* Break up SCRATCH into an argument vector suitable for passing to > execvp and store it in ARGV. E.g., on "run a b c d" this routine > would get as input the string "a b c d", and as output it would > - fill in ARGV with the four arguments "a", "b", "c", "d". */ > + fill in ARGV with the four arguments "a", "b", "c", "d". The > + pointers pushed to ARGV point directly into SCRATCH, which is > + modified in place with the necessary NULL terminators. */ > > static void > -breakup_args (const std::string &scratch, std::vector &argv) > +breakup_args (std::string &scratch, std::vector &argv) > { > for (size_t cur_pos = 0; cur_pos < scratch.size ();) > { > @@ -62,13 +64,19 @@ breakup_args (const std::string &scratch, std::vector &argv) > /* Find the position of the next separator. */ > std::size_t next_sep = scratch.find_first_of (" \t\n", cur_pos); > > - /* No separator found, which means this is the last > - argument. */ > if (next_sep == std::string::npos) > - next_sep = scratch.size (); > + { > + /* No separator found, which means this is the last > + argument. */ > + next_sep = scratch.size (); > + } This comment should really be inside the block. Thanks. > + else > + { > + /* Replace the separator with a terminator. */ > + scratch[next_sep++] = '\0'; > + } > > - char *arg = savestring (scratch.c_str () + cur_pos, next_sep - cur_pos); > - argv.push_back (arg); > + argv.push_back (&scratch[cur_pos]); > > cur_pos = next_sep; > } > @@ -155,7 +163,6 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs, > static const char *exec_file; > char **save_our_env; > int shell = 0; > - std::vector argv; > const char *inferior_io_terminal = get_inferior_io_terminal (); > struct inferior *inf; > int i; > @@ -183,21 +190,29 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs, > shell = 1; > } > > + /* The argument vector. Holds non-owning pointers. */ > + std::vector c_argv; > + > + /* These must be live up to the exec call, because the arguments in > + C_ARGV point inside them. */ > + std::string broken_up_args; > + std::string shell_command; > + > if (!shell) > { > /* We're going to call execvp. Create argument vector. */ > - argv.push_back (xstrdup (exec_file)); > - breakup_args (allargs, argv); > + c_argv.push_back (exec_file); > + broken_up_args = allargs; > + breakup_args (broken_up_args, c_argv); > } > else > { > /* We're going to call a shell. */ > - std::string shell_command; > const char *p; > int need_to_quote; > const int escape_bang = escape_bang_in_quoted_argument (shell_file); > > - shell_command = std::string ("exec "); > + shell_command = "exec "; > > /* Add any exec wrapper. That may be a program name with arguments, so > the user must handle quoting. */ > @@ -265,12 +280,12 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs, > /* If we decided above to start up with a shell, we exec the > shell, "-c" says to interpret the next arg as a shell command > to execute, and this command is "exec > - ". We xstrdup all the strings here because they will > - be free'd later in the code. */ > - argv.push_back (xstrdup (shell_file)); > - argv.push_back (xstrdup ("-c")); > - argv.push_back (xstrdup (shell_command.c_str ())); > - argv.push_back (NULL); > + ". */ > + c_argv.reserve (4); I liked the idea of reserving the exact amount of space needed in the vector. > + c_argv.push_back (shell_file); > + c_argv.push_back ("-c"); > + c_argv.push_back (shell_command.c_str ()); > + c_argv.push_back (NULL); > } > > /* Retain a copy of our environment variables, since the child will > @@ -376,6 +391,10 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs, > path to find $SHELL. Rich Pixley says so, and I agree. */ > environ = env; > > + /* It is guaranteed that the exec functions do not modify the > + arguments, but they nevertheless expect "char **". */ > + char **argv = const_cast (&c_argv[0]); Also, I liked your approach to this. > + > if (exec_fun != NULL) > (*exec_fun) (argv[0], &argv[0], env); > else > @@ -393,8 +412,6 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs, > _exit (0177); > } > > - free_vector_argv (argv); > - > /* Restore our environment in case a vforked child clob'd it. */ > environ = save_our_env; > > -- > 2.5.5 Anyway, long story short: I won't oppose if you want to go ahead and push this patch, but it's not my preferred approach. Cheers, -- Sergio GPG key ID: 237A 54B1 0287 28BF 00EF 31F4 D0EB 7628 65FC 5E36 Please send encrypted e-mail if possible http://sergiodj.net/