From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 97639 invoked by alias); 1 May 2017 02:22:12 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 97623 invoked by uid 89); 1 May 2017 02:22:10 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=no version=3.3.2 spammy=Speaking, Wednesday, wednesday, ify X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 01 May 2017 02:22:08 +0000 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1CDC2C054908; Mon, 1 May 2017 02:22:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 1CDC2C054908 Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=sergiodj@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 1CDC2C054908 Received: from localhost (unused-10-15-17-193.yyz.redhat.com [10.15.17.193]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D5A6A7E46E; Mon, 1 May 2017 02:22:08 +0000 (UTC) From: Sergio Durigan Junior To: Pedro Alves Cc: GDB Patches , Simon Marchi Subject: Re: [PATCH v3] C++ify gdb/common/environ.c References: <20170413040455.23996-1-sergiodj@redhat.com> <20170418030319.12637-1-sergiodj@redhat.com> Date: Mon, 01 May 2017 02:22:00 -0000 In-Reply-To: (Pedro Alves's message of "Wed, 19 Apr 2017 19:14:17 +0100") Message-ID: <87o9vdjoz4.fsf@redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-IsSubscribed: yes X-SW-Source: 2017-05/txt/msg00000.txt.bz2 Hey, Thanks for the review, and really sorry about the delay. I have a few comments and questions below. On Wednesday, April 19 2017, Pedro Alves wrote: > On 04/18/2017 04:03 AM, Sergio Durigan Junior wrote: > >> - if (e->allocated < i) >> +const char * >> +gdb_environ::get (const std::string &var) const >> +{ >> + try >> { >> - e->allocated = std::max (i, e->allocated + 10); >> - e->vector = (char **) xrealloc ((char *) e->vector, >> - (e->allocated + 1) * sizeof (char *)); >> + return (char *) this->m_environ_map.at (var).c_str (); >> } >> - >> - memcpy (e->vector, environ, (i + 1) * sizeof (char *)); >> - >> - while (--i >= 0) >> + catch (const std::out_of_range &ex) > > Please use unordered_map::find instead. In general, > use "at" with exceptions when you're "sure" the element > exists. It'll be simpler, more efficient, and less > roundabout, since at() essentially does a find() and then > throws if not found. I decided to go the easier way and not use unordered_map. I confess I was having second thoughts myself when I posted the patch, and your message only made me more confident that I don't want to follow this route. >> { >> - int len = strlen (e->vector[i]); >> - char *newobj = (char *) xmalloc (len + 1); >> - >> - memcpy (newobj, e->vector[i], len + 1); >> - e->vector[i] = newobj; >> + return NULL; >> } >> } >> > >> void >> -set_in_environ (struct gdb_environ *e, const char *var, const char *value) >> +gdb_environ::set (const std::string &var, const std::string &value) >> { >> - int i; >> - int len = strlen (var); >> - char **vector = e->vector; >> - char *s; >> + std::unordered_map::iterator el; >> >> - for (i = 0; (s = vector[i]) != NULL; i++) >> - if (strncmp (s, var, len) == 0 && s[len] == '=') >> - break; >> + el = this->m_environ_map.find (var); > > In C++, it's a good practice to avoid first initializing, > and then assigning separately, because the default constructor > will then do useless work. (And in some cases, there may not even > be a default constructor.) > > I'm not a fan or "auto" everywhere, but with iterators, > I find it OK: > > auto el = this->m_environ_map.find (var); Without the unordered_map, this is no longer a problem. But of course, thanks for the review. >> >> - if (s == 0) >> + if (el != this->m_environ_map.end ()) >> { >> - if (i == e->allocated) >> + if (el->second == value) >> + return; >> + else >> { >> - e->allocated += 10; >> - vector = (char **) xrealloc ((char *) vector, >> - (e->allocated + 1) * sizeof (char *)); >> - e->vector = vector; >> + /* If we found this item, it means that we have to update >> + its value on the map. */ >> + el->second = value; >> + /* And we also have to update its value on the >> + environ_vector. For that, we just delete the item here >> + and recreate it (with the new value) later. */ >> + unset_in_vector (this->m_environ_vector, var); >> } >> - vector[i + 1] = 0; >> } >> else >> - xfree (s); >> - >> - s = (char *) xmalloc (len + strlen (value) + 2); >> - strcpy (s, var); >> - strcat (s, "="); >> - strcat (s, value); >> - vector[i] = s; >> - >> - /* This used to handle setting the PATH and GNUTARGET variables >> - specially. The latter has been replaced by "set gnutarget" >> - (which has worked since GDB 4.11). The former affects searching >> - the PATH to find SHELL, and searching the PATH to find the >> - argument of "symbol-file" or "exec-file". Maybe we should have >> - some kind of "set exec-path" for that. But in any event, having >> - "set env" affect anything besides the inferior is a bad idea. >> - What if we want to change the environment we pass to the program >> - without afecting GDB's behavior? */ >> - >> - return; >> + this->m_environ_map.insert (std::make_pair (var, value)); >> + >> + this->m_environ_vector.insert (this->m_environ_vector.end () - 1, >> + xstrdup (std::string (var >> + + '=' >> + + value).c_str ())); > > I don't really understand the "m_environ_vector.end () - 1" here. This is needed because the last element of m_environ_vector needs to be always NULL. Therefore, this code is basically inserting the new variable in the second-to-last position of the vector. I made a comment on top of it to clarify this part. Also, there are other places where I need to iterate through the elements of the vector, and I'm also using the "- 1" in these places. I'll put comments where applicable. > Also, that's a lot of unnecessary copying/duping in that last > statement. Using concat instead of xstrdup + operator+ would > easily avoid it. You're right, I've replaced it by a concat and it's simpler now. > And you could easily avoid having to dup the string into both the > map and the vector by making the map's value be a const char * > instead of a std::string: > > std::unordered_map > > and then make a map's value point directly into the value > part of the string that is owned by the vector (as in, > the substring that starts at VALUE in "VAR=VALUE"). That was a good idea. >> } >> >> -/* Remove the setting for variable VAR from environment E. */ >> +/* See common/environ.h. */ >> >> void >> -unset_in_environ (struct gdb_environ *e, const char *var) >> +gdb_environ::unset (const std::string &var) >> { >> - int len = strlen (var); >> - char **vector = e->vector; >> - char *s; >> + this->m_environ_map.erase (var); >> + unset_in_vector (this->m_environ_vector, var); >> +} > > >> -extern struct gdb_environ *make_environ (void); >> +class gdb_environ >> +{ >> +public: >> + /* Regular constructor and destructor. */ >> + gdb_environ (); >> + ~gdb_environ (); >> >> -extern void free_environ (struct gdb_environ *); >> + /* Reinitialize the environment stored. This is used when the user >> + wants to delete all environment variables added by him/her. */ >> + void reinit (); > > This should mention that the initial state is copied from the host's > environ. I think the default ctor could use a similar comment. > I was going to suggest to call this clear() instead, to go > with the standard containers' terminology, until I realized what > "init" really does. Or maybe even find a more explicit name, > reset_from_host_environ or some such. Sorry, I guess I wasn't aware of the importance of specifying that the variables come from the host. Somehow I thought this was implied. I guess reset_from_host_environ is a good name for the method; my other option would be "reinit_using_host_environ", but that's longer. > Perhaps an even clearer approach would be to make the default ctor > not do a deep copy of the host's "environ", but instead add a > static factor method that returned such a new gdb_environ, like: > > static gdb_environ > gdb_environ::from_host_environ () > { > // build/return a gdb_environ that wraps the host's environ global. > } > > Not sure. Only experimenting would tell. Sorry, I'm not sure I understand your suggestion. Please correct me if I'm wrong. You're basically saying that the default ctor shouldn't actually do anything; instead, the class should have this new from_host_environ method which would be the "official" ctor. The users would then call from_host_environ directly when they wanted an instance of the class, and the method would be responsible for initializing the object. Is that correct? If yes, I think I fail to see the advantage of this method over having a normal ctor (aside from explicitly naming the new ctor after the fact that we're using the host environ to build the object). ... after some reading ... Oh, I think I see what you're suggesting. Instead of building one gdb_environ every time someone requests it, we build just one and pass it along. OK, now it makes sense. > Speaking of copying the host environ: > >> _initialize_mi_cmd_env (void) >> { >> - struct gdb_environ *environment; >> + gdb_environ environment; >> const char *env; >> >> /* We want original execution path to reset to, if desired later. >> @@ -278,13 +278,10 @@ _initialize_mi_cmd_env (void) >> current_inferior ()->environment. Also, there's no obvious >> place where this code can be moved such that it surely run >> before any code possibly mangles original PATH. */ >> - environment = make_environ (); >> - init_environ (environment); >> - env = get_in_environ (environment, path_var_name); >> + env = environment.get (path_var_name); >> >> /* Can be null if path is not set. */ >> if (!env) >> env = ""; >> orig_path = xstrdup (env); >> - free_environ (environment); >> } > > This usage of gdb_environ looks like pointless > wrapping / dupping / freeing to me -- I don't see why we > need to dup the whole host environment just to get at some > env variable. Using good old getenv(3) directly should do, > and end up trimming off a bit of work from gdb's startup. Absolutely. I didn't pay close attention to this bit; I was just mechanically converting the code. > I could see perhaps wanting to avoid / optimize the linear > walks that getenv must do, if we have many getenv calls in > gdb, but then that suggests keeping a gdb_environ global that > is initialized early on and is reused. But that would miss > any setenv call that changes the environment since that > gdb_environ is created, so I'd prefer not do that unless > we find a real need. I'll make the code use getenv instead. >> + >> +private: >> + /* Helper function that initializes our data structures with the >> + environment variables. */ >> + void init (); >> + >> + /* Helper function that clears our data structures. */ >> + void clear (); > > s/function/method/g throughout (ChangeLog too). Ops, thanks. Fixed. >> + >> + /* A map representing the environment variables and their values. >> + It is easier to store them using a map because of the faster >> + lookups. */ > > It's not that it's easier to store them. Not having a map > would be even easier. Just do a linear walk on the vector, > just like getenv does. You want to instead say that we're > optimizing for get operations. (I'm not sure this is the > right trade off for this class, but I'll go along with it.) Well, no need to worry about this anymore. The new code doesn't use unordered_map, as mentioned above. >> + std::unordered_map m_environ_map; >> + >> + /* A vector containing the environment variables. This is useful >> + for when we need to obtain a 'char **' with all the existing >> + variables. */ > > This should say that the entries are in VAR=VALUE form, and what > is the typical user that needs that format. I'll expand the comment. >> + std::vector m_environ_vector; >> +}; >> >> #endif /* defined (ENVIRON_H) */ > >> else >> { >> - char **vector = environ_vector (current_inferior ()->environment); >> + const std::vector &vector = >> + current_inferior ()->environment.get_vector (); > > = goes on the next line. Fixed. >> >> - while (*vector) >> + for (char *elem : vector) >> { >> - puts_filtered (*vector++); >> + puts_filtered (elem); >> puts_filtered ("\n"); >> } >> } > > I'm not sure it's worth it to expose the std::vector implementation > detail just for this loop. I don't really understand how this > can work though, given that the vector is NULL terminated. I'm not exposing the std::vector anymore. Instead, I'm just using the regular 'char **' and looping until I find a NULL. > I agree with Simon -- this is begging for some unit tests. :-) I'm writing them. It's taking some time because it's the first time I'm using the unittest framework. Thanks, -- Sergio GPG key ID: 237A 54B1 0287 28BF 00EF 31F4 D0EB 7628 65FC 5E36 Please send encrypted e-mail if possible http://sergiodj.net/