From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2006 invoked by alias); 2 Aug 2011 20:37:20 -0000 Received: (qmail 1996 invoked by uid 22791); 2 Aug 2011 20:37:18 -0000 X-SWARE-Spam-Status: No, hits=-6.1 required=5.0 tests=AWL,BAYES_00,KAM_STOCKGEN,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,SPF_HELO_PASS,TW_CP X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 02 Aug 2011 20:36:57 +0000 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p72KavNr009960 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 2 Aug 2011 16:36:57 -0400 Received: from host1.jankratochvil.net (ovpn-116-16.ams2.redhat.com [10.36.116.16]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p72Katgh005232 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 2 Aug 2011 16:36:56 -0400 Received: from host1.jankratochvil.net (localhost [127.0.0.1]) by host1.jankratochvil.net (8.14.4/8.14.4) with ESMTP id p72KasIr015411; Tue, 2 Aug 2011 22:36:54 +0200 Received: (from jkratoch@localhost) by host1.jankratochvil.net (8.14.4/8.14.4/Submit) id p72KasLA015410; Tue, 2 Aug 2011 22:36:54 +0200 Date: Tue, 02 Aug 2011 20:37:00 -0000 From: Jan Kratochvil To: Keith Seitz Cc: gdb-patches@sourceware.org Subject: Re: [RFA] c++/12266 (again) [cp_canonicalize_no_typedefs-4.patch] Message-ID: <20110802203654.GA8670@host1.jankratochvil.net> References: <4E31EE1A.5040204@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E31EE1A.5040204@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2011-08/txt/msg00038.txt.bz2 On Fri, 29 Jul 2011 01:17:46 +0200, Keith Seitz wrote: > --- a/gdb/cp-name-parser.y > +++ b/gdb/cp-name-parser.y > @@ -1979,6 +1979,27 @@ cp_demangled_name_parse_free (struct demangle_parse_info *parse_info) > free (parse_info); > } > > +/* Merge the two parse trees given by DEST and SRC. The parse tree > + in SRC is attached to DEST at the node represented by TARGET. > + SRC is then freed. */ It would need to state that DEST will then still possibly reference the original string of SRC. Commented more at `struct demangle_parse_info'. > + > +void > +cp_merge_demangle_parse_infos (struct demangle_parse_info *dest, > + struct demangle_component *target, > + struct demangle_parse_info *src) > + > +{ > + struct demangle_info *di; > + > + memcpy (target, src->tree, sizeof (struct demangle_component)); Why not just: *target = *src->tree; > + di = dest->info; > + while (di->next != NULL) > + di = di->next; > + di->next = src->info; > + src->info = NULL; > + cp_demangled_name_parse_free (src); > +} > + > /* A cleanup wrapper for cp_demangled_name_parse_free. */ > > static void [...] > +static int > +inspect_type (struct demangle_parse_info *info, > + struct demangle_component *ret_comp, > + VEC (namep) **free_list) > +{ > + int i; > + char *name; > + struct symbol *sym; > + > + /* Copy the symbol's name from RET_COMP and look it up > + in the symbol table. */ > + name = (char *) alloca (ret_comp->u.s_name.len + 1); > + memcpy (name, ret_comp->u.s_name.s, ret_comp->u.s_name.len); > + name[ret_comp->u.s_name.len] = '\0'; > + > + /* Ignore any typedefs that should not be substituted. */ > + for (i = 0; i < ARRAY_SIZE (ignore_typedefs); ++i) > + { > + if (strcmp (name, ignore_typedefs[i]) == 0) > + return 0; > + } > + > + sym = lookup_symbol (name, 0, VAR_DOMAIN, 0); > + if (sym != NULL) > + { > + struct type *otype = SYMBOL_TYPE (sym); > + > + /* If the type is a typedef, replace it. */ > + if (TYPE_CODE (otype) == TYPE_CODE_TYPEDEF) > + { > + long len; > + int is_anon; > + struct type *last, *type; > + struct demangle_parse_info *i; > + struct ui_file *buf = mem_fileopen (); > + struct cleanup *cleanup = make_cleanup_ui_file_delete (buf); > + > + /* If the final typedef points to an anonymous struct, union, > + or enum, keep the (last) typedef. */ > + > + /* Find the last typedef for the type. */ > + last = otype; > + while (TYPE_CODE (TYPE_TARGET_TYPE (last)) == TYPE_CODE_TYPEDEF) > + last = TYPE_TARGET_TYPE (last); You can calculate LAST only if it gets used - move the code more down. > + > + /* Get the real type of the typedef. */ > + type = check_typedef (otype); > + > + is_anon = (TYPE_TAG_NAME (type) == NULL > + && (TYPE_CODE (type) == TYPE_CODE_ENUM > + || TYPE_CODE (type) == TYPE_CODE_STRUCT > + || TYPE_CODE (type) == TYPE_CODE_UNION)); > + if (is_anon) > + { > + /* If there is only one typedef for this anonymous type, > + do not substitute it. */ > + if (type == otype) > + { > + do_cleanups (cleanup); > + return 0; > + } > + else > + /* Use the last typedef seen as the type for this > + anonymous type. */ > + type = last; > + } In some cases type == otype here, such as if there was only one typedef of anonymous type or of typedef failed to be resolved. Here could be also just return 0. if (type == otype) { do_cleanups; return 0; } > + > + > + type_print (type, "", buf, -1); > + name = ui_file_xstrdup (buf, &len); > + VEC_safe_push (namep, *free_list, name); > + do_cleanups (cleanup); > + > + /* Turn the result into a new tree. Note that this > + tree will contain pointers into NAME, so NAME cannot > + be free'd until all typedef conversion is done and > + the final result is converted into a string. */ > + i = cp_demangled_name_to_comp (name, NULL); > + if (i != NULL) > + { > + /* Merge the two trees. */ > + cp_merge_demangle_parse_infos (info, ret_comp, i); > + > + /* Replace any newly introduced typedefs -- but not > + if the type is anonymous (that would lead to infinite > + looping). */ > + if (!is_anon) > + replace_typedefs (info, ret_comp, free_list); > + } > + else > + { > + /* This shouldn't happen unless the type printer has > + output something that the name parser cannot grok. > + Nonetheless, an ounce of prevention... > + > + Canonicalize the name again, and store it in the > + current node (RET_COMP). */ > + char *canon = cp_canonicalize_string_no_typedefs (name); > + > + if (canon != NULL) > + { > + xfree (name); double-free, NAME is already in FREE_LIST. > + name = canon; LEN is not set here for NAME. And CANON is leaked. > + } > + > + ret_comp->u.s_name.s = name; > + ret_comp->u.s_name.len = len; > + } > + > + return 1; > + } > + } > + > + return 0; > +} > + > +/* Replace any typedefs appearing in the qualified name > + (DEMANGLE_COMPONENT_QUAL_NAME) represented in RET_COMP for the name parse > + given in INFO. Store any allocated memory in FREE_LIST. */ > + > +static void > +replace_typedefs_qualified_name (struct demangle_parse_info *info, > + struct demangle_component *ret_comp, > + VEC (namep) **free_list) > +{ > + long len; > + char *name; > + struct ui_file *buf = mem_fileopen (); > + struct demangle_component *comp = ret_comp; > + > + /* Walk each node of the qualified name, reconstructing the name of > + this element. With every node, check for any typedef substitutions. > + If a substitution has occurred, replace the qualified name node > + with a DEMANGLE_COMPONENT_NAME node representing the new, typedef- > + substituted name. */ > + while (comp->type == DEMANGLE_COMPONENT_QUAL_NAME) > + { > + if (d_left (comp)->type == DEMANGLE_COMPONENT_NAME) > + { > + struct demangle_component new; > + > + ui_file_write (buf, d_left (comp)->u.s_name.s, > + d_left (comp)->u.s_name.len); > + name = ui_file_xstrdup (buf, &len); > + new.type = DEMANGLE_COMPONENT_NAME; > + new.u.s_name.s = name; > + new.u.s_name.len = len; > + if (inspect_type (info, &new, free_list)) > + { > + char *n; > + > + /* A typedef was substituted in NEW. Convert it to a > + string and replace the top DEMANGLE_COMPONENT_QUAL_NAME > + node. */ > + n = cp_comp_to_string (&new, 100); > + xfree (name); > + if (n != NULL) > + { > + ui_file_rewind (buf); ui_file_rewind should happen even if N is NULL, somehow probably to rather abort the operation on NULL N. > + VEC_safe_push (namep, *free_list, n); > + IMO here is missing: ret_comp->type = DEMANGLE_COMPONENT_NAME; > + d_left (ret_comp)->u.s_name.s = n; > + d_left (ret_comp)->u.s_name.len = strlen (n); > + d_right (ret_comp) = d_right (comp); > + comp = ret_comp; > + continue; > + } > + } > + } > + else > + { > + /* The current node is not a name, so simply replace any > + typedefs in it. Then print it to the stream to continue > + checking for more typedefs in the tree. */ > + replace_typedefs (info, d_left (comp), free_list); > + name = cp_comp_to_string (d_left (comp), 100); If it returns NULL I would prefer some sort of abort. It risks now to quietly corrupt the name. The later operations will probably fail not modifying anything but still it is a bit fragile. > + if (name != NULL) > + { > + fputs_unfiltered (name, buf); > + xfree (name); > + } > + } > + ui_file_write (buf, "::", 2); > + comp = d_right (comp); > + } > + > + /* If the next component is DEMANGLE_COMPONENT_NAME, save the qualified > + name assembled above and append the name given by COMP. Then use this > + reassembled name to check for a typedef. */ > + > + if (comp->type == DEMANGLE_COMPONENT_NAME) > + { > + ui_file_write (buf, comp->u.s_name.s, comp->u.s_name.len); > + name = ui_file_xstrdup (buf, &len); > + VEC_safe_push (namep, *free_list, name); > + > + /* Replace the top (DEMANGLE_COMPONENT_QUAL_NAME) node > + with a DEMANGLE_COMPONENT_NAME node containing the whole > + name. */ > + ret_comp->type = DEMANGLE_COMPONENT_NAME; > + ret_comp->u.s_name.s = name; > + ret_comp->u.s_name.len = len; > + (void) inspect_type (info, ret_comp, free_list); > + } > + else > + replace_typedefs (info, comp, free_list); > + > + ui_file_delete (buf); > +} [...] > +static void > +replace_typedefs (struct demangle_parse_info *info, > + struct demangle_component *ret_comp, > + VEC (namep) **free_list) > +{ > + if (ret_comp) > + { > + switch (ret_comp->type) > + { > + case DEMANGLE_COMPONENT_ARGLIST: > + check_cv_qualifiers (ret_comp); > + /* Fall through */ > + > + case DEMANGLE_COMPONENT_FUNCTION_TYPE: > + case DEMANGLE_COMPONENT_TEMPLATE: > + case DEMANGLE_COMPONENT_TEMPLATE_ARGLIST: > + replace_typedefs (info, d_left (ret_comp), free_list); > + replace_typedefs (info, d_right (ret_comp), free_list); > + break; > + > + case DEMANGLE_COMPONENT_TYPED_NAME: > + { > + struct demangle_component *comp = d_right (ret_comp); > + > + while (comp != NULL > + && (comp->type == DEMANGLE_COMPONENT_VOLATILE > + || comp->type == DEMANGLE_COMPONENT_RESTRICT > + || comp->type == DEMANGLE_COMPONENT_CONST > + || comp->type == DEMANGLE_COMPONENT_VOLATILE_THIS > + || comp->type == DEMANGLE_COMPONENT_RESTRICT_THIS > + || comp->type == DEMANGLE_COMPONENT_CONST_THIS)) > + comp = d_left (comp); > + > + if (d_left (ret_comp)->type != DEMANGLE_COMPONENT_NAME > + || comp->type != DEMANGLE_COMPONENT_FUNCTION_TYPE) > + replace_typedefs (info, d_left (ret_comp), free_list); I admit I do not understand the goal of the COMP computation and the conditional. replace_typedefs unconditionally does not break gdb.cp/meth-typedefs.exp. At least a comment of the purpose would be nice. > + replace_typedefs (info, d_right (ret_comp), free_list); > + } > + break; > + > + case DEMANGLE_COMPONENT_NAME: > + (void) inspect_type (info, ret_comp, free_list); > + break; > + > + case DEMANGLE_COMPONENT_QUAL_NAME: > + replace_typedefs_qualified_name (info, ret_comp, free_list); > + break; > + > + case DEMANGLE_COMPONENT_LOCAL_NAME: > + case DEMANGLE_COMPONENT_CTOR: > + case DEMANGLE_COMPONENT_ARRAY_TYPE: > + case DEMANGLE_COMPONENT_PTRMEM_TYPE: > + replace_typedefs (info, d_right (ret_comp), free_list); > + break; > + > + case DEMANGLE_COMPONENT_CONST: > + case DEMANGLE_COMPONENT_RESTRICT: > + case DEMANGLE_COMPONENT_VOLATILE: > + case DEMANGLE_COMPONENT_VOLATILE_THIS: > + case DEMANGLE_COMPONENT_CONST_THIS: > + case DEMANGLE_COMPONENT_RESTRICT_THIS: > + case DEMANGLE_COMPONENT_POINTER: > + case DEMANGLE_COMPONENT_REFERENCE: > + replace_typedefs (info, d_left (ret_comp), free_list); > + break; > + > + default: > + break; > + } > + } > +} > + > +/* Parse STRING and convert it to canonical form, resolving any typedefs. > + If parsing fails, or if STRING is already canonical, return NULL. > + Otherwise return the canonical form. The return value is allocated via > + xmalloc. */ > + > +char * > +cp_canonicalize_string_no_typedefs (const char *string) > +{ > + char *ret; > + unsigned int estimated_len; > + struct demangle_parse_info *info; > + VEC (namep) *free_list; obstack is easier for the freeing but it may be rather moved to struct demangle_parse_info. > + > + ret = NULL; > + free_list = VEC_alloc (namep, 10); > + estimated_len = strlen (string) * 2; > + info = cp_demangled_name_to_comp (string, NULL); I have tried to run some strings to it and for example jsRegExpFree(struct JSRegExp *) fails to parse by cp-name-parser.y. This is outside of scope of this patch, it should cause no regressions. It reduces the PR 12266 functionality and it is a blocker for the possible future drop of DW_AT_MIPS_linkage_name. > + if (info != NULL) > + { > + /* Replace all the typedefs in the tree. */ > + replace_typedefs (info, info->tree, &free_list); > + > + /* Convert the tree back into a string. */ > + ret = cp_comp_to_string (info->tree, estimated_len); > + gdb_assert (ret != NULL); > + > + /* Free the parse information. */ > + cp_demangled_name_parse_free (info); > + > + /* Free any memory allocated during typedef replacement. */ > + if (!VEC_empty (namep, free_list)) > + { > + int i; > + char *iter; > + > + for (i = 0; VEC_iterate (namep, free_list, i, iter); ++i) > + xfree (iter); > + } > + > + /* Free the vector used for the free list. */ > + VEC_free (namep, free_list); > + > + /* Finally, compare the original string with the computed > + name, returning NULL if they are the same. */ > + if (strcmp (string, ret) == 0) > + { > + xfree (ret); > + return NULL; > + } > + } > + > + return ret; > +} > + [...] > @@ -1365,7 +1365,7 @@ decode_compound (char **argptr, int funfirstline, > char *the_real_saved_arg, char *p) > { > struct symtabs_and_lines values; > - char *p2; > + char *p2, *name; > char *saved_arg2 = *argptr; > char *temp_end; > struct symbol *sym; > @@ -1373,6 +1373,7 @@ decode_compound (char **argptr, int funfirstline, > struct symbol *sym_class; > struct type *t; > char *saved_arg; > + struct cleanup *cleanup; > > /* If the user specified any completer quote characters in the input, > strip them. They are superfluous. */ > @@ -1597,7 +1598,22 @@ decode_compound (char **argptr, int funfirstline, > *argptr = (*p == '\'') ? p + 1 : p; > > /* Look up entire name. */ > - sym = lookup_symbol (copy, get_selected_block (0), VAR_DOMAIN, 0); > + name = copy; > + > + cleanup = make_cleanup (null_cleanup, NULL); > + if (current_language->la_language == language_cplus) > + { > + char *canon = cp_canonicalize_string_no_typedefs (copy); You are now using cp_canonicalize_string_no_typedefs at the consumers but it is not used at the producer - at dwarf2_canonicalize_name. It is not needed there as long as GDB depends on DW_AT_MIPS_linkage_name. But do you plan to use it at dwarf2_canonicalize_name? That is it would fix the `set debug check-physname yes' warnings. I guess you were referring to the dwarf2_canonicalize_name application by your text: # I've pruned the original patchset down substantially: a lot of the # previous three (!) patches dealt with dwarf2_physname fixes which are # now no longer necessary for GCC. I will attempt to clean these up for # later submission for the benefit of other non-GNU compilers. > + > + if (canon != NULL) > + { > + name = canon; > + make_cleanup (xfree, name); > + } > + } > + > + sym = lookup_symbol (name, get_selected_block (0), VAR_DOMAIN, 0); > + do_cleanups (cleanup); > if (sym) > return symbol_found (funfirstline, canonical, copy, sym, NULL, NULL); > else [...] Thanks, Jan