From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12248 invoked by alias); 24 Jun 2010 02:25:20 -0000 Received: (qmail 12224 invoked by uid 22791); 24 Jun 2010 02:25:18 -0000 X-SWARE-Spam-Status: No, hits=-1.1 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,KAM_STOCKGEN,SPF_HELO_PASS,TW_BJ,TW_CP,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (216.239.44.51) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 24 Jun 2010 02:25:13 +0000 Received: from hpaq6.eem.corp.google.com (hpaq6.eem.corp.google.com [172.25.149.6]) by smtp-out.google.com with ESMTP id o5O2PAjm005825 for ; Wed, 23 Jun 2010 19:25:11 -0700 Received: from vws1 (vws1.prod.google.com [10.241.21.129]) by hpaq6.eem.corp.google.com with ESMTP id o5O2P8S1021893 for ; Wed, 23 Jun 2010 19:25:09 -0700 Received: by vws1 with SMTP id 1so1936667vws.36 for ; Wed, 23 Jun 2010 19:25:08 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.124.18 with SMTP id s18mr4568968vcr.149.1277346308323; Wed, 23 Jun 2010 19:25:08 -0700 (PDT) Received: by 10.220.203.141 with HTTP; Wed, 23 Jun 2010 19:25:08 -0700 (PDT) In-Reply-To: <4C20DA1A.10301@redhat.com> References: <4C20103B.1080906@redhat.com> <201006221045.23196.pedro@codesourcery.com> <4C20DA1A.10301@redhat.com> Date: Thu, 24 Jun 2010 02:25:00 -0000 Message-ID: Subject: Re: [RFA] c++/11734 From: Doug Evans To: Keith Seitz Cc: gdb-patches@sourceware.org, Pedro Alves , Tom Tromey Content-Type: text/plain; charset=ISO-8859-1 X-System-Of-Record: true X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2010-06/txt/msg00517.txt.bz2 On Tue, Jun 22, 2010 at 8:43 AM, Keith Seitz wrote: > I noticed that I also missed adding this logic to the linear search case. > I've attached a revised patch which addresses this missed bit and moves the > alloca out of the loop. I looked at this a bit. One thing I noticed is that "break c::foo" is only working by accident. "break c::foo" needs full symbols because we look up each overloaded method, so we need a lookup of "c::foo()" to succeed. "c::foo" looks enough like an objc symbol that decode_objc will call lookup_symbol ("c::foo") which will expand the psymtab and hence "c::foo()" is found. With "break c::foo()", "c::foo()" doesn't look enough like an objc symbol so decode_objc ignores it - so we don't get full symtabs and thus "c::foo()" isn't found. [I'm not suggesting we fix this by making decode_objc recognize "c::foo()" as a potentially valid objc symbol though. :-)] I have a few comments on your patch. diff -u -p -r1.103 linespec.c --- linespec.c 14 May 2010 23:41:04 -0000 1.103 +++ linespec.c 22 Jun 2010 15:39:18 -0000 @@ -1217,6 +1217,19 @@ decode_compound (char **argptr, int funf struct type *t; char *saved_java_argptr = NULL; + /* Strip single quotes from SAVED_ARG. This interferes with this function + which might, e.g., later call strcmp_iw with SYMBOL_LINKAGE_NAME + (which never contains quotes). */ + if (*saved_arg == '\'') + { + char *close = strrchr (saved_arg, '\''); + if (close) + { + ++saved_arg; + *close = '\0'; + } + } + /* First check for "global" namespace specification, of the form "::foo". If found, skip over the colons and jump to normal symbol processing. I.e. the whole line specification starts with I think the change to linespec.c should be a separate patch. And I think decode_compound shouldn't be modifying what saved_arg points to. Plus the caller is expecting the trailing quote to be there when decode_compound returns. At least that's what it seems. [Time goes by ...] Looking deeper I see that for break 'c::foo()', is_quote_enclosed is zero. That's because locate_first_half only looks for double-quotes, but it will remove them! (which is what you're doing in decode_compound). So it seems like a better way to go is to teach locate_first_half about all quote characters. There may be more going on here though, I don't understand why decode_line_1 has *both* is_quoted and is_quote_enclosed (is there some cleanup that can be done here, or is there a technical reason to handle " different from '?). For the lookup_partial_symbol patch: diff -u -p -r1.4 psymtab.c --- psymtab.c 16 May 2010 01:27:02 -0000 1.4 +++ psymtab.c 22 Jun 2010 15:39:20 -0000 @@ -432,6 +432,7 @@ lookup_partial_symbol (struct partial_sy struct partial_symbol **top, **real_top, **bottom, **center; int length = (global ? pst->n_global_syms : pst->n_static_syms); int do_linear_search = 1; + char *simple_name = NULL, *paren; if (length == 0) { @@ -441,6 +442,14 @@ lookup_partial_symbol (struct partial_sy pst->objfile->global_psymbols.list + pst->globals_offset : pst->objfile->static_psymbols.list + pst->statics_offset); + paren = strchr (name, '('); + if (paren) + { + simple_name = alloca (strlen (name)); + memcpy (simple_name, name, paren - name); + simple_name[name - paren] = '\0'; + } + if (global) /* This means we can use a binary search. */ { do_linear_search = 0; @@ -476,12 +485,32 @@ lookup_partial_symbol (struct partial_sy if (!(top == bottom)) internal_error (__FILE__, __LINE__, _("failed internal consistency check")); - while (top <= real_top - && SYMBOL_MATCHES_SEARCH_NAME (*top, name)) + while (top <= real_top) { - if (symbol_matches_domain (SYMBOL_LANGUAGE (*top), - SYMBOL_DOMAIN (*top), domain)) - return (*top); + if (SYMBOL_MATCHES_SEARCH_NAME (*top, name)) + { + if (symbol_matches_domain (SYMBOL_LANGUAGE (*top), + SYMBOL_DOMAIN (*top), domain)) + return (*top); + } + else + { + if (simple_name) + { + /* NAME has overload information. Partial symbols, however, + do not. This is a case of mistaken identity. + + Although hacky, this is fixed by expanding this psymtab, + which will allow any subsequent symtab search to succeed. + + For more details/test case, please refer to c++/11734. */ + + if (SYMBOL_MATCHES_SEARCH_NAME (*top, simple_name)) + PSYMTAB_TO_SYMTAB (pst); + } + else + break; + } top++; } } @@ -497,6 +526,11 @@ lookup_partial_symbol (struct partial_sy SYMBOL_DOMAIN (*psym), domain) && SYMBOL_MATCHES_SEARCH_NAME (*psym, name)) return (*psym); + else if (simple_name && SYMBOL_MATCHES_SEARCH_NAME (*psym, simple_name)) + { + PSYMTAB_TO_SYMTAB (pst); + break; + } } } There are several callers and rather than changing all of them to strip method overloading of the name to search for, it seems reasonable to handle it in lookup_partial_symbol. [But there's more to the story here - expanding the psymtab here feels wrong, see below.] In the case of simple_name != NULL, do we need to call SYMBOL_MATCHES_SEARCH_NAME twice? IWBN if psymtabs didn't require that complexity and *only* recorded the un-overloaded name. Although, note that if a psymtab did record an overloaded name, because of the order of arguments to strcmp_iw, we have to worry about matching c::foo(int) in the psymtab with c::foo(char*) in the search string. strcmp_iw ("c::foo(int)", "c::foo") is a match. Could we set name to simple_name when simple_name is created, and then have only one call to SYMBOL_MATCHES_SEARCH_NAME? There's still the issue of handling the "found" case differently for non-overloaded names vs overloaded-names. AIUI, what you're doing here is having the lookup "fail" if an overloaded-name is found, so that a subsequent search in the full symtab will be done and, having expanded the psymtab here, that search will succeed. However psymtabs are searched *after* symtabs. This patch works because it turns out that we will later do a search in the full symtab, but that's more by accident than design: struct symbol * cp_lookup_symbol_nonlocal (const char *name, const struct block *block, const domain_enum domain) { struct symbol *sym; const char *scope = block_scope (block); sym = lookup_namespace_scope (name, block, domain, scope, 0); //<<< psymtab expanded here if (sym != NULL) return sym; return cp_lookup_symbol_namespace (scope, name, block, domain); //<<< overloaded symbol found here } [Unless I'm missing something of course. That's what I see as I single step through gdb watching what's happening.] This situation is not well solved by our normal psymtabs->symtabs lazy expansion design. I don't have a specific proposal for a better fix at the moment. Comments?