Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
* [rfa] tweak sorting of partial symbols
@ 2003-02-22  1:55 David Carlton
  2003-02-24  0:36 ` Elena Zannoni
  0 siblings, 1 reply; 5+ messages in thread
From: David Carlton @ 2003-02-22  1:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Elena Zannoni, Daniel Jacobowitz, Jim Blandy

Right now, partial symbols are sorted via strcmp.  But if you look up
NAME in the partial symbols, then what lookup_partial_symbol does is:

* Figure out where NAME fits under the strcmp ordering.

* Then, starting from that place in the partial symbols, look at all
  the partial symbols until we either find the right one or until we
  find one such that strcmp_iw(SYM_NAME,NAME) doesn't match.

In particular, there's the assumption that all symbols for which
strcmp_iw(SYM_NAME,NAME) matches are sorted via strcmp into a group
that starts out at the position where strcmp would put NAME.

But I can't see why this should be true.  So this patch adds a new
function strcmp_iw_ordered (not the best name, perhaps) that is
suitable for use via qsort (unlike strcmp_iw, which doesn't provide an
ordering or even an equivalence relation) and which does have the
desired property.  It does this by ignoring whitespace and by treating
'(' as the smallest non-NULL character (because, in some
circumstances, strcmp_iw wants to ignore matches where the difference
starts with an open parenthesis).

In practice, this should rarely if ever pose a problem, but it's a
good idea for the sake of pedanticism.  After this one goes in, I'll
also commit that patch to make_symbol_overload_list that we talked
about earlier today.

Tested on i686-pc-linux-gnu/GCC3.1/DWARF-2; OK to apply?

David Carlton
carlton@math.stanford.edu

2003-02-21  David Carlton  <carlton@math.stanford.edu>

	* symtab.c (lookup_partial_symbol): Use strcmp_iw_ordered to
	do the comparison, not strcmp.
	* symfile.c (compare_psymbols): Ditto.
	* defs.h: Declare strcmp_iw_ordered.
	* utils.c (strcmp_iw_ordered): New function.

Index: symtab.c
===================================================================
RCS file: /cvs/src/src/gdb/symtab.c,v
retrieving revision 1.90
diff -u -p -r1.90 symtab.c
--- symtab.c	20 Feb 2003 22:07:38 -0000	1.90
+++ symtab.c	22 Feb 2003 01:05:45 -0000
@@ -1379,9 +1379,10 @@ lookup_partial_symbol (struct partial_sy
       do_linear_search = 0;
 
       /* Binary search.  This search is guaranteed to end with center
-         pointing at the earliest partial symbol with the correct
-         name.  At that point *all* partial symbols with that name
-         will be checked against the correct namespace. */
+         pointing at the earliest partial symbol whose name might be
+         correct.  At that point *all* partial symbols with an
+         appropriate name will be checked against the correct
+         namespace.  */
 
       bottom = start;
       top = start + length - 1;
@@ -1396,7 +1397,7 @@ lookup_partial_symbol (struct partial_sy
 	    {
 	      do_linear_search = 1;
 	    }
-	  if (strcmp (SYMBOL_PRINT_NAME (*center), name) >= 0)
+	  if (strcmp_iw_ordered (SYMBOL_PRINT_NAME (*center), name) >= 0)
 	    {
 	      top = center;
 	    }
Index: symfile.c
===================================================================
RCS file: /cvs/src/src/gdb/symfile.c,v
retrieving revision 1.89
diff -u -p -r1.89 symfile.c
--- symfile.c	20 Feb 2003 17:17:25 -0000	1.89
+++ symfile.c	22 Feb 2003 01:05:40 -0000
@@ -213,52 +213,17 @@ compare_symbols (const void *s1p, const 
   return (strcmp (SYMBOL_PRINT_NAME (*s1), SYMBOL_PRINT_NAME (*s2)));
 }
 
-/*
-
-   LOCAL FUNCTION
-
-   compare_psymbols -- compare two partial symbols by name
-
-   DESCRIPTION
-
-   Given pointers to pointers to two partial symbol table entries,
-   compare them by name and return -N, 0, or +N (ala strcmp).
-   Typically used by sorting routines like qsort().
-
-   NOTES
-
-   Does direct compare of first two characters before punting
-   and passing to strcmp for longer compares.  Note that the
-   original version had a bug whereby two null strings or two
-   identically named one character strings would return the
-   comparison of memory following the null byte.
-
- */
+/* This compares two partial symbols by names, using strcmp_iw_ordered
+   for the comparison.  */
 
 static int
 compare_psymbols (const void *s1p, const void *s2p)
 {
-  register struct partial_symbol **s1, **s2;
-  register char *st1, *st2;
-
-  s1 = (struct partial_symbol **) s1p;
-  s2 = (struct partial_symbol **) s2p;
-  st1 = SYMBOL_PRINT_NAME (*s1);
-  st2 = SYMBOL_PRINT_NAME (*s2);
-
+  struct partial_symbol *const *s1 = s1p;
+  struct partial_symbol *const *s2 = s2p;
 
-  if ((st1[0] - st2[0]) || !st1[0])
-    {
-      return (st1[0] - st2[0]);
-    }
-  else if ((st1[1] - st2[1]) || !st1[1])
-    {
-      return (st1[1] - st2[1]);
-    }
-  else
-    {
-      return (strcmp (st1, st2));
-    }
+  return strcmp_iw_ordered (SYMBOL_PRINT_NAME (*s1),
+			    SYMBOL_PRINT_NAME (*s2));
 }
 
 void
Index: defs.h
===================================================================
RCS file: /cvs/src/src/gdb/defs.h,v
retrieving revision 1.113
diff -u -p -r1.113 defs.h
--- defs.h	14 Feb 2003 13:58:05 -0000	1.113
+++ defs.h	22 Feb 2003 01:05:31 -0000
@@ -305,6 +305,8 @@ extern void notice_quit (void);
 
 extern int strcmp_iw (const char *, const char *);
 
+extern int strcmp_iw_ordered (const char *, const char *);
+
 extern int streq (const char *, const char *);
 
 extern int subset_compare (char *, char *);
Index: utils.c
===================================================================
RCS file: /cvs/src/src/gdb/utils.c,v
retrieving revision 1.96
diff -u -p -r1.96 utils.c
--- utils.c	7 Feb 2003 00:27:30 -0000	1.96
+++ utils.c	22 Feb 2003 01:05:36 -0000
@@ -2358,6 +2358,64 @@ strcmp_iw (const char *string1, const ch
   return (*string1 != '\0' && *string1 != '(') || (*string2 != '\0');
 }
 
+/* This is like strcmp except that it ignores whitespace and treats
+   '(' as the first non-NULL character in terms of ordering.  Like
+   strcmp (and unlike strcmp_iw), it returns negative if STRING1 <
+   STRING2, 0 if STRING2 = STRING2, and positive if STRING1 > STRING2
+   according to that ordering.
+
+   If a list is sorted according to this function and if you want to
+   find names in the list that match some fixed NAME according to
+   strcmp_iw(LIST_ELT, NAME), then the place to start looking is right
+   where this function would put NAME.  */
+
+int
+strcmp_iw_ordered (const char *string1, const char *string2)
+{
+  while ((*string1 != '\0') && (*string2 != '\0'))
+    {
+      while (isspace (*string1))
+	{
+	  string1++;
+	}
+      while (isspace (*string2))
+	{
+	  string2++;
+	}
+      if (*string1 != *string2)
+	{
+	  break;
+	}
+      if (*string1 != '\0')
+	{
+	  string1++;
+	  string2++;
+	}
+    }
+
+  switch (*string1)
+    {
+      /* Characters are non-equal unless they're both '\0'; we want to
+	 make sure we get the comparison right according to our
+	 comparison in the cases where one of them is '\0' or '('.  */
+    case '\0':
+      if (*string2 == '\0')
+	return 0;
+      else
+	return -1;
+    case '(':
+      if (*string2 == '\0')
+	return 1;
+      else
+	return -1;
+    default:
+      if (*string2 == '(')
+	return 1;
+      else
+	return *string1 - *string2;
+    }
+}
+
 /* A simple comparison function with opposite semantics to strcmp.  */
 
 int


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [rfa] tweak sorting of partial symbols
  2003-02-22  1:55 [rfa] tweak sorting of partial symbols David Carlton
@ 2003-02-24  0:36 ` Elena Zannoni
  2003-02-24 18:13   ` David Carlton
  0 siblings, 1 reply; 5+ messages in thread
From: Elena Zannoni @ 2003-02-24  0:36 UTC (permalink / raw)
  To: David Carlton; +Cc: gdb-patches, Elena Zannoni, Daniel Jacobowitz, Jim Blandy

David Carlton writes:
 > Right now, partial symbols are sorted via strcmp.  But if you look up
 > NAME in the partial symbols, then what lookup_partial_symbol does is:
 > 
 > * Figure out where NAME fits under the strcmp ordering.
 > 
 > * Then, starting from that place in the partial symbols, look at all
 >   the partial symbols until we either find the right one or until we
 >   find one such that strcmp_iw(SYM_NAME,NAME) doesn't match.
 > 
 > In particular, there's the assumption that all symbols for which
 > strcmp_iw(SYM_NAME,NAME) matches are sorted via strcmp into a group
 > that starts out at the position where strcmp would put NAME.
 > 

It seems innocuous enough, and intuitively it makes sense, but can you
show me a case where it makes a difference? I.e some set of strings
for which the order would change?

I tried a few sets of strings and couldn't trigger any differences.

elena


 > But I can't see why this should be true.  So this patch adds a new
 > function strcmp_iw_ordered (not the best name, perhaps) that is
 > suitable for use via qsort (unlike strcmp_iw, which doesn't provide an
 > ordering or even an equivalence relation) and which does have the
 > desired property.  It does this by ignoring whitespace and by treating
 > '(' as the smallest non-NULL character (because, in some
 > circumstances, strcmp_iw wants to ignore matches where the difference
 > starts with an open parenthesis).
 > 
 > In practice, this should rarely if ever pose a problem, but it's a
 > good idea for the sake of pedanticism.  After this one goes in, I'll
 > also commit that patch to make_symbol_overload_list that we talked
 > about earlier today.
 > 
 > Tested on i686-pc-linux-gnu/GCC3.1/DWARF-2; OK to apply?
 > 
 > David Carlton
 > carlton@math.stanford.edu
 > 
 > 2003-02-21  David Carlton  <carlton@math.stanford.edu>
 > 
 > 	* symtab.c (lookup_partial_symbol): Use strcmp_iw_ordered to
 > 	do the comparison, not strcmp.
 > 	* symfile.c (compare_psymbols): Ditto.
 > 	* defs.h: Declare strcmp_iw_ordered.
 > 	* utils.c (strcmp_iw_ordered): New function.
 > 
 > Index: symtab.c
 > ===================================================================
 > RCS file: /cvs/src/src/gdb/symtab.c,v
 > retrieving revision 1.90
 > diff -u -p -r1.90 symtab.c
 > --- symtab.c	20 Feb 2003 22:07:38 -0000	1.90
 > +++ symtab.c	22 Feb 2003 01:05:45 -0000
 > @@ -1379,9 +1379,10 @@ lookup_partial_symbol (struct partial_sy
 >        do_linear_search = 0;
 >  
 >        /* Binary search.  This search is guaranteed to end with center
 > -         pointing at the earliest partial symbol with the correct
 > -         name.  At that point *all* partial symbols with that name
 > -         will be checked against the correct namespace. */
 > +         pointing at the earliest partial symbol whose name might be
 > +         correct.  At that point *all* partial symbols with an
 > +         appropriate name will be checked against the correct
 > +         namespace.  */
 >  
 >        bottom = start;
 >        top = start + length - 1;
 > @@ -1396,7 +1397,7 @@ lookup_partial_symbol (struct partial_sy
 >  	    {
 >  	      do_linear_search = 1;
 >  	    }
 > -	  if (strcmp (SYMBOL_PRINT_NAME (*center), name) >= 0)
 > +	  if (strcmp_iw_ordered (SYMBOL_PRINT_NAME (*center), name) >= 0)
 >  	    {
 >  	      top = center;
 >  	    }
 > Index: symfile.c
 > ===================================================================
 > RCS file: /cvs/src/src/gdb/symfile.c,v
 > retrieving revision 1.89
 > diff -u -p -r1.89 symfile.c
 > --- symfile.c	20 Feb 2003 17:17:25 -0000	1.89
 > +++ symfile.c	22 Feb 2003 01:05:40 -0000
 > @@ -213,52 +213,17 @@ compare_symbols (const void *s1p, const 
 >    return (strcmp (SYMBOL_PRINT_NAME (*s1), SYMBOL_PRINT_NAME (*s2)));
 >  }
 >  
 > -/*
 > -
 > -   LOCAL FUNCTION
 > -
 > -   compare_psymbols -- compare two partial symbols by name
 > -
 > -   DESCRIPTION
 > -
 > -   Given pointers to pointers to two partial symbol table entries,
 > -   compare them by name and return -N, 0, or +N (ala strcmp).
 > -   Typically used by sorting routines like qsort().
 > -
 > -   NOTES
 > -
 > -   Does direct compare of first two characters before punting
 > -   and passing to strcmp for longer compares.  Note that the
 > -   original version had a bug whereby two null strings or two
 > -   identically named one character strings would return the
 > -   comparison of memory following the null byte.
 > -
 > - */
 > +/* This compares two partial symbols by names, using strcmp_iw_ordered
 > +   for the comparison.  */
 >  
 >  static int
 >  compare_psymbols (const void *s1p, const void *s2p)
 >  {
 > -  register struct partial_symbol **s1, **s2;
 > -  register char *st1, *st2;
 > -
 > -  s1 = (struct partial_symbol **) s1p;
 > -  s2 = (struct partial_symbol **) s2p;
 > -  st1 = SYMBOL_PRINT_NAME (*s1);
 > -  st2 = SYMBOL_PRINT_NAME (*s2);
 > -
 > +  struct partial_symbol *const *s1 = s1p;
 > +  struct partial_symbol *const *s2 = s2p;
 >  
 > -  if ((st1[0] - st2[0]) || !st1[0])
 > -    {
 > -      return (st1[0] - st2[0]);
 > -    }
 > -  else if ((st1[1] - st2[1]) || !st1[1])
 > -    {
 > -      return (st1[1] - st2[1]);
 > -    }
 > -  else
 > -    {
 > -      return (strcmp (st1, st2));
 > -    }
 > +  return strcmp_iw_ordered (SYMBOL_PRINT_NAME (*s1),
 > +			    SYMBOL_PRINT_NAME (*s2));
 >  }
 >  
 >  void
 > Index: defs.h
 > ===================================================================
 > RCS file: /cvs/src/src/gdb/defs.h,v
 > retrieving revision 1.113
 > diff -u -p -r1.113 defs.h
 > --- defs.h	14 Feb 2003 13:58:05 -0000	1.113
 > +++ defs.h	22 Feb 2003 01:05:31 -0000
 > @@ -305,6 +305,8 @@ extern void notice_quit (void);
 >  
 >  extern int strcmp_iw (const char *, const char *);
 >  
 > +extern int strcmp_iw_ordered (const char *, const char *);
 > +
 >  extern int streq (const char *, const char *);
 >  
 >  extern int subset_compare (char *, char *);
 > Index: utils.c
 > ===================================================================
 > RCS file: /cvs/src/src/gdb/utils.c,v
 > retrieving revision 1.96
 > diff -u -p -r1.96 utils.c
 > --- utils.c	7 Feb 2003 00:27:30 -0000	1.96
 > +++ utils.c	22 Feb 2003 01:05:36 -0000
 > @@ -2358,6 +2358,64 @@ strcmp_iw (const char *string1, const ch
 >    return (*string1 != '\0' && *string1 != '(') || (*string2 != '\0');
 >  }
 >  
 > +/* This is like strcmp except that it ignores whitespace and treats
 > +   '(' as the first non-NULL character in terms of ordering.  Like
 > +   strcmp (and unlike strcmp_iw), it returns negative if STRING1 <
 > +   STRING2, 0 if STRING2 = STRING2, and positive if STRING1 > STRING2
 > +   according to that ordering.
 > +
 > +   If a list is sorted according to this function and if you want to
 > +   find names in the list that match some fixed NAME according to
 > +   strcmp_iw(LIST_ELT, NAME), then the place to start looking is right
 > +   where this function would put NAME.  */
 > +
 > +int
 > +strcmp_iw_ordered (const char *string1, const char *string2)
 > +{
 > +  while ((*string1 != '\0') && (*string2 != '\0'))
 > +    {
 > +      while (isspace (*string1))
 > +	{
 > +	  string1++;
 > +	}
 > +      while (isspace (*string2))
 > +	{
 > +	  string2++;
 > +	}
 > +      if (*string1 != *string2)
 > +	{
 > +	  break;
 > +	}
 > +      if (*string1 != '\0')
 > +	{
 > +	  string1++;
 > +	  string2++;
 > +	}
 > +    }
 > +
 > +  switch (*string1)
 > +    {
 > +      /* Characters are non-equal unless they're both '\0'; we want to
 > +	 make sure we get the comparison right according to our
 > +	 comparison in the cases where one of them is '\0' or '('.  */
 > +    case '\0':
 > +      if (*string2 == '\0')
 > +	return 0;
 > +      else
 > +	return -1;
 > +    case '(':
 > +      if (*string2 == '\0')
 > +	return 1;
 > +      else
 > +	return -1;
 > +    default:
 > +      if (*string2 == '(')
 > +	return 1;
 > +      else
 > +	return *string1 - *string2;
 > +    }
 > +}
 > +
 >  /* A simple comparison function with opposite semantics to strcmp.  */
 >  
 >  int


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [rfa] tweak sorting of partial symbols
  2003-02-24  0:36 ` Elena Zannoni
@ 2003-02-24 18:13   ` David Carlton
  2003-02-24 21:36     ` Elena Zannoni
  0 siblings, 1 reply; 5+ messages in thread
From: David Carlton @ 2003-02-24 18:13 UTC (permalink / raw)
  To: Elena Zannoni; +Cc: gdb-patches, Daniel Jacobowitz, Jim Blandy

On Sun, 23 Feb 2003 19:40:25 -0500, Elena Zannoni <ezannoni@redhat.com> said:

> It seems innocuous enough, and intuitively it makes sense, but can you
> show me a case where it makes a difference? I.e some set of strings
> for which the order would change?

Sure, that's a good question; I should have included such examples
with my original message.

Whitespace example:

Partial symtab contains: "foo<char *>", "goo".

Then, if we try to do a search for "foo<char*>", strcmp will locate
this after "foo<char *>" and before "goo".  Then lookup_partial_symbol
will start looking at strings beginning with "goo", and will never see
the correct match of "foo<char *>".

Parenthesis example:

In practice, this is less like to be an issue, but I'll give it a
shot.  Let's assume that '$' is a legitimate character to occur in
symbols.  (Which may well even be the case on some systems.)  Then say
that the partial symbol table contains "foo$" and "foo(int)".  strcmp
will put them in this order, since '$' < '('.

Now, if the user searches for "foo", then strcmp will sort "foo"
before "foo$".  Then lookup_partial_symbol will notice that
strcmp_iw("foo$", "foo") is false, so it won't proceed to the actual
match of "foo(int)" with "foo".

David Carlton
carlton@math.stanford.edu


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [rfa] tweak sorting of partial symbols
  2003-02-24 18:13   ` David Carlton
@ 2003-02-24 21:36     ` Elena Zannoni
  2003-02-24 23:39       ` David Carlton
  0 siblings, 1 reply; 5+ messages in thread
From: Elena Zannoni @ 2003-02-24 21:36 UTC (permalink / raw)
  To: David Carlton; +Cc: Elena Zannoni, gdb-patches, Daniel Jacobowitz, Jim Blandy

David Carlton writes:
 > On Sun, 23 Feb 2003 19:40:25 -0500, Elena Zannoni <ezannoni@redhat.com> said:
 > 
 > > It seems innocuous enough, and intuitively it makes sense, but can you
 > > show me a case where it makes a difference? I.e some set of strings
 > > for which the order would change?
 > 
 > Sure, that's a good question; I should have included such examples
 > with my original message.
 > 
 > Whitespace example:
 > 
 > Partial symtab contains: "foo<char *>", "goo".
 > 
 > Then, if we try to do a search for "foo<char*>", strcmp will locate
 > this after "foo<char *>" and before "goo".  Then lookup_partial_symbol
 > will start looking at strings beginning with "goo", and will never see
 > the correct match of "foo<char *>".
 > 
 > Parenthesis example:
 > 
 > In practice, this is less like to be an issue, but I'll give it a
 > shot.  Let's assume that '$' is a legitimate character to occur in
 > symbols.  (Which may well even be the case on some systems.)  Then say
 > that the partial symbol table contains "foo$" and "foo(int)".  strcmp
 > will put them in this order, since '$' < '('.
 > 
 > Now, if the user searches for "foo", then strcmp will sort "foo"
 > before "foo$".  Then lookup_partial_symbol will notice that
 > strcmp_iw("foo$", "foo") is false, so it won't proceed to the actual
 > match of "foo(int)" with "foo".


Ah, yes. I was trying with some dumber examples, like foo, foo(int).

Ok, approved, but could you find a place to put some of the examples
above in comments?

elena


 > 
 > David Carlton
 > carlton@math.stanford.edu


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [rfa] tweak sorting of partial symbols
  2003-02-24 21:36     ` Elena Zannoni
@ 2003-02-24 23:39       ` David Carlton
  0 siblings, 0 replies; 5+ messages in thread
From: David Carlton @ 2003-02-24 23:39 UTC (permalink / raw)
  To: Elena Zannoni; +Cc: gdb-patches, Daniel Jacobowitz, Jim Blandy

On Mon, 24 Feb 2003 16:40:22 -0500, Elena Zannoni <ezannoni@redhat.com> said:

> Ah, yes. I was trying with some dumber examples, like foo, foo(int).

> Ok, approved, but could you find a place to put some of the examples
> above in comments?

Thanks; I committed this after adding the examples to the comment
above the definition of strcmp_iw_ordered.

David Carlton
carlton@math.stanford.edu


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-02-24 23:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-02-22  1:55 [rfa] tweak sorting of partial symbols David Carlton
2003-02-24  0:36 ` Elena Zannoni
2003-02-24 18:13   ` David Carlton
2003-02-24 21:36     ` Elena Zannoni
2003-02-24 23:39       ` David Carlton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox