Re: (patch) hpjyg09: bcache optimizations

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

* Re: (patch) hpjyg09: bcache optimizations
       [not found] <Pine.LNX.4.10.9911041529510.15357-100000@hpcll168.cup.hp.com>
@ 1999-11-04 17:31 ` Andrew Cagney
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Cagney @ 1999-11-04 17:31 UTC (permalink / raw)
  To: Jimmy Guo; +Cc: gdb-patches

Jimmy Guo wrote:
> 
> Andrew,
> 
> 1) Not all compilers support the inline keyword (HP C compiler for
>    example), so it will be a nop change in such cases.  Therefore the
>    macro def might be the only way to optimize this away.

BTW, how do you get a HP compiler to inline a function?  Will it just do
it if -O<very-large-number> is given?

In the case of the simulators, the decision was to build a fast correct
program when built with GCC and a slower correct program when built with
other compilers.

> 2) The macro def is duplicating the hash() implementation.  I agree that
>    one implementation is much maintainable.  While hash() could use
>    BCACHE_HASH macro, what about the 1st point you made about debugging
>    and use of macros?

Yes, I know the two comments appear contradictary.  I'm assuming the
macro is approved.  In that case I think a single (un-debuggable?)
implementation is better than a macro and function duplicating code.

Even having a function with a macro as its body helps - at least you can
set breakpoints on that hash function and watch what is being returned.

> 3) In the actual use of BCACHE_HASH vs. the hash() call, yes a default
>    BCACHE_HASH can be provided which calls hash().  However, this would
>    introduce yet another macro wrapper (to address both 2 and 3 ... one
>    for the hash implementation -- 2; another for the hash call wrapper
>    using the macro implementation directly or through an actual hash()
>    wrapper).

True. However, I believe it is more important to avoid littering C files
with further:

	#if ...
		one variant
	#else
		another variant
	#endif

If we did that we need to ensure that each alternatives continued to be
compilable.

> 
> >       o       I'm in too minds over the first argument
> >               (HASHVAL). I recently spent time eliminating
> >               several really bad macro definitions that
> >               abused the fact that they had direct access to
> >               arguments - changing them to multi-arch functions
> >               was, er, non-trivial.
> >
> >               However, in this case, the rational and use is
> >               clean.
> 
> I'm not sure I understand your last point about changing macros to
> multi-arch functions.

(Er, yes, sorry, my quandry isn't very clear).  Some of the old GDB
macro's would do things like:
	#define X(Y) (memset (&(Y), sizeof (Y), 0); (Y).A = 1; (Y).B = 2)

I'd still prefer to see retval = BCACHE_HASH (bytes, temp_count). 
Unfortunatly, that can't be implemented as an ISO-C macro so it won't
happen :-)

> Since:
>  a) the macro def is pretty much necessary to make the change
>     meaningful (see 1),
>  b) to have one implementation of hash and make the usage clean (see 2 &
>     3),
> I think we should just use the macro def and blow away hash() altogether.

I think there first needs to be a technical decision made about how much
of a price we're willing to pay for performance VS maintainability.  As
you can probably tell, I have a very strong preference for the latter
:-)

If the decision is to use the macro then the corresponding hash()
function should be deleted.

	enjoy,
		Andrew
From guo@cup.hp.com Thu Nov 04 18:07:00 1999
From: Jimmy Guo <guo@cup.hp.com>
To: Andrew Cagney <ac131313@cygnus.com>
Cc: gdb-patches@sourceware.cygnus.com
Subject: Revision: (patch) hpjyg09: bcache optimizations
Date: Thu, 04 Nov 1999 18:07:00 -0000
Message-id: <Pine.LNX.4.10.9911041739340.15357-100000@hpcll168.cup.hp.com>
References: <382232AF.7E3ED811@cygnus.com>
X-SW-Source: 1999-q4/msg00182.html
Content-length: 10394

***
Following my reply to Andrew is the revision of this patch.
Patch dependency: hpjyg05 (config/pa/tm-hppa.h)
***

>BTW, how do you get a HP compiler to inline a function?  Will it just do
>it if -O<very-large-number> is given?

The HP C compiler will inline functions at +O3 and +O4.  And
+Oinlinebudget=<n> controls the aggressiveness in inlining:
                                  = 100     Default level on inlining.
                                  > 100     More aggressive inlining.
                                  2 - 99    Less aggressive inlining.
                                  = 1       Only inline if it reduces
				  code
                                            size.

>> 2) The macro def is duplicating the hash() implementation.  I agree that
>>    one implementation is much maintainable.  While hash() could use
>>    BCACHE_HASH macro, what about the 1st point you made about debugging
>>    and use of macros?
>
>Yes, I know the two comments appear contradictary.  I'm assuming the
>macro is approved.  In that case I think a single (un-debuggable?)
>implementation is better than a macro and function duplicating code.
>
>Even having a function with a macro as its body helps - at least you can
>set breakpoints on that hash function and watch what is being returned.

I look forward to the same (that the use of the macro is approved) :)

>If the decision is to use the macro then the corresponding hash()
>function should be deleted.

I'm attaching a revision of this patch to use the macro only.  Given two
choices of using only the macro vs. providing a macro and also a
function wrapping around the macro, I also prefer the first choice,
assuming the decision is to use the macro.

Thanks!

Here is the revision of the patch:

ChangeLog:

1999-11-04      Jimmy Guo       <guo@cup.hp.com>

        * config/pa/tm-hppa.h (BCACHE_ALLOW_DUPLICATES): Declare.

        * bcache.c (BCACHE_HASH): Define, the macro equivalent of the
        hash () function this macro is replacing.
        (lookup_cache): Bypass if ignore_duplicates (new static var) is
        set; use BCACHE_HASH macro instead of hash() function call;
        optimize (reduce) memcmp() call.
        (bcache_ignore_duplicates,bcache_eliminate_duplicates): New
        functions, to control turning on/off duplicate handling at
        runtime.

        * bcache.h (bcache_ignore_duplicates,bcache_eliminate_duplicates):
        Declare.

Index: gdb/config/pa/tm-hppa.h
/opt/gnu/bin/diff -r -c -N  /view/guo.wdb.c//CLO/Components/WDB/Src/gnu/gdb/config/pa/tm-hppa.h gdb/config/pa/tm-hppa.h
*** /view/guo.wdb.c//CLO/Components/WDB/Src/gnu/gdb/config/pa/tm-hppa.h	Tue Nov  2 16:04:12 1999
--- gdb/config/pa/tm-hppa.h	Thu Nov  4 13:49:12 1999
***************
*** 811,816 ****
--- 811,818 ----
  		 || ((symname)[0] == '$' && isdigit ((symname)[1]))     \
  		 ))
  
+ #define BCACHE_ALLOW_DUPLICATES
+ 
  /* Here's how to step off a permanent breakpoint.  */
  #define SKIP_PERMANENT_BREAKPOINT (hppa_skip_permanent_breakpoint)
       extern void hppa_skip_permanent_breakpoint (void);
Index: gdb/bcache.c
/opt/gnu/bin/diff -r -c -N  /view/guo.wdb.c//CLO/Components/WDB/Src/gnu/gdb/bcache.c gdb/bcache.c
*** /view/guo.wdb.c//CLO/Components/WDB/Src/gnu/gdb/bcache.c	Thu Nov  4 15:13:44 1999
--- gdb/bcache.c	Thu Nov  4 17:56:03 1999
***************
*** 24,62 ****
  #include "bcache.h"
  #include "gdb_string.h"		/* For memcpy declaration */
  
! /* Prototypes for local functions. */
! 
! static unsigned int hash PARAMS ((void *, int));
! 
! static void *lookup_cache PARAMS ((void *, int, int, struct bcache *));
! 
  /* FIXME:  Incredibly simplistic hash generator.  Probably way too expensive
     (consider long strings) and unlikely to have good distribution across hash
     values for typical input. */
  
! static unsigned int
! hash (bytes, count)
!      void *bytes;
!      int count;
! {
!   unsigned int len;
!   unsigned long hashval;
!   unsigned int c;
!   const unsigned char *data = bytes;
! 
!   hashval = 0;
!   len = 0;
!   while (count-- > 0)
!     {
!       c = *data++;
!       hashval += c + (c << 17);
!       hashval ^= hashval >> 2;
!       ++len;
!     }
!   hashval += len + (len << 17);
!   hashval ^= hashval >> 2;
!   return (hashval % BCACHE_HASHSIZE);
! }
  
  static void *
  lookup_cache (bytes, count, hashval, bcachep)
--- 24,87 ----
  #include "bcache.h"
  #include "gdb_string.h"		/* For memcpy declaration */
  
! /* CM: The hash function is called a LOT, so have a an inlined version.
!    This should help scaling to larger apps. */
! /* guo: Replaced hash() implementation with the equivalent BCACHE_HASH
!    macro definition, since having two implementations (macro vs. function)
!    of the same functionality is harder to maintain, and specific function
!    inlining by compiler is not supported by all compilers (i.e. the use of
!    'inline' keyword supported by GCC). */
  /* FIXME:  Incredibly simplistic hash generator.  Probably way too expensive
     (consider long strings) and unlikely to have good distribution across hash
     values for typical input. */
  
! #define BCACHE_HASH(retval, bytes, temp_count) \
!   { \
!     int bcache_hash_count = (temp_count); \
!     unsigned int bcache_hash_len; \
!     unsigned long bcache_hash_hashval; \
!     unsigned int bcache_hash_c; \
!     const unsigned char *bcache_hash_data = (bytes); \
!     \
!     bcache_hash_hashval = 0; \
!     bcache_hash_len = 0; \
!     while (bcache_hash_count-- > 0) \
!       { \
!         bcache_hash_c = *bcache_hash_data++; \
!         bcache_hash_hashval += bcache_hash_c + (bcache_hash_c << 17); \
!         bcache_hash_hashval ^= bcache_hash_hashval >> 2; \
!         ++bcache_hash_len; \
!       } \
!     bcache_hash_hashval += bcache_hash_len + (bcache_hash_len << 17); \
!     bcache_hash_hashval ^= bcache_hash_hashval >> 2; \
!     (retval) = (bcache_hash_hashval % BCACHE_HASHSIZE); \
!   }
! 
! /* Prototypes for local functions. */
! 
! static void *lookup_cache PARAMS ((void *, int, int, struct bcache *));
! 
! /* srikanth, 990314, JAGaa80452, the bcache seems to be an extremely high 
!    overhead data structure. See that the overhead (which is really every 
!    byte that is not used to store GDB's data i.e., cells used for
!    housekeeping info like pointers, hash chain heads etc.,) is of the order
!    of O(m * n * 64k) where m is the number of load modules compiled with -g, 
!    and n is the number of strings that have unique length. This spells doom 
!    for applications wih a large number of shared libraries (m increases) 
!    and C++ (n increases since we also stick demangled names into the cache.) 
!    When debugging HP's C compiler more than 140 MB or about 48% of memory is
!    due to the bcache overhead. Not the data just the overhead !
! 
!    Further research, communication with Cygnus and Fred Fish (the original
!    implementor) shows that this module is extremely effective saving as much 
!    as 60% in gcc + stabs + linux combination. So I am disabling it just for
!    the Wildebeest and only for non-doom mode. */
! 
! #ifdef BCACHE_ALLOW_DUPLICATES
! static int ignore_duplicates = 1;
! #else
! static int ignore_duplicates = 0;
! #endif
  
  static void *
  lookup_cache (bytes, count, hashval, bcachep)
***************
*** 69,81 ****
    struct hashlink **hashtablep;
    struct hashlink *linkp;
  
    hashtablep = bcachep->indextable[count];
    if (hashtablep != NULL)
      {
        linkp = hashtablep[hashval];
        while (linkp != NULL)
  	{
! 	  if (memcmp (BCACHE_DATA (linkp), bytes, count) == 0)
  	    {
  	      location = BCACHE_DATA (linkp);
  	      break;
--- 94,118 ----
    struct hashlink **hashtablep;
    struct hashlink *linkp;
  
+   if (ignore_duplicates)	/* don't care if it exists already */
+     return NULL;
+ 
    hashtablep = bcachep->indextable[count];
    if (hashtablep != NULL)
      {
        linkp = hashtablep[hashval];
        while (linkp != NULL)
  	{
! 	  /* CM: The following helps to avoid excessive (possibly shared
! 	     library--i.e. indirect) function calls. This function is
! 	     called a LOT, so this should help scaling to larger apps.
! 
! 	     The original line was:
! 	     if (memcmp (BCACHE_DATA (linkp), bytes, count) == 0) */
! 
! 	  if ((*((char *) (BCACHE_DATA (linkp))) == *((char *) bytes))
! 	      ? (memcmp (BCACHE_DATA (linkp), bytes, count) == 0)
! 	      : 0)
  	    {
  	      location = BCACHE_DATA (linkp);
  	      break;
***************
*** 108,114 ****
      }
    else
      {
!       hashval = hash (bytes, count);
        location = lookup_cache (bytes, count, hashval, bcachep);
        if (location != NULL)
  	{
--- 145,152 ----
      }
    else
      {
!       BCACHE_HASH (hashval, bytes, count);
! 
        location = lookup_cache (bytes, count, hashval, bcachep);
        if (location != NULL)
  	{
***************
*** 139,144 ****
--- 177,195 ----
    return (location);
  }
  
+ /* srikanth, 990323, now we can toggle the bcache behavior on the fly */
+ void
+ bcache_ignore_duplicates ()
+ {
+   ignore_duplicates = 1;
+ }
+ 
+ void
+ bcache_eliminate_duplicates ()
+ {
+   ignore_duplicates = 0;
+ }
+ 
  void
  print_bcache_statistics (bcachep, id)
       struct bcache *bcachep;
***************
*** 146,152 ****
  {
    struct hashlink **hashtablep;
    struct hashlink *linkp;
!   int tidx, tcount, hidx, hcount, lcount, lmax, temp, lmaxt, lmaxh;
  
    for (lmax = lcount = tcount = hcount = tidx = 0; tidx < BCACHE_MAXLENGTH; tidx++)
      {
--- 197,203 ----
  {
    struct hashlink **hashtablep;
    struct hashlink *linkp;
!   int tidx, tcount, hidx, hcount, lcount, lmax, temp, lmaxt = 0, lmaxh = 0;
  
    for (lmax = lcount = tcount = hcount = tidx = 0; tidx < BCACHE_MAXLENGTH; tidx++)
      {
Index: gdb/bcache.h
/opt/gnu/bin/diff -r -c -N  /view/guo.wdb.c//CLO/Components/WDB/Src/gnu/gdb/bcache.h gdb/bcache.h
*** /view/guo.wdb.c//CLO/Components/WDB/Src/gnu/gdb/bcache.h	Thu Nov  4 11:11:13 1999
--- gdb/bcache.h	Thu Nov  4 13:49:11 1999
***************
*** 68,73 ****
--- 68,79 ----
    bcache PARAMS ((void *bytes, int count, struct bcache * bcachep));
  
  extern void
+ bcache_ignore_duplicates PARAMS (());
+ 
+ extern void
+ bcache_eliminate_duplicates PARAMS (());
+ 
+ extern void
  print_bcache_statistics PARAMS ((struct bcache *, char *));
  
  #endif /* BCACHE_H */


^ permalink raw reply	[flat|nested] 7+ messages in thread

* (patch) hpjyg09: bcache optimizations
@ 1999-11-04 13:48 Jimmy Guo
  1999-11-05  9:59 ` Jim Blandy
  0 siblings, 1 reply; 7+ messages in thread
From: Jimmy Guo @ 1999-11-04 13:48 UTC (permalink / raw)
  To: gdb-patches

***
Patch dependency: hpjyg05 (config/pa/tm-hppa.h)
***

Couple of bcache.c optimizations:

1) Introduction and use of inlined hash function, and attempt to reduce
   the memcmp call.

2) Ability to bypass bcache duplicate handling (initial behavior
   controlled by a target dependent macro def), and to adjust such
   behavior at runtime through two convenience functions.

ChangeLog:

1999-11-04	Jimmy Guo	<guo@cup.hp.com>

	* config/pa/tm-hppa.h (BCACHE_ALLOW_DUPLICATES): Declare.

	* bcache.c (BCACHE_USE_INLINED_HASH): Define, enables use of
	the macro equivalent of the hash () function instead.
	(BCACHE_HASH): Define, the macro equivalent of the hash () function.
	(lookup_cache): Bypass if ignore_duplicates (new static var) is
	set; use BCACHE_HASH macro ifdef BCACHE_USE_INLINED_HASH;
	optimize (reduce) memcmp() call.
	(bcache_ignore_duplicates,bcache_eliminate_duplicates): New
	functions, to control turning on/off duplicate handling at
	runtime.

	* bcache.h (bcache_ignore_duplicates,bcache_eliminate_duplicates):
	Declare.

Index: gdb/config/pa/tm-hppa.h
/opt/gnu/bin/diff -r -c -N  /view/guo.wdb.c//CLO/Components/WDB/Src/gnu/gdb/config/pa/tm-hppa.h gdb/config/pa/tm-hppa.h
*** /view/guo.wdb.c//CLO/Components/WDB/Src/gnu/gdb/config/pa/tm-hppa.h	Tue Nov  2 16:04:12 1999
--- gdb/config/pa/tm-hppa.h	Thu Nov  4 13:28:28 1999
***************
*** 811,816 ****
--- 811,818 ----
  		 || ((symname)[0] == '$' && isdigit ((symname)[1]))     \
  		 ))
  
+ #define BCACHE_ALLOW_DUPLICATES
+ 
  /* Here's how to step off a permanent breakpoint.  */
  #define SKIP_PERMANENT_BREAKPOINT (hppa_skip_permanent_breakpoint)
       extern void hppa_skip_permanent_breakpoint (void);
Index: gdb/bcache.c
/opt/gnu/bin/diff -r -c -N  /view/guo.wdb.c//CLO/Components/WDB/Src/gnu/gdb/bcache.c gdb/bcache.c
*** /view/guo.wdb.c//CLO/Components/WDB/Src/gnu/gdb/bcache.c	Thu Nov  4 10:57:42 1999
--- gdb/bcache.c	Thu Nov  4 13:28:20 1999
***************
*** 24,35 ****
--- 24,71 ----
  #include "bcache.h"
  #include "gdb_string.h"		/* For memcpy declaration */
  
+ /* CM: The hash function is called a LOT, so have a an inlined version.
+    This should help scaling to larger apps. Comment out the following
+    define to use the old method. */
+ 
+ #define BCACHE_USE_INLINED_HASH
+ 
  /* Prototypes for local functions. */
  
+ #ifndef BCACHE_USE_INLINED_HASH
  static unsigned int hash PARAMS ((void *, int));
+ #endif
  
  static void *lookup_cache PARAMS ((void *, int, int, struct bcache *));
  
+ #ifdef BCACHE_USE_INLINED_HASH
+ 
+ /* CM: The hash method should exactly match the function defined below. */
+ 
+ #define BCACHE_HASH(retval, bytes, temp_count) \
+   { \
+     int bcache_hash_count = (temp_count); \
+     unsigned int bcache_hash_len; \
+     unsigned long bcache_hash_hashval; \
+     unsigned int bcache_hash_c; \
+     const unsigned char *bcache_hash_data = (bytes); \
+     \
+     bcache_hash_hashval = 0; \
+     bcache_hash_len = 0; \
+     while (bcache_hash_count-- > 0) \
+       { \
+         bcache_hash_c = *bcache_hash_data++; \
+         bcache_hash_hashval += bcache_hash_c + (bcache_hash_c << 17); \
+         bcache_hash_hashval ^= bcache_hash_hashval >> 2; \
+         ++bcache_hash_len; \
+       } \
+     bcache_hash_hashval += bcache_hash_len + (bcache_hash_len << 17); \
+     bcache_hash_hashval ^= bcache_hash_hashval >> 2; \
+     (retval) = (bcache_hash_hashval % BCACHE_HASHSIZE); \
+   }
+ 
+ #else
+ 
  /* FIXME:  Incredibly simplistic hash generator.  Probably way too expensive
     (consider long strings) and unlikely to have good distribution across hash
     values for typical input. */
***************
*** 58,63 ****
--- 94,123 ----
    return (hashval % BCACHE_HASHSIZE);
  }
  
+ #endif /* BCACHE_USE_INLINED_HASH */
+ 
+ /* srikanth, 990314, JAGaa80452, the bcache seems to be an extremely high 
+    overhead data structure. See that the overhead (which is really every 
+    byte that is not used to store GDB's data i.e., cells used for
+    housekeeping info like pointers, hash chain heads etc.,) is of the order
+    of O(m * n * 64k) where m is the number of load modules compiled with -g, 
+    and n is the number of strings that have unique length. This spells doom 
+    for applications wih a large number of shared libraries (m increases) 
+    and C++ (n increases since we also stick demangled names into the cache.) 
+    When debugging HP's C compiler more than 140 MB or about 48% of memory is
+    due to the bcache overhead. Not the data just the overhead !
+ 
+    Further research, communication with Cygnus and Fred Fish (the original
+    implementor) shows that this module is extremely effective saving as much 
+    as 60% in gcc + stabs + linux combination. So I am disabling it just for
+    the Wildebeest and only for non-doom mode. */
+ 
+ #ifdef BCACHE_ALLOW_DUPLICATES
+ static int ignore_duplicates = 1;
+ #else
+ static int ignore_duplicates = 0;
+ #endif
+ 
  static void *
  lookup_cache (bytes, count, hashval, bcachep)
       void *bytes;
***************
*** 69,81 ****
    struct hashlink **hashtablep;
    struct hashlink *linkp;
  
    hashtablep = bcachep->indextable[count];
    if (hashtablep != NULL)
      {
        linkp = hashtablep[hashval];
        while (linkp != NULL)
  	{
! 	  if (memcmp (BCACHE_DATA (linkp), bytes, count) == 0)
  	    {
  	      location = BCACHE_DATA (linkp);
  	      break;
--- 129,153 ----
    struct hashlink **hashtablep;
    struct hashlink *linkp;
  
+   if (ignore_duplicates)	/* don't care if it exists already */
+     return NULL;
+ 
    hashtablep = bcachep->indextable[count];
    if (hashtablep != NULL)
      {
        linkp = hashtablep[hashval];
        while (linkp != NULL)
  	{
! 	  /* CM: The following helps to avoid excessive (possibly shared
! 	     library--i.e. indirect) function calls. This function is
! 	     called a LOT, so this should help scaling to larger apps.
! 
! 	     The original line was:
! 	     if (memcmp (BCACHE_DATA (linkp), bytes, count) == 0) */
! 
! 	  if ((*((char *) (BCACHE_DATA (linkp))) == *((char *) bytes))
! 	      ? (memcmp (BCACHE_DATA (linkp), bytes, count) == 0)
! 	      : 0)
  	    {
  	      location = BCACHE_DATA (linkp);
  	      break;
***************
*** 108,114 ****
--- 180,192 ----
      }
    else
      {
+       /* CM: Use the inlined hash function if requested. */
+ #ifdef BCACHE_USE_INLINED_HASH
+       BCACHE_HASH (hashval, bytes, count);
+ #else
        hashval = hash (bytes, count);
+ #endif
+ 
        location = lookup_cache (bytes, count, hashval, bcachep);
        if (location != NULL)
  	{
***************
*** 139,144 ****
--- 217,235 ----
    return (location);
  }
  
+ /* srikanth, 990323, now we can toggle the bcache behavior on the fly */
+ void
+ bcache_ignore_duplicates ()
+ {
+   ignore_duplicates = 1;
+ }
+ 
+ void
+ bcache_eliminate_duplicates ()
+ {
+   ignore_duplicates = 0;
+ }
+ 
  void
  print_bcache_statistics (bcachep, id)
       struct bcache *bcachep;
***************
*** 146,152 ****
  {
    struct hashlink **hashtablep;
    struct hashlink *linkp;
!   int tidx, tcount, hidx, hcount, lcount, lmax, temp, lmaxt, lmaxh;
  
    for (lmax = lcount = tcount = hcount = tidx = 0; tidx < BCACHE_MAXLENGTH; tidx++)
      {
--- 237,243 ----
  {
    struct hashlink **hashtablep;
    struct hashlink *linkp;
!   int tidx, tcount, hidx, hcount, lcount, lmax, temp, lmaxt = 0, lmaxh = 0;
  
    for (lmax = lcount = tcount = hcount = tidx = 0; tidx < BCACHE_MAXLENGTH; tidx++)
      {
Index: gdb/bcache.h
/opt/gnu/bin/diff -r -c -N  /view/guo.wdb.c//CLO/Components/WDB/Src/gnu/gdb/bcache.h gdb/bcache.h
*** /view/guo.wdb.c//CLO/Components/WDB/Src/gnu/gdb/bcache.h	Thu Nov  4 11:11:13 1999
--- gdb/bcache.h	Thu Nov  4 11:12:53 1999
***************
*** 68,73 ****
--- 68,79 ----
    bcache PARAMS ((void *bytes, int count, struct bcache * bcachep));
  
  extern void
+ bcache_ignore_duplicates PARAMS (());
+ 
+ extern void
+ bcache_eliminate_duplicates PARAMS (());
+ 
+ extern void
  print_bcache_statistics PARAMS ((struct bcache *, char *));
  
  #endif /* BCACHE_H */


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: (patch) hpjyg09: bcache optimizations
  1999-11-04 13:48 Jimmy Guo
@ 1999-11-05  9:59 ` Jim Blandy
  1999-11-05 10:50   ` Srikanth Adayapalam
  1999-12-16  0:26   ` Jeffrey A Law
  0 siblings, 2 replies; 7+ messages in thread
From: Jim Blandy @ 1999-11-05  9:59 UTC (permalink / raw)
  To: Jimmy Guo; +Cc: gdb-patches

The first red flag here is that this patch adds a new target macro,
BCACHE_ALLOW_DUPLICATES (in config/pa/tm-hppa.h), which is actually
dependent on the application being debugged, not the architecture.
Whether the bcache helps you depends on the contents of your debug
info, not whether it's running on a PA microprocessor.  Hmm.

I'd like to look at this problem on a larger scale:

I've read Srikanth's comment in the patch, explaining why the bcache
imposes too much overhead for some code while it is still quite
effective for other code.

But this situation still sounds weird.  The bcache is a very simple
thing --- bcache.h and bcache.c total 289 lines of code.  You give it
a string of bytes, and it either adds a copy of your string to its
hash table, or finds an identical copy already present.  Either way,
it hands you back a pointer to its stashed copy.  So, a bcache is
helpful if you expect a lot of duplicates.  In Linux's libc.so, the
bcache reduces the space required for the psymtabs by half.

What I can't understand is why such a simple data structure needs to
have such a terrible overhead.  Srikanth writes:

    See that the overhead (which is really every byte that is not used to
    store GDB's data i.e., cells used for housekeeping info like pointers,
    hash chain heads etc.,) is of the order of O(m * n * 64k) where m is
    the number of load modules compiled with -g, and n is the number of
    strings that have unique length. This spells doom for applications wih
    a large number of shared libraries (m increases) and C++ (n increases
    since we also stick demangled names into the cache.)  When debugging
    HP's C compiler more than 140 MB or about 48% of memory is due to the
    bcache overhead. Not the data just the overhead !

I'm sure that's true, but it seems completely unnecessary.  One should
be able to design a bcache with much less overhead.

Looking at the output of "maint print statistics" from running GDB on
itself, I see:

Average hash table population: 8%
Average hash table population: (not applicable)
Average hash table population: (not applicable)
Average hash table population: 0%
Average hash table population: 1%
Average hash table population: 3%
Average hash table population: 1%

(The "not applicable" items are for bcaches with no entries.)

I don't think those lower-level hash tables are being used very
effectively, if the hash function is distributing the items across ~3%
of the buckets.  A good hash function distributes items across as many
different buckets as possible.  I would guess that the bcache has
never been tuned.

So, while I understand the impulse to just disable the bcache, because
it's causing you trouble in your driving uses of the debugger, I think
the better solution, for both Cygnus and HP, is to fix the data
structure so it actually works.  I would recommend:

- using a single-level hash table, with an initial size of about 32k
  buckets (based on looking at GDB debugging GDB)
- growing the hash table when the average chain length grows beyond 
  a certain limit, so the time overhead remains the same as the
  problem size grows
- choosing a hash function by running GDB on your C compiler, doing
  "maint print statistics" to see how it's distributing items across
  buckets, and then tweaking the function to minimize your maximum
  chain length (this is kind of fun, and takes less time than you'd
  think, since hash functions have almost no *required* semantics)
- adding code to print_bcache_statistics to print the number of
  bytes spend on bcache overhead

Yes, this is more work than commenting it out.  But we can't
perpetually choose the quick fix over the real cure.  Especially when
the quick fix adds complexity, like more CPP conditionals and target
parameters that aren't target parameters.

In this light, I don't think we should apply this patch.
From guo@cup.hp.com Fri Nov 05 10:24:00 1999
From: Jimmy Guo <guo@cup.hp.com>
To: Jim Blandy <jimb@cygnus.com>
Cc: gdb-patches@sourceware.cygnus.com
Subject: Re: (patch) hpjyg09: bcache optimizations
Date: Fri, 05 Nov 1999 10:24:00 -0000
Message-id: <Pine.LNX.4.10.9911051016470.11755-100000@hpcll168.cup.hp.com>
References: <npaeoskevw.fsf@zwingli.cygnus.com>
X-SW-Source: 1999-q4/msg00189.html
Content-length: 462

Given the questions and concerns over this patch, I will 'hash' it a bit
more in the direction as suggested here ... just ignore this patch.

- Jimmy Guo

>Yes, this is more work than commenting it out.  But we can't
>perpetually choose the quick fix over the real cure.  Especially when
>the quick fix adds complexity, like more CPP conditionals and target
>parameters that aren't target parameters.
>
>In this light, I don't think we should apply this patch.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: (patch) hpjyg09: bcache optimizations
  1999-11-05  9:59 ` Jim Blandy
@ 1999-11-05 10:50   ` Srikanth Adayapalam
  1999-11-05 13:29     ` Jim Blandy
  1999-12-16  0:26   ` Jeffrey A Law
  1 sibling, 1 reply; 7+ messages in thread
From: Srikanth Adayapalam @ 1999-11-05 10:50 UTC (permalink / raw)
  To: jimb; +Cc: gdb-patches

> 
> 
> The first red flag here is that this patch adds a new target macro,
> BCACHE_ALLOW_DUPLICATES (in config/pa/tm-hppa.h), which is actually
> dependent on the application being debugged, not the architecture.
> Whether the bcache helps you depends on the contents of your debug
> info, not whether it's running on a PA microprocessor.  Hmm.

	While it is true that it is not tied to the microprocessor,
(and the placement of the macro is not ideal perhaps,) it is tied 
somewhat to the HP-UX platform in that :

	o The bcache is beneficial, when there is high degree of
          duplication and redundancy in the debug info generated
          by a set of compilers for a platform (e.g : stabs + gcc + linux.)

        o and is useless and high overhead item , if there is little
          or no duplication (e.g : som + HP cc + HP-UX).

	On HP-UX systems, we have a tool called pxdb, which is a linker
post processor (or a debugger preprocessor if you want to look at it 
that way) which runs thro the a.out and removes all duplicates. So when
gdb enters the picture there is no duplicate debug info at all. Further 
this overhead is not particular to the C compiler, it is seen for all
applications compiled with HP compilers. The C compiler's case just
happened to be mentioned. That we waste 140 MB trying to eliminate the 
non-existent dups was a big concern for us.	

Hope this clarifies the picture. As for how to proceed with the patch,
I'll leave it to Jimmy and this forum to evolve the best course.

Thanks
Srikanth

From jtc@redback.com Fri Nov 05 11:17:00 1999
From: jtc@redback.com (J.T. Conklin)
To: Jim Blandy <jimb@cygnus.com>
Cc: Jimmy Guo <guo@cup.hp.com>, gdb-patches@sourceware.cygnus.com
Subject: Re: (patch) hpjyg09: bcache optimizations
Date: Fri, 05 Nov 1999 11:17:00 -0000
Message-id: <5mwvrw92gx.fsf@jtc.redbacknetworks.com>
References: <Pine.LNX.4.10.9911041332020.15719-100000@hpcll168.cup.hp.com> <npaeoskevw.fsf@zwingli.cygnus.com>
X-SW-Source: 1999-q4/msg00191.html
Content-length: 2247

>>>>> "Jim" == Jim Blandy <jimb@cygnus.com> writes:
Jim> But this situation still sounds weird.  The bcache is a very
Jim> simple thing --- bcache.h and bcache.c total 289 lines of code.
Jim> You give it a string of bytes, and it either adds a copy of your
Jim> string to its hash table, or finds an identical copy already
Jim> present.  Either way, it hands you back a pointer to its stashed
Jim> copy.  So, a bcache is helpful if you expect a lot of duplicates.

This brings back memories.  

Many years ago, I discovered CVS had terrible hash behavior.  Date and
Revision strings were used as keys, and each character (of which there
were only digits, '.', and '/') was weighted equally.  I can't recall
exactly, but I think the original hash function resulted in ~10% bucket
utilization --- the remaining ~90% would not be used by any valid keys.
And even among the remaning ~10%, the distribution was non-uniform.

I changed it to one of the functions compared in the Red Dragon book's
discussion of hash functions and their suitability for use for hashing
identifiers (which tend to have a lot of duplication/redundency). This
change resulted in a uniform distribution across all buckets and a
cooresponding performance improvement.

For the longest time, I used to keep a histogram plot of the original
and new hash functions in a folder, and would show them to anyone who
would listen to the tale: switching from a nieve to a complex data
structure will not automatically result in a performance improvement.

So I agree with Jim, fixing bcache to use a better hash table is a
much better approach than microoptimizing the existing hash function
by making it an inline function or a macro.  Many thanks for spending
the time and tracking down the root cause.

For reference, here is hash function I used:

    unsigned int 
    hashpjw(const char* x)
    {
      unsigned int h = 0;
      unsigned int g;

      while (*x != 0)
      {
        h = (h << 4) + *x++;
        if ((g = h & 0xf0000000) != 0)
          h = (h ^ (g >> 24)) ^ g;
      }
      return h;
    }

It will have to be modified slightly for use in bcache, but it may
result in acceptable behavior in this case as well.

        --jtc

-- 
J.T. Conklin
RedBack Networks
From tromey@cygnus.com Fri Nov 05 12:00:00 1999
From: Tom Tromey <tromey@cygnus.com>
To: gdb-patches@sourceware.cygnus.com
Subject: Re: (patch) hpjyg09: bcache optimizations
Date: Fri, 05 Nov 1999 12:00:00 -0000
Message-id: <87u2n0lnbn.fsf@cygnus.com>
References: <Pine.LNX.4.10.9911041332020.15719-100000@hpcll168.cup.hp.com> <npaeoskevw.fsf@zwingli.cygnus.com>
X-SW-Source: 1999-q4/msg00192.html
Content-length: 427

>>>>> "Jim" == Jim Blandy <jimb@cygnus.com> writes:

Jim> - growing the hash table when the average chain length grows beyond 
Jim>   a certain limit, so the time overhead remains the same as the
Jim>   problem size grows

Greg McGary recently pointed out to me that libiberty now contains a
hash table implementation.  You still have to write the hash function,
but the growing, etc, are handled automatically.

Tom
From guo@cup.hp.com Fri Nov 05 12:38:00 1999
From: Jimmy Guo <guo@cup.hp.com>
To: Srikanth Adayapalam <srikanth@cup.hp.com>
Cc: jimb@cygnus.com, gdb-patches@sourceware.cygnus.com
Subject: Re: (patch) hpjyg09: bcache optimizations
Date: Fri, 05 Nov 1999 12:38:00 -0000
Message-id: <Pine.LNX.4.10.9911051142210.12517-100000@hpcll168.cup.hp.com>
References: <199911051850.KAA28555@flytrap.cup.hp.com>
X-SW-Source: 1999-q4/msg00193.html
Content-length: 1485

Given Jim Blandy's comments:
  http://sourceware.cygnus.com/ml/gdb-patches/1999-q4/msg00188.html

and Srikanth's clarification:
  http://sourceware.cygnus.com/ml/gdb-patches/1999-q4/msg00190.html

I think we should look at the issue of bypassing and the issue of bcache
hash implementation separately.

Bcache is introduced to deal with duplicate symbols.  Where it is not
necessary, it just adds time and space overheads, and that could be
significant for HP customers debugging large applications.  The tuning
of bcache hash is probably necessary, but that's a general and separate
issue.

I'd like to tackle the bypass scheme initially to introduce an
acceptable bypass mechanism which gives gdb users with platform compiler
combintations like HP-UX + HP cc compiler a faster / smaller footprint
gdb for large apps -- 
    Controlling the bypass through a target dependent macro def is out.
    The bcache_ignore_duplicates() and bcache_filter_duplicates() calls
    still need to be added, and be used to bypass bcache when detecting
    that the HP compiler is used (and if gcc is used, gdb will still use
    bcache).
Comments?

As to the macro implementation of hash(), that is a smaller issue and
will be looked at with some more performance benchmarking to answer
whether or not "Inlining such a relatively large body of code is
definitely not a definite win".  Again, the focus will be debugging
large applications, as typical for HP customers.

- Jimmy Guo, guo@cup.hp.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: (patch) hpjyg09: bcache optimizations
  1999-11-05 10:50   ` Srikanth Adayapalam
@ 1999-11-05 13:29     ` Jim Blandy
  1999-12-15  1:16       ` Jeffrey A Law
  0 siblings, 1 reply; 7+ messages in thread
From: Jim Blandy @ 1999-11-05 13:29 UTC (permalink / raw)
  To: Srikanth Adayapalam; +Cc: gdb-patches

> 	While it is true that it is not tied to the microprocessor,
> (and the placement of the macro is not ideal perhaps,) it is tied 
> somewhat to the HP-UX platform in that :

Ahh.  That, I didn't know.  So it would be correct to put the flag in
a file associated with a particular toolchain.

> 	o The bcache is beneficial, when there is high degree of
>           duplication and redundancy in the debug info generated
>           by a set of compilers for a platform (e.g : stabs + gcc + linux.)
> 
>         o and is useless and high overhead item , if there is little
>           or no duplication (e.g : som + HP cc + HP-UX).
> 
> 
> 	On HP-UX systems, we have a tool called pxdb, which is a linker
> post processor (or a debugger preprocessor if you want to look at it 
> that way) which runs thro the a.out and removes all duplicates. So when
> gdb enters the picture there is no duplicate debug info at all. Further 
> this overhead is not particular to the C compiler, it is seen for all
> applications compiled with HP compilers. The C compiler's case just
> happened to be mentioned. That we waste 140 MB trying to eliminate the 
> non-existent dups was a big concern for us.	

It's a big concern for us, too --- I'll explain why.

The memory consumed by the bcache depends only on the number of unique
strings it contains --- not on the presence or absence of duplicates.
Using pxdb has no effect on the amount of bcache memory overhead.  So
if HP is having problems with bcache memory overhead, others will have
problems, too.

It's not a toolchain-specific problem.  But disabling it is a
toolchain-specific fix.
From guo@cup.hp.com Fri Nov 05 13:36:00 1999
From: Jimmy Guo <guo@cup.hp.com>
To: gdb-patches@sourceware.cygnus.com
Subject: (patch) hpjyg13: dejagnu/lib/framework.exp
Date: Fri, 05 Nov 1999 13:36:00 -0000
Message-id: <Pine.LNX.4.10.9911051330510.12973-100000@hpcll168.cup.hp.com>
X-SW-Source: 1999-q4/msg00197.html
Content-length: 2049

Extends setup_xfail PRMS handling.  See ChangeLog.

ChangeLog:

1999-11-05	Jimmy Guo	<guo@cup.hp.com>

	* framework.exp (setup_xfail): extend PRMS number pattern to
	anything not containing '-' (target triplet spec).  This enables
	cleaner and more precise xfail message specification where
	target specific 'PRMS' numbers are specified in corresponding
	setup_xfail call instead of in the literal fail string (e.g.
	HP's bug tracking numbers associated with HP compilers on
	HP targets).

Index: dejagnu/lib/framework.exp
/opt/gnu/bin/diff -r -c -N  /view/guo.import//CLO/Components/WDB/Src/gnu/dejagnu/lib/framework.exp dejagnu/lib/framework.exp
*** /view/guo.import//CLO/Components/WDB/Src/gnu/dejagnu/lib/framework.exp	Tue Jun 22 22:46:11 1999
--- dejagnu/lib/framework.exp	Fri Jul 16 13:30:44 1999
***************
*** 417,424 ****
  # Setup a flag to control whether a failure is expected or not
  #
  # Multiple target triplet patterns can be specified for targets
! # for which the test fails.  A decimal number can be specified,
! # which is the PRMS number.
  #
  proc setup_xfail { args } {
      global xfail_flag
--- 417,424 ----
  # Setup a flag to control whether a failure is expected or not
  #
  # Multiple target triplet patterns can be specified for targets
! # for which the test fails.  A bug report ID can be specified,
! # which is a string without '-'.
  #
  proc setup_xfail { args } {
      global xfail_flag
***************
*** 428,435 ****
      set argc [ llength $args ]
      for { set i 0 } { $i < $argc } { incr i } {
  	set sub_arg [ lindex $args $i ]
! 	# is a prms number. we assume this is a number with no characters
! 	if [regexp "^\[0-9\]+$" $sub_arg] { 
  	    set xfail_prms $sub_arg
  	    continue
  	}
--- 428,435 ----
      set argc [ llength $args ]
      for { set i 0 } { $i < $argc } { incr i } {
  	set sub_arg [ lindex $args $i ]
! 	# is a prms number. we assume this is a string with no '-' characters
! 	if [regexp "^\[^\-\]+$" $sub_arg] { 
  	    set xfail_prms $sub_arg
  	    continue
  	}



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: (patch) hpjyg09: bcache optimizations
  1999-11-05 13:29     ` Jim Blandy
@ 1999-12-15  1:16       ` Jeffrey A Law
  0 siblings, 0 replies; 7+ messages in thread
From: Jeffrey A Law @ 1999-12-15  1:16 UTC (permalink / raw)
  To: Jim Blandy; +Cc: Srikanth Adayapalam, gdb-patches

  In message < npyacciqf8.fsf@zwingli.cygnus.com >you write:
  > 
  > > 	While it is true that it is not tied to the microprocessor,
  > > (and the placement of the macro is not ideal perhaps,) it is tied 
  > > somewhat to the HP-UX platform in that :
  > 
  > Ahh.  That, I didn't know.  So it would be correct to put the flag in
  > a file associated with a particular toolchain.
Not in this case.  While HP's tools arrange (via pxdb) to avoid duplicates,
other HP toolchains do not.

Short term none of the GNU toolchains on the PA remove duplicates.  That will
likely change for the PA64 toolchain (dwarf2 + elf), but it is unlikely to 
ever change for the PA32 toolchain (stabs + som).

In general, if we find ourselves tuning towards one particular toolchain, then
we are probably going down the wrong road.

jeff


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: (patch) hpjyg09: bcache optimizations
  1999-11-05  9:59 ` Jim Blandy
  1999-11-05 10:50   ` Srikanth Adayapalam
@ 1999-12-16  0:26   ` Jeffrey A Law
  1 sibling, 0 replies; 7+ messages in thread
From: Jeffrey A Law @ 1999-12-16  0:26 UTC (permalink / raw)
  To: Jim Blandy; +Cc: Jimmy Guo, gdb-patches

  In message < npaeoskevw.fsf@zwingli.cygnus.com >you write:
  > So, while I understand the impulse to just disable the bcache, because
  > it's causing you trouble in your driving uses of the debugger, I think
  > the better solution, for both Cygnus and HP, is to fix the data
  > structure so it actually works.  I would recommend:
  > 
[ ... ]
  > - growing the hash table when the average chain length grows beyond 
  >   a certain limit, so the time overhead remains the same as the
  >   problem size grows
FWIW, libiberty now includes some generic hash table capabilities, including
automatic expansion of the hash table once it hits a certain capacity.

jeff






^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~1999-12-16  0:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.10.9911041529510.15357-100000@hpcll168.cup.hp.com>
1999-11-04 17:31 ` (patch) hpjyg09: bcache optimizations Andrew Cagney
1999-11-04 13:48 Jimmy Guo
1999-11-05  9:59 ` Jim Blandy
1999-11-05 10:50   ` Srikanth Adayapalam
1999-11-05 13:29     ` Jim Blandy
1999-12-15  1:16       ` Jeffrey A Law
1999-12-16  0:26   ` Jeffrey A Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox