From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jim Blandy To: Jimmy Guo Cc: gdb-patches@sourceware.cygnus.com Subject: Re: (patch) hpjyg09: bcache optimizations Date: Fri, 05 Nov 1999 09:59:00 -0000 Message-id: References: X-SW-Source: 1999-q4/msg00188.html The first red flag here is that this patch adds a new target macro, BCACHE_ALLOW_DUPLICATES (in config/pa/tm-hppa.h), which is actually dependent on the application being debugged, not the architecture. Whether the bcache helps you depends on the contents of your debug info, not whether it's running on a PA microprocessor. Hmm. I'd like to look at this problem on a larger scale: I've read Srikanth's comment in the patch, explaining why the bcache imposes too much overhead for some code while it is still quite effective for other code. But this situation still sounds weird. The bcache is a very simple thing --- bcache.h and bcache.c total 289 lines of code. You give it a string of bytes, and it either adds a copy of your string to its hash table, or finds an identical copy already present. Either way, it hands you back a pointer to its stashed copy. So, a bcache is helpful if you expect a lot of duplicates. In Linux's libc.so, the bcache reduces the space required for the psymtabs by half. What I can't understand is why such a simple data structure needs to have such a terrible overhead. Srikanth writes: See that the overhead (which is really every byte that is not used to store GDB's data i.e., cells used for housekeeping info like pointers, hash chain heads etc.,) is of the order of O(m * n * 64k) where m is the number of load modules compiled with -g, and n is the number of strings that have unique length. This spells doom for applications wih a large number of shared libraries (m increases) and C++ (n increases since we also stick demangled names into the cache.) When debugging HP's C compiler more than 140 MB or about 48% of memory is due to the bcache overhead. Not the data just the overhead ! I'm sure that's true, but it seems completely unnecessary. One should be able to design a bcache with much less overhead. Looking at the output of "maint print statistics" from running GDB on itself, I see: Average hash table population: 8% Average hash table population: (not applicable) Average hash table population: (not applicable) Average hash table population: 0% Average hash table population: 1% Average hash table population: 3% Average hash table population: 1% (The "not applicable" items are for bcaches with no entries.) I don't think those lower-level hash tables are being used very effectively, if the hash function is distributing the items across ~3% of the buckets. A good hash function distributes items across as many different buckets as possible. I would guess that the bcache has never been tuned. So, while I understand the impulse to just disable the bcache, because it's causing you trouble in your driving uses of the debugger, I think the better solution, for both Cygnus and HP, is to fix the data structure so it actually works. I would recommend: - using a single-level hash table, with an initial size of about 32k buckets (based on looking at GDB debugging GDB) - growing the hash table when the average chain length grows beyond a certain limit, so the time overhead remains the same as the problem size grows - choosing a hash function by running GDB on your C compiler, doing "maint print statistics" to see how it's distributing items across buckets, and then tweaking the function to minimize your maximum chain length (this is kind of fun, and takes less time than you'd think, since hash functions have almost no *required* semantics) - adding code to print_bcache_statistics to print the number of bytes spend on bcache overhead Yes, this is more work than commenting it out. But we can't perpetually choose the quick fix over the real cure. Especially when the quick fix adds complexity, like more CPP conditionals and target parameters that aren't target parameters. In this light, I don't think we should apply this patch. >From guo@cup.hp.com Fri Nov 05 10:24:00 1999 From: Jimmy Guo To: Jim Blandy Cc: gdb-patches@sourceware.cygnus.com Subject: Re: (patch) hpjyg09: bcache optimizations Date: Fri, 05 Nov 1999 10:24:00 -0000 Message-id: References: X-SW-Source: 1999-q4/msg00189.html Content-length: 462 Given the questions and concerns over this patch, I will 'hash' it a bit more in the direction as suggested here ... just ignore this patch. - Jimmy Guo >Yes, this is more work than commenting it out. But we can't >perpetually choose the quick fix over the real cure. Especially when >the quick fix adds complexity, like more CPP conditionals and target >parameters that aren't target parameters. > >In this light, I don't think we should apply this patch.