From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6145 invoked by alias); 8 Oct 2002 22:05:59 -0000 Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 6100 invoked from network); 8 Oct 2002 22:05:58 -0000 Received: from unknown (HELO jackfruit.Stanford.EDU) (171.64.38.136) by sources.redhat.com with SMTP; 8 Oct 2002 22:05:58 -0000 Received: (from carlton@localhost) by jackfruit.Stanford.EDU (8.11.6/8.11.6) id g98M5mE09465; Tue, 8 Oct 2002 15:05:48 -0700 X-Authentication-Warning: jackfruit.Stanford.EDU: carlton set sender to carlton@math.stanford.edu using -f To: Kevin Buettner Cc: gdb-patches@sources.redhat.com, Jim Blandy , Elena Zannoni Subject: Re: [rfc] split up symtab.h References: <1021008205446.ZM16832@localhost.localdomain> From: David Carlton Date: Tue, 08 Oct 2002 15:05:00 -0000 In-Reply-To: <1021008205446.ZM16832@localhost.localdomain> Message-ID: User-Agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Common Lisp) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-SW-Source: 2002-10/txt/msg00184.txt.bz2 On Tue, 8 Oct 2002 13:54:46 -0700, Kevin Buettner said: > Aside from the build time issue, are there other reasons why > splitting up symtab.h is desirable? Honestly, I have a hard time answering this question: it's easy enough to consider practical considerations, but the philosophical side of things isn't so clear to me. I guess that, if you pinned me down, I'd say that I start with a default assumption that an include file should normally correspond to a single construct, which typically means a single structure. But I'm not sure I really believe that: I haven't thought enough about its implications. To look at it another way: why should 'struct minimal_symbol' and 'struct linetable' both be in the same header file? The best answer that I can come up with is: * 'struct symtab' has a member that is a pointer to a 'struct linetable'. * 'struct symtab' also stores a bunch of 'struct symbol's. (Indirectly: via 'struct blockvector' and then 'struct block'.) * 'struct minimal_symbol' is kind of like 'struct symbol'. That, to me, is not a very good reason for both of those structs to be in the same header file. > Here are several reasons for not splitting it: > 1) The list of includes for many .c files will (I suspect) grow > quite a bit. If it turns out that you'll be replacing one > #include statement with five or size (per source file), I > can't really see that making the split was an advantage. I honestly don't know what the average number of includes that each #include "symtab.h" would turn into is. Five might be right, or it might be just a tad high. (It also depends on whether or not the #include files for minimal_symbol, symbol, and partial_symbol are allowed to include the one for general_symbol_info.) I've generated various correlations between uses of different structures (I knew my number theory background would come in useful somehow!), but they don't give me a clear answer to this. One example is that minimal_symbol, symtab, and symbol are the most commonly used structures in symtab.h, but while 111 files refer to at least one of these structures, only 29 refer to all three. (And it's clear that the conceptual link from minimal_symbol to symtab passes via symbol: only 4 files refer to minimal_symbol and symtab but _not_ to symbol, and only 8 files refer only to symbol but not to either of minimal_symbol or symtab. But there are tons of files that mention either minimal_symbol or symtab but not both.) I'm certainly not looking forward to changing existing files. Having said that, I don't think it would impose a large future maintenance burden: if somebody, say, adds a function to an existing file that calls one of the new headers to have to be pulled in, the compiler will let that person know, and it's easy enough to use grep to figure out which file to include. > 2) One could argue that modifying symtab.h *should* be a heavy weight > operation. I.e, you're modifying something that's at the very > heart of gdb and you need to take great care. This is, to me, a really important issue. But I'm honestly not sure whether this argues for or against breaking up symtab.h. For example, since symbol stuff is so important, I like to make small changes and recompile GDB after each change (and even to run a subset of the test suite after each change) to help reassure me that I didn't screw anything up. And having long compilation times really works against me there: if it takes longer to recompile the program than to make the change, then I'll wait until I've made several changes before recompiling. Or, to give another example, sometimes I make a change, and then after using it for a little while, I realize that the change is a little subtler than I first thought. So I want to include a comment that explains the situation a little better: but, just by including a comment, I've doomed myself to a large recompilation. (I certainly wouldn't fix a typo in a comment in symtab.h, even if it's one that I've made myself, unless I'm doing other changes to symtab.h: it's just not worth the pain.) Of course, one answer is to do the thinking before making the change in the first place, and that's the best situation: but, alas, I'm not a good enough programmer to always be able to forsee the implications of my changes that way. (Obviously I should give changes time to settle before submitting them as an RFA, but that's another matter entirely.) So, basically, the long compilation times are making it more painful to follow what seems to me like good software engineering practices. > 3) Makefile.in maintenance becomes harder due to the larger > number of header files. I guess; I don't have a lot of experience with that. I suppose the problem there is that, if Makefile.in gets out of sync, then you might not recompile when you're supposed to, and, unlike your first point, this could lead to problems that programmers were unaware of. That would be very bad. Too bad there's no way to generate those dependencies automatically... > I should note that I don't find any of the above reasons to be > overly compelling. I just think that we need a better reason for > making such a split than the build time consideration. Yeah, I totally agree. I think that there should be some sort of philosophical justification for when to split a header file, and I certainly wouldn't argue with you when you say that changes to core structures in symtab.h shouldn't be made lightly. David Carlton carlton@math.stanford.edu