From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kevin Buettner To: Andrew Cagney Cc: gdb@sourceware.cygnus.com Subject: Re: Proposal: convert function definitions to prototyped form Date: Sat, 03 Jun 2000 00:13:00 -0000 Message-id: <1000603071258.ZM32408@ocotillo.lan> References: <1000602075018.ZM29997@ocotillo.lan> <3937816C.E66B9AE0@cygnus.com> X-SW-Source: 2000-06/msg00028.html On Jun 2, 7:42pm, Andrew Cagney wrote: > The only concern I have is, given the slightly more complex nature of > the script (compared to PARAMS) there is a possibility that the > conversion re-orders or re-types the argument list. With that in mind, > should a pre-cursor to this be to least have prototypes for all > (global?) functions (-Wmissing-prototypes?) or only do the conversion > when there is a prototype visible? I've given this matter a lot of thought. I agree that it would be desirable to have prototypes for all functions. Unfortunately, while it is easy to generate prototypes, it's not so easy to know where to stick them. Also, even if we had prototypes in place, there's no guarantee that we'd catch the errors after a few builds because I think there's some code in gdb (though I don't know how much) that never gets built! (Due to ifdefs and near obsolete ports.) What we really need is a method for vetting all of the changes immediately after running the script. I.e, we need to make sure that the conversion does no reordering or retyping of any argument list. Also, we need to make sure that the rewritten function declaration is syntactically correct. While examining the diffs (made with the -u switch) an idea occurred to me. Consider the following example: diff -ur ../../orig/gdb-sourceware/wrapper.c ./wrapper.c --- ../../orig/gdb-sourceware/wrapper.c Sat May 27 17:10:27 2000 +++ ./wrapper.c Thu Jun 1 23:33:16 2000 @@ -57,11 +57,8 @@ static int wrap_parse_and_eval_type (char *); int -gdb_parse_exp_1 (stringptr, block, comma, expression) - char **stringptr; - struct block *block; - int comma; - struct expression **expression; +gdb_parse_exp_1 (char **stringptr, struct block *block, int comma, + struct expression **expression) { struct gdb_wrapper_arguments args; args.args[0].pointer = stringptr; In the above diff, the lines prepended with `-' represent the original K&R definition. And the lines prepended with `+' represent the transformed code. Moreover, the diff is extremely regular in this respect. So... If you take the lines which begin with `+', prepend the type on the line before the `-' lines and tack a semicolon onto the end, you end up with a prototype declaration. And, if you take the lines beginning with `-', again tack the type onto the front and put a function body underneath it, you have a K&R style (traditional) function definition. Now if you put these into a file with the prototype first and the K&R definition later on, you can run "gcc -Wall" on it to see if any warnings are produced. Obviously, if we get warnings, we need to look closer to see if something went wrong with the fix-decls conversion. Of course, there are other details to consider, like making sure that all of the types, structs, unions, and enums are declared. Also, in a source tree as big as gdb, we'll likely wind up with a number of functions with the same name, so some method of disambiguating these will be necessary. And then of course, there's the matter of no declared return type and other oddments. I've written a script called ``check-decls'' which performs these transformations on the diff output. When I run it on the above diff, it produces the following output (indented by four spaces by me for readability) struct block { int f0; }; struct expression { int f1; }; #define INLINE #define private #define CONST const #define NORETURN void init___ (void *); int gdb_parse_exp_1 (char **stringptr, struct block *block, int comma, struct expression **expression); int gdb_parse_exp_1 (stringptr, block, comma, expression) char **stringptr; struct block *block; int comma; struct expression **expression; { int ret; init___ (&ret); return ret; } void use___ (void) { } The use___ () function isn't interesting in this example, but it would be if there had been a static declaration. Here's what happens when I run it on *all* of the diffs: ocotillo:ptests$ ./check-decls prog.c ocotillo:ptests$ wc prog.c 50235 112228 960827 prog.c ocotillo:ptests$ gcc -c -Wall prog.c prog.c: In function `exit': prog.c:39303: warning: function declared `noreturn' has a `return' statement prog.c: At top level: prog.c:45903: parse error before `arg_type' prog.c: In function `value_primitive_field': prog.c:45907: declaration for parameter `arg_type' but no such parameter prog.c:45906: declaration for parameter `fieldno' but no such parameter prog.c:45905: declaration for parameter `offset' but no such parameter prog.c:45904: declaration for parameter `arg1' but no such parameter prog.c:45908: argument `arg_type' doesn't match prototype prog.c:5886: prototype declaration prog.c:45908: argument `arg1' doesn't match prototype prog.c:5886: prototype declaration The `exit' warning is due to the fact that there's a declaration and definition of exit() from standalone.c. It is of no concern. The error following this warning looks more serious. Here's the declaration and the definition of the function involved: value_ptr value_primitive_field (register value_ptr arg1, int offset, register int fieldno, register struct type *arg_type); value_ptr value_primitive_field (arg1, offset, fieldno, arg_type) register value_ptr arg1; int offset; register int fieldno; register struct type *arg_type; { value_ptr ret; init___ (&ret); return ret; } I looked at this for a long, long time and didn't see anything wrong. Finally, I realized that arg_type was a type from a different file. (Which is one of the problems with throwing everything into one big pot.) Anyway, here's the type that the script declared: typedef struct t44 { int f44; } arg_type; And here's the (transformed) definition which caused it to be defined: bool_t xdr_arg_type(xdrs, objp) XDR *xdrs; arg_type *objp; { bool_t ret; init___ (&ret); return ret; } So it turns out that it's nothing to worry about. And that's it. There are no other warnings or errors. Which means that the transformation was successful and didn't mess up any of the parameter types. The check-decls script is below. One might argue that it is about as complex as the fix-decls script. This is true, but the code which actually extracts the `-' and `+' lines is fairly simple. Also, after being extracted, there are no transformations made to these lines aside from appending ___ to the function name if the script detects that the function name has already been seen. Most importantly, the parameter lists are not rewritten in any way. Most of the complexity is in the analysis and generation of the type, struct, enum, and union declarations. But uniqueness of these is easy to verify. Plus, if something is screwed up, the compiler complains. --- check-decls --- #!/usr/bin/perl -w # Feed this script a unidiff after running fix-decls and it generates # (on stdout) a program which may be used to test the validity of the # conversion. Just run the result through gcc -Wall and if it # generates any warnings, there's a problem... undef $/; # slurp mode my $diff = <>; # read entire diff in $diff; my $decls = ''; my $defns = ''; my %userstructs = (); my %userenums = (); my %usertypes = (); my %funcnames = (); my $funcname_gensym = 0; # for names that clash my @needuse; while ($diff =~ / ( ^ # beginning of line [^\n]+ # everything til the end of line ) \n # newline ( (?: ^ # beginning of line - # minus sign (?: \n # either just a newline | # -- or -- [^-\n] # any character but minus and newline [^\n]* # the rest of the line \n # including the newline ) )+ # one or more of the above ) ( (?: ^ # beginning of line \+ # plus sign [^+] # any character but plus [^\n]* # the rest of the line \n # including the newline )+ # one or more of the above ) /mgx) { my ($rettype, $traddecl, $isodecl) = ($1, $2, $3); # Remove leading diff character from the lines extracted foreach ($rettype, $traddecl, $isodecl) { s/^.//mg; } # Find type names in parameter list my $parmdecls = $traddecl; $parmdecls =~ s/^\w+\s*\([^)]*\)//; foreach my $parm (split /\s*;\s*/, $parmdecls) { $parm =~ s/\s*\**\w+(,|$).*$//; analyze_type($parm); } # Resolve collisions between function name (either due to statics # or due to the names being in different branches of an ifdef) my ($funcname) = $traddecl =~ /^(\w+)/; if (defined $funcnames{$funcname}) { foreach ($traddecl, $isodecl) { s/\b$funcname\b/${funcname}___$funcname_gensym/; } $funcname .= "___$funcname_gensym"; $funcname_gensym++; } $funcnames{$funcname} = $funcname; # Nuke comments in the return type $rettype =~ s#/\*.*?\*/##g; # Nuke partial comment in return type $rettype =~ s#^.*?\*/##; # Eliminate ``CALLBACK'' from return type $rettype =~ s/\bCALLBACK\b//; # Eliminate ``extern'' from return type $rettype =~ s/\bextern\b//; # Eliminate leading and trailing spaces from return type $rettype =~ s/^\s*//; $rettype =~ s/\s*$//; if (($rettype =~ /^#/) || ($rettype eq '')) { # preprocessor line or empty string $rettype = 'int'; } elsif ($rettype eq "static") { $rettype = 'static int'; } elsif ($rettype eq "private") { $rettype = 'static int'; } else { analyze_type($rettype); } $isodecl =~ s/\n\Z/;\n/; $decls .= "$rettype $isodecl"; if ($rettype =~ /\bvoid$/) { $defns .= "$rettype\n$traddecl\{\n}\n\n"; } else { $defns .= "$rettype\n$traddecl\{\n $rettype ret;\n" . " init___ (&ret);\n return ret;\n}\n\n"; } if ($rettype =~/\bstatic\b/) { push @needuse, $funcname; } } my $typeidx = 0; foreach $key (sort keys %usertypes) { print "typedef struct t$typeidx { int f$typeidx; } $key;\n"; $typeidx++; } foreach $key (sort keys %userstructs) { print "$key { int f$typeidx; };\n"; $typeidx++; } foreach $key (sort keys %userenums) { print "$key { e$typeidx };\n"; $typeidx++; } print "#define INLINE\n"; print "#define private\n"; print "#define CONST const\n"; print "#define NORETURN\n"; print "void init___ (void *);\n"; print $decls; print "\n"; print $defns; print "void\nuse___ (void)\n{\n"; foreach (@needuse) { print " init___ ($_);\n"; } print "}\n"; sub analyze_type { my ($parm) = @_; $parm =~ s/\s*\**\s*$//; my $type; if ($parm =~ /\b(struct|union)\b/) { $parm =~ s/\A.*\b(struct|union)\b/$1/s; $parm =~ s/\s*\**\s*\Z//s; $userstructs{$parm} = $parm; } elsif ($parm =~ /\b(enum)\b/) { $parm =~ s/\A.*\b(enum)\b/$1/s; $parm =~ s/\s*\**\s*\Z//s; $userenums{$parm} = $parm; } elsif ((($type) = $parm =~ /(\w+)$/) && ($type !~ /^(int|char|long|short|unsigned|double |register|void|const|static)$/x)) { $usertypes{$type} = $type; } } --- end check-decls --- >From kevinb@cygnus.com Sat Jun 03 00:21:00 2000 From: Kevin Buettner To: Andrew Cagney Cc: gdb@sourceware.cygnus.com Subject: Re: Proposal: convert function definitions to prototyped form Date: Sat, 03 Jun 2000 00:21:00 -0000 Message-id: <1000603072049.ZM32430@ocotillo.lan> References: <1000602075018.ZM29997@ocotillo.lan> <3937816C.E66B9AE0@cygnus.com> X-SW-Source: 2000-06/msg00029.html Content-length: 198 On Jun 2, 7:42pm, Andrew Cagney wrote: > PS: You may want to add gdb/*-share to the list of directories to avoid. I can certainly do this, but I'd like to know why they shouldn't be converted... >From eliz@delorie.com Sat Jun 03 03:58:00 2000 From: Eli Zaretskii To: kevinb@cygnus.com Cc: taylor@cygnus.com, gdb@sourceware.cygnus.com Subject: Re: Proposal: convert function definitions to prototyped form Date: Sat, 03 Jun 2000 03:58:00 -0000 Message-id: <200006031058.GAA12885@indy.delorie.com> References: <200006021539.LAA25912@texas.cygnus.com> <1000602191042.ZM30936@ocotillo.lan> X-SW-Source: 2000-06/msg00030.html Content-length: 2307 > Date: Fri, 2 Jun 2000 12:10:42 -0700 > From: Kevin Buettner > > > I've used protoize before with good results. It was a fairly > > substantial project, though not as big as gdb. > > Okay. Out of curiousity, did the project in question have a large > number of active developers? I think the real problem is not the number of developers, but the number of different configurations, and also different data types and functions concealed behind macros. `protoize' needs everything to be explicit, so it is not easy to run it on multi-platform project that uses macros to hide system dependencies (since you want the same macros back in the reformatted sources). This is a disadvantage of `protoize'. Its significant advantage is that its output is *always* correct, because it takes the info ``from the horse's mouth'': the compiler itself. In contrast, a script is simply a text-processing tool: it really doesn't understand the semantics of the source. In fact, it doesn't really understand the syntax very well. So with a script, we will always need a verification tool that can be trusted to find any potential bugs introduced by reformatting. > > I'd be tempted to do a build before running your script; stash away the > > object files; run the script; do another build; compare the object > > files... > > Good idea. I'll have to see what gcc does to compare object files. > (I don't think a simple cmp works for ELF files.) Comparing object files generally doesn't work. COFF (at least the variety used by DJGPP) is another case: if I compile the same source twice in a row, I get different object files (IIRC, the time stamp is recorded inside). One method I can suggest is to compile the source without optimizations, then run "objdump --disassemble" and compare the output of `objdump' for the two source files (original and reformatted). (I suggest to disable optimizations because it's possible that ANSI source allows the compiler to produce more optimal code that the K&R source.) Note that, since you need to compile the source to make sure reformatting didn't screw up anything, you essentially get the same problems you had with `protoize', albeit through the back door: you need to build for all supported configurations to make sure nothing's broken. >From ac131313@cygnus.com Sat Jun 03 04:48:00 2000 From: Andrew Cagney To: Kevin Buettner Cc: gdb@sourceware.cygnus.com Subject: Re: Proposal: convert function definitions to prototyped form Date: Sat, 03 Jun 2000 04:48:00 -0000 Message-id: <3938F055.42A9FEA3@cygnus.com> References: <1000602075018.ZM29997@ocotillo.lan> X-SW-Source: 2000-06/msg00031.html Content-length: 747 Kevin Buettner wrote: > > As many of you know, I'm in the midst of purging the use of ``PARAMS'' > in prototyped function declarations from the gdb sources. After this > activity is concluded, I'd like to begin converting function > definitions whose parameters are declared using the traditional C > style to the ISO C prototyped form. I.e, I'd like to convert > functions of the form Something to consider with the timing. There are Pascal, ObjectiveC.* and random other Apple files pending. It might be good to wait until the buil of the work is in the repository so that we reduce the number of contributors / maintainers that get hit for six by this one :-) Unlike params, this one will really hurt people with private diffs. Andrew >From ac131313@cygnus.com Sat Jun 03 04:52:00 2000 From: Andrew Cagney To: Kevin Buettner Cc: gdb@sourceware.cygnus.com Subject: Re: Proposal: convert function definitions to prototyped form Date: Sat, 03 Jun 2000 04:52:00 -0000 Message-id: <3938F161.943C45F2@cygnus.com> References: <1000602075018.ZM29997@ocotillo.lan> <3937816C.E66B9AE0@cygnus.com> <1000603072049.ZM32430@ocotillo.lan> X-SW-Source: 2000-06/msg00032.html Content-length: 511 Kevin Buettner wrote: > > On Jun 2, 7:42pm, Andrew Cagney wrote: > > > PS: You may want to add gdb/*-share to the list of directories to avoid. > > I can certainly do this, but I'd like to know why they shouldn't be > converted... Some of the *-share files are based on code from third parties. We should probably try to avoid munging that code - it will make it harder for us merge back patches. rdi-share comes to mind. As you note, the TUI which will be cleaned up, can be handed separatly. Andrew >From ac131313@cygnus.com Sat Jun 03 05:17:00 2000 From: Andrew Cagney To: jtc@redback.com Cc: Kevin Buettner , Mark Kettenis , gdb@sourceware.cygnus.com Subject: Re: Proposal: convert function definitions to prototyped form Date: Sat, 03 Jun 2000 05:17:00 -0000 Message-id: <3938F700.509FD4AA@cygnus.com> References: <1000602075018.ZM29997@ocotillo.lan> <200006021226.e52CQ2I01239@delius.kettenis.local> <1000602151553.ZM30578@ocotillo.lan> <5mya4om115.fsf@jtc.redback.com> X-SW-Source: 2000-06/msg00033.html Content-length: 1164 "J.T. Conklin" wrote: > > >>>>> "Kevin" == Kevin Buettner writes: > Kevin> I noticed that. The space was put there by ``indent''. I > Kevin> would very much like to get rid of that space and it would be > Kevin> easy to make the perl script postprocess the ``indent'' output. > Kevin> But in doing so, we (obviously) generate different output than > Kevin> that of ``indent''. > Kevin> > Kevin> I suppose the other solution is to fix indent. :-) > > You can tell indent about all the types defined by typedef with -T > option, and then it won't add the extra space. It shouldn't be too > difficult to identify all the types. > > It might be useful for us to maintain an indent.pro file that has > these definitions so that additional runs of indent don't add back > the space. Given that people are currently editing headers to remove spaces by hand this sounds like a good idea. One thing, how does one pass ``indent.pro'' rather than have it pick up ``.indent.pro''? I suspect it can be maintained largely by hand (please not another perl script to maintain it - we can't make perl a requirement on developers :-) enjoy, Andrew >From kevinb@cygnus.com Sat Jun 03 10:50:00 2000 From: Kevin Buettner To: Eli Zaretskii Cc: gdb@sourceware.cygnus.com Subject: Re: Proposal: convert function definitions to prototyped form Date: Sat, 03 Jun 2000 10:50:00 -0000 Message-id: <1000603175039.ZM738@ocotillo.lan> References: <200006021539.LAA25912@texas.cygnus.com> <1000602191042.ZM30936@ocotillo.lan> <200006031058.GAA12885@indy.delorie.com> X-SW-Source: 2000-06/msg00034.html Content-length: 1970 On Jun 3, 6:58am, Eli Zaretskii wrote: > So with a script, we will always need a verification tool that can be > trusted to find any potential bugs introduced by reformatting. Right. That's why I wrote check-decls (see http://sourceware.cygnus.com/ml/gdb/2000-06/msg00028.html ) which takes the result of comparing (via diff -u) the original sources with the protoized sources and produces a C source file in which the portions from the protoized sources are used to construct prototypes and the portions from the original sources are used to construct (potentially?) corresponding function definitions. We can then invoke the C compiler (gcc -c -Wall) on the result and see what kinds of warnings and errors are produced. E.g, in an earlier (than the one I posted) version of fix-decls, I hadn't yet handled comma separated parameter lists and so the following: foo (a, b, c) int a; char *b, *c; was getting transformed into foo (int a, int b, char *b, *c) This type of mistake would've been quickly caught by check-decls + gcc. (As it was, I caught it myself because I was looking for it.) Also, since my fix-decls script merely looks for patterns which appear to be function definitions, it was finding if (overload_debug) { in find_overload_match() in valops.c and turning this into if (int overload_debug) { (Note that the ``if'' is at the beginning of the line in this function.) I found this one when I did a test build, but check-decls + the C compiler would have caught this one too. (fix-decls was quickly changed so that it no longer gets tripped up by this code.) Also, note that check-decls would've caught this mistake even if the the construct in question had appeared in some #if 0'd code whereas doing a build wouldn't have. I think it could still happen that something might slip by that won't work in the real code, but now that I've written check-decls, I think it is much, much less likely. Kevin >From eliz@delorie.com Sat Jun 03 11:37:00 2000 From: Eli Zaretskii To: kevinb@cygnus.com Cc: gdb@sourceware.cygnus.com Subject: Re: Proposal: convert function definitions to prototyped form Date: Sat, 03 Jun 2000 11:37:00 -0000 Message-id: <200006031837.OAA13278@indy.delorie.com> References: <200006021539.LAA25912@texas.cygnus.com> <1000602191042.ZM30936@ocotillo.lan> <200006031058.GAA12885@indy.delorie.com> <1000603175039.ZM738@ocotillo.lan> X-SW-Source: 2000-06/msg00035.html Content-length: 870 > Date: Sat, 3 Jun 2000 10:50:39 -0700 > From: Kevin Buettner > > That's why I wrote check-decls (see > http://sourceware.cygnus.com/ml/gdb/2000-06/msg00028.html > ) which takes the result of comparing (via diff -u) the original > sources with the protoized sources and produces a C source file in > which the portions from the protoized sources are used to construct > prototypes and the portions from the original sources are used to > construct (potentially?) corresponding function definitions. We can > then invoke the C compiler (gcc -c -Wall) on the result and see what > kinds of warnings and errors are produced. I saw that script, but I don't like the idea to depend on a human to judge what GCC warnings are okay to ignore and what aren't. I'd prefer an automated tool that would give a definite yes/no answer, if that's possible. >From pavenis@latnet.lv Sat Jun 03 11:47:00 2000 From: Andris Pavenis To: gdb@sourceware.cygnus.com Cc: linux-kernel@vger.rutgers.edu Subject: Problems with GDB-5.0 and recent Linux kernels (2.4.0-test1-ac[47]) Date: Sat, 03 Jun 2000 11:47:00 -0000 Message-id: <00060320465900.00261@hal> X-SW-Source: 2000-06/msg00036.html Content-length: 2559 Have somebody tried GDB commands 'info float' and 'info reg' on a system running latest ac kernels. I'm getting coredump from gdb-5.0 on these commands. 2.4.0-test1 - seems that all works, no such problem 2.4.0-test1-ac4 and 2.4.0-test1-ac7 - gdb coredumps on these commands It seems to be some stack corruption. Andris hal:/usr/src/build/gdb$ gdb gdb GNU gdb 5.0 Copyright 2000 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-slackware-linux"... Setting up the environment for debugging gdb. Breakpoint 1 at 0x80bf9cd: file ../../gdb-5.0/gdb/utils.c, line 723. Breakpoint 2 at 0x80bd3af: file ../../gdb-5.0/gdb/top.c, line 2953. Breakpoint 3 at 0x80a0563: file ../../gdb-5.0/gdb/i386-linux-nat.c, line 522. (top-gdb) r Starting program: /usr/src/build/gdb/gdb GNU gdb 5.0 Copyright 2000 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i586-pc-linux-gnu". Setting up the environment for debugging gdb. .gdbinit:5: Error in sourced command file: No symbol table is loaded. Use the "file" command. (gdb) file test1 Reading symbols from test1...done. (gdb) tb main Breakpoint 1 at 0x80485d6: file test1.c, line 10. (gdb) r Starting program: /usr/src/build/gdb/test1 main () at test1.c:10 10 rc = system ("ls -l"); (gdb) info float Breakpoint 3, fetch_fpregs (tid=319) at ../../gdb-5.0/gdb/i386-linux-nat.c:522 522 if (ret < 0) (top-gdb) n 528 supply_fpregset (&fpregs); (top-gdb) n 529 } (top-gdb) fetch_inferior_registers (regno=Cannot access memory at address 0xffff0008 ) at ../../gdb-5.0/gdb/i386-linux-nat.c:824 824 dummy_sse_values (); (top-gdb) 825 return; (top-gdb) 830 } (top-gdb) Program received signal SIGSEGV, Segmentation fault. 0x80a06c9 in fetch_inferior_registers (regno=Cannot access memory at address 0xffff0008 ) at ../../gdb-5.0/gdb/i386-linux-nat.c:830 830 }