* text file formats @ 2006-04-05 22:31 Bob Rossi 2006-04-05 23:39 ` Daniel Jacobowitz 2006-04-06 3:43 ` Eli Zaretskii 0 siblings, 2 replies; 23+ messages in thread From: Bob Rossi @ 2006-04-05 22:31 UTC (permalink / raw) To: gdb Hi, While trying to display to the user a source file, it has become increasingly obvious to me how complicated such a simple task can be. Unix formatted text files have "\n" for a newline, dos formatted text files have "\r\n" for a newline and mac formatted text files have "\r" for a newline. In the 3 case's above it is obvious how to determine exactly which line is which. However, it is easy to mix these file formats. In this case, any particular file can use any combination of "\r", "\r\n" and "\n" for newlines. I'm not even sure how to display such a file. I'm guessing that's it's ambiguous, and i can make a best guess as to what the newline sequence should be. Is this correct? One thing I have determined, is that in order to know what the file format is, the entire text file needs to be parsed. After that, either the file format is defined (unix/dos/mac) or it is undefined (mix of them). I would like to make sure that the algorithm CGDB uses to determine the line number from a file is the same algorithm that GDB uses. Can anyone point me in the correct direction? Thanks, Bob Rossi ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-05 22:31 text file formats Bob Rossi @ 2006-04-05 23:39 ` Daniel Jacobowitz 2006-04-06 0:14 ` Bob Rossi 2006-04-06 3:47 ` Eli Zaretskii 2006-04-06 3:43 ` Eli Zaretskii 1 sibling, 2 replies; 23+ messages in thread From: Daniel Jacobowitz @ 2006-04-05 23:39 UTC (permalink / raw) To: gdb On Wed, Apr 05, 2006 at 06:31:22PM -0400, Bob Rossi wrote: > One thing I have determined, is that in order to know what the file > format is, the entire text file needs to be parsed. After that, either > the file format is defined (unix/dos/mac) or it is undefined (mix of > them). > > I would like to make sure that the algorithm CGDB uses to determine > the line number from a file is the same algorithm that GDB uses. Can > anyone point me in the correct direction? GDB does something much simpler. It opens the file in text mode and lets the C library sort it out. Well, usually. In search and reverse search it sometimes uses a similar but slightly simpler algorithm: ignore '\r' if followed by '\n'. I'm not sure why those are done in binary mode. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-05 23:39 ` Daniel Jacobowitz @ 2006-04-06 0:14 ` Bob Rossi 2006-04-06 1:17 ` Daniel Jacobowitz 2006-04-06 3:47 ` Eli Zaretskii 1 sibling, 1 reply; 23+ messages in thread From: Bob Rossi @ 2006-04-06 0:14 UTC (permalink / raw) To: gdb On Wed, Apr 05, 2006 at 07:39:38PM -0400, Daniel Jacobowitz wrote: > On Wed, Apr 05, 2006 at 06:31:22PM -0400, Bob Rossi wrote: > > One thing I have determined, is that in order to know what the file > > format is, the entire text file needs to be parsed. After that, either > > the file format is defined (unix/dos/mac) or it is undefined (mix of > > them). > > > > I would like to make sure that the algorithm CGDB uses to determine > > the line number from a file is the same algorithm that GDB uses. Can > > anyone point me in the correct direction? > > GDB does something much simpler. It opens the file in text mode and > lets the C library sort it out. > > Well, usually. In search and reverse search it sometimes uses a > similar but slightly simpler algorithm: ignore '\r' if followed by > '\n'. I'm not sure why those are done in binary mode. OK, so now I'm confused. If the user looks at the text file through my viewer, and set's a breakpoint at line 100, how can I be sure it's the same 100 that GDB will actually set a breakpoint at? Obviously this works for unix and dos file formats. But from the algorithm you stated above, it doesn't look like GDB will work with mac file formats. I mean, the C library on unix won't be able to read a file that was created on a mac (at least with the mac file format). Is GDB responsible for mapping the file line numbers to the actual lines? or is this the responsibility of GCC via the debug info? For instance, if foo () is defined at line 100 according to gcc and 101 according to GDB, does CGDB have to think foo () is at line 100 or 101? Thanks, Bob Rossi ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 0:14 ` Bob Rossi @ 2006-04-06 1:17 ` Daniel Jacobowitz 2006-04-06 3:27 ` Bob Rossi 0 siblings, 1 reply; 23+ messages in thread From: Daniel Jacobowitz @ 2006-04-06 1:17 UTC (permalink / raw) To: gdb On Wed, Apr 05, 2006 at 08:14:55PM -0400, Bob Rossi wrote: > OK, so now I'm confused. If the user looks at the text file through my > viewer, and set's a breakpoint at line 100, how can I be sure it's the > same 100 that GDB will actually set a breakpoint at? Obviously this > works for unix and dos file formats. But from the algorithm you stated > above, it doesn't look like GDB will work with mac file formats. > > I mean, the C library on unix won't be able to read a file that was > created on a mac (at least with the mac file format). The manual algorithm is only used while searching. It will work on any file format recognized by the C library. The C library is generally used. If you want to handle a text file, including a source file, it had better be in the native format. Full stop. I don't think any recent version of MacOS uses the old \r format, anyway? I thought OSX had switched to the Unix convention. > Is GDB responsible for mapping the file line numbers to the actual lines? > or is this the responsibility of GCC via the debug info? For instance, > if foo () is defined at line 100 according to gcc and 101 according to > GDB, does CGDB have to think foo () is at line 100 or 101? I have no idea what you mean. GDB gets line numbers from debug info, of course; where else would it get them? -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 1:17 ` Daniel Jacobowitz @ 2006-04-06 3:27 ` Bob Rossi 2006-04-06 3:35 ` Eli Zaretskii 0 siblings, 1 reply; 23+ messages in thread From: Bob Rossi @ 2006-04-06 3:27 UTC (permalink / raw) To: gdb On Wed, Apr 05, 2006 at 09:17:32PM -0400, Daniel Jacobowitz wrote: > On Wed, Apr 05, 2006 at 08:14:55PM -0400, Bob Rossi wrote: > > OK, so now I'm confused. If the user looks at the text file through my > > viewer, and set's a breakpoint at line 100, how can I be sure it's the > > same 100 that GDB will actually set a breakpoint at? Obviously this > > works for unix and dos file formats. But from the algorithm you stated > > above, it doesn't look like GDB will work with mac file formats. > > > > I mean, the C library on unix won't be able to read a file that was > > created on a mac (at least with the mac file format). > > The manual algorithm is only used while searching. It will work on any > file format recognized by the C library. > > The C library is generally used. If you want to handle a text file, > including a source file, it had better be in the native format. Full > stop. > > I don't think any recent version of MacOS uses the old \r format, > anyway? I thought OSX had switched to the Unix convention. Sure, but for some reason the file '/usr/include/g++-3/sstream' on red hat enterprise 3 has mixed file formats. So, this becomes a practical issue, not just a theoretical one. Here is an example from 'od -a /usr/include/g++-3/sstream 0007400 h ) cr nl sp sp sp sp { sp } cr nl nl sp sp The 'cr nl' is "\r\n" and the extra "nl" is "\n". Opening this file in vim shows the ^M at the end of each line. > > Is GDB responsible for mapping the file line numbers to the actual lines? > > or is this the responsibility of GCC via the debug info? For instance, > > if foo () is defined at line 100 according to gcc and 101 according to > > GDB, does CGDB have to think foo () is at line 100 or 101? > > I have no idea what you mean. GDB gets line numbers from debug info, > of course; where else would it get them? Does the debug info actually say, "at line 100 symbol foo() exists?" Meaning, does gcc calculate that information, or does GDB derive the line number from the debug info? OK, I'm sorry, as usual I'm describing this bad. Let's start out by trying to agree with an assumption that I have. Since the file is in a mixed format, is it true that the line number is determined ambiguously by the program reading the file? Thanks, Bob Rossi ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 3:27 ` Bob Rossi @ 2006-04-06 3:35 ` Eli Zaretskii 2006-04-06 5:06 ` Daniel Jacobowitz 2006-04-06 14:01 ` Bob Rossi 0 siblings, 2 replies; 23+ messages in thread From: Eli Zaretskii @ 2006-04-06 3:35 UTC (permalink / raw) To: gdb > Date: Wed, 5 Apr 2006 23:27:02 -0400 > From: Bob Rossi <bob_rossi@cox.net> > > > I have no idea what you mean. GDB gets line numbers from debug info, > > of course; where else would it get them? > > Does the debug info actually say, "at line 100 symbol foo() exists?" No, it says, for every source line, which PC addresses correspond to that source line. That is all GDB needs to know, because it manipulates PC addresses (i.e. addresses in the .text section). For symbols, the debug info says that symbol `foo' is stored in the .text or .data section (or .bss or something else) at address NNN. > OK, I'm sorry, as usual I'm describing this bad. Let's start out by > trying to agree with an assumption that I have. Since the file is in a > mixed format, is it true that the line number is determined ambiguously > by the program reading the file? Not necessarily, see my other message. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 3:35 ` Eli Zaretskii @ 2006-04-06 5:06 ` Daniel Jacobowitz 2006-04-06 13:03 ` Daniel Jacobowitz 2006-04-06 14:01 ` Bob Rossi 1 sibling, 1 reply; 23+ messages in thread From: Daniel Jacobowitz @ 2006-04-06 5:06 UTC (permalink / raw) To: gdb, gdb On Thu, Apr 06, 2006 at 06:35:41AM +0300, Eli Zaretskii wrote: > > Does the debug info actually say, "at line 100 symbol foo() exists?" > > No, it says, for every source line, which PC addresses correspond to > that source line. That is all GDB needs to know, because it > manipulates PC addresses (i.e. addresses in the .text section). > > For symbols, the debug info says that symbol `foo' is stored in the > .text or .data section (or .bss or something else) at address NNN. Well, this is true, but what Bob wrote is often true also. One of them is the line corresponding to the first PC instruction of the function; the other is the line of declaration of the function, which may be different (e.g. before the leading brace). I don't remember offhand if GDB takes advantage of the latter. (One comes from DWARF .debug_line, the other from .debug_info DW_AT_decl_line). -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 5:06 ` Daniel Jacobowitz @ 2006-04-06 13:03 ` Daniel Jacobowitz 0 siblings, 0 replies; 23+ messages in thread From: Daniel Jacobowitz @ 2006-04-06 13:03 UTC (permalink / raw) To: gdb, gdb On Thu, Apr 06, 2006 at 06:35:41AM +0300, Eli Zaretskii wrote: > > Does the debug info actually say, "at line 100 symbol foo() exists?" > > No, it says, for every source line, which PC addresses correspond to > that source line. That is all GDB needs to know, because it > manipulates PC addresses (i.e. addresses in the .text section). > > For symbols, the debug info says that symbol `foo' is stored in the > .text or .data section (or .bss or something else) at address NNN. Well, this is true, but what Bob wrote is often true also. One of them is the line corresponding to the first PC instruction of the function; the other is the line of declaration of the function, which may be different (e.g. before the leading brace). I don't remember offhand if GDB takes advantage of the latter. (One comes from DWARF .debug_line, the other from .debug_info DW_AT_decl_line). -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 3:35 ` Eli Zaretskii 2006-04-06 5:06 ` Daniel Jacobowitz @ 2006-04-06 14:01 ` Bob Rossi 2006-04-06 14:41 ` Daniel Jacobowitz 2006-04-06 19:07 ` Eli Zaretskii 1 sibling, 2 replies; 23+ messages in thread From: Bob Rossi @ 2006-04-06 14:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: gdb [-- Attachment #1: Type: text/plain, Size: 4095 bytes --] On Thu, Apr 06, 2006 at 06:35:41AM +0300, Eli Zaretskii wrote: > > Date: Wed, 5 Apr 2006 23:27:02 -0400 > > From: Bob Rossi <bob_rossi@cox.net> > > > > > I have no idea what you mean. GDB gets line numbers from debug info, > > > of course; where else would it get them? > > > > Does the debug info actually say, "at line 100 symbol foo() exists?" > > No, it says, for every source line, which PC addresses correspond to > that source line. That is all GDB needs to know, because it > manipulates PC addresses (i.e. addresses in the .text section). > > For symbols, the debug info says that symbol `foo' is stored in the > .text or .data section (or .bss or something else) at address NNN. OK, this is interesting in brings up 2 cases. (They may be the same though). The first is when I have a source file displayed, I need to make sure that what the user see's as line N is what GDB/GCC think is line N. For instance, 'b foo.c:N' must be the same line N that GDB/GCC think is line N. The second case is when the user types 'b main'. GDB will find the symbol and determine the line number. (gdb) b main Breakpoint 1 at 0x8048320: file main.c, line 4. I need to make sure that line 4, is the same in GDB, as it is in CGDB. I've created a small example, using unix/dos/mac text formats. The files are attached for your viewing purposes. The output of cat $ cat unix.c int unixf (int i) { return i; } bar@adam ~/tmp/foo $ cat dos.c int dosf (int i) { return i; } bar@adam ~/tmp/foo $ cat mac.c bar@adam ~/tmp/foo The output of GDB's list and source command. (gdb) list unix.c:1 1 int 2 unixf (int i) 3 { 4 return i; 5 } (gdb) info source Current source file is unix.c Compilation directory is /home/bar/tmp/foo Located in /home/ADAM/bar/tmp/foo/unix.c Contains 5 lines. Source language is c. Compiled with unknown debugging format. Does not include preprocessor macro info. (gdb) list dos.c:1 1 int 2 dosf (int i) 3 { 4 return i; 5 } (gdb) info source Current source file is dos.c Compilation directory is /home/bar/tmp/foo Located in /home/ADAM/bar/tmp/foo/dos.c Contains 5 lines. Source language is c. Compiled with unknown debugging format. Does not include preprocessor macro info. (gdb) list mac.c:1 (gdb) rn i;) (gdb) info source Current source file is mac.c Compilation directory is /home/bar/tmp/foo Located in /home/ADAM/bar/tmp/foo/mac.c Contains 1 line. Source language is c. Compiled with unknown debugging format. Does not include preprocessor macro info. (gdb) GDB writes every line to the current line when listing the mac file. It is overwritten via the "\r". Notice that the 'info source' command thinks the file is 1 line long. This isn't correct IMO. Is it to anyone else? The breakpoint command on symbols. GDB apparently thinks the macf function is at line 4, but thinks there is only 1 line in the file. (gdb) b unixf Breakpoint 2 at 0x8048313: file unix.c, line 4. (gdb) b dosf Breakpoint 3 at 0x804831b: file dos.c, line 4. (gdb) b macf Breakpoint 4 at 0x8048323: file mac.c, line 4. Executing the program. (gdb) b unixf Breakpoint 1 at 0x8048313: file unix.c, line 4. (gdb) b dosf Breakpoint 2 at 0x804831b: file dos.c, line 4. (gdb) b macf Breakpoint 3 at 0x8048323: file mac.c, line 4. (gdb) r Breakpoint 1, unixf (i=1) at unix.c:4 4 return i; (gdb) c Breakpoint 2, dosf (i=1) at dos.c:4 4 return i; (gdb) Breakpoint 3, macf (i=1) at mac.c:4 Line number 4 out of range; mac.c has 1 lines. (gdb) I don't know what this warning message means. Should CGDB think there is 5 lines in the file or 1 to be consitent with GDB or with GCC? None of this really has anything to do with mixed file formats. That step is even more confusing. Thanks, Bob Rossi [-- Attachment #2: files.tar --] [-- Type: application/x-tar, Size: 10240 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 14:01 ` Bob Rossi @ 2006-04-06 14:41 ` Daniel Jacobowitz 2006-04-06 19:20 ` Eli Zaretskii 2006-04-06 19:07 ` Eli Zaretskii 1 sibling, 1 reply; 23+ messages in thread From: Daniel Jacobowitz @ 2006-04-06 14:41 UTC (permalink / raw) To: Eli Zaretskii, gdb On Thu, Apr 06, 2006 at 09:38:29AM -0400, Bob Rossi wrote: > OK, this is interesting in brings up 2 cases. (They may be the same > though). Why are you going to tremendous lengths to accomodate non-native newline conventions? Is there some good reason I've missed? Your example about the slightly mangled RHEL3 header doesn't cut it. That's the least problematic form of mixing. You'll just get stray non-printables at the end of lines. You can go to all the trouble you want, but the fact is, GDB only supports files using the native convention. So no wonder you can't get it to match. If you really need anything more, I recommend just detecting the case and warning. > GDB writes every line to the current line when listing the mac file. > It is overwritten via the "\r". Notice that the 'info source' command > thinks the file is 1 line long. This isn't correct IMO. Is it to anyone > else? That is a correct interpretation of a file containing only '\r' line separators on a Unix platform. > Breakpoint 3, macf (i=1) at mac.c:4 > Line number 4 out of range; mac.c has 1 lines. That is GCC's interpretation of a file containing only '\r' separators on a Unix platform. It is also valid. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 14:41 ` Daniel Jacobowitz @ 2006-04-06 19:20 ` Eli Zaretskii 2006-04-06 19:32 ` Bob Rossi 2006-04-06 20:55 ` Paul Koning 0 siblings, 2 replies; 23+ messages in thread From: Eli Zaretskii @ 2006-04-06 19:20 UTC (permalink / raw) To: gdb > Date: Thu, 6 Apr 2006 10:05:42 -0400 > From: Daniel Jacobowitz <drow@false.org> > > Why are you going to tremendous lengths to accomodate non-native > newline conventions? Is there some good reason I've missed? I don't know if that's Bob's reason, but one good reason is that nowadays you can never know where (on what machine) the source files live, and who edits them on what platform. For example, some developers are so used to Microsoft's Visual Studio that they use it to edit sources to be compiled on Unix (via the network). So it does make sense to support non-native formats, although adding that to GDB would be a non-trivial job. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 19:20 ` Eli Zaretskii @ 2006-04-06 19:32 ` Bob Rossi 2006-04-06 23:55 ` Daniel Jacobowitz 2006-04-06 20:55 ` Paul Koning 1 sibling, 1 reply; 23+ messages in thread From: Bob Rossi @ 2006-04-06 19:32 UTC (permalink / raw) To: Eli Zaretskii; +Cc: gdb On Thu, Apr 06, 2006 at 10:01:34PM +0300, Eli Zaretskii wrote: > > Date: Thu, 6 Apr 2006 10:05:42 -0400 > > From: Daniel Jacobowitz <drow@false.org> > > > > Why are you going to tremendous lengths to accomodate non-native > > newline conventions? Is there some good reason I've missed? > > I don't know if that's Bob's reason, but one good reason is that > nowadays you can never know where (on what machine) the source files > live, and who edits them on what platform. For example, some > developers are so used to Microsoft's Visual Studio that they use it > to edit sources to be compiled on Unix (via the network). > > So it does make sense to support non-native formats, although adding > that to GDB would be a non-trivial job. This is exactly my reasoning. Usually I agree 100% with Daniel, but in this circumstance, I just think it's wrong to say "sorry, your out of luck" to the user. Bob Rossi ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 19:32 ` Bob Rossi @ 2006-04-06 23:55 ` Daniel Jacobowitz 2006-04-07 13:33 ` Eli Zaretskii 0 siblings, 1 reply; 23+ messages in thread From: Daniel Jacobowitz @ 2006-04-06 23:55 UTC (permalink / raw) To: Eli Zaretskii, gdb On Thu, Apr 06, 2006 at 03:07:42PM -0400, Bob Rossi wrote: > This is exactly my reasoning. Usually I agree 100% with Daniel, but in > this circumstance, I just think it's wrong to say "sorry, your out of > luck" to the user. Native formats are the only sort we can support reliably. We get line numbers from the debug information, which was produced by a compiler - any compiler. If the file is in native format, the compiler can be presumed to have gotten the line numbers right. If it isn't, then they could be totally out of whack. And the compiler could have been run on a different platform. Supporting all of the possible combinations is simply impossible. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 23:55 ` Daniel Jacobowitz @ 2006-04-07 13:33 ` Eli Zaretskii 0 siblings, 0 replies; 23+ messages in thread From: Eli Zaretskii @ 2006-04-07 13:33 UTC (permalink / raw) To: gdb > Date: Thu, 6 Apr 2006 15:31:55 -0400 > From: Daniel Jacobowitz <drow@false.org> > > Native formats are the only sort we can support reliably. > > We get line numbers from the debug information, which was produced by a > compiler - any compiler. If the file is in native format, the compiler > can be presumed to have gotten the line numbers right. If it isn't, > then they could be totally out of whack. At least with GCC and with Unix and DOS style of EOLs, there's no basis to assume that line numbers will be ``totally out of wack''. The Mac case obviously is harder, but it sounds like that style is dying anyway. > And the compiler could have been run on a different platform. GCC supports both Unix and DOS EOLs on Windows as well. So, at least for these two styles and platforms, it's possible to support line numbers reliably. > Supporting all of the possible combinations is simply impossible. All of them might be impossible, but some of them could be quite possible. Anyway, unless we have a volunteer to add this kind of support to GDB, this dispute is purely academic. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 19:20 ` Eli Zaretskii 2006-04-06 19:32 ` Bob Rossi @ 2006-04-06 20:55 ` Paul Koning 2006-04-07 11:54 ` Eli Zaretskii 1 sibling, 1 reply; 23+ messages in thread From: Paul Koning @ 2006-04-06 20:55 UTC (permalink / raw) To: eliz; +Cc: gdb >>>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes: >> Date: Thu, 6 Apr 2006 10:05:42 -0400 From: Daniel Jacobowitz >> <drow@false.org> >> >> Why are you going to tremendous lengths to accomodate non-native >> newline conventions? Is there some good reason I've missed? Eli> I don't know if that's Bob's reason, but one good reason is that Eli> nowadays you can never know where (on what machine) the source Eli> files live, and who edits them on what platform. For example, Eli> some developers are so used to Microsoft's Visual Studio that Eli> they use it to edit sources to be compiled on Unix (via the Eli> network). Eli> So it does make sense to support non-native formats, although Eli> adding that to GDB would be a non-trivial job. This is only an issue if the source control system doesn't cure it. Subversion does, for example. paul ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 20:55 ` Paul Koning @ 2006-04-07 11:54 ` Eli Zaretskii 0 siblings, 0 replies; 23+ messages in thread From: Eli Zaretskii @ 2006-04-07 11:54 UTC (permalink / raw) To: Paul Koning; +Cc: gdb > Date: Thu, 6 Apr 2006 15:20:51 -0400 > From: Paul Koning <pkoning@equallogic.com> > Cc: gdb@sources.redhat.com > > Eli> So it does make sense to support non-native formats, although > Eli> adding that to GDB would be a non-trivial job. > > This is only an issue if the source control system doesn't cure it. I don't think we can rely on the assumption that an arbitrary source file is under source control system. > Subversion does, for example. I don't think we can rely on the assumption that everyone uses Subversion. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 14:01 ` Bob Rossi 2006-04-06 14:41 ` Daniel Jacobowitz @ 2006-04-06 19:07 ` Eli Zaretskii 1 sibling, 0 replies; 23+ messages in thread From: Eli Zaretskii @ 2006-04-06 19:07 UTC (permalink / raw) To: gdb > Date: Thu, 6 Apr 2006 09:38:29 -0400 > From: Bob Rossi <bob_rossi@cox.net> > Cc: gdb@sources.redhat.com > > The first is when I have a source file displayed, I need to make sure > that what the user see's as line N is what GDB/GCC think is line N. For > instance, 'b foo.c:N' must be the same line N that GDB/GCC think is line N. Then you must do _exactly_ what GDB does: support only native EOL formats. > The second case is when the user types 'b main'. > GDB will find the symbol and determine the line number. > (gdb) b main > Breakpoint 1 at 0x8048320: file main.c, line 4. > I need to make sure that line 4, is the same in GDB, as it is in CGDB. This only works in GDB if the source has native EOLs. You must do the same in CGDB, or else change GDB to support non-native formats, and make CGDB do the same. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-05 23:39 ` Daniel Jacobowitz 2006-04-06 0:14 ` Bob Rossi @ 2006-04-06 3:47 ` Eli Zaretskii 2006-04-06 4:29 ` Daniel Jacobowitz 1 sibling, 1 reply; 23+ messages in thread From: Eli Zaretskii @ 2006-04-06 3:47 UTC (permalink / raw) To: gdb > Date: Wed, 5 Apr 2006 19:39:38 -0400 > From: Daniel Jacobowitz <drow@false.org> > > GDB does something much simpler. It opens the file in text mode and > lets the C library sort it out. > > Well, usually. In search and reverse search it sometimes uses a > similar but slightly simpler algorithm: ignore '\r' if followed by > '\n'. I'm not sure why those are done in binary mode. I think it's because GDB counts characters and then lseeks to the point it thinks it should display. If the library's text-mode I/O converts \r\n to \n, this seeks will only work reliably in binary mode, since most DOS/Windows libraries don't seek to the correct place (the only exception from this rule I know of is the DJGPP library). ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 3:47 ` Eli Zaretskii @ 2006-04-06 4:29 ` Daniel Jacobowitz 2006-04-06 4:30 ` Daniel Jacobowitz 0 siblings, 1 reply; 23+ messages in thread From: Daniel Jacobowitz @ 2006-04-06 4:29 UTC (permalink / raw) To: gdb, gdb On Thu, Apr 06, 2006 at 06:47:42AM +0300, Eli Zaretskii wrote: > > Well, usually. In search and reverse search it sometimes uses a > > similar but slightly simpler algorithm: ignore '\r' if followed by > > '\n'. I'm not sure why those are done in binary mode. > > I think it's because GDB counts characters and then lseeks to the > point it thinks it should display. If the library's text-mode I/O > converts \r\n to \n, this seeks will only work reliably in binary > mode, since most DOS/Windows libraries don't seek to the correct place > (the only exception from this rule I know of is the DJGPP library). Oh, thanks. That's surely it. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 4:29 ` Daniel Jacobowitz @ 2006-04-06 4:30 ` Daniel Jacobowitz 0 siblings, 0 replies; 23+ messages in thread From: Daniel Jacobowitz @ 2006-04-06 4:30 UTC (permalink / raw) To: gdb, gdb On Thu, Apr 06, 2006 at 06:47:42AM +0300, Eli Zaretskii wrote: > > Well, usually. In search and reverse search it sometimes uses a > > similar but slightly simpler algorithm: ignore '\r' if followed by > > '\n'. I'm not sure why those are done in binary mode. > > I think it's because GDB counts characters and then lseeks to the > point it thinks it should display. If the library's text-mode I/O > converts \r\n to \n, this seeks will only work reliably in binary > mode, since most DOS/Windows libraries don't seek to the correct place > (the only exception from this rule I know of is the DJGPP library). Oh, thanks. That's surely it. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-05 22:31 text file formats Bob Rossi 2006-04-05 23:39 ` Daniel Jacobowitz @ 2006-04-06 3:43 ` Eli Zaretskii 2006-04-06 13:35 ` Bob Rossi 1 sibling, 1 reply; 23+ messages in thread From: Eli Zaretskii @ 2006-04-06 3:43 UTC (permalink / raw) To: gdb > Date: Wed, 5 Apr 2006 18:31:22 -0400 > From: Bob Rossi <bob_rossi@cox.net> > > However, it is easy to mix these file formats. In this case, any particular > file can use any combination of "\r", "\r\n" and "\n" for newlines. I'm not > even sure how to display such a file. I'm guessing that's it's > ambiguous, and i can make a best guess as to what the newline sequence > should be. Is this correct? > > One thing I have determined, is that in order to know what the file > format is, the entire text file needs to be parsed. After that, either > the file format is defined (unix/dos/mac) or it is undefined (mix of > them). (a) For native end-of-line (EOL) format, use the native C library and specify the text-mode I/O when you open the file. (b) For non-native but consistent EOL format, read the file in binary mode, analyze its first chunk, and then manually convert the original EOL markers into literal \n. The only two methods I know of to handle the mixed case are: (1) Fall back to Unix-style EOL and show the ^M literally. (2) Let the user specify the EOL and then apply the (b) strategy above. > I would like to make sure that the algorithm CGDB uses to determine > the line number from a file is the same algorithm that GDB uses. GDB doesn't solve any of these problems. But I think that your motivation for doing the same as GDB was based on incorrect assumptions, see Daniel's and my responses elsewhere in this thread. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 3:43 ` Eli Zaretskii @ 2006-04-06 13:35 ` Bob Rossi 2006-04-06 19:01 ` Eli Zaretskii 0 siblings, 1 reply; 23+ messages in thread From: Bob Rossi @ 2006-04-06 13:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: gdb On Thu, Apr 06, 2006 at 06:43:48AM +0300, Eli Zaretskii wrote: > > Date: Wed, 5 Apr 2006 18:31:22 -0400 > > From: Bob Rossi <bob_rossi@cox.net> > > > > However, it is easy to mix these file formats. In this case, any particular > > file can use any combination of "\r", "\r\n" and "\n" for newlines. I'm not > > even sure how to display such a file. I'm guessing that's it's > > ambiguous, and i can make a best guess as to what the newline sequence > > should be. Is this correct? > > > > One thing I have determined, is that in order to know what the file > > format is, the entire text file needs to be parsed. After that, either > > the file format is defined (unix/dos/mac) or it is undefined (mix of > > them). > > (a) For native end-of-line (EOL) format, use the native C library and > specify the text-mode I/O when you open the file. > > (b) For non-native but consistent EOL format, read the file in binary > mode, analyze its first chunk, and then manually convert the > original EOL markers into literal \n. OK, that's fine, except, you don't know if the file is native/non-native EOL until you open it and process the entire file. > The only two methods I know of to handle the mixed case are: > > (1) Fall back to Unix-style EOL and show the ^M literally. OK. > (2) Let the user specify the EOL and then apply the (b) strategy > above. OK, that's fine, but is this what GDB, GCC do? Bob Rossi ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: text file formats 2006-04-06 13:35 ` Bob Rossi @ 2006-04-06 19:01 ` Eli Zaretskii 0 siblings, 0 replies; 23+ messages in thread From: Eli Zaretskii @ 2006-04-06 19:01 UTC (permalink / raw) To: gdb > Date: Thu, 6 Apr 2006 09:15:14 -0400 > From: Bob Rossi <bob_rossi@cox.net> > Cc: gdb@sources.redhat.com > > > (a) For native end-of-line (EOL) format, use the native C library and > > specify the text-mode I/O when you open the file. > > > > (b) For non-native but consistent EOL format, read the file in binary > > mode, analyze its first chunk, and then manually convert the > > original EOL markers into literal \n. > > OK, that's fine, except, you don't know if the file is native/non-native > EOL until you open it and process the entire file. You do know that if all you want to handle is the native format. If you want to handle non-native formats as well, you must do (b). > > The only two methods I know of to handle the mixed case are: > > > > (1) Fall back to Unix-style EOL and show the ^M literally. > > OK. > > (2) Let the user specify the EOL and then apply the (b) strategy > > above. > > OK, that's fine, but is this what GDB, GCC do? No, that's what Emacs does. Daniel told you what GDB does. As for GCC, I simply don't know, but I think it does handle DOS-style CR-LF EOLs on non-Windows machines. Not sure about the (old) Mac style. ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2006-04-07 8:18 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2006-04-05 22:31 text file formats Bob Rossi 2006-04-05 23:39 ` Daniel Jacobowitz 2006-04-06 0:14 ` Bob Rossi 2006-04-06 1:17 ` Daniel Jacobowitz 2006-04-06 3:27 ` Bob Rossi 2006-04-06 3:35 ` Eli Zaretskii 2006-04-06 5:06 ` Daniel Jacobowitz 2006-04-06 13:03 ` Daniel Jacobowitz 2006-04-06 14:01 ` Bob Rossi 2006-04-06 14:41 ` Daniel Jacobowitz 2006-04-06 19:20 ` Eli Zaretskii 2006-04-06 19:32 ` Bob Rossi 2006-04-06 23:55 ` Daniel Jacobowitz 2006-04-07 13:33 ` Eli Zaretskii 2006-04-06 20:55 ` Paul Koning 2006-04-07 11:54 ` Eli Zaretskii 2006-04-06 19:07 ` Eli Zaretskii 2006-04-06 3:47 ` Eli Zaretskii 2006-04-06 4:29 ` Daniel Jacobowitz 2006-04-06 4:30 ` Daniel Jacobowitz 2006-04-06 3:43 ` Eli Zaretskii 2006-04-06 13:35 ` Bob Rossi 2006-04-06 19:01 ` Eli Zaretskii
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox