Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Simon Marchi <simon.marchi@polymtl.ca>
To: Eli Zaretskii <eliz@gnu.org>
Cc: gdb-patches@sourceware.org
Subject: Re: GDB internal error in pc_in_thread_step_range
Date: Thu, 20 Dec 2018 23:03:00 -0000	[thread overview]
Message-ID: <988ca92d2c5c976fbea57c2381eb6279@polymtl.ca> (raw)
In-Reply-To: <837eg4cick.fsf@gnu.org>

On 2018-12-20 10:45, Eli Zaretskii wrote:
>> Date: Wed, 19 Dec 2018 19:16:15 -0500
>> From: Simon Marchi <simon.marchi@polymtl.ca>
>> Cc: gdb-patches@sourceware.org
>> 
>> >   (top-gdb) p msymbol
>> >   $3 = {minsym = 0x10450d38, objfile = 0x10443b48}
>> >   (top-gdb) p msymbol.minsym.mginfo.name
>> >   $4 = 0x104485cd "__register_frame_info"
>> >   (top-gdb) p msymbol.minsym.mginfo
>> >   $5 = {name = 0x104485cd "__register_frame_info", value = {ivalue = 0,
>> >       block = 0x0, bytes = 0x0, address = 0x0, common_block = 0x0,
>> >       chain = 0x0}, language_specific = {obstack = 0x0, demangled_name
>> > = 0x0},
>> >     language = language_auto, ada_mangled = 0, section = 0}
>> 
>> Ok.  Well this is already strange.  Why is there an mst_text (code)
>> symbol with a value of 0?
> 
> Its address is zero because it's an unresolved symbol:
> 
>   d:\usr\eli>nm -A hello0.exe | fgrep " U "
>   hello0.exe:         U ___deregister_frame_info
>   hello0.exe:         U ___register_frame_info
>   hello0.exe:         U __Jv_RegisterClasses

Huh, interesting.  I looked at elfread, and similar undefined symbols 
are skipped:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=gdb/elfread.c;h=71e6fcca6ec62ec57f93f06d8a9913612be6f9e2;hb=HEAD#l270

> This symbol comes from a weak symbol defined in MinGW crtbegin.o:
> 
>   d:\usr\eli>nm -A lib/gcc/mingw32/6.3.0/crtbegin.o | fgrep _frame_info
>   lib/gcc/mingw32/6.3.0/crtbegin.o:00000000 A
> .weak.___deregister_frame_info.___EH_FRAME_BEGIN__
>   lib/gcc/mingw32/6.3.0/crtbegin.o:00000000 A
> .weak.___register_frame_info.___EH_FRAME_BEGIN__
>   lib/gcc/mingw32/6.3.0/crtbegin.o:         w ___deregister_frame_info
>   lib/gcc/mingw32/6.3.0/crtbegin.o:         w ___register_frame_info

Is crtbegin.o an object file you link with statically when compiling 
with mingw?  If so, why would you end up with an undefined reference in 
the final executable?  Or is it linked dynamically at runtime?

>> If your binary is anything like those I can
>> produce with x86_64-w64-mingw32-gcc (and it looks similar, given the
>> addresses you show), your "image base" is likely 0x400000, and "base 
>> of
>> code" 0x1000 (0x401000 in absolute).  I found this information using
>> "objdump -x", in the header somewhere.  I therefore expect all text
>> symbols to be >= 0x401000.  I would start digging why this text symbol
>> with a value of 0 exists.
> 
> See above.  But please note that I use mingw.org's MinGW, and my
> executables are 32-bit, whereas you use MinGW64 and 64-bit
> executables.  So some details might be different; in particular, I
> don't think MinGW64 has this problematic symbol, because it's specific
> to the DWARF2 exception unwinding implemented in libgcc, which 64-bit
> Windows executables don't use.

Indeed, they are similar but not identical.  The file you provided as 
attachment is very useful to see what's in your binary.

>> It would be interesting to look at some other symbols in the msymbols
>> vector.  Are the other mst_text symbols >= 0x401000?
> 
> There are 2 more unresolved mst_text symbols, see above; they all have
> a zero address.  All the others are above 0x401000, indeed.
> 
> The lowest-address resolved minimal symbol whose type is mst_text is
> this:
> 
>   (top-gdb) p msymbol[22]
>   $112 = {mginfo = {name = 0x10447d95 "_mingw32_init_mainargs", value = 
> {
> 	ivalue = 4199072, block = 0x4012a0 <_mingw32_init_mainargs>,
> 	bytes = 0x4012a0 <_mingw32_init_mainargs> "Æ\222?<\215D$,\307D$\004",
> 	address = 0x4012a0, common_block = 0x4012a0 <_mingw32_init_mainargs>,
> 	chain = 0x4012a0 <_mingw32_init_mainargs>}, language_specific = {
> 	obstack = 0x0, demangled_name = 0x0}, language = language_auto,
>       ada_mangled = 0, section = 0}, size = 0, filename = 0x0, type = 
> mst_text,
>     created_by_gdb = 0, target_flag_1 = 0, target_flag_2 = 0, has_size 
> = 0,
>     hash_next = 0x0, demangled_hash_next = 0x0}
> 
> Interestingly, objdump shows this symbol in section 1:
> 
>   [  0](sec  1)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x000002a0
> __mingw32_init_mainargs
> 
> whereas the above minsym information shows section = 0.  Is this
> expected?  If "real" symbols were to have section > 0, we could
> perhaps reject the unresolved ones.

Indeed.  I think the objdump output is misleading.  The indices in the 
"Sections:" section are 0-based.  But the indices in the "SYMBOL TABLE:" 
section look 1-based, as described here:

https://docs.microsoft.com/en-us/windows/desktop/debug/pe-format#coff-symbol-table
https://docs.microsoft.com/en-us/windows/desktop/debug/pe-format#section-number-values

So it looks like we should indeed skip symbols with section == 0.  We 
may also want to skip symbols with section == -1 (IMAGE_SYM_ABSOLUTE), 
such as __major_subsystem_version__.  I don't if anything relies on some 
of those symbols though.

>> Assuming this minimal symbol is wrong and assuming it wasn't there, 
>> then
>> I guess the search would fail and we would fall in the "Cannot find
>> bounds of current function" case of prepare_one_step?  That would be
>> appropriate in this case.
> 
> It's not wrong, but perhaps lookup_minimal_symbol_by_pc_section should
> reject unresolved symbols for this purpose.  However, the question is
> how?  One possibility is by their zero address.  (I don't see the weak
> attribute, or any other indication of its being unresolved, in the
> minimal symbol attributes.)
> 
> In any case, if we do call the "Cannot find bounds of current
> function" error, that will throw to the command loop, which I think is
> undesirable in this case.  We want GDB to step out of this code, not
> to error out.

When we have no line information for the current PC and the user asks us 
to step, we fall back to "single step until out of the current 
function".  But if the minimal symbol information doesn't let us know 
the bounds of the current function, then we can't "single step until out 
of the current function", because we don't know where it starts/end.

In your binary, the lowest .text function symbol is  
__mingw32_init_mainargs at 0x000002a0 (0x4012a0 once relocated).  Your 
pc is 0x40126d (according to an earlier message, but reading lower I 
realize this may not be valid anymore), which is lower.  So there's just 
no minimal symbol for GDB to find.  In that case, it sounds right to 
error our and say "I can't step, I don't have enough information".  The 
user can still use stepi.

Side-question, are there some debug symbols in the binary that could 
describe this location?  Which debug format (DWARF or equivalent) is 
generated when you use -g with mingw?

>> Ok, from what I understand, all these "mst_abs" symbols do not 
>> represent
>> addresses.  They just represent numerical "values", like version
>> numbers, alignment sizes, etc.  So it seems right to skip them when
>> looking for the minimal symbol preceding pc.
>> 
>> It looks like minimal_symbol_upper_bound is buggy, in that it should 
>> not
>> consider these mst_abs.  If we are looking for the end of a memory
>> range, we should not consider those symbols that do not even represent
>> memory addresses...
> 
> Indeed, the following change is enough to avoid the internal error:
> 
> --- gdb/minsyms.c~0	2018-07-04 18:41:59.000000000 +0300
> +++ gdb/minsyms.c	2018-12-20 08:06:11.516834500 +0200
> @@ -1514,7 +1514,8 @@ minimal_symbol_upper_bound (struct bound
>      {
>        if ((MSYMBOL_VALUE_RAW_ADDRESS (msymbol + i)
>  	   != MSYMBOL_VALUE_RAW_ADDRESS (msymbol))
> -	  && MSYMBOL_SECTION (msymbol + i) == section)
> +	  && MSYMBOL_SECTION (msymbol + i) == section
> +	  && MSYMBOL_TYPE (msymbol + i) != mst_abs)
>  	break;
>      }

Note that if we implement the solution of rejecting the symbols with 
section == -1, those mst_abs symbols won't be there anymore.

> However, it still shows the incorrect function name from the
> zero-address symbol:
> 
>   7       }
>   (gdb) n
>   0x00401288 in __register_frame_info ()
>   (gdb) n
>   Single stepping until exit from function __register_frame_info,
>   which has no line number information.
>   [Inferior 1 (process 10424) exited normally]
> 
> I think if we want to avoid showing __register_frame_info, we need
> further changes in lookup_minimal_symbol_by_pc_section.  But I don't
> see how this will help us, unless we also allow displaying the above
> message for functions whose names we don't know, perhaps saying
> something like
> 
>   Single stepping until exit from function <unknown>

The problem is not only that we are missing the name, but most 
importantly that we are missing the bounds of the current function.  
With what you've implemented here, GDB thinks there is a function that 
occupies the range [0,401000[ (something like that), so it single steps 
until it gets out of that range, but the process exits before.

So it kind of works for your use case, but it's not right, IMO.  If the 
process did not exit as it does here, the behavior would be erratic.  If 
GDB doesn't have the data to do something right, it should bail instead 
of doing something random.

>> 2. investigate if there should be some text symbol that should really
>> contain 0x0040126d, that for some reason does not end up in GDB's
>> minimal symbol table.
> 
> The function in which the PC value of 0x401288 lives is
> __mingw_CRTStartup, which ends like this:
> 
>   /* Call the main() function. If the user does not supply one
>    * the one in the 'libmingw32.a' library will be linked in, and
>    * that one calls WinMain().  See main.c in the 'lib' directory
>    * for more details.
>    */
>   nRet = main (_argc, _argv, environ);
> 
>   /* Perform exit processing for the C library. This means flushing
>    * output and calling atexit() registered functions.
>    */
>   _cexit ();
> 
>   ExitProcess (nRet);
> }
> 
> This function is declared in the MinGW runtime sources as follows:
> 
>   static __MINGW_ATTRIB_NORETURN void __mingw_CRTStartup (void);
> 
> But its symbol is not in the symbol table.  Not sure why, perhaps
> because it's a static function?  But the code is there, starting at
> the address 0x4011b0.  The last part, after exiting 'main', which
> corresponds to the above source snippet is this:
> 
>   (gdb) disassemble 0x401283,0x401294
>   Dump of assembler code from 0x401283 to 0x401294:
>      0x00401283 <__register_frame_info+4199043>:  call   0x401460 
> <main>
>      0x00401288 <__register_frame_info+4199048>:  mov    %eax,%ebx
>      0x0040128a <__register_frame_info+4199050>:  call   0x403a90 
> <_cexit>
>      0x0040128f <__register_frame_info+4199055>:  mov    %ebx,(%esp)
>      0x00401292 <__register_frame_info+4199058>:  call   0x403b28
> <ExitProcess@4>
> 
> So when this problem happens, we are at the "mov %eax,%ebx"
> instruction after exiting 'main', as I'd expect.

Ok, well I think it shows the problem quite clearly, some symbol is 
missing for GDB to work properly in that context.  I think that we 
should improve GDB to handle it better error out clearly (instead of 
hitting a failed assert), but I don't think it can do much more.

I guess that having debug info for the file containing 
__mingw_CRTStartup would help, if you really needed to step past main?

Simon


  reply	other threads:[~2018-12-20 23:03 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <83h8kjr8r6.fsf@gnu.org>
     [not found] ` <100001f1b27aa7d90902a75d5db37710@polymtl.ca>
2018-11-18 16:37   ` Eli Zaretskii
2018-11-30 13:42     ` Eli Zaretskii
2018-12-07  7:22       ` Eli Zaretskii
2018-12-15 12:07         ` Eli Zaretskii
2018-12-16  3:58     ` Simon Marchi
2018-12-16 15:40       ` Eli Zaretskii
2018-12-16 17:06         ` Simon Marchi
2018-12-16 17:22           ` Eli Zaretskii
2018-12-16 18:06             ` Simon Marchi
2018-12-19 15:50               ` Eli Zaretskii
2018-12-20  0:16                 ` Simon Marchi
2018-12-20 15:45                   ` Eli Zaretskii
2018-12-20 23:03                     ` Simon Marchi [this message]
2018-12-22  8:44                       ` Eli Zaretskii
2018-12-22 16:47                         ` Simon Marchi
2018-12-23  5:35                           ` Joel Brobecker
2018-12-23 15:23                             ` Eli Zaretskii
2018-12-23 15:33                               ` Simon Marchi
2018-12-23 16:10                           ` Eli Zaretskii
2018-12-23 23:31                             ` Simon Marchi
2018-12-24 16:14                               ` Eli Zaretskii
2018-12-24 16:29                                 ` Simon Marchi
2018-12-28  7:09                                   ` Eli Zaretskii
2018-12-28 10:09                                     ` Joel Brobecker
2018-12-28 10:15                                       ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=988ca92d2c5c976fbea57c2381eb6279@polymtl.ca \
    --to=simon.marchi@polymtl.ca \
    --cc=eliz@gnu.org \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox