On 10/16/24 7:50 AM, Andrew Burgess wrote:
Guinevere Larsen <guinevere@redhat.com> writes:

I managed to reproduce a crash on Linux with a similar setup, but my 
backtrace looks a little different. In my version, GDB is trying to 
setup solib hooks as a "post start inferior" step, whereas on yours it 
seems that GDB is trying to print the location at which the inferior is 
stopped.

In both cases, though, what happens is that we're calling 
find_pc_section, which will try to create a map of sections for quick 
lookup, but since we don't understand the file format, GDB thinks we 
have no sections (and at least in the linux case, the pointer to the 
sections is null). so the issue is not that the offset is -1, but rather 
that this->the_bfd_section is 0x0.

If I'm correct that this is the cause of the crash on windows as well as 
linux, I think this should fix the crash. Could you sanity check it for me?


diff --git a/gdb/objfiles.c b/gdb/objfiles.c
index 0e076fe36be..cc8bb253d11 100644
--- a/gdb/objfiles.c
+++ b/gdb/objfiles.c
@@ -1053,6 +1053,11 @@ update_section_map (struct program_space *pspace,
    gdb_assert (pspace_info->section_map_dirty != 0
               || pspace_info->new_objfiles_available != 0);

+  /* If there are 0 sections, or the point to the sections is null, it 
makes
+     no sense to try and have a section map at all. */
+  if (pspace_info->num_sections == 0 || pspace_info->sections == nullptr)
+ return;
+
    map = *pmap;
    xfree (map);

This does fix the crash in my setup (Windows gdbserver with Linux GDB
configured to only support ELF), but I'm not sure this is the approach
that I would take.

While it is true that this change allows GDB to handle an objfile that
is can't otherwise understand, I wonder, how useful is it to even keep
the objfile around at all?

GDB realises that it can't handle the objfile in find_sym_fns, this is
where the "I'm sorry, Dave, I can't do that" message comes from.  The
backtrace at this point is:

  #0  find_sym_fns (abfd=0x21b4fa0) at ../../src/gdb/symfile.c:1808
  #1  0x0000000000c13002 in syms_from_objfile_1 (objfile=0x21d3bf0, addrs=0x0, add_flags=...) at ../../src/gdb/symfile.c:916
  #2  0x0000000000c1330c in syms_from_objfile (objfile=0x21d3bf0, addrs=0x0, add_flags=...) at ../../src/gdb/symfile.c:998
  #3  0x0000000000c13838 in symbol_file_add_with_addrs (abfd=..., name=0x21aeab0 "target:C:/msys64/home/andrew/segv.exe", add_flags=..., addrs=0x0, flags=..., parent=0x0) at ../../src/gdb/symfile.c:1104
  #4  0x0000000000c13bd1 in symbol_file_add_from_bfd (abfd=..., name=0x21aeab0 "target:C:/msys64/home/andrew/segv.exe", add_flags=..., addrs=0x0, flags=..., parent=0x0) at ../../src/gdb/symfile.c:1180
  #5  0x0000000000c13c20 in symbol_file_add (name=0x21aeab0 "target:C:/msys64/home/andrew/segv.exe", add_flags=..., addrs=0x0, flags=...) at ../../src/gdb/symfile.c:1193
  #6  0x0000000000c13ce6 in symbol_file_add_main_1 (args=0x21aeab0 "target:C:/msys64/home/andrew/segv.exe", add_flags=..., flags=..., reloff=0x0) at ../../src/gdb/symfile.c:1217
  #7  0x0000000000c13c8d in symbol_file_add_main (args=0x21aeab0 "target:C:/msys64/home/andrew/segv.exe", add_flags=...) at ../../src/gdb/symfile.c:1208
  #8  0x000000000081cfa3 in try_open_exec_file (exec_file_host=0x21aeab0 "target:C:/msys64/home/andrew/segv.exe", inf=0x1a1e9f0, add_flags=...) at ../../src/gdb/exec.c:202
  #9  0x000000000081d863 in exec_file_locate_attach (pid=42000, defer_bp_reset=0, from_tty=1) at ../../src/gdb/exec.c:344

The exception thrown in find_sym_fns is not caught until
try_open_exec_file.

The objfile itself is created in symbol_file_add_with_addrs, and
interestingly, if we checkout the end of this function we'll find this
line:

  gdb::observers::new_objfile.notify (objfile);

which is going to be completely skipped if we throw an exception as we
do in this case.

I'm tempted to suggest that what we should consider is removing the
objfile if it turns out that GDB doesn't understand it.  Attached at the
end of this email is a **very** rough patch which does this.  After
creating the objfile we immediately wrap it in an struct, now if we
throw an exception then the objfile is auto-removed from the program
space (and deleted).  Only if we reach the end of
symbol_file_add_with_addrs do we call a method on the local variable to
mark the objfile as successfully created.
This makes sense. I was a bit worried about what would happen if there were no objfiles for a program space (since that is likely to happen if we don't support the main objfile format), but testing with your rough patch doesn't show any problems in a basic "read, break, run, change memory" test session, so it should be ok enough, considering this isn't meant to be supported.

I'd be tempted to convert the `if` checks you added into `gdb_assert`
calls, at least initially.  I'm not sure if it's possible to have a
"valid" objfile that we understand that would otherwise trigger that
`if`, hence asserts for now.

I tested this and, it doesn't work. Adding the asserts give me hundreds of errors in the gdb.base subfolder alone. Which must also mean that my patch is incorrect.

I'm going to take a quick look that the errors makes sense and aren't an unrelated bug, but I think we should just go with your solution...


The patch below is really **very** rough.  I'm tempted to think that
objfile::make should return the object which auto-removes the objfile,
that way anyone who creates an objfile is forced to think about this
pattern of creation, validation, accepting...
I tested it, things also seem to work. I'll work a bit on the change to objfile::make that you suggested, this looks like a good direction to go on.
-- 
Cheers,
Guinevere Larsen
She/Her/Hers

Anyway, would be interested to hear your thoughts.

Thanks,
Andrew

---

diff --git a/gdb/symfile.c b/gdb/symfile.c
index 1502fdbe500..17a7be6093b 100644
--- a/gdb/symfile.c
+++ b/gdb/symfile.c
@@ -854,6 +854,33 @@ init_entry_point_info (struct objfile *objfile)
     }
 }
 
+struct scoped_objfile_remover
+{
+  explicit scoped_objfile_remover (struct objfile *objfile)
+    : m_objfile (objfile)
+  {
+    /* Nothing.  */
+  }
+
+  ~scoped_objfile_remover ()
+  {
+    if (m_objfile != nullptr)
+      {
+	fprintf (stderr, "APB: Removing the objfile, something went wrong\n");
+	m_objfile->unlink ();
+      }
+  }
+
+  void release_objfile ()
+  {
+    m_objfile = nullptr;
+  }
+
+private:
+
+  struct objfile *m_objfile;
+};
+
 /* Process a symbol file, as either the main file or as a dynamically
    loaded file.
 
@@ -1061,6 +1088,7 @@ symbol_file_add_with_addrs (const gdb_bfd_ref_ptr &abfd, const char *name,
 
   objfile *objfile
     = objfile::make (abfd, current_program_space, name, flags, parent);
+  scoped_objfile_remover remove_objfile_on_error (objfile);
 
   /* We either created a new mapped symbol table, mapped an existing
      symbol table file which has not had initial symbol reading
@@ -1112,6 +1140,9 @@ symbol_file_add_with_addrs (const gdb_bfd_ref_ptr &abfd, const char *name,
   if (objfile->sf != nullptr)
     finish_new_objfile (objfile, add_flags);
 
+  /* No errors.  The objfile is officially added.  */
+  remove_objfile_on_error.release_objfile ();
+
   gdb::observers::new_objfile.notify (objfile);
 
   return objfile;