[PATCH 0/2] [gdb] Fix error handling with nullptr gdb_stdout/gdb

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

* [PATCH 0/2] [gdb] Fix error handling with nullptr gdb_stdout/gdb_stderr
@ 2025-04-25 17:18 Tom de Vries
  2025-04-25 17:18 ` [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr Tom de Vries
  2025-04-25 17:18 ` [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush Tom de Vries
  0 siblings, 2 replies; 13+ messages in thread
From: Tom de Vries @ 2025-04-25 17:18 UTC (permalink / raw)
  To: gdb-patches

While investigating PR tui/32906, which is about how DWARF errors interact
with TUI, I ran into a two problems unrelated to the PR.

The problems are due to gdb trying to use gdb_stdout/gdb_stderr while they're
nullptr.

This series fixes those problems.

Tom de Vries (2):
  [gdb] Fix sig_write for null gdb_stderr
  [gdb] Handle nullptr stream in gdb_flush

 gdb/bt-utils.c  | 30 +++++++++++++++++++++---------
 gdb/event-top.c | 13 +++++++++++--
 gdb/utils.c     |  3 ++-
 3 files changed, 34 insertions(+), 12 deletions(-)


base-commit: a1537331abaf66fddd3553dd827381a01bcda46c
-- 
2.43.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr
  2025-04-25 17:18 [PATCH 0/2] [gdb] Fix error handling with nullptr gdb_stdout/gdb_stderr Tom de Vries
@ 2025-04-25 17:18 ` Tom de Vries
  2025-04-28 15:17   ` Simon Marchi
  2025-04-29 15:34   ` Tom Tromey
  2025-04-25 17:18 ` [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush Tom de Vries
  1 sibling, 2 replies; 13+ messages in thread
From: Tom de Vries @ 2025-04-25 17:18 UTC (permalink / raw)
  To: gdb-patches

When running test-case gdb.tui/tui-layout-asm.exp with target board
dwarf5-fission-debug-types, the test-case fails and I get a core dump:
...
 # of unexpected core files      1
...

Looking at the backtrace of the core file, what seems to be happening is that:
- gdbpy_flush attempts to flush gdb_stdout, which is nullptr
- that causes a segfault
- gdb intercepts this and starts to handle it using handle_fatal_signal
- handle_fatal_signal calls sig_write, which attempts to write to gdb_stderr,
  which is nullptr,
- that causes another segfault
- gdb exits

I managed to reproduce the problem by the following trigger patch in
stdin_event_handler:
...
-  if (error)
+  if (1 || error)
     {
       current_ui = main_ui;
       ui->unregister_file_handler ();
-      if (main_ui == ui)
+      if (1 || main_ui == ui)
 	{
 	  gdb_printf (gdb_stderr, _("error detected on stdin\n"));
+	  gdb_stderr = nullptr;
+	  gdb_stdout = nullptr;
+	  gdb_stdlog = nullptr;
 	  quit_command ((char *) 0, 0);
 	}
...
which gives us:
...
$ gdb
(gdb) <q> error detected on stdin
Segmentation fault (core dumped)
$ q
...

Fix sig_write to handle the case that gdb_stderr == nullptr, such that we get
instead:
...
$ gdb
(gdb) <q> error detected on stdin

Fatal signal: Segmentation fault
----- Backtrace -----
  ...
---------------------
A fatal error internal to GDB has been detected, further
debugging is not possible.  GDB will now terminate.

This is a bug, please report it.  For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.

Segmentation fault (core dumped)
$ q
...

Tested on x86_64-linux.
---
 gdb/bt-utils.c  | 30 +++++++++++++++++++++---------
 gdb/event-top.c | 13 +++++++++++--
 2 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/gdb/bt-utils.c b/gdb/bt-utils.c
index 8e782450ae9..9e9f680f9ea 100644
--- a/gdb/bt-utils.c
+++ b/gdb/bt-utils.c
@@ -55,7 +55,10 @@ libbacktrace_error (void *data, const char *errmsg, int errnum)
 
   const auto sig_write = [] (const char *msg) -> void
   {
-    gdb_stderr->write_async_safe (msg, strlen (msg));
+    if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
+      std::ignore = ::write (2, msg, strlen (msg));
+    else
+      gdb_stderr->write_async_safe (msg, strlen (msg));
   };
 
   sig_write ("error creating backtrace: ");
@@ -79,7 +82,10 @@ libbacktrace_print (void *data, uintptr_t pc, const char *filename,
 {
   const auto sig_write = [] (const char *msg) -> void
   {
-    gdb_stderr->write_async_safe (msg, strlen (msg));
+    if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
+      std::ignore = ::write (2, msg, strlen (msg));
+    else
+      gdb_stderr->write_async_safe (msg, strlen (msg));
   };
 
   /* Buffer to print addresses and line numbers into.  An 8-byte address
@@ -130,14 +136,20 @@ gdb_internal_backtrace_1 ()
 {
   const auto sig_write = [] (const char *msg) -> void
   {
-    gdb_stderr->write_async_safe (msg, strlen (msg));
+    if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
+      std::ignore = ::write (2, msg, strlen (msg));
+    else
+      gdb_stderr->write_async_safe (msg, strlen (msg));
   };
 
   /* Allow up to 25 frames of backtrace.  */
   void *buffer[25];
   int frames = backtrace (buffer, ARRAY_SIZE (buffer));
 
-  backtrace_symbols_fd (buffer, frames, gdb_stderr->fd ());
+  int fd = ((gdb_stderr == nullptr || gdb_stderr->fd () == -1)
+	    ? 2
+	    : gdb_stderr->fd ());
+  backtrace_symbols_fd (buffer, frames, fd);
   if (frames == ARRAY_SIZE (buffer))
     sig_write (_("Backtrace might be incomplete.\n"));
 }
@@ -173,15 +185,15 @@ gdb_internal_backtrace ()
 #ifdef GDB_PRINT_INTERNAL_BACKTRACE
   const auto sig_write = [] (const char *msg) -> void
   {
-    gdb_stderr->write_async_safe (msg, strlen (msg));
+    if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
+      std::ignore = ::write (2, msg, strlen (msg));
+    else
+      gdb_stderr->write_async_safe (msg, strlen (msg));
   };
 
   sig_write (str_backtrace);
 
-  if (gdb_stderr->fd () > -1)
-    gdb_internal_backtrace_1 ();
-  else
-    sig_write (str_backtrace_unavailable);
+  gdb_internal_backtrace_1 ();
 
   sig_write ("---------------------\n");
 #endif
diff --git a/gdb/event-top.c b/gdb/event-top.c
index c533e74811d..0c4e06b7749 100644
--- a/gdb/event-top.c
+++ b/gdb/event-top.c
@@ -982,7 +982,10 @@ handle_fatal_signal (int sig)
 #ifdef GDB_PRINT_INTERNAL_BACKTRACE
   const auto sig_write = [] (const char *msg) -> void
   {
-    gdb_stderr->write_async_safe (msg, strlen (msg));
+    if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
+      std::ignore = ::write (2, msg, strlen (msg));
+    else
+      gdb_stderr->write_async_safe (msg, strlen (msg));
   };
 
   if (bt_on_fatal_signal)
@@ -1027,7 +1030,13 @@ handle_fatal_signal (int sig)
 	}
       sig_write ("\n\n");
 
-      gdb_stderr->flush ();
+      if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
+	{
+	  /* Writing to file descriptor instead of stream, no flush
+	     required.  */
+	}
+      else
+	gdb_stderr->flush ();
     }
 #endif
 
-- 
2.43.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush
  2025-04-25 17:18 [PATCH 0/2] [gdb] Fix error handling with nullptr gdb_stdout/gdb_stderr Tom de Vries
  2025-04-25 17:18 ` [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr Tom de Vries
@ 2025-04-25 17:18 ` Tom de Vries
  2025-04-28 15:21   ` Simon Marchi
  1 sibling, 1 reply; 13+ messages in thread
From: Tom de Vries @ 2025-04-25 17:18 UTC (permalink / raw)
  To: gdb-patches

Using the trigger patch described in the previous commit, I get:
...
$ gdb
(gdb) <q>error detected on stdin

Fatal signal: Segmentation fault
----- Backtrace -----
0x64c7b3 gdb_internal_backtrace_1
	/data/vries/gdb/src/gdb/bt-utils.c:127
0x64c937 _Z22gdb_internal_backtracev
	/data/vries/gdb/src/gdb/bt-utils.c:196
0x94db83 handle_fatal_signal
	/data/vries/gdb/src/gdb/event-top.c:1021
0x94dd48 handle_sigsegv
	/data/vries/gdb/src/gdb/event-top.c:1098
0x7f372be578ff ???
0x10b7c0a _Z9gdb_flushP7ui_file
	/data/vries/gdb/src/gdb/utils.c:1527
0xd4b938 gdbpy_flush
	/data/vries/gdb/src/gdb/python/python.c:1624
0x7f372d73b276 _PyCFunction_FastCallDict
	Objects/methodobject.c:231
0x7f372d73b276 _PyCFunction_FastCallKeywords
	Objects/methodobject.c:294
0x7f372d794a09 call_function
	Python/ceval.c:4851
0x7f372d78e838 _PyEval_EvalFrameDefault
	Python/ceval.c:3351
0x7f372d796e6e PyEval_EvalFrameEx
	Python/ceval.c:754
0x7f372d796e6e _PyFunction_FastCall
	Python/ceval.c:4933
0x7f372d796e6e _PyFunction_FastCallDict
	Python/ceval.c:5035
0x7f372d6fefc8 _PyObject_FastCallDict
	Objects/abstract.c:2310
0x7f372d6fefc8 _PyObject_Call_Prepend
	Objects/abstract.c:2373
0x7f372d6fe162 _PyObject_FastCallDict
	Objects/abstract.c:2331
0x7f372d700705 callmethod
	Objects/abstract.c:2583
0x7f372d700705 _PyObject_CallMethodId
	Objects/abstract.c:2640
0x7f372d812a41 flush_std_files
	Python/pylifecycle.c:699
0x7f372d81281d Py_FinalizeEx
	Python/pylifecycle.c:768
0xd4d49b finalize_python
	/data/vries/gdb/src/gdb/python/python.c:2308
0x9587eb _Z17ext_lang_shutdownv
	/data/vries/gdb/src/gdb/extension.c:330
0xfd98df _Z10quit_forcePii
	/data/vries/gdb/src/gdb/top.c:1817
0x6b3080 _Z12quit_commandPKci
	/data/vries/gdb/src/gdb/cli/cli-cmds.c:483
0x1056577 stdin_event_handler
	/data/vries/gdb/src/gdb/ui.c:131
0x1986970 handle_file_event
	/data/vries/gdb/src/gdbsupport/event-loop.cc:551
0x1986f4b gdb_wait_for_event
	/data/vries/gdb/src/gdbsupport/event-loop.cc:672
0x1985e0c _Z16gdb_do_one_eventi
	/data/vries/gdb/src/gdbsupport/event-loop.cc:263
0xb66f2e start_event_loop
	/data/vries/gdb/src/gdb/main.c:402
0xb670ba captured_command_loop
	/data/vries/gdb/src/gdb/main.c:466
0xb68b9b captured_main
	/data/vries/gdb/src/gdb/main.c:1344
0xb68c44 _Z8gdb_mainP18captured_main_args
	/data/vries/gdb/src/gdb/main.c:1363
0x41a3b1 main
	/data/vries/gdb/src/gdb/gdb.c:38
---------------------
A fatal error internal to GDB has been detected, further
debugging is not possible.  GDB will now terminate.

This is a bug, please report it.  For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.

Segmentation fault (core dumped)
$ q
...

Fix this in gdb_flush by checking for nullptr stream parameter, such that we
get instead:
...
$ gdb
(gdb) <q>error detected on stdin
$ q
...

Tested on x86_64-linux.
---
 gdb/utils.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gdb/utils.c b/gdb/utils.c
index ce3c26ef140..90063e4a96d 100644
--- a/gdb/utils.c
+++ b/gdb/utils.c
@@ -1524,7 +1524,8 @@ pager_file::flush ()
 void
 gdb_flush (struct ui_file *stream)
 {
-  stream->flush ();
+  if (stream != nullptr)
+    stream->flush ();
 }
 
 /* See utils.h.  */
-- 
2.43.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr
  2025-04-25 17:18 ` [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr Tom de Vries
@ 2025-04-28 15:17   ` Simon Marchi
  2025-04-29 11:50     ` Tom de Vries
  2025-04-29 15:34   ` Tom Tromey
  1 sibling, 1 reply; 13+ messages in thread
From: Simon Marchi @ 2025-04-28 15:17 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

On 4/25/25 1:18 PM, Tom de Vries wrote:
> When running test-case gdb.tui/tui-layout-asm.exp with target board
> dwarf5-fission-debug-types, the test-case fails and I get a core dump:
> ...
>  # of unexpected core files      1
> ...
> 
> Looking at the backtrace of the core file, what seems to be happening is that:
> - gdbpy_flush attempts to flush gdb_stdout, which is nullptr
> - that causes a segfault
> - gdb intercepts this and starts to handle it using handle_fatal_signal
> - handle_fatal_signal calls sig_write, which attempts to write to gdb_stderr,
>   which is nullptr,
> - that causes another segfault
> - gdb exits
> 
> I managed to reproduce the problem by the following trigger patch in
> stdin_event_handler:
> ...
> -  if (error)
> +  if (1 || error)
>      {
>        current_ui = main_ui;
>        ui->unregister_file_handler ();
> -      if (main_ui == ui)
> +      if (1 || main_ui == ui)
>  	{
>  	  gdb_printf (gdb_stderr, _("error detected on stdin\n"));
> +	  gdb_stderr = nullptr;
> +	  gdb_stdout = nullptr;
> +	  gdb_stdlog = nullptr;
>  	  quit_command ((char *) 0, 0);
>  	}
> ...
> which gives us:
> ...
> $ gdb
> (gdb) <q> error detected on stdin
> Segmentation fault (core dumped)
> $ q
> ...
> 
> Fix sig_write to handle the case that gdb_stderr == nullptr, such that we get
> instead:
> ...
> $ gdb
> (gdb) <q> error detected on stdin
> 
> Fatal signal: Segmentation fault
> ----- Backtrace -----
>   ...
> ---------------------
> A fatal error internal to GDB has been detected, further
> debugging is not possible.  GDB will now terminate.
> 
> This is a bug, please report it.  For instructions, see:
> <https://www.gnu.org/software/gdb/bugs/>.
> 
> Segmentation fault (core dumped)
> $ q
> ...
> 
> Tested on x86_64-linux.
> ---
>  gdb/bt-utils.c  | 30 +++++++++++++++++++++---------
>  gdb/event-top.c | 13 +++++++++++--
>  2 files changed, 32 insertions(+), 11 deletions(-)
> 
> diff --git a/gdb/bt-utils.c b/gdb/bt-utils.c
> index 8e782450ae9..9e9f680f9ea 100644
> --- a/gdb/bt-utils.c
> +++ b/gdb/bt-utils.c
> @@ -55,7 +55,10 @@ libbacktrace_error (void *data, const char *errmsg, int errnum)
>  
>    const auto sig_write = [] (const char *msg) -> void
>    {
> -    gdb_stderr->write_async_safe (msg, strlen (msg));
> +    if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
> +      std::ignore = ::write (2, msg, strlen (msg));

TIL about std::ignore.

Is it possible to factor out the lambdas:

  const auto sig_write = [] (const char *msg) -> void
  {
    if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
      std::ignore = ::write (2, msg, strlen (msg));
    else
      gdb_stderr->write_async_safe (msg, strlen (msg));
  };

to a function?  At first glance they all seem to be the same.

Otherwise, LGTM.

Approved-By: Simon Marchi <simon.marchi@efficios.com>

Simon

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush
  2025-04-25 17:18 ` [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush Tom de Vries
@ 2025-04-28 15:21   ` Simon Marchi
  2025-04-28 16:32     ` Tom de Vries
  0 siblings, 1 reply; 13+ messages in thread
From: Simon Marchi @ 2025-04-28 15:21 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

On 4/25/25 1:18 PM, Tom de Vries wrote:
> Using the trigger patch described in the previous commit, I get:
> ...
> $ gdb
> (gdb) <q>error detected on stdin
> 
> Fatal signal: Segmentation fault
> ----- Backtrace -----
> 0x64c7b3 gdb_internal_backtrace_1
> 	/data/vries/gdb/src/gdb/bt-utils.c:127
> 0x64c937 _Z22gdb_internal_backtracev
> 	/data/vries/gdb/src/gdb/bt-utils.c:196
> 0x94db83 handle_fatal_signal
> 	/data/vries/gdb/src/gdb/event-top.c:1021
> 0x94dd48 handle_sigsegv
> 	/data/vries/gdb/src/gdb/event-top.c:1098
> 0x7f372be578ff ???
> 0x10b7c0a _Z9gdb_flushP7ui_file
> 	/data/vries/gdb/src/gdb/utils.c:1527
> 0xd4b938 gdbpy_flush
> 	/data/vries/gdb/src/gdb/python/python.c:1624
> 0x7f372d73b276 _PyCFunction_FastCallDict
> 	Objects/methodobject.c:231
> 0x7f372d73b276 _PyCFunction_FastCallKeywords
> 	Objects/methodobject.c:294
> 0x7f372d794a09 call_function
> 	Python/ceval.c:4851
> 0x7f372d78e838 _PyEval_EvalFrameDefault
> 	Python/ceval.c:3351
> 0x7f372d796e6e PyEval_EvalFrameEx
> 	Python/ceval.c:754
> 0x7f372d796e6e _PyFunction_FastCall
> 	Python/ceval.c:4933
> 0x7f372d796e6e _PyFunction_FastCallDict
> 	Python/ceval.c:5035
> 0x7f372d6fefc8 _PyObject_FastCallDict
> 	Objects/abstract.c:2310
> 0x7f372d6fefc8 _PyObject_Call_Prepend
> 	Objects/abstract.c:2373
> 0x7f372d6fe162 _PyObject_FastCallDict
> 	Objects/abstract.c:2331
> 0x7f372d700705 callmethod
> 	Objects/abstract.c:2583
> 0x7f372d700705 _PyObject_CallMethodId
> 	Objects/abstract.c:2640
> 0x7f372d812a41 flush_std_files
> 	Python/pylifecycle.c:699
> 0x7f372d81281d Py_FinalizeEx
> 	Python/pylifecycle.c:768
> 0xd4d49b finalize_python
> 	/data/vries/gdb/src/gdb/python/python.c:2308
> 0x9587eb _Z17ext_lang_shutdownv
> 	/data/vries/gdb/src/gdb/extension.c:330
> 0xfd98df _Z10quit_forcePii
> 	/data/vries/gdb/src/gdb/top.c:1817
> 0x6b3080 _Z12quit_commandPKci
> 	/data/vries/gdb/src/gdb/cli/cli-cmds.c:483
> 0x1056577 stdin_event_handler
> 	/data/vries/gdb/src/gdb/ui.c:131
> 0x1986970 handle_file_event
> 	/data/vries/gdb/src/gdbsupport/event-loop.cc:551
> 0x1986f4b gdb_wait_for_event
> 	/data/vries/gdb/src/gdbsupport/event-loop.cc:672
> 0x1985e0c _Z16gdb_do_one_eventi
> 	/data/vries/gdb/src/gdbsupport/event-loop.cc:263
> 0xb66f2e start_event_loop
> 	/data/vries/gdb/src/gdb/main.c:402
> 0xb670ba captured_command_loop
> 	/data/vries/gdb/src/gdb/main.c:466
> 0xb68b9b captured_main
> 	/data/vries/gdb/src/gdb/main.c:1344
> 0xb68c44 _Z8gdb_mainP18captured_main_args
> 	/data/vries/gdb/src/gdb/main.c:1363
> 0x41a3b1 main
> 	/data/vries/gdb/src/gdb/gdb.c:38
> ---------------------
> A fatal error internal to GDB has been detected, further
> debugging is not possible.  GDB will now terminate.
> 
> This is a bug, please report it.  For instructions, see:
> <https://www.gnu.org/software/gdb/bugs/>.
> 
> Segmentation fault (core dumped)
> $ q
> ...
> 
> Fix this in gdb_flush by checking for nullptr stream parameter, such that we
> get instead:
> ...
> $ gdb
> (gdb) <q>error detected on stdin
> $ q
> ...
> 
> Tested on x86_64-linux.
> ---
>  gdb/utils.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/utils.c b/gdb/utils.c
> index ce3c26ef140..90063e4a96d 100644
> --- a/gdb/utils.c
> +++ b/gdb/utils.c
> @@ -1524,7 +1524,8 @@ pager_file::flush ()
>  void
>  gdb_flush (struct ui_file *stream)
>  {
> -  stream->flush ();
> +  if (stream != nullptr)
> +    stream->flush ();

Can we add the nullptr check only where it's needed?

I'm thinking that we should move towards removing gdb_flush (replacing
the call sites to call ui_file::flush directly).  And when we do so, we
wouldn't want to add nullptr checks everywhere that gdb_flush is
currently called.

Simon

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush
  2025-04-28 15:21   ` Simon Marchi
@ 2025-04-28 16:32     ` Tom de Vries
  2025-04-28 16:38       ` Simon Marchi
  0 siblings, 1 reply; 13+ messages in thread
From: Tom de Vries @ 2025-04-28 16:32 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

On 4/28/25 17:21, Simon Marchi wrote:
> On 4/25/25 1:18 PM, Tom de Vries wrote:
>> Using the trigger patch described in the previous commit, I get:
>> ...
>> $ gdb
>> (gdb) <q>error detected on stdin
>>
>> Fatal signal: Segmentation fault
>> ----- Backtrace -----
>> 0x64c7b3 gdb_internal_backtrace_1
>> 	/data/vries/gdb/src/gdb/bt-utils.c:127
>> 0x64c937 _Z22gdb_internal_backtracev
>> 	/data/vries/gdb/src/gdb/bt-utils.c:196
>> 0x94db83 handle_fatal_signal
>> 	/data/vries/gdb/src/gdb/event-top.c:1021
>> 0x94dd48 handle_sigsegv
>> 	/data/vries/gdb/src/gdb/event-top.c:1098
>> 0x7f372be578ff ???
>> 0x10b7c0a _Z9gdb_flushP7ui_file
>> 	/data/vries/gdb/src/gdb/utils.c:1527
>> 0xd4b938 gdbpy_flush
>> 	/data/vries/gdb/src/gdb/python/python.c:1624
>> 0x7f372d73b276 _PyCFunction_FastCallDict
>> 	Objects/methodobject.c:231
>> 0x7f372d73b276 _PyCFunction_FastCallKeywords
>> 	Objects/methodobject.c:294
>> 0x7f372d794a09 call_function
>> 	Python/ceval.c:4851
>> 0x7f372d78e838 _PyEval_EvalFrameDefault
>> 	Python/ceval.c:3351
>> 0x7f372d796e6e PyEval_EvalFrameEx
>> 	Python/ceval.c:754
>> 0x7f372d796e6e _PyFunction_FastCall
>> 	Python/ceval.c:4933
>> 0x7f372d796e6e _PyFunction_FastCallDict
>> 	Python/ceval.c:5035
>> 0x7f372d6fefc8 _PyObject_FastCallDict
>> 	Objects/abstract.c:2310
>> 0x7f372d6fefc8 _PyObject_Call_Prepend
>> 	Objects/abstract.c:2373
>> 0x7f372d6fe162 _PyObject_FastCallDict
>> 	Objects/abstract.c:2331
>> 0x7f372d700705 callmethod
>> 	Objects/abstract.c:2583
>> 0x7f372d700705 _PyObject_CallMethodId
>> 	Objects/abstract.c:2640
>> 0x7f372d812a41 flush_std_files
>> 	Python/pylifecycle.c:699
>> 0x7f372d81281d Py_FinalizeEx
>> 	Python/pylifecycle.c:768
>> 0xd4d49b finalize_python
>> 	/data/vries/gdb/src/gdb/python/python.c:2308
>> 0x9587eb _Z17ext_lang_shutdownv
>> 	/data/vries/gdb/src/gdb/extension.c:330
>> 0xfd98df _Z10quit_forcePii
>> 	/data/vries/gdb/src/gdb/top.c:1817
>> 0x6b3080 _Z12quit_commandPKci
>> 	/data/vries/gdb/src/gdb/cli/cli-cmds.c:483
>> 0x1056577 stdin_event_handler
>> 	/data/vries/gdb/src/gdb/ui.c:131
>> 0x1986970 handle_file_event
>> 	/data/vries/gdb/src/gdbsupport/event-loop.cc:551
>> 0x1986f4b gdb_wait_for_event
>> 	/data/vries/gdb/src/gdbsupport/event-loop.cc:672
>> 0x1985e0c _Z16gdb_do_one_eventi
>> 	/data/vries/gdb/src/gdbsupport/event-loop.cc:263
>> 0xb66f2e start_event_loop
>> 	/data/vries/gdb/src/gdb/main.c:402
>> 0xb670ba captured_command_loop
>> 	/data/vries/gdb/src/gdb/main.c:466
>> 0xb68b9b captured_main
>> 	/data/vries/gdb/src/gdb/main.c:1344
>> 0xb68c44 _Z8gdb_mainP18captured_main_args
>> 	/data/vries/gdb/src/gdb/main.c:1363
>> 0x41a3b1 main
>> 	/data/vries/gdb/src/gdb/gdb.c:38
>> ---------------------
>> A fatal error internal to GDB has been detected, further
>> debugging is not possible.  GDB will now terminate.
>>
>> This is a bug, please report it.  For instructions, see:
>> <https://www.gnu.org/software/gdb/bugs/>.
>>
>> Segmentation fault (core dumped)
>> $ q
>> ...
>>
>> Fix this in gdb_flush by checking for nullptr stream parameter, such that we
>> get instead:
>> ...
>> $ gdb
>> (gdb) <q>error detected on stdin
>> $ q
>> ...
>>
>> Tested on x86_64-linux.
>> ---
>>   gdb/utils.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/gdb/utils.c b/gdb/utils.c
>> index ce3c26ef140..90063e4a96d 100644
>> --- a/gdb/utils.c
>> +++ b/gdb/utils.c
>> @@ -1524,7 +1524,8 @@ pager_file::flush ()
>>   void
>>   gdb_flush (struct ui_file *stream)
>>   {
>> -  stream->flush ();
>> +  if (stream != nullptr)
>> +    stream->flush ();
> 

Hi Simon,

thanks for the review.

> Can we add the nullptr check only where it's needed?
> 

AFAIU, I've added it where it's needed.

Can you tell me where you think it's needed?  I cannot tell from your 
comments.

Thanks,
- Tom

> I'm thinking that we should move towards removing gdb_flush (replacing
> the call sites to call ui_file::flush directly).  And when we do so, we
> wouldn't want to add nullptr checks everywhere that gdb_flush is
> currently called.
> 
> Simon


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush
  2025-04-28 16:32     ` Tom de Vries
@ 2025-04-28 16:38       ` Simon Marchi
  2025-04-29 11:52         ` Tom de Vries
  0 siblings, 1 reply; 13+ messages in thread
From: Simon Marchi @ 2025-04-28 16:38 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

On 4/28/25 12:32 PM, Tom de Vries wrote:
> AFAIU, I've added it where it's needed.
> 
> Can you tell me where you think it's needed?  I cannot tell from your comments.

Based on the stack trace in your commit message, it would be in
gdbpy_flush.  If I understand correctly, this call site is special
because it can be called when GDB is getting torn down, and I suppose
because the streams are closed before Python gets finalized.  Most calls
to gdb_flush are done on streams that we know are not nullptr/not
closed.

Simon

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr
  2025-04-28 15:17   ` Simon Marchi
@ 2025-04-29 11:50     ` Tom de Vries
  0 siblings, 0 replies; 13+ messages in thread
From: Tom de Vries @ 2025-04-29 11:50 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

On 4/28/25 17:17, Simon Marchi wrote:
> On 4/25/25 1:18 PM, Tom de Vries wrote:
>> When running test-case gdb.tui/tui-layout-asm.exp with target board
>> dwarf5-fission-debug-types, the test-case fails and I get a core dump:
>> ...
>>   # of unexpected core files      1
>> ...
>>
>> Looking at the backtrace of the core file, what seems to be happening is that:
>> - gdbpy_flush attempts to flush gdb_stdout, which is nullptr
>> - that causes a segfault
>> - gdb intercepts this and starts to handle it using handle_fatal_signal
>> - handle_fatal_signal calls sig_write, which attempts to write to gdb_stderr,
>>    which is nullptr,
>> - that causes another segfault
>> - gdb exits
>>
>> I managed to reproduce the problem by the following trigger patch in
>> stdin_event_handler:
>> ...
>> -  if (error)
>> +  if (1 || error)
>>       {
>>         current_ui = main_ui;
>>         ui->unregister_file_handler ();
>> -      if (main_ui == ui)
>> +      if (1 || main_ui == ui)
>>   	{
>>   	  gdb_printf (gdb_stderr, _("error detected on stdin\n"));
>> +	  gdb_stderr = nullptr;
>> +	  gdb_stdout = nullptr;
>> +	  gdb_stdlog = nullptr;
>>   	  quit_command ((char *) 0, 0);
>>   	}
>> ...
>> which gives us:
>> ...
>> $ gdb
>> (gdb) <q> error detected on stdin
>> Segmentation fault (core dumped)
>> $ q
>> ...
>>
>> Fix sig_write to handle the case that gdb_stderr == nullptr, such that we get
>> instead:
>> ...
>> $ gdb
>> (gdb) <q> error detected on stdin
>>
>> Fatal signal: Segmentation fault
>> ----- Backtrace -----
>>    ...
>> ---------------------
>> A fatal error internal to GDB has been detected, further
>> debugging is not possible.  GDB will now terminate.
>>
>> This is a bug, please report it.  For instructions, see:
>> <https://www.gnu.org/software/gdb/bugs/>.
>>
>> Segmentation fault (core dumped)
>> $ q
>> ...
>>
>> Tested on x86_64-linux.
>> ---
>>   gdb/bt-utils.c  | 30 +++++++++++++++++++++---------
>>   gdb/event-top.c | 13 +++++++++++--
>>   2 files changed, 32 insertions(+), 11 deletions(-)
>>
>> diff --git a/gdb/bt-utils.c b/gdb/bt-utils.c
>> index 8e782450ae9..9e9f680f9ea 100644
>> --- a/gdb/bt-utils.c
>> +++ b/gdb/bt-utils.c
>> @@ -55,7 +55,10 @@ libbacktrace_error (void *data, const char *errmsg, int errnum)
>>   
>>     const auto sig_write = [] (const char *msg) -> void
>>     {
>> -    gdb_stderr->write_async_safe (msg, strlen (msg));
>> +    if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
>> +      std::ignore = ::write (2, msg, strlen (msg));
> 
> TIL about std::ignore.
> 
> Is it possible to factor out the lambdas:
> 
>    const auto sig_write = [] (const char *msg) -> void
>    {
>      if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
>        std::ignore = ::write (2, msg, strlen (msg));
>      else
>        gdb_stderr->write_async_safe (msg, strlen (msg));
>    };
> 
> to a function?  At first glance they all seem to be the same.
> 

I've done this in a v2 ( 
https://sourceware.org/pipermail/gdb-patches/2025-April/217564.html ) in 
a separate patch.

Thanks,
- Tom

> Otherwise, LGTM.
> 
> Approved-By: Simon Marchi <simon.marchi@efficios.com>
> 
> Simon


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush
  2025-04-28 16:38       ` Simon Marchi
@ 2025-04-29 11:52         ` Tom de Vries
  0 siblings, 0 replies; 13+ messages in thread
From: Tom de Vries @ 2025-04-29 11:52 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

On 4/28/25 18:38, Simon Marchi wrote:
> On 4/28/25 12:32 PM, Tom de Vries wrote:
>> AFAIU, I've added it where it's needed.
>>
>> Can you tell me where you think it's needed?  I cannot tell from your comments.
> 
> Based on the stack trace in your commit message, it would be in
> gdbpy_flush.  If I understand correctly, this call site is special
> because it can be called when GDB is getting torn down, and I suppose
> because the streams are closed before Python gets finalized.  Most calls
> to gdb_flush are done on streams that we know are not nullptr/not
> closed.

Done in a v2 ( 
https://sourceware.org/pipermail/gdb-patches/2025-April/217565.html ).

This also required fixing ioscm_flush.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr
  2025-04-25 17:18 ` [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr Tom de Vries
  2025-04-28 15:17   ` Simon Marchi
@ 2025-04-29 15:34   ` Tom Tromey
  2025-04-30 18:49     ` Tom de Vries
  1 sibling, 1 reply; 13+ messages in thread
From: Tom Tromey @ 2025-04-29 15:34 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

Tom> - gdbpy_flush attempts to flush gdb_stdout, which is nullptr
Tom> - that causes a segfault

When can this be nullptr?
That seems surprising to me.

Tom

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr
  2025-04-29 15:34   ` Tom Tromey
@ 2025-04-30 18:49     ` Tom de Vries
  2025-04-30 19:11       ` Tom de Vries
  0 siblings, 1 reply; 13+ messages in thread
From: Tom de Vries @ 2025-04-30 18:49 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

On 4/29/25 17:34, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
> 
> Tom> - gdbpy_flush attempts to flush gdb_stdout, which is nullptr
> Tom> - that causes a segfault
> 
> When can this be nullptr?
> That seems surprising to me.

I managed to recognize the situation using:
...
diff --git a/gdb/tui/tui-io.c b/gdb/tui/tui-io.c
index 1b4cc82cce8..fa4186aea3b 100644
--- a/gdb/tui/tui-io.c
+++ b/gdb/tui/tui-io.c
@@ -869,6 +869,7 @@ tui_setup_io (int mode)
        /* Restore gdb output.  */
        gdb_stdout = tui_old_stdout;
        gdb_stderr = tui_old_stderr;
+      gdb_assert (gdb_stderr != nullptr);
        gdb_stdlog = tui_old_stdlog;
        gdb_stdtarg = gdb_stderr;
        current_uiout = tui_old_uiout;
...

And the backtrace (in an unusual form because of recursion) is:
...
#19000 0x000000000041a3b2 in main (argc=14, argv=0x7ffc5c28c168)
     at /data/vries/gdb/src/gdb/gdb.c:38
38	  return gdb_main (&args);
(gdb) down
#18999 0x0000000000b68937 in gdb_main (args=0x7ffc5c28c030)
     at /data/vries/gdb/src/gdb/main.c:1363
1363	      captured_main (args);
(gdb)
#18998 0x0000000000b6888e in captured_main (data=0x7ffc5c28c030)
     at /data/vries/gdb/src/gdb/main.c:1344
1344		  captured_command_loop ();
(gdb)
#18997 0x0000000000b66dad in captured_command_loop ()
     at /data/vries/gdb/src/gdb/main.c:466
466	  start_event_loop ();
(gdb)
#18996 0x0000000000b66c21 in start_event_loop ()
     at /data/vries/gdb/src/gdb/main.c:402
402		  result = gdb_do_one_event ();
(gdb)
#18995 0x0000000001985e31 in gdb_do_one_event (mstimeout=-1)
     at /data/vries/gdb/src/gdbsupport/event-loop.cc:263
263	  return gdb_wait_for_event (1);
(gdb)
#18994 0x0000000001986f70 in gdb_wait_for_event (block=1)
     at /data/vries/gdb/src/gdbsupport/event-loop.cc:672
672	      handle_file_event (file_ptr, mask);
(gdb)
#18993 0x0000000001986995 in handle_file_event (file_ptr=0x3b26aff0,
     ready_mask=25) at /data/vries/gdb/src/gdbsupport/event-loop.cc:551
551	      file_ptr->proc (file_ptr->error, file_ptr->client_data);
(gdb)
#18992 0x00000000010563ad in stdin_event_handler (error=1,
     client_data=0x3aec7120) at /data/vries/gdb/src/gdb/ui.c:128
128		  quit_command ((char *) 0, 0);
(gdb)
#18991 0x00000000006b2e63 in quit_command (args=0x0, from_tty=0)
     at /data/vries/gdb/src/gdb/cli/cli-cmds.c:483
483	  quit_force (args ? &exit_code : NULL, from_tty);
(gdb)
#18990 0x0000000000fd94dc in quit_force (exit_arg=0x0, from_tty=0)
     at /data/vries/gdb/src/gdb/top.c:1750
1750	  undo_terminal_modifications_before_exit ();
(gdb)
#18989 0x0000000000fd942f in undo_terminal_modifications_before_exit ()
     at /data/vries/gdb/src/gdb/top.c:1718
1718	  tui_disable ();
(gdb)
#18988 0x000000000103fa11 in tui_disable ()
     at /data/vries/gdb/src/gdb/tui/tui.c:562
562	  tui_setup_io (0);
(gdb)
#18987 0x00000000010129d4 in tui_setup_io (mode=0)
     at /data/vries/gdb/src/gdb/tui/tui-io.c:872
872	      gdb_assert (gdb_stderr != nullptr);
(gdb)
...

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr
  2025-04-30 18:49     ` Tom de Vries
@ 2025-04-30 19:11       ` Tom de Vries
  2025-05-02 18:45         ` Tom Tromey
  0 siblings, 1 reply; 13+ messages in thread
From: Tom de Vries @ 2025-04-30 19:11 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

On 4/30/25 20:49, Tom de Vries wrote:
> On 4/29/25 17:34, Tom Tromey wrote:
>>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
>>
>> Tom> - gdbpy_flush attempts to flush gdb_stdout, which is nullptr
>> Tom> - that causes a segfault
>>
>> When can this be nullptr?
>> That seems surprising to me.
> 
> I managed to recognize the situation using:
> ...
> diff --git a/gdb/tui/tui-io.c b/gdb/tui/tui-io.c
> index 1b4cc82cce8..fa4186aea3b 100644
> --- a/gdb/tui/tui-io.c
> +++ b/gdb/tui/tui-io.c
> @@ -869,6 +869,7 @@ tui_setup_io (int mode)
>         /* Restore gdb output.  */
>         gdb_stdout = tui_old_stdout;
>         gdb_stderr = tui_old_stderr;
> +      gdb_assert (gdb_stderr != nullptr);
>         gdb_stdlog = tui_old_stdlog;
>         gdb_stdtarg = gdb_stderr;
>         current_uiout = tui_old_uiout;
> ...
> 
> And the backtrace (in an unusual form because of recursion) is:
> ...
> #19000 0x000000000041a3b2 in main (argc=14, argv=0x7ffc5c28c168)
>      at /data/vries/gdb/src/gdb/gdb.c:38
> 38      return gdb_main (&args);
> (gdb) down
> #18999 0x0000000000b68937 in gdb_main (args=0x7ffc5c28c030)
>      at /data/vries/gdb/src/gdb/main.c:1363
> 1363          captured_main (args);
> (gdb)
> #18998 0x0000000000b6888e in captured_main (data=0x7ffc5c28c030)
>      at /data/vries/gdb/src/gdb/main.c:1344
> 1344          captured_command_loop ();
> (gdb)
> #18997 0x0000000000b66dad in captured_command_loop ()
>      at /data/vries/gdb/src/gdb/main.c:466
> 466      start_event_loop ();
> (gdb)
> #18996 0x0000000000b66c21 in start_event_loop ()
>      at /data/vries/gdb/src/gdb/main.c:402
> 402          result = gdb_do_one_event ();
> (gdb)
> #18995 0x0000000001985e31 in gdb_do_one_event (mstimeout=-1)
>      at /data/vries/gdb/src/gdbsupport/event-loop.cc:263
> 263      return gdb_wait_for_event (1);
> (gdb)
> #18994 0x0000000001986f70 in gdb_wait_for_event (block=1)
>      at /data/vries/gdb/src/gdbsupport/event-loop.cc:672
> 672          handle_file_event (file_ptr, mask);
> (gdb)
> #18993 0x0000000001986995 in handle_file_event (file_ptr=0x3b26aff0,
>      ready_mask=25) at /data/vries/gdb/src/gdbsupport/event-loop.cc:551
> 551          file_ptr->proc (file_ptr->error, file_ptr->client_data);
> (gdb)
> #18992 0x00000000010563ad in stdin_event_handler (error=1,
>      client_data=0x3aec7120) at /data/vries/gdb/src/gdb/ui.c:128
> 128          quit_command ((char *) 0, 0);
> (gdb)
> #18991 0x00000000006b2e63 in quit_command (args=0x0, from_tty=0)
>      at /data/vries/gdb/src/gdb/cli/cli-cmds.c:483
> 483      quit_force (args ? &exit_code : NULL, from_tty);
> (gdb)
> #18990 0x0000000000fd94dc in quit_force (exit_arg=0x0, from_tty=0)
>      at /data/vries/gdb/src/gdb/top.c:1750
> 1750      undo_terminal_modifications_before_exit ();
> (gdb)
> #18989 0x0000000000fd942f in undo_terminal_modifications_before_exit ()
>      at /data/vries/gdb/src/gdb/top.c:1718
> 1718      tui_disable ();
> (gdb)
> #18988 0x000000000103fa11 in tui_disable ()
>      at /data/vries/gdb/src/gdb/tui/tui.c:562
> 562      tui_setup_io (0);
> (gdb)
> #18987 0x00000000010129d4 in tui_setup_io (mode=0)
>      at /data/vries/gdb/src/gdb/tui/tui-io.c:872
> 872          gdb_assert (gdb_stderr != nullptr);
> (gdb)
> ...

OK, I understand now how this happens.

The relevant code is in tui_enable.

While initializing, we set tui_active to true, because:
...
       /* We must mark the tui sub-system active before trying to setup
          the
          current layout as tui windows defined by an extension language 

          rely on this flag being true in order to know that the window 

          they are creating is currently valid.  */
       tui_active = true;
...

After that we call tui_set_initial_layout, and the DWARF error is thrown.

It's not caught early enough to continue executing tui_enable.

Consequently, tui_setup_io (1) is not called.

Then when undo_terminal_modifications_before_exit is called, it calls 
tui_disable, which checks for tui_active, and then calls setup_io (0) 
assuming setup_io (1) has been called, which is not the case.

Thanks,
- Tom


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr
  2025-04-30 19:11       ` Tom de Vries
@ 2025-05-02 18:45         ` Tom Tromey
  0 siblings, 0 replies; 13+ messages in thread
From: Tom Tromey @ 2025-05-02 18:45 UTC (permalink / raw)
  To: Tom de Vries; +Cc: Tom Tromey, gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

Tom>       tui_active = true;
Tom> ...

Tom> After that we call tui_set_initial_layout, and the DWARF error is thrown.

Tom> It's not caught early enough to continue executing tui_enable.

Tom> Consequently, tui_setup_io (1) is not called.

Tom> Then when undo_terminal_modifications_before_exit is called, it calls
Tom> tui_disable, which checks for tui_active, and then calls setup_io (0)
Tom> assuming setup_io (1) has been called, which is not the case.

It sounds like an invariant isn't being kept here, I wonder if there
should be code to reset tui_active if tui_set_initial_layout throws?

Tom

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-05-02 18:45 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-25 17:18 [PATCH 0/2] [gdb] Fix error handling with nullptr gdb_stdout/gdb_stderr Tom de Vries
2025-04-25 17:18 ` [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr Tom de Vries
2025-04-28 15:17   ` Simon Marchi
2025-04-29 11:50     ` Tom de Vries
2025-04-29 15:34   ` Tom Tromey
2025-04-30 18:49     ` Tom de Vries
2025-04-30 19:11       ` Tom de Vries
2025-05-02 18:45         ` Tom Tromey
2025-04-25 17:18 ` [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush Tom de Vries
2025-04-28 15:21   ` Simon Marchi
2025-04-28 16:32     ` Tom de Vries
2025-04-28 16:38       ` Simon Marchi
2025-04-29 11:52         ` Tom de Vries

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox