* [PATCH 0/2] [gdb] Fix error handling with nullptr gdb_stdout/gdb_stderr
@ 2025-04-25 17:18 Tom de Vries
2025-04-25 17:18 ` [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr Tom de Vries
2025-04-25 17:18 ` [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush Tom de Vries
0 siblings, 2 replies; 13+ messages in thread
From: Tom de Vries @ 2025-04-25 17:18 UTC (permalink / raw)
To: gdb-patches
While investigating PR tui/32906, which is about how DWARF errors interact
with TUI, I ran into a two problems unrelated to the PR.
The problems are due to gdb trying to use gdb_stdout/gdb_stderr while they're
nullptr.
This series fixes those problems.
Tom de Vries (2):
[gdb] Fix sig_write for null gdb_stderr
[gdb] Handle nullptr stream in gdb_flush
gdb/bt-utils.c | 30 +++++++++++++++++++++---------
gdb/event-top.c | 13 +++++++++++--
gdb/utils.c | 3 ++-
3 files changed, 34 insertions(+), 12 deletions(-)
base-commit: a1537331abaf66fddd3553dd827381a01bcda46c
--
2.43.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr
2025-04-25 17:18 [PATCH 0/2] [gdb] Fix error handling with nullptr gdb_stdout/gdb_stderr Tom de Vries
@ 2025-04-25 17:18 ` Tom de Vries
2025-04-28 15:17 ` Simon Marchi
2025-04-29 15:34 ` Tom Tromey
2025-04-25 17:18 ` [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush Tom de Vries
1 sibling, 2 replies; 13+ messages in thread
From: Tom de Vries @ 2025-04-25 17:18 UTC (permalink / raw)
To: gdb-patches
When running test-case gdb.tui/tui-layout-asm.exp with target board
dwarf5-fission-debug-types, the test-case fails and I get a core dump:
...
# of unexpected core files 1
...
Looking at the backtrace of the core file, what seems to be happening is that:
- gdbpy_flush attempts to flush gdb_stdout, which is nullptr
- that causes a segfault
- gdb intercepts this and starts to handle it using handle_fatal_signal
- handle_fatal_signal calls sig_write, which attempts to write to gdb_stderr,
which is nullptr,
- that causes another segfault
- gdb exits
I managed to reproduce the problem by the following trigger patch in
stdin_event_handler:
...
- if (error)
+ if (1 || error)
{
current_ui = main_ui;
ui->unregister_file_handler ();
- if (main_ui == ui)
+ if (1 || main_ui == ui)
{
gdb_printf (gdb_stderr, _("error detected on stdin\n"));
+ gdb_stderr = nullptr;
+ gdb_stdout = nullptr;
+ gdb_stdlog = nullptr;
quit_command ((char *) 0, 0);
}
...
which gives us:
...
$ gdb
(gdb) <q> error detected on stdin
Segmentation fault (core dumped)
$ q
...
Fix sig_write to handle the case that gdb_stderr == nullptr, such that we get
instead:
...
$ gdb
(gdb) <q> error detected on stdin
Fatal signal: Segmentation fault
----- Backtrace -----
...
---------------------
A fatal error internal to GDB has been detected, further
debugging is not possible. GDB will now terminate.
This is a bug, please report it. For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.
Segmentation fault (core dumped)
$ q
...
Tested on x86_64-linux.
---
gdb/bt-utils.c | 30 +++++++++++++++++++++---------
gdb/event-top.c | 13 +++++++++++--
2 files changed, 32 insertions(+), 11 deletions(-)
diff --git a/gdb/bt-utils.c b/gdb/bt-utils.c
index 8e782450ae9..9e9f680f9ea 100644
--- a/gdb/bt-utils.c
+++ b/gdb/bt-utils.c
@@ -55,7 +55,10 @@ libbacktrace_error (void *data, const char *errmsg, int errnum)
const auto sig_write = [] (const char *msg) -> void
{
- gdb_stderr->write_async_safe (msg, strlen (msg));
+ if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
+ std::ignore = ::write (2, msg, strlen (msg));
+ else
+ gdb_stderr->write_async_safe (msg, strlen (msg));
};
sig_write ("error creating backtrace: ");
@@ -79,7 +82,10 @@ libbacktrace_print (void *data, uintptr_t pc, const char *filename,
{
const auto sig_write = [] (const char *msg) -> void
{
- gdb_stderr->write_async_safe (msg, strlen (msg));
+ if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
+ std::ignore = ::write (2, msg, strlen (msg));
+ else
+ gdb_stderr->write_async_safe (msg, strlen (msg));
};
/* Buffer to print addresses and line numbers into. An 8-byte address
@@ -130,14 +136,20 @@ gdb_internal_backtrace_1 ()
{
const auto sig_write = [] (const char *msg) -> void
{
- gdb_stderr->write_async_safe (msg, strlen (msg));
+ if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
+ std::ignore = ::write (2, msg, strlen (msg));
+ else
+ gdb_stderr->write_async_safe (msg, strlen (msg));
};
/* Allow up to 25 frames of backtrace. */
void *buffer[25];
int frames = backtrace (buffer, ARRAY_SIZE (buffer));
- backtrace_symbols_fd (buffer, frames, gdb_stderr->fd ());
+ int fd = ((gdb_stderr == nullptr || gdb_stderr->fd () == -1)
+ ? 2
+ : gdb_stderr->fd ());
+ backtrace_symbols_fd (buffer, frames, fd);
if (frames == ARRAY_SIZE (buffer))
sig_write (_("Backtrace might be incomplete.\n"));
}
@@ -173,15 +185,15 @@ gdb_internal_backtrace ()
#ifdef GDB_PRINT_INTERNAL_BACKTRACE
const auto sig_write = [] (const char *msg) -> void
{
- gdb_stderr->write_async_safe (msg, strlen (msg));
+ if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
+ std::ignore = ::write (2, msg, strlen (msg));
+ else
+ gdb_stderr->write_async_safe (msg, strlen (msg));
};
sig_write (str_backtrace);
- if (gdb_stderr->fd () > -1)
- gdb_internal_backtrace_1 ();
- else
- sig_write (str_backtrace_unavailable);
+ gdb_internal_backtrace_1 ();
sig_write ("---------------------\n");
#endif
diff --git a/gdb/event-top.c b/gdb/event-top.c
index c533e74811d..0c4e06b7749 100644
--- a/gdb/event-top.c
+++ b/gdb/event-top.c
@@ -982,7 +982,10 @@ handle_fatal_signal (int sig)
#ifdef GDB_PRINT_INTERNAL_BACKTRACE
const auto sig_write = [] (const char *msg) -> void
{
- gdb_stderr->write_async_safe (msg, strlen (msg));
+ if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
+ std::ignore = ::write (2, msg, strlen (msg));
+ else
+ gdb_stderr->write_async_safe (msg, strlen (msg));
};
if (bt_on_fatal_signal)
@@ -1027,7 +1030,13 @@ handle_fatal_signal (int sig)
}
sig_write ("\n\n");
- gdb_stderr->flush ();
+ if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
+ {
+ /* Writing to file descriptor instead of stream, no flush
+ required. */
+ }
+ else
+ gdb_stderr->flush ();
}
#endif
--
2.43.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush
2025-04-25 17:18 [PATCH 0/2] [gdb] Fix error handling with nullptr gdb_stdout/gdb_stderr Tom de Vries
2025-04-25 17:18 ` [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr Tom de Vries
@ 2025-04-25 17:18 ` Tom de Vries
2025-04-28 15:21 ` Simon Marchi
1 sibling, 1 reply; 13+ messages in thread
From: Tom de Vries @ 2025-04-25 17:18 UTC (permalink / raw)
To: gdb-patches
Using the trigger patch described in the previous commit, I get:
...
$ gdb
(gdb) <q>error detected on stdin
Fatal signal: Segmentation fault
----- Backtrace -----
0x64c7b3 gdb_internal_backtrace_1
/data/vries/gdb/src/gdb/bt-utils.c:127
0x64c937 _Z22gdb_internal_backtracev
/data/vries/gdb/src/gdb/bt-utils.c:196
0x94db83 handle_fatal_signal
/data/vries/gdb/src/gdb/event-top.c:1021
0x94dd48 handle_sigsegv
/data/vries/gdb/src/gdb/event-top.c:1098
0x7f372be578ff ???
0x10b7c0a _Z9gdb_flushP7ui_file
/data/vries/gdb/src/gdb/utils.c:1527
0xd4b938 gdbpy_flush
/data/vries/gdb/src/gdb/python/python.c:1624
0x7f372d73b276 _PyCFunction_FastCallDict
Objects/methodobject.c:231
0x7f372d73b276 _PyCFunction_FastCallKeywords
Objects/methodobject.c:294
0x7f372d794a09 call_function
Python/ceval.c:4851
0x7f372d78e838 _PyEval_EvalFrameDefault
Python/ceval.c:3351
0x7f372d796e6e PyEval_EvalFrameEx
Python/ceval.c:754
0x7f372d796e6e _PyFunction_FastCall
Python/ceval.c:4933
0x7f372d796e6e _PyFunction_FastCallDict
Python/ceval.c:5035
0x7f372d6fefc8 _PyObject_FastCallDict
Objects/abstract.c:2310
0x7f372d6fefc8 _PyObject_Call_Prepend
Objects/abstract.c:2373
0x7f372d6fe162 _PyObject_FastCallDict
Objects/abstract.c:2331
0x7f372d700705 callmethod
Objects/abstract.c:2583
0x7f372d700705 _PyObject_CallMethodId
Objects/abstract.c:2640
0x7f372d812a41 flush_std_files
Python/pylifecycle.c:699
0x7f372d81281d Py_FinalizeEx
Python/pylifecycle.c:768
0xd4d49b finalize_python
/data/vries/gdb/src/gdb/python/python.c:2308
0x9587eb _Z17ext_lang_shutdownv
/data/vries/gdb/src/gdb/extension.c:330
0xfd98df _Z10quit_forcePii
/data/vries/gdb/src/gdb/top.c:1817
0x6b3080 _Z12quit_commandPKci
/data/vries/gdb/src/gdb/cli/cli-cmds.c:483
0x1056577 stdin_event_handler
/data/vries/gdb/src/gdb/ui.c:131
0x1986970 handle_file_event
/data/vries/gdb/src/gdbsupport/event-loop.cc:551
0x1986f4b gdb_wait_for_event
/data/vries/gdb/src/gdbsupport/event-loop.cc:672
0x1985e0c _Z16gdb_do_one_eventi
/data/vries/gdb/src/gdbsupport/event-loop.cc:263
0xb66f2e start_event_loop
/data/vries/gdb/src/gdb/main.c:402
0xb670ba captured_command_loop
/data/vries/gdb/src/gdb/main.c:466
0xb68b9b captured_main
/data/vries/gdb/src/gdb/main.c:1344
0xb68c44 _Z8gdb_mainP18captured_main_args
/data/vries/gdb/src/gdb/main.c:1363
0x41a3b1 main
/data/vries/gdb/src/gdb/gdb.c:38
---------------------
A fatal error internal to GDB has been detected, further
debugging is not possible. GDB will now terminate.
This is a bug, please report it. For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.
Segmentation fault (core dumped)
$ q
...
Fix this in gdb_flush by checking for nullptr stream parameter, such that we
get instead:
...
$ gdb
(gdb) <q>error detected on stdin
$ q
...
Tested on x86_64-linux.
---
gdb/utils.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/gdb/utils.c b/gdb/utils.c
index ce3c26ef140..90063e4a96d 100644
--- a/gdb/utils.c
+++ b/gdb/utils.c
@@ -1524,7 +1524,8 @@ pager_file::flush ()
void
gdb_flush (struct ui_file *stream)
{
- stream->flush ();
+ if (stream != nullptr)
+ stream->flush ();
}
/* See utils.h. */
--
2.43.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr
2025-04-25 17:18 ` [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr Tom de Vries
@ 2025-04-28 15:17 ` Simon Marchi
2025-04-29 11:50 ` Tom de Vries
2025-04-29 15:34 ` Tom Tromey
1 sibling, 1 reply; 13+ messages in thread
From: Simon Marchi @ 2025-04-28 15:17 UTC (permalink / raw)
To: Tom de Vries, gdb-patches
On 4/25/25 1:18 PM, Tom de Vries wrote:
> When running test-case gdb.tui/tui-layout-asm.exp with target board
> dwarf5-fission-debug-types, the test-case fails and I get a core dump:
> ...
> # of unexpected core files 1
> ...
>
> Looking at the backtrace of the core file, what seems to be happening is that:
> - gdbpy_flush attempts to flush gdb_stdout, which is nullptr
> - that causes a segfault
> - gdb intercepts this and starts to handle it using handle_fatal_signal
> - handle_fatal_signal calls sig_write, which attempts to write to gdb_stderr,
> which is nullptr,
> - that causes another segfault
> - gdb exits
>
> I managed to reproduce the problem by the following trigger patch in
> stdin_event_handler:
> ...
> - if (error)
> + if (1 || error)
> {
> current_ui = main_ui;
> ui->unregister_file_handler ();
> - if (main_ui == ui)
> + if (1 || main_ui == ui)
> {
> gdb_printf (gdb_stderr, _("error detected on stdin\n"));
> + gdb_stderr = nullptr;
> + gdb_stdout = nullptr;
> + gdb_stdlog = nullptr;
> quit_command ((char *) 0, 0);
> }
> ...
> which gives us:
> ...
> $ gdb
> (gdb) <q> error detected on stdin
> Segmentation fault (core dumped)
> $ q
> ...
>
> Fix sig_write to handle the case that gdb_stderr == nullptr, such that we get
> instead:
> ...
> $ gdb
> (gdb) <q> error detected on stdin
>
> Fatal signal: Segmentation fault
> ----- Backtrace -----
> ...
> ---------------------
> A fatal error internal to GDB has been detected, further
> debugging is not possible. GDB will now terminate.
>
> This is a bug, please report it. For instructions, see:
> <https://www.gnu.org/software/gdb/bugs/>.
>
> Segmentation fault (core dumped)
> $ q
> ...
>
> Tested on x86_64-linux.
> ---
> gdb/bt-utils.c | 30 +++++++++++++++++++++---------
> gdb/event-top.c | 13 +++++++++++--
> 2 files changed, 32 insertions(+), 11 deletions(-)
>
> diff --git a/gdb/bt-utils.c b/gdb/bt-utils.c
> index 8e782450ae9..9e9f680f9ea 100644
> --- a/gdb/bt-utils.c
> +++ b/gdb/bt-utils.c
> @@ -55,7 +55,10 @@ libbacktrace_error (void *data, const char *errmsg, int errnum)
>
> const auto sig_write = [] (const char *msg) -> void
> {
> - gdb_stderr->write_async_safe (msg, strlen (msg));
> + if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
> + std::ignore = ::write (2, msg, strlen (msg));
TIL about std::ignore.
Is it possible to factor out the lambdas:
const auto sig_write = [] (const char *msg) -> void
{
if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
std::ignore = ::write (2, msg, strlen (msg));
else
gdb_stderr->write_async_safe (msg, strlen (msg));
};
to a function? At first glance they all seem to be the same.
Otherwise, LGTM.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
Simon
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush
2025-04-25 17:18 ` [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush Tom de Vries
@ 2025-04-28 15:21 ` Simon Marchi
2025-04-28 16:32 ` Tom de Vries
0 siblings, 1 reply; 13+ messages in thread
From: Simon Marchi @ 2025-04-28 15:21 UTC (permalink / raw)
To: Tom de Vries, gdb-patches
On 4/25/25 1:18 PM, Tom de Vries wrote:
> Using the trigger patch described in the previous commit, I get:
> ...
> $ gdb
> (gdb) <q>error detected on stdin
>
> Fatal signal: Segmentation fault
> ----- Backtrace -----
> 0x64c7b3 gdb_internal_backtrace_1
> /data/vries/gdb/src/gdb/bt-utils.c:127
> 0x64c937 _Z22gdb_internal_backtracev
> /data/vries/gdb/src/gdb/bt-utils.c:196
> 0x94db83 handle_fatal_signal
> /data/vries/gdb/src/gdb/event-top.c:1021
> 0x94dd48 handle_sigsegv
> /data/vries/gdb/src/gdb/event-top.c:1098
> 0x7f372be578ff ???
> 0x10b7c0a _Z9gdb_flushP7ui_file
> /data/vries/gdb/src/gdb/utils.c:1527
> 0xd4b938 gdbpy_flush
> /data/vries/gdb/src/gdb/python/python.c:1624
> 0x7f372d73b276 _PyCFunction_FastCallDict
> Objects/methodobject.c:231
> 0x7f372d73b276 _PyCFunction_FastCallKeywords
> Objects/methodobject.c:294
> 0x7f372d794a09 call_function
> Python/ceval.c:4851
> 0x7f372d78e838 _PyEval_EvalFrameDefault
> Python/ceval.c:3351
> 0x7f372d796e6e PyEval_EvalFrameEx
> Python/ceval.c:754
> 0x7f372d796e6e _PyFunction_FastCall
> Python/ceval.c:4933
> 0x7f372d796e6e _PyFunction_FastCallDict
> Python/ceval.c:5035
> 0x7f372d6fefc8 _PyObject_FastCallDict
> Objects/abstract.c:2310
> 0x7f372d6fefc8 _PyObject_Call_Prepend
> Objects/abstract.c:2373
> 0x7f372d6fe162 _PyObject_FastCallDict
> Objects/abstract.c:2331
> 0x7f372d700705 callmethod
> Objects/abstract.c:2583
> 0x7f372d700705 _PyObject_CallMethodId
> Objects/abstract.c:2640
> 0x7f372d812a41 flush_std_files
> Python/pylifecycle.c:699
> 0x7f372d81281d Py_FinalizeEx
> Python/pylifecycle.c:768
> 0xd4d49b finalize_python
> /data/vries/gdb/src/gdb/python/python.c:2308
> 0x9587eb _Z17ext_lang_shutdownv
> /data/vries/gdb/src/gdb/extension.c:330
> 0xfd98df _Z10quit_forcePii
> /data/vries/gdb/src/gdb/top.c:1817
> 0x6b3080 _Z12quit_commandPKci
> /data/vries/gdb/src/gdb/cli/cli-cmds.c:483
> 0x1056577 stdin_event_handler
> /data/vries/gdb/src/gdb/ui.c:131
> 0x1986970 handle_file_event
> /data/vries/gdb/src/gdbsupport/event-loop.cc:551
> 0x1986f4b gdb_wait_for_event
> /data/vries/gdb/src/gdbsupport/event-loop.cc:672
> 0x1985e0c _Z16gdb_do_one_eventi
> /data/vries/gdb/src/gdbsupport/event-loop.cc:263
> 0xb66f2e start_event_loop
> /data/vries/gdb/src/gdb/main.c:402
> 0xb670ba captured_command_loop
> /data/vries/gdb/src/gdb/main.c:466
> 0xb68b9b captured_main
> /data/vries/gdb/src/gdb/main.c:1344
> 0xb68c44 _Z8gdb_mainP18captured_main_args
> /data/vries/gdb/src/gdb/main.c:1363
> 0x41a3b1 main
> /data/vries/gdb/src/gdb/gdb.c:38
> ---------------------
> A fatal error internal to GDB has been detected, further
> debugging is not possible. GDB will now terminate.
>
> This is a bug, please report it. For instructions, see:
> <https://www.gnu.org/software/gdb/bugs/>.
>
> Segmentation fault (core dumped)
> $ q
> ...
>
> Fix this in gdb_flush by checking for nullptr stream parameter, such that we
> get instead:
> ...
> $ gdb
> (gdb) <q>error detected on stdin
> $ q
> ...
>
> Tested on x86_64-linux.
> ---
> gdb/utils.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gdb/utils.c b/gdb/utils.c
> index ce3c26ef140..90063e4a96d 100644
> --- a/gdb/utils.c
> +++ b/gdb/utils.c
> @@ -1524,7 +1524,8 @@ pager_file::flush ()
> void
> gdb_flush (struct ui_file *stream)
> {
> - stream->flush ();
> + if (stream != nullptr)
> + stream->flush ();
Can we add the nullptr check only where it's needed?
I'm thinking that we should move towards removing gdb_flush (replacing
the call sites to call ui_file::flush directly). And when we do so, we
wouldn't want to add nullptr checks everywhere that gdb_flush is
currently called.
Simon
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush
2025-04-28 15:21 ` Simon Marchi
@ 2025-04-28 16:32 ` Tom de Vries
2025-04-28 16:38 ` Simon Marchi
0 siblings, 1 reply; 13+ messages in thread
From: Tom de Vries @ 2025-04-28 16:32 UTC (permalink / raw)
To: Simon Marchi, gdb-patches
On 4/28/25 17:21, Simon Marchi wrote:
> On 4/25/25 1:18 PM, Tom de Vries wrote:
>> Using the trigger patch described in the previous commit, I get:
>> ...
>> $ gdb
>> (gdb) <q>error detected on stdin
>>
>> Fatal signal: Segmentation fault
>> ----- Backtrace -----
>> 0x64c7b3 gdb_internal_backtrace_1
>> /data/vries/gdb/src/gdb/bt-utils.c:127
>> 0x64c937 _Z22gdb_internal_backtracev
>> /data/vries/gdb/src/gdb/bt-utils.c:196
>> 0x94db83 handle_fatal_signal
>> /data/vries/gdb/src/gdb/event-top.c:1021
>> 0x94dd48 handle_sigsegv
>> /data/vries/gdb/src/gdb/event-top.c:1098
>> 0x7f372be578ff ???
>> 0x10b7c0a _Z9gdb_flushP7ui_file
>> /data/vries/gdb/src/gdb/utils.c:1527
>> 0xd4b938 gdbpy_flush
>> /data/vries/gdb/src/gdb/python/python.c:1624
>> 0x7f372d73b276 _PyCFunction_FastCallDict
>> Objects/methodobject.c:231
>> 0x7f372d73b276 _PyCFunction_FastCallKeywords
>> Objects/methodobject.c:294
>> 0x7f372d794a09 call_function
>> Python/ceval.c:4851
>> 0x7f372d78e838 _PyEval_EvalFrameDefault
>> Python/ceval.c:3351
>> 0x7f372d796e6e PyEval_EvalFrameEx
>> Python/ceval.c:754
>> 0x7f372d796e6e _PyFunction_FastCall
>> Python/ceval.c:4933
>> 0x7f372d796e6e _PyFunction_FastCallDict
>> Python/ceval.c:5035
>> 0x7f372d6fefc8 _PyObject_FastCallDict
>> Objects/abstract.c:2310
>> 0x7f372d6fefc8 _PyObject_Call_Prepend
>> Objects/abstract.c:2373
>> 0x7f372d6fe162 _PyObject_FastCallDict
>> Objects/abstract.c:2331
>> 0x7f372d700705 callmethod
>> Objects/abstract.c:2583
>> 0x7f372d700705 _PyObject_CallMethodId
>> Objects/abstract.c:2640
>> 0x7f372d812a41 flush_std_files
>> Python/pylifecycle.c:699
>> 0x7f372d81281d Py_FinalizeEx
>> Python/pylifecycle.c:768
>> 0xd4d49b finalize_python
>> /data/vries/gdb/src/gdb/python/python.c:2308
>> 0x9587eb _Z17ext_lang_shutdownv
>> /data/vries/gdb/src/gdb/extension.c:330
>> 0xfd98df _Z10quit_forcePii
>> /data/vries/gdb/src/gdb/top.c:1817
>> 0x6b3080 _Z12quit_commandPKci
>> /data/vries/gdb/src/gdb/cli/cli-cmds.c:483
>> 0x1056577 stdin_event_handler
>> /data/vries/gdb/src/gdb/ui.c:131
>> 0x1986970 handle_file_event
>> /data/vries/gdb/src/gdbsupport/event-loop.cc:551
>> 0x1986f4b gdb_wait_for_event
>> /data/vries/gdb/src/gdbsupport/event-loop.cc:672
>> 0x1985e0c _Z16gdb_do_one_eventi
>> /data/vries/gdb/src/gdbsupport/event-loop.cc:263
>> 0xb66f2e start_event_loop
>> /data/vries/gdb/src/gdb/main.c:402
>> 0xb670ba captured_command_loop
>> /data/vries/gdb/src/gdb/main.c:466
>> 0xb68b9b captured_main
>> /data/vries/gdb/src/gdb/main.c:1344
>> 0xb68c44 _Z8gdb_mainP18captured_main_args
>> /data/vries/gdb/src/gdb/main.c:1363
>> 0x41a3b1 main
>> /data/vries/gdb/src/gdb/gdb.c:38
>> ---------------------
>> A fatal error internal to GDB has been detected, further
>> debugging is not possible. GDB will now terminate.
>>
>> This is a bug, please report it. For instructions, see:
>> <https://www.gnu.org/software/gdb/bugs/>.
>>
>> Segmentation fault (core dumped)
>> $ q
>> ...
>>
>> Fix this in gdb_flush by checking for nullptr stream parameter, such that we
>> get instead:
>> ...
>> $ gdb
>> (gdb) <q>error detected on stdin
>> $ q
>> ...
>>
>> Tested on x86_64-linux.
>> ---
>> gdb/utils.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/gdb/utils.c b/gdb/utils.c
>> index ce3c26ef140..90063e4a96d 100644
>> --- a/gdb/utils.c
>> +++ b/gdb/utils.c
>> @@ -1524,7 +1524,8 @@ pager_file::flush ()
>> void
>> gdb_flush (struct ui_file *stream)
>> {
>> - stream->flush ();
>> + if (stream != nullptr)
>> + stream->flush ();
>
Hi Simon,
thanks for the review.
> Can we add the nullptr check only where it's needed?
>
AFAIU, I've added it where it's needed.
Can you tell me where you think it's needed? I cannot tell from your
comments.
Thanks,
- Tom
> I'm thinking that we should move towards removing gdb_flush (replacing
> the call sites to call ui_file::flush directly). And when we do so, we
> wouldn't want to add nullptr checks everywhere that gdb_flush is
> currently called.
>
> Simon
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush
2025-04-28 16:32 ` Tom de Vries
@ 2025-04-28 16:38 ` Simon Marchi
2025-04-29 11:52 ` Tom de Vries
0 siblings, 1 reply; 13+ messages in thread
From: Simon Marchi @ 2025-04-28 16:38 UTC (permalink / raw)
To: Tom de Vries, gdb-patches
On 4/28/25 12:32 PM, Tom de Vries wrote:
> AFAIU, I've added it where it's needed.
>
> Can you tell me where you think it's needed? I cannot tell from your comments.
Based on the stack trace in your commit message, it would be in
gdbpy_flush. If I understand correctly, this call site is special
because it can be called when GDB is getting torn down, and I suppose
because the streams are closed before Python gets finalized. Most calls
to gdb_flush are done on streams that we know are not nullptr/not
closed.
Simon
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr
2025-04-28 15:17 ` Simon Marchi
@ 2025-04-29 11:50 ` Tom de Vries
0 siblings, 0 replies; 13+ messages in thread
From: Tom de Vries @ 2025-04-29 11:50 UTC (permalink / raw)
To: Simon Marchi, gdb-patches
On 4/28/25 17:17, Simon Marchi wrote:
> On 4/25/25 1:18 PM, Tom de Vries wrote:
>> When running test-case gdb.tui/tui-layout-asm.exp with target board
>> dwarf5-fission-debug-types, the test-case fails and I get a core dump:
>> ...
>> # of unexpected core files 1
>> ...
>>
>> Looking at the backtrace of the core file, what seems to be happening is that:
>> - gdbpy_flush attempts to flush gdb_stdout, which is nullptr
>> - that causes a segfault
>> - gdb intercepts this and starts to handle it using handle_fatal_signal
>> - handle_fatal_signal calls sig_write, which attempts to write to gdb_stderr,
>> which is nullptr,
>> - that causes another segfault
>> - gdb exits
>>
>> I managed to reproduce the problem by the following trigger patch in
>> stdin_event_handler:
>> ...
>> - if (error)
>> + if (1 || error)
>> {
>> current_ui = main_ui;
>> ui->unregister_file_handler ();
>> - if (main_ui == ui)
>> + if (1 || main_ui == ui)
>> {
>> gdb_printf (gdb_stderr, _("error detected on stdin\n"));
>> + gdb_stderr = nullptr;
>> + gdb_stdout = nullptr;
>> + gdb_stdlog = nullptr;
>> quit_command ((char *) 0, 0);
>> }
>> ...
>> which gives us:
>> ...
>> $ gdb
>> (gdb) <q> error detected on stdin
>> Segmentation fault (core dumped)
>> $ q
>> ...
>>
>> Fix sig_write to handle the case that gdb_stderr == nullptr, such that we get
>> instead:
>> ...
>> $ gdb
>> (gdb) <q> error detected on stdin
>>
>> Fatal signal: Segmentation fault
>> ----- Backtrace -----
>> ...
>> ---------------------
>> A fatal error internal to GDB has been detected, further
>> debugging is not possible. GDB will now terminate.
>>
>> This is a bug, please report it. For instructions, see:
>> <https://www.gnu.org/software/gdb/bugs/>.
>>
>> Segmentation fault (core dumped)
>> $ q
>> ...
>>
>> Tested on x86_64-linux.
>> ---
>> gdb/bt-utils.c | 30 +++++++++++++++++++++---------
>> gdb/event-top.c | 13 +++++++++++--
>> 2 files changed, 32 insertions(+), 11 deletions(-)
>>
>> diff --git a/gdb/bt-utils.c b/gdb/bt-utils.c
>> index 8e782450ae9..9e9f680f9ea 100644
>> --- a/gdb/bt-utils.c
>> +++ b/gdb/bt-utils.c
>> @@ -55,7 +55,10 @@ libbacktrace_error (void *data, const char *errmsg, int errnum)
>>
>> const auto sig_write = [] (const char *msg) -> void
>> {
>> - gdb_stderr->write_async_safe (msg, strlen (msg));
>> + if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
>> + std::ignore = ::write (2, msg, strlen (msg));
>
> TIL about std::ignore.
>
> Is it possible to factor out the lambdas:
>
> const auto sig_write = [] (const char *msg) -> void
> {
> if (gdb_stderr == nullptr || gdb_stderr->fd () == -1)
> std::ignore = ::write (2, msg, strlen (msg));
> else
> gdb_stderr->write_async_safe (msg, strlen (msg));
> };
>
> to a function? At first glance they all seem to be the same.
>
I've done this in a v2 (
https://sourceware.org/pipermail/gdb-patches/2025-April/217564.html ) in
a separate patch.
Thanks,
- Tom
> Otherwise, LGTM.
>
> Approved-By: Simon Marchi <simon.marchi@efficios.com>
>
> Simon
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush
2025-04-28 16:38 ` Simon Marchi
@ 2025-04-29 11:52 ` Tom de Vries
0 siblings, 0 replies; 13+ messages in thread
From: Tom de Vries @ 2025-04-29 11:52 UTC (permalink / raw)
To: Simon Marchi, gdb-patches
On 4/28/25 18:38, Simon Marchi wrote:
> On 4/28/25 12:32 PM, Tom de Vries wrote:
>> AFAIU, I've added it where it's needed.
>>
>> Can you tell me where you think it's needed? I cannot tell from your comments.
>
> Based on the stack trace in your commit message, it would be in
> gdbpy_flush. If I understand correctly, this call site is special
> because it can be called when GDB is getting torn down, and I suppose
> because the streams are closed before Python gets finalized. Most calls
> to gdb_flush are done on streams that we know are not nullptr/not
> closed.
Done in a v2 (
https://sourceware.org/pipermail/gdb-patches/2025-April/217565.html ).
This also required fixing ioscm_flush.
Thanks,
- Tom
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr
2025-04-25 17:18 ` [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr Tom de Vries
2025-04-28 15:17 ` Simon Marchi
@ 2025-04-29 15:34 ` Tom Tromey
2025-04-30 18:49 ` Tom de Vries
1 sibling, 1 reply; 13+ messages in thread
From: Tom Tromey @ 2025-04-29 15:34 UTC (permalink / raw)
To: Tom de Vries; +Cc: gdb-patches
>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
Tom> - gdbpy_flush attempts to flush gdb_stdout, which is nullptr
Tom> - that causes a segfault
When can this be nullptr?
That seems surprising to me.
Tom
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr
2025-04-29 15:34 ` Tom Tromey
@ 2025-04-30 18:49 ` Tom de Vries
2025-04-30 19:11 ` Tom de Vries
0 siblings, 1 reply; 13+ messages in thread
From: Tom de Vries @ 2025-04-30 18:49 UTC (permalink / raw)
To: Tom Tromey; +Cc: gdb-patches
On 4/29/25 17:34, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
>
> Tom> - gdbpy_flush attempts to flush gdb_stdout, which is nullptr
> Tom> - that causes a segfault
>
> When can this be nullptr?
> That seems surprising to me.
I managed to recognize the situation using:
...
diff --git a/gdb/tui/tui-io.c b/gdb/tui/tui-io.c
index 1b4cc82cce8..fa4186aea3b 100644
--- a/gdb/tui/tui-io.c
+++ b/gdb/tui/tui-io.c
@@ -869,6 +869,7 @@ tui_setup_io (int mode)
/* Restore gdb output. */
gdb_stdout = tui_old_stdout;
gdb_stderr = tui_old_stderr;
+ gdb_assert (gdb_stderr != nullptr);
gdb_stdlog = tui_old_stdlog;
gdb_stdtarg = gdb_stderr;
current_uiout = tui_old_uiout;
...
And the backtrace (in an unusual form because of recursion) is:
...
#19000 0x000000000041a3b2 in main (argc=14, argv=0x7ffc5c28c168)
at /data/vries/gdb/src/gdb/gdb.c:38
38 return gdb_main (&args);
(gdb) down
#18999 0x0000000000b68937 in gdb_main (args=0x7ffc5c28c030)
at /data/vries/gdb/src/gdb/main.c:1363
1363 captured_main (args);
(gdb)
#18998 0x0000000000b6888e in captured_main (data=0x7ffc5c28c030)
at /data/vries/gdb/src/gdb/main.c:1344
1344 captured_command_loop ();
(gdb)
#18997 0x0000000000b66dad in captured_command_loop ()
at /data/vries/gdb/src/gdb/main.c:466
466 start_event_loop ();
(gdb)
#18996 0x0000000000b66c21 in start_event_loop ()
at /data/vries/gdb/src/gdb/main.c:402
402 result = gdb_do_one_event ();
(gdb)
#18995 0x0000000001985e31 in gdb_do_one_event (mstimeout=-1)
at /data/vries/gdb/src/gdbsupport/event-loop.cc:263
263 return gdb_wait_for_event (1);
(gdb)
#18994 0x0000000001986f70 in gdb_wait_for_event (block=1)
at /data/vries/gdb/src/gdbsupport/event-loop.cc:672
672 handle_file_event (file_ptr, mask);
(gdb)
#18993 0x0000000001986995 in handle_file_event (file_ptr=0x3b26aff0,
ready_mask=25) at /data/vries/gdb/src/gdbsupport/event-loop.cc:551
551 file_ptr->proc (file_ptr->error, file_ptr->client_data);
(gdb)
#18992 0x00000000010563ad in stdin_event_handler (error=1,
client_data=0x3aec7120) at /data/vries/gdb/src/gdb/ui.c:128
128 quit_command ((char *) 0, 0);
(gdb)
#18991 0x00000000006b2e63 in quit_command (args=0x0, from_tty=0)
at /data/vries/gdb/src/gdb/cli/cli-cmds.c:483
483 quit_force (args ? &exit_code : NULL, from_tty);
(gdb)
#18990 0x0000000000fd94dc in quit_force (exit_arg=0x0, from_tty=0)
at /data/vries/gdb/src/gdb/top.c:1750
1750 undo_terminal_modifications_before_exit ();
(gdb)
#18989 0x0000000000fd942f in undo_terminal_modifications_before_exit ()
at /data/vries/gdb/src/gdb/top.c:1718
1718 tui_disable ();
(gdb)
#18988 0x000000000103fa11 in tui_disable ()
at /data/vries/gdb/src/gdb/tui/tui.c:562
562 tui_setup_io (0);
(gdb)
#18987 0x00000000010129d4 in tui_setup_io (mode=0)
at /data/vries/gdb/src/gdb/tui/tui-io.c:872
872 gdb_assert (gdb_stderr != nullptr);
(gdb)
...
Thanks,
- Tom
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr
2025-04-30 18:49 ` Tom de Vries
@ 2025-04-30 19:11 ` Tom de Vries
2025-05-02 18:45 ` Tom Tromey
0 siblings, 1 reply; 13+ messages in thread
From: Tom de Vries @ 2025-04-30 19:11 UTC (permalink / raw)
To: Tom Tromey; +Cc: gdb-patches
On 4/30/25 20:49, Tom de Vries wrote:
> On 4/29/25 17:34, Tom Tromey wrote:
>>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
>>
>> Tom> - gdbpy_flush attempts to flush gdb_stdout, which is nullptr
>> Tom> - that causes a segfault
>>
>> When can this be nullptr?
>> That seems surprising to me.
>
> I managed to recognize the situation using:
> ...
> diff --git a/gdb/tui/tui-io.c b/gdb/tui/tui-io.c
> index 1b4cc82cce8..fa4186aea3b 100644
> --- a/gdb/tui/tui-io.c
> +++ b/gdb/tui/tui-io.c
> @@ -869,6 +869,7 @@ tui_setup_io (int mode)
> /* Restore gdb output. */
> gdb_stdout = tui_old_stdout;
> gdb_stderr = tui_old_stderr;
> + gdb_assert (gdb_stderr != nullptr);
> gdb_stdlog = tui_old_stdlog;
> gdb_stdtarg = gdb_stderr;
> current_uiout = tui_old_uiout;
> ...
>
> And the backtrace (in an unusual form because of recursion) is:
> ...
> #19000 0x000000000041a3b2 in main (argc=14, argv=0x7ffc5c28c168)
> at /data/vries/gdb/src/gdb/gdb.c:38
> 38 return gdb_main (&args);
> (gdb) down
> #18999 0x0000000000b68937 in gdb_main (args=0x7ffc5c28c030)
> at /data/vries/gdb/src/gdb/main.c:1363
> 1363 captured_main (args);
> (gdb)
> #18998 0x0000000000b6888e in captured_main (data=0x7ffc5c28c030)
> at /data/vries/gdb/src/gdb/main.c:1344
> 1344 captured_command_loop ();
> (gdb)
> #18997 0x0000000000b66dad in captured_command_loop ()
> at /data/vries/gdb/src/gdb/main.c:466
> 466 start_event_loop ();
> (gdb)
> #18996 0x0000000000b66c21 in start_event_loop ()
> at /data/vries/gdb/src/gdb/main.c:402
> 402 result = gdb_do_one_event ();
> (gdb)
> #18995 0x0000000001985e31 in gdb_do_one_event (mstimeout=-1)
> at /data/vries/gdb/src/gdbsupport/event-loop.cc:263
> 263 return gdb_wait_for_event (1);
> (gdb)
> #18994 0x0000000001986f70 in gdb_wait_for_event (block=1)
> at /data/vries/gdb/src/gdbsupport/event-loop.cc:672
> 672 handle_file_event (file_ptr, mask);
> (gdb)
> #18993 0x0000000001986995 in handle_file_event (file_ptr=0x3b26aff0,
> ready_mask=25) at /data/vries/gdb/src/gdbsupport/event-loop.cc:551
> 551 file_ptr->proc (file_ptr->error, file_ptr->client_data);
> (gdb)
> #18992 0x00000000010563ad in stdin_event_handler (error=1,
> client_data=0x3aec7120) at /data/vries/gdb/src/gdb/ui.c:128
> 128 quit_command ((char *) 0, 0);
> (gdb)
> #18991 0x00000000006b2e63 in quit_command (args=0x0, from_tty=0)
> at /data/vries/gdb/src/gdb/cli/cli-cmds.c:483
> 483 quit_force (args ? &exit_code : NULL, from_tty);
> (gdb)
> #18990 0x0000000000fd94dc in quit_force (exit_arg=0x0, from_tty=0)
> at /data/vries/gdb/src/gdb/top.c:1750
> 1750 undo_terminal_modifications_before_exit ();
> (gdb)
> #18989 0x0000000000fd942f in undo_terminal_modifications_before_exit ()
> at /data/vries/gdb/src/gdb/top.c:1718
> 1718 tui_disable ();
> (gdb)
> #18988 0x000000000103fa11 in tui_disable ()
> at /data/vries/gdb/src/gdb/tui/tui.c:562
> 562 tui_setup_io (0);
> (gdb)
> #18987 0x00000000010129d4 in tui_setup_io (mode=0)
> at /data/vries/gdb/src/gdb/tui/tui-io.c:872
> 872 gdb_assert (gdb_stderr != nullptr);
> (gdb)
> ...
OK, I understand now how this happens.
The relevant code is in tui_enable.
While initializing, we set tui_active to true, because:
...
/* We must mark the tui sub-system active before trying to setup
the
current layout as tui windows defined by an extension language
rely on this flag being true in order to know that the window
they are creating is currently valid. */
tui_active = true;
...
After that we call tui_set_initial_layout, and the DWARF error is thrown.
It's not caught early enough to continue executing tui_enable.
Consequently, tui_setup_io (1) is not called.
Then when undo_terminal_modifications_before_exit is called, it calls
tui_disable, which checks for tui_active, and then calls setup_io (0)
assuming setup_io (1) has been called, which is not the case.
Thanks,
- Tom
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr
2025-04-30 19:11 ` Tom de Vries
@ 2025-05-02 18:45 ` Tom Tromey
0 siblings, 0 replies; 13+ messages in thread
From: Tom Tromey @ 2025-05-02 18:45 UTC (permalink / raw)
To: Tom de Vries; +Cc: Tom Tromey, gdb-patches
>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
Tom> tui_active = true;
Tom> ...
Tom> After that we call tui_set_initial_layout, and the DWARF error is thrown.
Tom> It's not caught early enough to continue executing tui_enable.
Tom> Consequently, tui_setup_io (1) is not called.
Tom> Then when undo_terminal_modifications_before_exit is called, it calls
Tom> tui_disable, which checks for tui_active, and then calls setup_io (0)
Tom> assuming setup_io (1) has been called, which is not the case.
It sounds like an invariant isn't being kept here, I wonder if there
should be code to reset tui_active if tui_set_initial_layout throws?
Tom
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-05-02 18:45 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-25 17:18 [PATCH 0/2] [gdb] Fix error handling with nullptr gdb_stdout/gdb_stderr Tom de Vries
2025-04-25 17:18 ` [PATCH 1/2] [gdb] Fix sig_write for null gdb_stderr Tom de Vries
2025-04-28 15:17 ` Simon Marchi
2025-04-29 11:50 ` Tom de Vries
2025-04-29 15:34 ` Tom Tromey
2025-04-30 18:49 ` Tom de Vries
2025-04-30 19:11 ` Tom de Vries
2025-05-02 18:45 ` Tom Tromey
2025-04-25 17:18 ` [PATCH 2/2] [gdb] Handle nullptr stream in gdb_flush Tom de Vries
2025-04-28 15:21 ` Simon Marchi
2025-04-28 16:32 ` Tom de Vries
2025-04-28 16:38 ` Simon Marchi
2025-04-29 11:52 ` Tom de Vries
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox