* [RFC/ia64] memory error when reading wrong core file
@ 2010-01-29 16:02 Joel Brobecker
2010-01-29 17:53 ` Jan Kratochvil
2010-02-01 1:25 ` Jan Kratochvil
0 siblings, 2 replies; 6+ messages in thread
From: Joel Brobecker @ 2010-01-29 16:02 UTC (permalink / raw)
To: gdb-patches
[-- Attachment #1: Type: text/plain, Size: 4358 bytes --]
Hello,
We noticed a slight change of behavior in the scenario below.
The general idea of our testcase is to verify that GDB warns the user
when it notices that the user tries to debug a core file that has not
been produced by the same executable as the one already loaded:
% ./crash
zsh: 344 abort (core dumped) ./crash
% gdb call_crash <<<--- executable name is different
(gdb) core core <<<--- so wrong core file used here
We expect GDB to print a warning in that case, since the user probably
picked the wrong core:
(gdb) core core
warning: core file may not match specified executable file.
[...]
However, the testcase that we used at AdaCore also tested various
other things, while we're testing core file support. In particular,
we were testing the fact that GDB was able to print the name of
the executable that produced this core file. See the output produced
by GDB as of a couple of weeks ago:
(gdb) core core
warning: core file may not match specified executable file.
[New Thread 5437]
[traces about symbols being read from shared libs]
Core was generated by `./crash'.
Program terminated with signal 6, Aborted.
#0 0xa000000000010640 in __kernel_syscall_via_break ()
Contrast this with what we get today, on ia64-linux:
(gdb) core core
warning: core file may not match specified executable file.
[New Thread 5437]
Cannot access memory at address 0x1000000000009
The change of behavior is related to a patch that changed the way
the solib base address gets computed (that was for PIE, patch 12/15
I believe). Prior to that patch, the computed base address was zero.
But I believe that this was totally by accident: Since the solib base
is computed using the .dynamic data of the executable, and because
there is a discrepancy between the executable and the core file,
the base address means nothing in any case.
What I think is happening in this case is that we're being a little
less lucky than before, and end up tripping a memory error while we were
lucky before to return a null base address, which itself allowed us to
fallback on:
/* If we can't find the dynamic linker's base structure, this
must not be a dynamically linked executable. Hmm. */
if (! info->debug_base)
return svr4_default_sos ();
This explains the change of behavior, which is not entirely unreasonable.
That being said, I think that the new behavior is less useful for the user.
It was nice that GDB was able to print the name of the executable,
particularly in this case where the user probably just picked the wrong
core file.
What I suggest, is to catch all errors while trying to read the shared
library map, and try to continue without. Something like the attached?
It gives me the following output:
(gdb) core core
warning: exec file is newer than core file.
[New Thread 5437]
warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/tls/libc.so.6.1...(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libc.so.6.1
Reading symbols from /lib/ld-linux-ia64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux-ia64.so.2
Core was generated by `./crash'.
Program terminated with signal 6, Aborted.
#0 0xa000000000010640 in __kernel_syscall_via_break ()
It pretty much brings back the previous behavior. Although we still
compute a non-null solib base address, solib_svr4_r_map now notices
the memory error while trying to load the map, and returns zero.
As a result, the caller (svr4_current_sos) finds that its list of
SOs is empty, and thus falls back again on svr4_default_sos.
gdb/ChangeLog:
* solib-svr4.c (solib_svr4_r_map): catch all exceptions while
reading the inferior memory, and return zero if an exception
was raised.
As mentioned on IRC, I could not test this on ia64-linux with the official
testsuite, as the testsuite run immediately crashes the two machines on
which I tried (after which I was firmly prohibited from making any additional
attempt on any other machines). I did test it on this platform (ia64-linux)
but with AdaCore's testsuite. I also ran the testsuite on x86_64-linux.
I asked Jan, if he has a moment, to try to run the testsuite on his side.
Any thoughts on this?
--
Joel
[-- Attachment #2: solib-svr4.diff --]
[-- Type: text/x-diff, Size: 744 bytes --]
diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
index e497364..38ae126 100644
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -835,9 +835,15 @@ solib_svr4_r_map (struct svr4_info *info)
{
struct link_map_offsets *lmo = svr4_fetch_link_map_offsets ();
struct type *ptr_type = builtin_type (target_gdbarch)->builtin_data_ptr;
+ CORE_ADDR addr = 0;
+ volatile struct gdb_exception ex;
- return read_memory_typed_address (info->debug_base + lmo->r_map_offset,
- ptr_type);
+ TRY_CATCH (ex, RETURN_MASK_ERROR)
+ {
+ addr = read_memory_typed_address (info->debug_base + lmo->r_map_offset,
+ ptr_type);
+ }
+ return addr;
}
/* Find r_brk from the inferior's debug base. */
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [RFC/ia64] memory error when reading wrong core file
2010-01-29 16:02 [RFC/ia64] memory error when reading wrong core file Joel Brobecker
@ 2010-01-29 17:53 ` Jan Kratochvil
2010-02-01 1:25 ` Jan Kratochvil
1 sibling, 0 replies; 6+ messages in thread
From: Jan Kratochvil @ 2010-01-29 17:53 UTC (permalink / raw)
To: Joel Brobecker; +Cc: gdb-patches
Hi Joel,
On Fri, 29 Jan 2010 17:02:22 +0100, Joel Brobecker wrote:
> Contrast this with what we get today, on ia64-linux:
>
> (gdb) core core
> warning: core file may not match specified executable file.
> [New Thread 5437]
> Cannot access memory at address 0x1000000000009
>
> The change of behavior is related to a patch that changed the way
> the solib base address gets computed (that was for PIE, patch 12/15
> I believe). Prior to that patch, the computed base address was zero.
before verifying your patch I would like to check if the PIE change should not
be rather somehow fixed. Unfortunately I am unable to reproduce the problem
on ia64 (RHEL-5.4) GDB HEAD (ea72145d5db8c66253c9b21b785a5311f69ae99e).
GNU gdb (GDB) 7.0.50.20100129-cvs
This GDB was configured as "ia64-unknown-linux-gnu".
Reading symbols from /root/.../pause...done.
[New Thread 30776]
warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/libc.so.6.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6.1
Reading symbols from /lib/ld-linux-ia64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux-ia64.so.2
Core was generated by `./pause'.
Program terminated with signal 11, Segmentation fault.
#0 0xa000000000010620 in __kernel_syscall_via_break ()
(gdb)
Could you please provide some binary/core file as reproducers?
Thanks,
Jan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC/ia64] memory error when reading wrong core file
2010-01-29 16:02 [RFC/ia64] memory error when reading wrong core file Joel Brobecker
2010-01-29 17:53 ` Jan Kratochvil
@ 2010-02-01 1:25 ` Jan Kratochvil
2010-02-01 6:54 ` Joel Brobecker
1 sibling, 1 reply; 6+ messages in thread
From: Jan Kratochvil @ 2010-02-01 1:25 UTC (permalink / raw)
To: Joel Brobecker; +Cc: gdb-patches
Hi Joel,
this specific problem gets also fixed by:
[patch] Sanity check PIE displacement (like the PIC one)
http://sourceware.org/ml/gdb-patches/2010-02/msg00000.html
which produces
(gdb) core core
warning: core file may not match specified executable file.
[New Thread 5437]
warning: PIE (Position Independent Executable) displacement 0xffffffffffffe580 is not aligned to the minimal page size 0x4000 for "/root/wrong-core-for-exe/call_crash" (wrong executable or version mismatch?)
Reading symbols from /lib/ld-linux-ia64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux-ia64.so.2
warning: PIE (Position Independent Executable) displacement 0xffffffffffffe580 is not aligned to the minimal page size 0x4000 for "/root/wrong-core-for-exe/call_crash" (wrong executable or version mismatch?)
Core was generated by `./crash'.
Program terminated with signal 6, Aborted.
#0 0xa000000000010640 in __kernel_syscall_via_break ()
Still I think you have found a problem of its own which should be fixed
anyway.
On Fri, 29 Jan 2010 17:02:22 +0100, Joel Brobecker wrote:
> This explains the change of behavior, which is not entirely unreasonable.
> That being said, I think that the new behavior is less useful for the user.
I think the PASS->FAIL turn is in fact just accident, it could even turn
FAIL->PASS etc.
> What I suggest, is to catch all errors while trying to read the shared
> library map, and try to continue without. Something like the attached?
>
> It gives me the following output:
>
> (gdb) core core
> warning: exec file is newer than core file.
> [New Thread 5437]
> warning: Can't read pathname for load map: Input/output error.
> Reading symbols from /lib/tls/libc.so.6.1...(no debugging symbols found)...done.
> Loaded symbols for /lib/tls/libc.so.6.1
> Reading symbols from /lib/ld-linux-ia64.so.2...(no debugging symbols found)...done.
> Loaded symbols for /lib/ld-linux-ia64.so.2
> Core was generated by `./crash'.
> Program terminated with signal 6, Aborted.
> #0 0xa000000000010640 in __kernel_syscall_via_break ()
When some memory access fails it should be IMO printed as it is some sort of
incorrect operation. Therefore I would prefer the attached patch producing:
(gdb) core core
warning: core file may not match specified executable file.
[New Thread 5437]
Cannot access memory at address 0x1000000000009
Cannot access memory at address 0x1000000000009
Cannot access memory at address 0x1000000000009
Reading symbols from /lib/tls/libc.so.6.1...(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libc.so.6.1
Reading symbols from /lib/ld-linux-ia64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux-ia64.so.2
Cannot access memory at address 0x1000000000009
Core was generated by `./crash'.
Program terminated with signal 6, Aborted.
#0 0xa000000000010640 in __kernel_syscall_via_break ()
(this dump has been made up a bit)
> As mentioned on IRC, I could not test this on ia64-linux with the official
> testsuite, as the testsuite run immediately crashes the two machines on
> which I tried (after which I was firmly prohibited from making any additional
> attempt on any other machines).
Tested your original patch on RHEL-5.4 ia64 with no regressions using:
make check-single RUNTESTFLAGS="--ignore 'watch-vfork.exp interrupt.exp'"
Tested the patch attached below on {x86_64,x86_64-m32,i686}-fedora12-linux-gnu
(as it is IMO fully arch-independent).
Technical details:
This patch could be made more simple just by extending your patch by:
print_any_exception (gdb_stderr, errstring, exception);
Just if there exists catch_errors shouldn't it be used instead of TRY_CATCH?
Moreover catch_errors is considered deprecated in favor of catch_exceptions.
But I am open for inlining catch_errors there as you did, with that new
print_any_exception call, if you agree.
Moreover this patch is not required if a new patch for PAGESIZE sanity checking
PIE displacement
Thanks,
Jan
2010-02-01 Jan Kratochvil <jan.kratochvil@redhat.com>
Joel Brobecker <brobecker@adacore.com>
* solib-svr4.c
(struct solib_svr4_r_map_args): New.
(solib_svr4_r_map): Call solib_svr4_r_map_stub and move the former
code content to ...
(solib_svr4_r_map_stub): ... a new function.
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -823,6 +823,30 @@ locate_base (struct svr4_info *info)
return info->debug_base;
}
+/* Arguments for catch_exceptions's call of solib_svr4_r_map_stub. */
+
+struct solib_svr4_r_map_args
+ {
+ struct svr4_info *info;
+ CORE_ADDR retval;
+ };
+
+/* Stub interface for catch_exceptions's called from solib_svr4_r_map. */
+
+static int
+solib_svr4_r_map_stub (struct ui_out *ui_out, void *args1)
+{
+ struct solib_svr4_r_map_args *args = args1;
+ struct svr4_info *info = args->info;
+ struct link_map_offsets *lmo = svr4_fetch_link_map_offsets ();
+ struct type *ptr_type = builtin_type (target_gdbarch)->builtin_data_ptr;
+
+ args->retval = read_memory_typed_address ((info->debug_base
+ + lmo->r_map_offset),
+ ptr_type);
+ return 1;
+}
+
/* Find the first element in the inferior's dynamic link map, and
return its address in the inferior.
@@ -833,11 +857,16 @@ locate_base (struct svr4_info *info)
static CORE_ADDR
solib_svr4_r_map (struct svr4_info *info)
{
- struct link_map_offsets *lmo = svr4_fetch_link_map_offsets ();
- struct type *ptr_type = builtin_type (target_gdbarch)->builtin_data_ptr;
+ struct solib_svr4_r_map_args args;
- return read_memory_typed_address (info->debug_base + lmo->r_map_offset,
- ptr_type);
+ args.info = info;
+
+ /* Return 0 as a failure result if there is a memory read error. */
+ args.retval = 0;
+
+ catch_exceptions (uiout, solib_svr4_r_map_stub, &args, RETURN_MASK_ERROR);
+
+ return args.retval;
}
/* Find r_brk from the inferior's debug base. */
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [RFC/ia64] memory error when reading wrong core file
2010-02-01 1:25 ` Jan Kratochvil
@ 2010-02-01 6:54 ` Joel Brobecker
2010-02-01 19:20 ` Jan Kratochvil
2010-03-08 7:46 ` Joel Brobecker
0 siblings, 2 replies; 6+ messages in thread
From: Joel Brobecker @ 2010-02-01 6:54 UTC (permalink / raw)
To: Jan Kratochvil; +Cc: gdb-patches
[-- Attachment #1: Type: text/plain, Size: 1876 bytes --]
Thanks for taking the time to test my patch!
> this specific problem gets also fixed by:
> [patch] Sanity check PIE displacement (like the PIC one)
> http://sourceware.org/ml/gdb-patches/2010-02/msg00000.html
[...]
> Still I think you have found a problem of its own which should be fixed
> anyway.
Agreed.
> When some memory access fails it should be IMO printed as it is some sort of
> incorrect operation. Therefore I would prefer the attached patch producing:
Agreed as well.
> Just if there exists catch_errors shouldn't it be used instead of TRY_CATCH?
TRY_CATCH was introduced after catch_error, and I *think* that it was
intended as a simpler way to call some code while being able to handle
gdb exceptions. I don't know about others, but I do not really like
the catch_errors/catch_exceptions interface - I think it's awkward to
use: You almost always have to define an "args" struct type that contains
the arguments, and most of the time a developer will also separate
the stub (the function called by catch_exceptions which unmarshalls the
function arguments) from the real implementation. TRY_CATCH greatly
simplifies the use process, which is why I'm trying to use it in preference
to catch errors as much as I can.
The downside, as you are pointing out, is the lack of error message
when an exception is raised. The attached patch fixes this.
That made me wonder, however, why we have 2 routines to do the exception
printing. There's probably some cleanup we could do, there...
gdb/ChangeLog:
* solib-svr4.c (solib_svr4_r_map): catch and print all exception
errors while reading the inferior memory, and return zero if
an exception was raised.
This patch should be strictly equivalent to yours, I believe. I've tested
it against the testcase described in this thread as well as the rest of
AdaCore's testsuite.
--
Joel
[-- Attachment #2: solib-svr4.diff --]
[-- Type: text/x-diff, Size: 781 bytes --]
diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
index e497364..dc9a685 100644
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -835,9 +835,16 @@ solib_svr4_r_map (struct svr4_info *info)
{
struct link_map_offsets *lmo = svr4_fetch_link_map_offsets ();
struct type *ptr_type = builtin_type (target_gdbarch)->builtin_data_ptr;
+ CORE_ADDR addr = 0;
+ volatile struct gdb_exception ex;
- return read_memory_typed_address (info->debug_base + lmo->r_map_offset,
- ptr_type);
+ TRY_CATCH (ex, RETURN_MASK_ERROR)
+ {
+ addr = read_memory_typed_address (info->debug_base + lmo->r_map_offset,
+ ptr_type);
+ }
+ exception_print (gdb_stderr, ex);
+ return addr;
}
/* Find r_brk from the inferior's debug base. */
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [RFC/ia64] memory error when reading wrong core file
2010-02-01 6:54 ` Joel Brobecker
@ 2010-02-01 19:20 ` Jan Kratochvil
2010-03-08 7:46 ` Joel Brobecker
1 sibling, 0 replies; 6+ messages in thread
From: Jan Kratochvil @ 2010-02-01 19:20 UTC (permalink / raw)
To: Joel Brobecker; +Cc: gdb-patches
On Mon, 01 Feb 2010 07:53:28 +0100, Joel Brobecker wrote:
> > Just if there exists catch_errors shouldn't it be used instead of TRY_CATCH?
>
> TRY_CATCH was introduced after catch_error, and I *think* that it was
> intended as a simpler way to call some code while being able to handle
> gdb exceptions.
OK, thanks for the info. The code looks better with TRY_CATCH and the longjmp
magic is there either way.
> That made me wonder, however, why we have 2 routines to do the exception
> printing. There's probably some cleanup we could do, there...
Yes... :-)
> gdb/ChangeLog:
>
> * solib-svr4.c (solib_svr4_r_map): catch and print all exception
> errors while reading the inferior memory, and return zero if
> an exception was raised.
>
> This patch should be strictly equivalent to yours, I believe. I've tested
> it against the testcase described in this thread as well as the rest of
> AdaCore's testsuite.
Retested on RHEL-5.4.ia64 as you asked before.
Thanks,
Jan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC/ia64] memory error when reading wrong core file
2010-02-01 6:54 ` Joel Brobecker
2010-02-01 19:20 ` Jan Kratochvil
@ 2010-03-08 7:46 ` Joel Brobecker
1 sibling, 0 replies; 6+ messages in thread
From: Joel Brobecker @ 2010-03-08 7:46 UTC (permalink / raw)
To: gdb-patches
> gdb/ChangeLog:
>
> * solib-svr4.c (solib_svr4_r_map): catch and print all exception
> errors while reading the inferior memory, and return zero if
> an exception was raised.
This is a patch that I should have checked in a while ago, but somehow
forgot. Now in (HEAD only, since hardly critical).
--
Joel
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-03-08 7:46 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-29 16:02 [RFC/ia64] memory error when reading wrong core file Joel Brobecker
2010-01-29 17:53 ` Jan Kratochvil
2010-02-01 1:25 ` Jan Kratochvil
2010-02-01 6:54 ` Joel Brobecker
2010-02-01 19:20 ` Jan Kratochvil
2010-03-08 7:46 ` Joel Brobecker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox