From: Christian Biesinger <cbiesinger@google.com>
To: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: Andrew Burgess <andrew.burgess@embecosm.com>,
"gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Subject: Re: [PATCHv3] Fix range end handling of inlined subroutines
Date: Thu, 12 Mar 2020 13:27:51 -0500 [thread overview]
Message-ID: <CAPTJ0XG8Po2633xXdFY4E7Bxp1sws_SC-GmKZy3g00NH+SVF7A@mail.gmail.com> (raw)
In-Reply-To: <AM6PR03MB51701431E9A888715E98F704E4FD0@AM6PR03MB5170.eurprd03.prod.outlook.com>
On Thu, Mar 12, 2020 at 1:21 PM Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
>
> On 3/11/20 11:02 PM, Andrew Burgess wrote:
> > * Bernd Edlinger <bernd.edlinger@hotmail.de> [2020-03-08 14:57:35 +0000]:
> >
> >> On 2/22/20 7:38 AM, Bernd Edlinger wrote:
> >>> On 2/9/20 10:07 PM, Bernd Edlinger wrote:
> >>>> Hi,
> >>>>
> >>>> this is based on Andrew's patch here:
> >>>>
> >>>> https://sourceware.org/ml/gdb-patches/2020-02/msg00092.html
> >>>>
> >>>> This and adds a heuristic to fix the case where caller
> >>>> and callee reside in the same subfile, it uses
> >>>> the recorded is-stmt bits and locates end of
> >>>> range infos, including the previously ignored empty
> >>>> range, and adjusting is-stmt info at this
> >>>> same pc, when the last item is not-is-stmt, the
> >>>> previous line table entries are dubious and
> >>>> we reset the is-stmt bit there.
> >>>> This fixes the new test case in Andrew's patch.
> >>>>
> >>>> It understood, that this is just a heuristic
> >>>> approach, since we do not have exactly the data,
> >>>> which is needed to determine at which of the identical
> >>>> PCs in the line table the subroutine actually ends.
> >>>> So, this tries to be as conservative as possible,
> >>>> just changing what is absolutely necessary.
> >>>>
> >>>> This patch itself is preliminary, since the is-stmt patch
> >>>> needs to be rebased after the refactoring of
> >>>> dwarf2read.c from yesterday, so I will have to rebase
> >>>> this patch as well, but decided to wait for Andrew.
> >>>>
> >>>>
> >>>
> >>> So, this is an update to my previous patch above:
> >>> https://sourceware.org/ml/gdb-patches/2020-02/msg00262.html
> >>>
> >>> It improves performance on big data, by using binary
> >>> search to locate the bogus line table entries.
> >>> Otherwise it behaves identical to the previous version,
> >>> and is still waiting for Andrew's patch before it can
> >>> be applied.
> >>>
> >>>
> >>
> >> Rebased to match Andrew's updated patch of today.
> >
> >
> > Bernd,
> >
> > Thanks for working on this. This looks like a great improvement, but
> > I have a could of questions / comment inline below.
> >
> > Thanks,
> > Andrew
> >
> >
> >>
> >>
> >> Thanks
> >> Bernd.
> >
> >> From 010f1774ec69d5cab48db9b7a5fb9de3d5860f97 Mon Sep 17 00:00:00 2001
> >> From: Bernd Edlinger <bernd.edlinger@hotmail.de>
> >> Date: Sun, 9 Feb 2020 21:13:17 +0100
> >> Subject: [PATCH] Fix range end handling of inlined subroutines
> >>
> >> Since the is_stmt is now handled, it becomes
> >> possible to locate dubious is_stmt line entries
> >> at the end of an inlined function, even if the
> >> called inline function is in the same subfile.
> >>
> >> When there is a sequence of line entries at the
> >> same address where an inline range ends, and the
> >> last item has is_stmt = 0, we force all previous
> >> items to have is_stmt = 0 as well.
> >
> > This describes _what_ we're doing...
> >
> >>
> >> If the last line at that address has is_stmt = 1,
> >> there is no need to change anything, since a step
> >> over will always stop at that last line from the
> >> same address, which is fine, since it is outside
> >> the subroutine.
> >
> > This describes what we're _not_ doing...
> >
> > ... it just feels like this description would benefit for a bit more
> > _why_. How does setting all the is_stmt fields to 0 help us? What
> > cases step/next should we expect to see improvement in?
> >
> > I'm also worried that setting is_stmt to false will mean we loose the
> > ability to set breakpoints on some lines using file:line syntax. I
> > wonder if we should consider coming up with a solution that both
> > preserves the ability to set breakpoints, while also giving the
> > benefits for step/next.
> >
>
> Maybe as a followup. The breakpoints on theses lines are *very*
> problematic, as the call stack is almost always wrong. From the block
> info they appear to be part of the caller, but from the line table they
> are located in the subroutine, when you are at this breakpoint, an
> ordinary user can't make sense of the call stack, as one call frame is
> missing.
>
> The is-stmt patch does the same, removing line infos that could
> in theory be used as breakpoints. See buildsym_compunit::record_line:
>
> /* Normally, we treat lines as unsorted. But the end of sequence
> marker is special. We sort line markers at the same PC by line
> number, so end of sequence markers (which have line == 0) appear
> first. This is right if the marker ends the previous function,
> and there is no padding before the next function. But it is
> wrong if the previous line was empty and we are now marking a
> switch to a different subfile. We must leave the end of sequence
> marker at the end of this group of lines, not sort the empty line
> to after the marker. The easiest way to accomplish this is to
> delete any empty lines from our table, if they are followed by
> end of sequence markers. All we lose is the ability to set
> breakpoints at some lines which contain no instructions
> anyway. */
> if (line == 0 && subfile->line_vector->nitems > 0)
> {
> e = subfile->line_vector->item + subfile->line_vector->nitems - 1;
> while (subfile->line_vector->nitems > 0 && e->pc == pc)
> {
> e--;
> subfile->line_vector->nitems--;
> }
> }
>
>
> This is invoked if a is-stmt line is followed by
> a non-stmt line of another CU, at the same PC.
> This is previously the non-is-stmt lines were all ignored,
> but now the non-is-stmt makes an end marker in the current
> CU and a non-is-stmt line in the next CU.
> This deletion is exactly what makes the other patch work.
>
> But I think it is not necessary to remove those lines altogether,
> instead we can just make them non-is-stmt lines, which would
> better match to what this patch does.
>
> While I write these lines, I see an UB in the above code,
> since e-- can decrement the pointer before the beginning of the
> line_vector, when nitems becomes 0, the value is not used, but
> the pointer value e indeterminate if I see that right.
>
> I will send a patch for that, when I find time for it.
>
> >>
> >> This is about the best we can do at the moment,
> >> unless location view information are added to the
> >> block ranges debug info structure, and location
> >> views are implemented in gdb in general.
> >>
> >> gdb:
> >> 2020-03-08 Bernd Edlinger <bernd.edlinger@hotmail.de>
> >>
> >> * buildsym.c (buildsym_compunit::record_inline_range_end,
> >> patch_inline_end_pos): New helper functions.
> >> (buildsym_compunit::end_symtab_with_blockvector): Patch line table.
> >> (buildsym_compunit::~buildsym_compunit): Cleanup m_inline_end_vector.
> >> * buildsym.h (buildsym_compunit::record_inline_range_end): Declare.
> >> (buildsym_compunit::m_inline_end_vector,
> >> buildsym_compunit::m_inline_end_vector_length,
> >> buildsym_compunit::m_inline_end_vector_nitems): New data items.
> >> * dwarf2/read.c (dwarf2_rnglists_process,
> >> dwarf2_ranges_process): Don't ignore empty ranges here.
> >> (dwarf2_ranges_read): Ignore empty ranges here.
> >> (dwarf2_record_block_ranges): Pass end of range PC to
> >> record_inline_range_end for inline functions.
> >>
> >> gdb/testsuite:
> >> 2020-03-08 Bernd Edlinger <bernd.edlinger@hotmail.de>
> >>
> >> * gdb.cp/step-and-next-inline.exp: Adjust test.
> >> ---
> >> gdb/buildsym.c | 84 +++++++++++++++++++++++++++
> >> gdb/buildsym.h | 11 ++++
> >> gdb/dwarf2/read.c | 22 ++++---
> >> gdb/testsuite/gdb.cp/step-and-next-inline.exp | 17 ------
> >> 4 files changed, 108 insertions(+), 26 deletions(-)
> >>
> >> diff --git a/gdb/buildsym.c b/gdb/buildsym.c
> >> index 24aeba8..f02b7ce 100644
> >> --- a/gdb/buildsym.c
> >> +++ b/gdb/buildsym.c
> >> @@ -113,6 +113,8 @@ struct pending_block
> >> next1 = next->next;
> >> xfree ((void *) next);
> >> }
> >> +
> >> + xfree (m_inline_end_vector);
> >> }
> >>
> >> struct macro_table *
> >>>> this patch as well, but decided to wait for Andrew.
> >> @@ -732,6 +734,84 @@ struct blockvector *
> >> }
> >>
> >>
> >> +/* Record a PC where a inlined subroutine ends. */
> >> +
> >> +void
> >> +buildsym_compunit::record_inline_range_end (CORE_ADDR end)
> >> +{
> >> + if (m_inline_end_vector == nullptr)
> >> + {
> >> + m_inline_end_vector_length = INITIAL_LINE_VECTOR_LENGTH;
> >> + m_inline_end_vector = (CORE_ADDR *)
> >> + xmalloc (sizeof (CORE_ADDR) * m_inline_end_vector_length);
> >> + m_inline_end_vector_nitems = 0;
> >> + }
> >> + else if (m_inline_end_vector_nitems == m_inline_end_vector_length)
> >> + {
> >> + m_inline_end_vector_length *= 2;
> >> + m_inline_end_vector = (CORE_ADDR *)
> >> + xrealloc ((char *) m_inline_end_vector,
> >> + sizeof (CORE_ADDR) * m_inline_end_vector_length);
> >> + }
> >> +
> >> + m_inline_end_vector[m_inline_end_vector_nitems++] = end;
> >> +}
> >> +
> >> +
> >> +/* Patch the is_stmt bits at the given inline end address.
> >> + The line table has to be already sorted. */
> >> +
> >> +static void
> >> +patch_inline_end_pos (struct linetable *table, CORE_ADDR end)
> >> +{
> >> + int a = 2, b = table->nitems - 1;
> >> + struct linetable_entry *items = table->item;
> >> +
> >> + /* We need at least two items with pc = end in the table.
> >> + The lowest usable items are at pos 0 and 1, the highest
> >> + usable items are at pos b - 2 and b - 1. */
> >> + if (a > b || end < items[1].pc || end > items[b - 2].pc)
> >> + return;
> >> +
> >> + /* Look for the first item with pc > end in the range [a,b].
> >> + The previous element has pc = end or there is no match.
> >> + We set a = 2, since we need at least two consecutive elements
> >> + with pc = end to do anything useful.
> >> + We set b = nitems - 1, since we are not interested in the last
> >> + element which should be an end of sequence marker with line = 0
> >> + and is_stmt = 1. */
> >> + while (a < b)
> >> + {
> >> + int c = (a + b) / 2;
> >> +
> >> + if (end < items[c].pc)
> >> + b = c;
> >> + else
> >> + a = c + 1;
> >> + }
> >> +
> >> + a--;
> >> + if (items[a].pc != end || items[a].is_stmt)
> >> + return;
> >> +
> >> + /* When there is a sequence of line entries at the same address
> >> + where an inline range ends, and the last item has is_stmt = 0,
> >> + we force all previous items to have is_stmt = 0 as well.
> >> + Setting breakpoints at those addresses is currently not
> >> + supported, since it is unclear if the previous addresses are
> >> + part of the subroutine or the calling program. */
> >> + while (a-- > 0)
> >
> > I know this is a personal preference, but I really dislike having
> > these inc/dec adjustments within expressions like this. I feel it's
> > too easy to overlook, and I always end up having to think twice just
> > to make sure I've parsed the intention correctly. I'd like to make a
> > plea to have the adjustment moved into the body of the loop please.
> >
>
> No problem, I can do that better, the first iteration does
> not need to check for a > 0, so I will make that a
> do { a--; .... } while (a > 0), so that will be even more
> efficient this way.
>
> >> + {
> >> + /* We stop at the first line entry with a different address,
> >> + or when we see an end of sequence marker. */
> >
> > I think there's a whitespace error at the start of this line.
>
> Yes, indeed.
>
> >
> >> + if (items[a].pc != end || items[a].line == 0)
> >> + break;
> >> +
> >> + items[a].is_stmt = 0;
> >> + }
> >> +}
> >> +
> >> +
> >> /* Subroutine of end_symtab to simplify it. Look for a subfile that
> >> matches the main source file's basename. If there is only one, and
> >> if the main source file doesn't have any symbol or line number
> >> @@ -965,6 +1045,10 @@ struct compunit_symtab *
> >> subfile->line_vector->item
> >> + subfile->line_vector->nitems,
> >> lte_is_less_than);
> >> +
> >> + for (int i = 0; i < m_inline_end_vector_nitems; i++)
> >> + patch_inline_end_pos (subfile->line_vector,
> >> + m_inline_end_vector[i]);
> >> }
> >>
> >> /* Allocate a symbol table if necessary. */
> >> diff --git a/gdb/buildsym.h b/gdb/buildsym.h
> >> index c768a4c..2845789 100644
> >> --- a/gdb/buildsym.h
> >> +++ b/gdb/buildsym.h
> >> @@ -190,6 +190,8 @@ struct buildsym_compunit
> >> void record_line (struct subfile *subfile, int line, CORE_ADDR pc,
> >> bool is_stmt);
> >>
> >> + void record_inline_range_end (CORE_ADDR end);
> >> +
> >> struct compunit_symtab *get_compunit_symtab ()
> >> {
> >> return m_compunit_symtab;
> >> @@ -397,6 +399,15 @@ struct buildsym_compunit
> >>
> >> /* Pending symbols that are local to the lexical context. */
> >> struct pending *m_local_symbols = nullptr;
> >> +
> >> + /* Pending inline end range addresses. */
> >> + CORE_ADDR * m_inline_end_vector = nullptr;
> >> +
> >> + /* Number of allocated inline end range addresses. */
> >> + int m_inline_end_vector_length = 0;
> >> +
> >> + /* Number of pending inline end range addresses. */
> >> + int m_inline_end_vector_nitems = 0;
> >> };
> >
> > I think we should probably just use a std::vector<CORE_ADDR> for
> > m_inline_end_vector - unless I'm missing a good reason. I think this
> > would simplify some of the code in buildsym.c.
> >
>
> I am worried that the vector is not as efficient as it is necessary here.
> I know for instance that a straight forward
>
> m_inline_end_vector.push_back(pc);
>
> has often an O(n^2) complexity alone for this adding new elements,
> (and it can throw, which makes error handling a mess).
push_back is not O(n^2). See
https://en.cppreference.com/w/cpp/container/vector/push_back:
"Complexity
Amortized constant"
That's because it doubles the capacity whenever it has to resize.
Christian
> But the performance of this table need to be O(n * log(n))
> since n is often around 1.000.000 (when I debug large
> apps like gcc).
>
> A previous version of this patch used O(n^2) and was exactly doubling
> the execution time for loading of the debug info (for debugging gcc's cc1).
> Although that appears to be already done incrementally.
>
>
> Bernd.
next prev parent reply other threads:[~2020-03-12 18:28 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-05 11:37 [PATCH 0/2] Line table is_stmt support Andrew Burgess
2020-02-05 11:37 ` [PATCH 2/2] gdb: Add support for tracking the DWARF line table is-stmt field Andrew Burgess
2020-02-05 17:55 ` Bernd Edlinger
2020-02-10 18:30 ` Bernd Edlinger
2020-02-11 13:57 ` Andrew Burgess
2020-02-14 20:05 ` Bernd Edlinger
2020-03-05 18:01 ` Bernd Edlinger
2020-03-08 12:50 ` [PATCHv2 0/2] Line table is_stmt support Andrew Burgess
2020-03-08 14:39 ` Bernd Edlinger
2020-03-10 23:01 ` Andrew Burgess
2020-03-11 6:50 ` Simon Marchi
2020-03-11 11:28 ` Andrew Burgess
2020-03-11 13:27 ` Simon Marchi
2020-04-03 22:21 ` [PATCH 0/2] More regression fixing from is-stmt patches Andrew Burgess
2020-04-03 22:21 ` [PATCH 1/2] gdb/testsuite: Move helper function into lib/dwarf.exp Andrew Burgess
2020-04-06 20:18 ` Tom Tromey
2020-04-14 11:18 ` Andrew Burgess
2020-04-03 22:21 ` [PATCH 2/2] gdb: Preserve is-stmt lines when switch between files Andrew Burgess
2020-04-04 18:07 ` Bernd Edlinger
2020-04-04 19:59 ` Bernd Edlinger
2020-04-04 22:23 ` Andrew Burgess
2020-04-05 0:04 ` Bernd Edlinger
2020-04-05 0:47 ` Bernd Edlinger
2020-04-05 8:55 ` Bernd Edlinger
2020-04-11 3:52 ` Bernd Edlinger
2020-04-12 17:13 ` Bernd Edlinger
2020-04-14 11:28 ` Andrew Burgess
2020-04-14 11:37 ` Bernd Edlinger
2020-04-14 11:41 ` Bernd Edlinger
2020-04-14 13:08 ` Andrew Burgess
2020-04-16 17:18 ` Andrew Burgess
2020-04-22 21:13 ` Tom Tromey
2020-04-25 7:06 ` Bernd Edlinger
2020-04-27 10:34 ` Andrew Burgess
2020-05-14 20:18 ` Tom Tromey
2020-05-14 22:39 ` Andrew Burgess
2020-05-15 3:35 ` Bernd Edlinger
2020-05-15 14:46 ` Andrew Burgess
2020-05-16 8:12 ` Bernd Edlinger
2020-05-17 17:26 ` Bernd Edlinger
2020-05-20 18:26 ` Andrew Burgess
2020-05-27 13:10 ` Andrew Burgess
2020-06-01 9:05 ` Andrew Burgess
2020-03-08 12:50 ` [PATCHv2 1/2] gdb/testsuite: Add is-stmt support to the DWARF compiler Andrew Burgess
2020-03-08 12:50 ` [PATCHv2 2/2] gdb: Add support for tracking the DWARF line table is-stmt field Andrew Burgess
2020-03-16 20:57 ` Tom Tromey
2020-03-16 22:37 ` Bernd Edlinger
2020-03-17 12:47 ` Tom Tromey
2020-03-17 18:23 ` Tom Tromey
2020-03-17 18:51 ` Bernd Edlinger
2020-03-17 18:56 ` Andrew Burgess
2020-03-17 20:18 ` Tom Tromey
2020-03-17 22:21 ` Andrew Burgess
2020-03-23 17:30 ` [PATCH 0/3] Keep duplicate line table entries Andrew Burgess
2020-03-23 17:30 ` [PATCH 1/3] gdb/testsuite: Add compiler options parameter to function_range helper Andrew Burgess
2020-04-01 18:31 ` Tom Tromey
2020-03-23 17:30 ` [PATCH 2/3] gdb/testsuite: Add support for DW_LNS_set_file to DWARF compiler Andrew Burgess
2020-04-01 18:32 ` Tom Tromey
2020-03-23 17:30 ` [PATCH 3/3] gdb: Don't remove duplicate entries from the line table Andrew Burgess
2020-04-01 18:34 ` Tom Tromey
2020-06-01 13:26 ` [PATCH 2/2] gdb: Add support for tracking the DWARF line table is-stmt field Pedro Alves
2020-02-06 9:01 ` Luis Machado
2020-02-11 15:39 ` Andrew Burgess
2020-02-09 21:07 ` [PATCH] Fix range end handling of inlined subroutines Bernd Edlinger
2020-02-10 21:48 ` Andrew Burgess
2020-02-22 6:39 ` [PATCHv2] " Bernd Edlinger
2020-03-08 14:57 ` [PATCHv3] " Bernd Edlinger
2020-03-11 22:02 ` Andrew Burgess
2020-03-12 18:21 ` Bernd Edlinger
2020-03-12 18:27 ` Christian Biesinger [this message]
2020-03-13 8:03 ` Bernd Edlinger
2020-03-17 22:27 ` Andrew Burgess
2020-03-19 1:33 ` Bernd Edlinger
2020-03-21 20:31 ` Bernd Edlinger
2020-03-23 17:53 ` Andrew Burgess
2020-03-23 20:58 ` Bernd Edlinger
2020-06-01 14:28 ` Pedro Alves
2020-03-13 12:47 ` [PATCHv4] " Bernd Edlinger
2020-02-05 11:37 ` [PATCH 1/2] gdb/testsuite: Add is-stmt support to the DWARF compiler Andrew Burgess
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPTJ0XG8Po2633xXdFY4E7Bxp1sws_SC-GmKZy3g00NH+SVF7A@mail.gmail.com \
--to=cbiesinger@google.com \
--cc=andrew.burgess@embecosm.com \
--cc=bernd.edlinger@hotmail.de \
--cc=gdb-patches@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox