Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Christian Biesinger <cbiesinger@google.com>
To: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: Andrew Burgess <andrew.burgess@embecosm.com>,
	 "gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Subject: Re: [PATCHv3] Fix range end handling of inlined subroutines
Date: Thu, 12 Mar 2020 13:27:51 -0500	[thread overview]
Message-ID: <CAPTJ0XG8Po2633xXdFY4E7Bxp1sws_SC-GmKZy3g00NH+SVF7A@mail.gmail.com> (raw)
In-Reply-To: <AM6PR03MB51701431E9A888715E98F704E4FD0@AM6PR03MB5170.eurprd03.prod.outlook.com>

On Thu, Mar 12, 2020 at 1:21 PM Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
>
> On 3/11/20 11:02 PM, Andrew Burgess wrote:
> > * Bernd Edlinger <bernd.edlinger@hotmail.de> [2020-03-08 14:57:35 +0000]:
> >
> >> On 2/22/20 7:38 AM, Bernd Edlinger wrote:
> >>> On 2/9/20 10:07 PM, Bernd Edlinger wrote:
> >>>> Hi,
> >>>>
> >>>> this is based on Andrew's patch here:
> >>>>
> >>>> https://sourceware.org/ml/gdb-patches/2020-02/msg00092.html
> >>>>
> >>>> This and adds a heuristic to fix the case where caller
> >>>> and callee reside in the same subfile, it uses
> >>>> the recorded is-stmt bits and locates end of
> >>>> range infos, including the previously ignored empty
> >>>> range, and adjusting is-stmt info at this
> >>>> same pc, when the last item is not-is-stmt, the
> >>>> previous line table entries are dubious and
> >>>> we reset the is-stmt bit there.
> >>>> This fixes the new test case in Andrew's patch.
> >>>>
> >>>> It understood, that this is just a heuristic
> >>>> approach, since we do not have exactly the data,
> >>>> which is needed to determine at which of the identical
> >>>> PCs in the line table the subroutine actually ends.
> >>>> So, this tries to be as conservative as possible,
> >>>> just changing what is absolutely necessary.
> >>>>
> >>>> This patch itself is preliminary, since the is-stmt patch
> >>>> needs to be rebased after the refactoring of
> >>>> dwarf2read.c from yesterday, so I will have to rebase
> >>>> this patch as well, but decided to wait for Andrew.
> >>>>
> >>>>
> >>>
> >>> So, this is an update to my previous patch above:
> >>> https://sourceware.org/ml/gdb-patches/2020-02/msg00262.html
> >>>
> >>> It improves performance on big data, by using binary
> >>> search to locate the bogus line table entries.
> >>> Otherwise it behaves identical to the previous version,
> >>> and is still waiting for Andrew's patch before it can
> >>> be applied.
> >>>
> >>>
> >>
> >> Rebased to match Andrew's updated patch of today.
> >
> >
> > Bernd,
> >
> > Thanks for working on this.  This looks like a great improvement, but
> > I have a could of questions / comment inline below.
> >
> > Thanks,
> > Andrew
> >
> >
> >>
> >>
> >> Thanks
> >> Bernd.
> >
> >> From 010f1774ec69d5cab48db9b7a5fb9de3d5860f97 Mon Sep 17 00:00:00 2001
> >> From: Bernd Edlinger <bernd.edlinger@hotmail.de>
> >> Date: Sun, 9 Feb 2020 21:13:17 +0100
> >> Subject: [PATCH] Fix range end handling of inlined subroutines
> >>
> >> Since the is_stmt is now handled, it becomes
> >> possible to locate dubious is_stmt line entries
> >> at the end of an inlined function, even if the
> >> called inline function is in the same subfile.
> >>
> >> When there is a sequence of line entries at the
> >> same address where an inline range ends, and the
> >> last item has is_stmt = 0, we force all previous
> >> items to have is_stmt = 0 as well.
> >
> > This describes _what_ we're doing...
> >
> >>
> >> If the last line at that address has is_stmt = 1,
> >> there is no need to change anything, since a step
> >> over will always stop at that last line from the
> >> same address, which is fine, since it is outside
> >> the subroutine.
> >
> > This describes what we're _not_ doing...
> >
> > ... it just feels like this description would benefit for a bit more
> > _why_.  How does setting all the is_stmt fields to 0 help us?  What
> > cases step/next should we expect to see improvement in?
> >
> > I'm also worried that setting is_stmt to false will mean we loose the
> > ability to set breakpoints on some lines using file:line syntax.  I
> > wonder if we should consider coming up with a solution that both
> > preserves the ability to set breakpoints, while also giving the
> > benefits for step/next.
> >
>
> Maybe as a followup.  The breakpoints on theses lines are *very*
> problematic, as the call stack is almost always wrong.  From the block
> info they appear to be part of the caller, but from the line table they
> are located in the subroutine, when you are at this breakpoint, an
> ordinary user can't make sense of the call stack, as one call frame is
> missing.
>
> The is-stmt patch does the same, removing line infos that could
> in theory be used as breakpoints. See buildsym_compunit::record_line:
>
>   /* Normally, we treat lines as unsorted.  But the end of sequence
>      marker is special.  We sort line markers at the same PC by line
>      number, so end of sequence markers (which have line == 0) appear
>      first.  This is right if the marker ends the previous function,
>      and there is no padding before the next function.  But it is
>      wrong if the previous line was empty and we are now marking a
>      switch to a different subfile.  We must leave the end of sequence
>      marker at the end of this group of lines, not sort the empty line
>      to after the marker.  The easiest way to accomplish this is to
>      delete any empty lines from our table, if they are followed by
>      end of sequence markers.  All we lose is the ability to set
>      breakpoints at some lines which contain no instructions
>      anyway.  */
>   if (line == 0 && subfile->line_vector->nitems > 0)
>     {
>       e = subfile->line_vector->item + subfile->line_vector->nitems - 1;
>       while (subfile->line_vector->nitems > 0 && e->pc == pc)
>         {
>           e--;
>           subfile->line_vector->nitems--;
>         }
>     }
>
>
> This is invoked if a is-stmt line is followed by
> a non-stmt line of another CU, at the same PC.
> This is previously the non-is-stmt lines were all ignored,
> but now the non-is-stmt makes an end marker in the current
> CU and a non-is-stmt line in the next CU.
> This deletion is exactly what makes the other patch work.
>
> But I think it is not necessary to remove those lines altogether,
> instead we can just make them non-is-stmt lines, which would
> better match to what this patch does.
>
> While I write these lines, I see an UB in the above code,
> since e-- can decrement the pointer before the beginning of the
> line_vector, when nitems becomes 0, the value is not used, but
> the pointer value e indeterminate if I see that right.
>
> I will send a patch for that, when I find time for it.
>
> >>
> >> This is about the best we can do at the moment,
> >> unless location view information are added to the
> >> block ranges debug info structure, and location
> >> views are implemented in gdb in general.
> >>
> >> gdb:
> >> 2020-03-08  Bernd Edlinger  <bernd.edlinger@hotmail.de>
> >>
> >>      * buildsym.c (buildsym_compunit::record_inline_range_end,
> >>      patch_inline_end_pos): New helper functions.
> >>      (buildsym_compunit::end_symtab_with_blockvector): Patch line table.
> >>      (buildsym_compunit::~buildsym_compunit): Cleanup m_inline_end_vector.
> >>      * buildsym.h (buildsym_compunit::record_inline_range_end): Declare.
> >>      (buildsym_compunit::m_inline_end_vector,
> >>      buildsym_compunit::m_inline_end_vector_length,
> >>      buildsym_compunit::m_inline_end_vector_nitems): New data items.
> >>      * dwarf2/read.c (dwarf2_rnglists_process,
> >>      dwarf2_ranges_process): Don't ignore empty ranges here.
> >>      (dwarf2_ranges_read): Ignore empty ranges here.
> >>      (dwarf2_record_block_ranges): Pass end of range PC to
> >>      record_inline_range_end for inline functions.
> >>
> >> gdb/testsuite:
> >> 2020-03-08  Bernd Edlinger  <bernd.edlinger@hotmail.de>
> >>
> >>      * gdb.cp/step-and-next-inline.exp: Adjust test.
> >> ---
> >>  gdb/buildsym.c                                | 84 +++++++++++++++++++++++++++
> >>  gdb/buildsym.h                                | 11 ++++
> >>  gdb/dwarf2/read.c                             | 22 ++++---
> >>  gdb/testsuite/gdb.cp/step-and-next-inline.exp | 17 ------
> >>  4 files changed, 108 insertions(+), 26 deletions(-)
> >>
> >> diff --git a/gdb/buildsym.c b/gdb/buildsym.c
> >> index 24aeba8..f02b7ce 100644
> >> --- a/gdb/buildsym.c
> >> +++ b/gdb/buildsym.c
> >> @@ -113,6 +113,8 @@ struct pending_block
> >>        next1 = next->next;
> >>        xfree ((void *) next);
> >>      }
> >> +
> >> +  xfree (m_inline_end_vector);
> >>  }
> >>
> >>  struct macro_table *
> >>>> this patch as well, but decided to wait for Andrew.
> >> @@ -732,6 +734,84 @@ struct blockvector *
> >>  }
> >>
> >>
> >> +/* Record a PC where a inlined subroutine ends.  */
> >> +
> >> +void
> >> +buildsym_compunit::record_inline_range_end (CORE_ADDR end)
> >> +{
> >> +  if (m_inline_end_vector == nullptr)
> >> +    {
> >> +      m_inline_end_vector_length = INITIAL_LINE_VECTOR_LENGTH;
> >> +      m_inline_end_vector = (CORE_ADDR *)
> >> +    xmalloc (sizeof (CORE_ADDR) * m_inline_end_vector_length);
> >> +      m_inline_end_vector_nitems = 0;
> >> +    }
> >> +  else if (m_inline_end_vector_nitems == m_inline_end_vector_length)
> >> +    {
> >> +      m_inline_end_vector_length *= 2;
> >> +      m_inline_end_vector = (CORE_ADDR *)
> >> +    xrealloc ((char *) m_inline_end_vector,
> >> +              sizeof (CORE_ADDR) * m_inline_end_vector_length);
> >> +    }
> >> +
> >> +  m_inline_end_vector[m_inline_end_vector_nitems++] = end;
> >> +}
> >> +
> >> +
> >> +/* Patch the is_stmt bits at the given inline end address.
> >> +   The line table has to be already sorted.  */
> >> +
> >> +static void
> >> +patch_inline_end_pos (struct linetable *table, CORE_ADDR end)
> >> +{
> >> +  int a = 2, b = table->nitems - 1;
> >> +  struct linetable_entry *items = table->item;
> >> +
> >> +  /* We need at least two items with pc = end in the table.
> >> +     The lowest usable items are at pos 0 and 1, the highest
> >> +     usable items are at pos b - 2 and b - 1.  */
> >> +  if (a > b || end < items[1].pc || end > items[b - 2].pc)
> >> +    return;
> >> +
> >> +  /* Look for the first item with pc > end in the range [a,b].
> >> +     The previous element has pc = end or there is no match.
> >> +     We set a = 2, since we need at least two consecutive elements
> >> +     with pc = end to do anything useful.
> >> +     We set b = nitems - 1, since we are not interested in the last
> >> +     element which should be an end of sequence marker with line = 0
> >> +     and is_stmt = 1.  */
> >> +  while (a < b)
> >> +    {
> >> +      int c = (a + b) / 2;
> >> +
> >> +      if (end < items[c].pc)
> >> +    b = c;
> >> +      else
> >> +    a = c + 1;
> >> +    }
> >> +
> >> +  a--;
> >> +  if (items[a].pc != end || items[a].is_stmt)
> >> +    return;
> >> +
> >> +  /* When there is a sequence of line entries at the same address
> >> +     where an inline range ends, and the last item has is_stmt = 0,
> >> +     we force all previous items to have is_stmt = 0 as well.
> >> +     Setting breakpoints at those addresses is currently not
> >> +     supported, since it is unclear if the previous addresses are
> >> +     part of the subroutine or the calling program.  */
> >> +  while (a-- > 0)
> >
> > I know this is a personal preference, but I really dislike having
> > these inc/dec adjustments within expressions like this.  I feel it's
> > too easy to overlook, and I always end up having to think twice just
> > to make sure I've parsed the intention correctly.  I'd like to make a
> > plea to have the adjustment moved into the body of the loop please.
> >
>
> No problem, I can do that better, the first iteration does
> not need to check for a > 0, so I will make that a
> do { a--; .... } while (a > 0), so that will be even more
> efficient this way.
>
> >> +    {
> >> +      /* We stop at the first line entry with a different address,
> >> +         or when we see an end of sequence marker.  */
> >
> > I think there's a whitespace error at the start of this line.
>
> Yes, indeed.
>
> >
> >> +      if (items[a].pc != end || items[a].line == 0)
> >> +    break;
> >> +
> >> +      items[a].is_stmt = 0;
> >> +    }
> >> +}
> >> +
> >> +
> >>  /* Subroutine of end_symtab to simplify it.  Look for a subfile that
> >>     matches the main source file's basename.  If there is only one, and
> >>     if the main source file doesn't have any symbol or line number
> >> @@ -965,6 +1045,10 @@ struct compunit_symtab *
> >>                            subfile->line_vector->item
> >>                            + subfile->line_vector->nitems,
> >>                            lte_is_less_than);
> >> +
> >> +       for (int i = 0; i < m_inline_end_vector_nitems; i++)
> >> +         patch_inline_end_pos (subfile->line_vector,
> >> +                               m_inline_end_vector[i]);
> >>      }
> >>
> >>        /* Allocate a symbol table if necessary.  */
> >> diff --git a/gdb/buildsym.h b/gdb/buildsym.h
> >> index c768a4c..2845789 100644
> >> --- a/gdb/buildsym.h
> >> +++ b/gdb/buildsym.h
> >> @@ -190,6 +190,8 @@ struct buildsym_compunit
> >>    void record_line (struct subfile *subfile, int line, CORE_ADDR pc,
> >>                  bool is_stmt);
> >>
> >> +  void record_inline_range_end (CORE_ADDR end);
> >> +
> >>    struct compunit_symtab *get_compunit_symtab ()
> >>    {
> >>      return m_compunit_symtab;
> >> @@ -397,6 +399,15 @@ struct buildsym_compunit
> >>
> >>    /* Pending symbols that are local to the lexical context.  */
> >>    struct pending *m_local_symbols = nullptr;
> >> +
> >> +  /* Pending inline end range addresses.  */
> >> +  CORE_ADDR * m_inline_end_vector = nullptr;
> >> +
> >> +  /* Number of allocated inline end range addresses.  */
> >> +  int m_inline_end_vector_length = 0;
> >> +
> >> +  /* Number of pending inline end range addresses.  */
> >> +  int m_inline_end_vector_nitems = 0;
> >>  };
> >
> > I think we should probably just use a std::vector<CORE_ADDR> for
> > m_inline_end_vector - unless I'm missing a good reason.  I think this
> > would simplify some of the code in buildsym.c.
> >
>
> I am worried that the vector is not as efficient as it is necessary here.
> I know for instance that a straight forward
>
> m_inline_end_vector.push_back(pc);
>
> has often an O(n^2) complexity alone for this adding new elements,
> (and it can throw, which makes error handling a mess).

push_back is not O(n^2). See
https://en.cppreference.com/w/cpp/container/vector/push_back:
"Complexity
Amortized constant"

That's because it doubles the capacity whenever it has to resize.

Christian

> But the performance of this table need to be O(n * log(n))
> since n is often around 1.000.000 (when I debug large
> apps like gcc).
>
> A previous version of this patch used O(n^2) and was exactly doubling
> the execution time for loading of the debug info (for debugging gcc's cc1).
> Although that appears to be already done incrementally.
>
>
> Bernd.


  reply	other threads:[~2020-03-12 18:28 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-05 11:37 [PATCH 0/2] Line table is_stmt support Andrew Burgess
2020-02-05 11:37 ` [PATCH 2/2] gdb: Add support for tracking the DWARF line table is-stmt field Andrew Burgess
2020-02-05 17:55   ` Bernd Edlinger
2020-02-10 18:30     ` Bernd Edlinger
2020-02-11 13:57     ` Andrew Burgess
2020-02-14 20:05       ` Bernd Edlinger
2020-03-05 18:01         ` Bernd Edlinger
2020-03-08 12:50           ` [PATCHv2 0/2] Line table is_stmt support Andrew Burgess
2020-03-08 14:39             ` Bernd Edlinger
2020-03-10 23:01               ` Andrew Burgess
2020-03-11  6:50                 ` Simon Marchi
2020-03-11 11:28                   ` Andrew Burgess
2020-03-11 13:27                     ` Simon Marchi
2020-04-03 22:21             ` [PATCH 0/2] More regression fixing from is-stmt patches Andrew Burgess
2020-04-03 22:21             ` [PATCH 1/2] gdb/testsuite: Move helper function into lib/dwarf.exp Andrew Burgess
2020-04-06 20:18               ` Tom Tromey
2020-04-14 11:18                 ` Andrew Burgess
2020-04-03 22:21             ` [PATCH 2/2] gdb: Preserve is-stmt lines when switch between files Andrew Burgess
2020-04-04 18:07               ` Bernd Edlinger
2020-04-04 19:59                 ` Bernd Edlinger
2020-04-04 22:23                 ` Andrew Burgess
2020-04-05  0:04                   ` Bernd Edlinger
2020-04-05  0:47                   ` Bernd Edlinger
2020-04-05  8:55                   ` Bernd Edlinger
2020-04-11  3:52               ` Bernd Edlinger
2020-04-12 17:13                 ` Bernd Edlinger
2020-04-14 11:28                   ` Andrew Burgess
2020-04-14 11:37                     ` Bernd Edlinger
2020-04-14 11:41                       ` Bernd Edlinger
2020-04-14 13:08                       ` Andrew Burgess
2020-04-16 17:18                     ` Andrew Burgess
2020-04-22 21:13                       ` Tom Tromey
2020-04-25  7:06                         ` Bernd Edlinger
2020-04-27 10:34                           ` Andrew Burgess
2020-05-14 20:18                             ` Tom Tromey
2020-05-14 22:39                               ` Andrew Burgess
2020-05-15  3:35                                 ` Bernd Edlinger
2020-05-15 14:46                                   ` Andrew Burgess
2020-05-16  8:12                                     ` Bernd Edlinger
2020-05-17 17:26                                   ` Bernd Edlinger
2020-05-20 18:26                                   ` Andrew Burgess
2020-05-27 13:10                                     ` Andrew Burgess
2020-06-01  9:05                                       ` Andrew Burgess
2020-03-08 12:50           ` [PATCHv2 1/2] gdb/testsuite: Add is-stmt support to the DWARF compiler Andrew Burgess
2020-03-08 12:50           ` [PATCHv2 2/2] gdb: Add support for tracking the DWARF line table is-stmt field Andrew Burgess
2020-03-16 20:57             ` Tom Tromey
2020-03-16 22:37               ` Bernd Edlinger
2020-03-17 12:47               ` Tom Tromey
2020-03-17 18:23                 ` Tom Tromey
2020-03-17 18:51                   ` Bernd Edlinger
2020-03-17 18:56                   ` Andrew Burgess
2020-03-17 20:18                     ` Tom Tromey
2020-03-17 22:21                       ` Andrew Burgess
2020-03-23 17:30             ` [PATCH 0/3] Keep duplicate line table entries Andrew Burgess
2020-03-23 17:30             ` [PATCH 1/3] gdb/testsuite: Add compiler options parameter to function_range helper Andrew Burgess
2020-04-01 18:31               ` Tom Tromey
2020-03-23 17:30             ` [PATCH 2/3] gdb/testsuite: Add support for DW_LNS_set_file to DWARF compiler Andrew Burgess
2020-04-01 18:32               ` Tom Tromey
2020-03-23 17:30             ` [PATCH 3/3] gdb: Don't remove duplicate entries from the line table Andrew Burgess
2020-04-01 18:34               ` Tom Tromey
2020-06-01 13:26           ` [PATCH 2/2] gdb: Add support for tracking the DWARF line table is-stmt field Pedro Alves
2020-02-06  9:01   ` Luis Machado
2020-02-11 15:39     ` Andrew Burgess
2020-02-09 21:07   ` [PATCH] Fix range end handling of inlined subroutines Bernd Edlinger
2020-02-10 21:48     ` Andrew Burgess
2020-02-22  6:39     ` [PATCHv2] " Bernd Edlinger
2020-03-08 14:57       ` [PATCHv3] " Bernd Edlinger
2020-03-11 22:02         ` Andrew Burgess
2020-03-12 18:21           ` Bernd Edlinger
2020-03-12 18:27             ` Christian Biesinger [this message]
2020-03-13  8:03               ` Bernd Edlinger
2020-03-17 22:27                 ` Andrew Burgess
2020-03-19  1:33                   ` Bernd Edlinger
2020-03-21 20:31                     ` Bernd Edlinger
2020-03-23 17:53                       ` Andrew Burgess
2020-03-23 20:58                         ` Bernd Edlinger
2020-06-01 14:28                           ` Pedro Alves
2020-03-13 12:47         ` [PATCHv4] " Bernd Edlinger
2020-02-05 11:37 ` [PATCH 1/2] gdb/testsuite: Add is-stmt support to the DWARF compiler Andrew Burgess

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPTJ0XG8Po2633xXdFY4E7Bxp1sws_SC-GmKZy3g00NH+SVF7A@mail.gmail.com \
    --to=cbiesinger@google.com \
    --cc=andrew.burgess@embecosm.com \
    --cc=bernd.edlinger@hotmail.de \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox