From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-149455-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 100183 invoked by alias); 1 Aug 2018 02:01:16 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 100123 invoked by uid 89); 1 Aug 2018 02:01:07 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-25.4 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_STOCKGEN autolearn=ham version=3.3.2 spammy=sk:overlay, 28333
X-HELO: sessmg22.ericsson.net
Received: from sessmg22.ericsson.net (HELO sessmg22.ericsson.net) (193.180.251.58) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 01 Aug 2018 02:01:04 +0000
DKIM-Signature: v=1; a=rsa-sha256; d=ericsson.com; s=mailgw201801; c=relaxed/simple;	q=dns/txt; i=@ericsson.com; t=1533088862;	h=From:Sender:Reply-To:Subject:Date:Message-ID:To:Cc:MIME-Version:Content-Type:	Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date:Resent-From:	Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Id:	List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive;	bh=MpfDb7JH+5/9nsH3ldHnPH7r9+QQh0PFdDrMwYywMbg=;	b=e+yBt9q4pAqv4CqhPK9M8bmohCTiLMi21OwmylQZJrdXg5svJXT1mIMwbjW3q4wg	KOvVsVHok5ufh8FCeLg+pBroikzxF48nps9Dntf/lOFsei5ztrcfTC8YzCdLotdL	EhY+XMxRoWhVzs+9QM62+kYF0gvOXCaCkvehVf55rr0=;
Received: from ESESSMB501.ericsson.se (Unknown_Domain [153.88.183.119])	by sessmg22.ericsson.net (Symantec Mail Security) with SMTP id 8A.E3.31169.E54116B5; Wed,  1 Aug 2018 04:01:02 +0200 (CEST)
Received: from ESESBMB505.ericsson.se (153.88.183.172) by ESESSMB501.ericsson.se (153.88.183.162) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1466.3; Wed, 1 Aug 2018 04:00:10 +0200
Received: from NAM01-SN1-obe.outbound.protection.outlook.com (153.88.183.157) by ESESBMB505.ericsson.se (153.88.183.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1466.3 via Frontend Transport; Wed, 1 Aug 2018 04:00:10 +0200
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=yFpq6VTMFtgS+Qn1XfIQYGbtwKSILDL2U2pOjBtEedA=; b=Bn5nkatZleEJ763oUybs9jnAu971iD9BhRQW6W3rT6DCcdTP4BzJE9iJr0qKh6SqB6zRIisoMTfOWZR6mfViImSt7A7W/3V+S3UjfLJFh2NHkqTcsiIC65cEYSizC24klWZi5/Syr46fcQQEsnyLppuytpVTxY+rziARIgptaAs=
Received: from [10.0.0.110] (192.222.164.54) by BN7PR15MB2387.namprd15.prod.outlook.com (2603:10b6:406:8c::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.995.19; Wed, 1 Aug 2018 02:00:07 +0000
Subject: Re: [PATCH 3/8] Add support for non-contiguous blocks to find_pc_partial_function
To: Kevin Buettner <kevinb@redhat.com>, <gdb-patches@sourceware.org>
References: <20180625233239.49dc52ea@pinnacle.lan> <20180625234703.08a2e8ca@pinnacle.lan> <20180719115216.6a08ad9a@pinnacle.lan>
From: Simon Marchi <simon.marchi@ericsson.com>
Message-ID: <b0d4910e-e6a3-a678-46a6-d0accbd5c2a1@ericsson.com>
Date: Wed, 01 Aug 2018 02:01:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1
MIME-Version: 1.0
In-Reply-To: <20180719115216.6a08ad9a@pinnacle.lan>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Return-Path: simon.marchi@ericsson.com
Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=simon.marchi@ericsson.com;
Received-SPF: None (protection.outlook.com: ericsson.com does not designate permitted sender hosts)
X-IsSubscribed: yes
X-SW-Source: 2018-08/txt/msg00004.txt.bz2

On 2018-07-19 02:52 PM, Kevin Buettner wrote:
> The description and patch below are intended as a replacement for my
> original patch.  It uses the approach outlined by Simon Marchi for
> checking the find_pc_partial_function cache.
> 
> - - - -
> 
> This change adds an optional output parameter BLOCK to
> find_pc_partial_function.  If BLOCK is non-null, then *BLOCK will be
> set to the address of the block corresponding to the function symbol
> if such a symbol was found during lookup.  Otherwise it's set to a
> NULL value.  Callers may wish to use the block information to
> determine whether the block contains any non-contiguous ranges.  The
> caller may also iterate over or examine those ranges.
> 
> When I first started looking at the broken stepping behavior associated
> with functions w/ non-contiguous ranges, I found that I could "fix"
> the problem by disabling the find_pc_partial_function cache.  It would
> sometimes happen that the PC passed in would be between the low and
> high cache values, but would be in some other function that happens to
> be placed in between the ranges for the cached function.  This caused
> incorrect values to be returned.
> 
> So dealing with this cache turns out to be very important for fixing
> this problem.  I explored three different ways of dealing with the cache.
> 
> My first approach was to clear the cache when a block was encountered
> with more than one range.  This would cause the non-cache pathway to
> be executed on the next call to find_pc_partial_function.
> 
> Another approach, which I suspect is slightly faster, checks to see
> whether the PC is within one of the ranges associated with the cached
> block.  If so, then the cached values can be used.  It falls back to
> the original behavior if there is no cached block.
> 
> The current approach, suggested by Simon Marchi, is to restrict the
> low/high pc values recorded for the cache to the beginning and end of
> the range containing the PC value under consideration.  This allows us
> to retain the simple (and fast) test for determining whether the
> memoized (cached) values apply to the PC passed to
> find_pc_partial_function.
> 
> Another choice that had to be made was whether to have ADDRESS
> continue to represent the lowest address associated with the function
> or with the entry pc associated with the function.  For the moment,
> I've decided to keep the current behavior of having it represent the
> lowest address.  In cases where the entry pc is needed, this can be
> found by passing a non-NULL value for BLOCK which will cause the block
> (pointer) associated with the function to be returned.  BLOCK_ENTRY_PC
> can then be used on that block to obtain the entry pc.

This LGTM overall, just a few nits/suggestions/questions.

> diff --git a/gdb/blockframe.c b/gdb/blockframe.c
> index b3c9aa3..a3b2a11 100644
> --- a/gdb/blockframe.c
> +++ b/gdb/blockframe.c
> @@ -159,6 +159,7 @@ static CORE_ADDR cache_pc_function_low = 0;
>  static CORE_ADDR cache_pc_function_high = 0;
>  static const char *cache_pc_function_name = 0;
>  static struct obj_section *cache_pc_function_section = NULL;
> +static const struct block *cache_pc_function_block = nullptr;
>  
>  /* Clear cache, e.g. when symbol table is discarded.  */
>  
> @@ -169,24 +170,14 @@ clear_pc_function_cache (void)
>    cache_pc_function_high = 0;
>    cache_pc_function_name = (char *) 0;
>    cache_pc_function_section = NULL;
> +  cache_pc_function_block = nullptr;

We might want to rename cache_pc_function_low and cache_pc_function_high to
reflect their new usage (since they don't represent the low/high bounds of
the function anymore, but the range).

Alternatively, it might make things simpler to keep cache_pc_function_low
and cache_pc_function_high as they are right now, and introduce two
new variables for the bounds of the matched range.

>  }
>  
> -/* Finds the "function" (text symbol) that is smaller than PC but
> -   greatest of all of the potential text symbols in SECTION.  Sets
> -   *NAME and/or *ADDRESS conditionally if that pointer is non-null.
> -   If ENDADDR is non-null, then set *ENDADDR to be the end of the
> -   function (exclusive), but passing ENDADDR as non-null means that
> -   the function might cause symbols to be read.  This function either
> -   succeeds or fails (not halfway succeeds).  If it succeeds, it sets
> -   *NAME, *ADDRESS, and *ENDADDR to real information and returns 1.
> -   If it fails, it sets *NAME, *ADDRESS and *ENDADDR to zero and
> -   returns 0.  */
> -
> -/* Backward compatibility, no section argument.  */
> +/* See symtab.h.  */
>  
>  int
>  find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address,
> -			  CORE_ADDR *endaddr)
> +			  CORE_ADDR *endaddr, const struct block **block)
>  {
>    struct obj_section *section;
>    struct symbol *f;
> @@ -232,13 +223,37 @@ find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address,
>        f = find_pc_sect_function (mapped_pc, section);
>        if (f != NULL
>  	  && (msymbol.minsym == NULL
> -	      || (BLOCK_START (SYMBOL_BLOCK_VALUE (f))
> +	      || (BLOCK_ENTRY_PC (SYMBOL_BLOCK_VALUE (f))

I don't understand this change, can you explain it briefly?

>  		  >= BMSYMBOL_VALUE_ADDRESS (msymbol))))
>  	{
> -	  cache_pc_function_low = BLOCK_START (SYMBOL_BLOCK_VALUE (f));
> -	  cache_pc_function_high = BLOCK_END (SYMBOL_BLOCK_VALUE (f));
> +	  const struct block *b = SYMBOL_BLOCK_VALUE (f);
> +
> +	  if (BLOCK_CONTIGUOUS_P (b))
> +	    {
> +	      cache_pc_function_low = BLOCK_START (b);
> +	      cache_pc_function_high = BLOCK_END (b);
> +	    }
> +	  else
> +	    {
> +	      int i;
> +	      for (i = 0; i < BLOCK_NRANGES (b); i++)
> +	        {
> +		  if (BLOCK_RANGE_START (b, i) <= mapped_pc
> +		      && mapped_pc < BLOCK_RANGE_END (b, i))
> +		    {
> +		      cache_pc_function_low = BLOCK_RANGE_START (b, i);
> +		      cache_pc_function_high = BLOCK_RANGE_END (b, i);
> +		      break;
> +		    }
> +		}
> +	      /* Above loop should exit via the break.  */
> +	      gdb_assert (i < BLOCK_NRANGES (b));
> +	    }
> +
>  	  cache_pc_function_name = SYMBOL_LINKAGE_NAME (f);
>  	  cache_pc_function_section = section;
> +	  cache_pc_function_block = b;
> +
>  	  goto return_cached_value;
>  	}
>      }
> @@ -268,15 +283,33 @@ find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address,
>    cache_pc_function_name = MSYMBOL_LINKAGE_NAME (msymbol.minsym);
>    cache_pc_function_section = section;
>    cache_pc_function_high = minimal_symbol_upper_bound (msymbol);
> +  cache_pc_function_block = nullptr;
>  
>   return_cached_value:
>  
> +  CORE_ADDR f_low, f_high;
> +
> +  /* The low and high addresses for the cache do not necessarily
> +     correspond to the low and high addresses for the function.
> +     Extract the function low/high addresses from the cached block
> +     if there is one; otherwise use the cached low & high values.  */
> +  if (cache_pc_function_block)

if (cache_pc_function_block != nullptr)

> +    {
> +      f_low = BLOCK_START (cache_pc_function_block);
> +      f_high = BLOCK_END (cache_pc_function_block);
> +    }
> +  else
> +    {
> +      f_low = cache_pc_function_low;
> +      f_high = cache_pc_function_high;
> +    }

This, for example, could probably be kept simple if new variables were
introduced for the matched block range, and cache_pc_function_{low,high}
stayed as-is.  I haven't actually tried, so it may not be a good idea at
all.

> +
>    if (address)
>      {
>        if (pc_in_unmapped_range (pc, section))
> -	*address = overlay_unmapped_address (cache_pc_function_low, section);
> +	*address = overlay_unmapped_address (f_low, section);
>        else
> -	*address = cache_pc_function_low;
> +	*address = f_low;
>      }
>  
>    if (name)
> @@ -291,13 +324,15 @@ find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address,
>  	     the overlay), we must actually convert (high - 1) and
>  	     then add one to that.  */
>  
> -	  *endaddr = 1 + overlay_unmapped_address (cache_pc_function_high - 1,
> -						   section);
> +	  *endaddr = 1 + overlay_unmapped_address (f_high - 1, section);
>  	}
>        else
> -	*endaddr = cache_pc_function_high;
> +	*endaddr = f_high;
>      }
>  
> +  if (block)

if (block != nullptr)

Thanks,

Simon