From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-bounces@sourceware.org>
Received: from simark.ca
	by simark.ca with LMTP
	id 1jnkIPNjhF/PaQAAWB0awg
	(envelope-from <gdb-patches-bounces@sourceware.org>)
	for <public-inbox@simark.ca>; Mon, 12 Oct 2020 10:10:59 -0400
Received: by simark.ca (Postfix, from userid 112)
	id 7A65B1EF6F; Mon, 12 Oct 2020 10:10:59 -0400 (EDT)
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca
X-Spam-Level: 
X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI,
	URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.2
Received: from sourceware.org (server2.sourceware.org [8.43.85.97])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by simark.ca (Postfix) with ESMTPS id 3015A1E58D
	for <public-inbox@simark.ca>; Mon, 12 Oct 2020 10:10:58 -0400 (EDT)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id AD817386F41C;
	Mon, 12 Oct 2020 14:10:57 +0000 (GMT)
Received: from simark.ca (simark.ca [158.69.221.121])
 by sourceware.org (Postfix) with ESMTPS id 3737B3857C45
 for <gdb-patches@sourceware.org>; Mon, 12 Oct 2020 14:10:55 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3737B3857C45
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=simark.ca
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark@simark.ca
Received: from [10.0.0.11] (173-246-6-90.qc.cable.ebox.net [173.246.6.90])
 (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by simark.ca (Postfix) with ESMTPSA id A77021E58D;
 Mon, 12 Oct 2020 10:10:54 -0400 (EDT)
Subject: Re: [PATCHv5 4/4] gdb/fortran: Add support for Fortran array slices
 at the GDB prompt
To: Andrew Burgess <andrew.burgess@embecosm.com>, gdb-patches@sourceware.org
References: <cover.1601285823.git.andrew.burgess@embecosm.com>
 <cover.1602439661.git.andrew.burgess@embecosm.com>
 <7832c05de858cfc8bf4b6abba4332523d0547805.1602439661.git.andrew.burgess@embecosm.com>
From: Simon Marchi <simark@simark.ca>
Message-ID: <b231b34d-7d08-48db-67f7-6e740669cce5@simark.ca>
Date: Mon, 12 Oct 2020 10:10:54 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.12.0
MIME-Version: 1.0
In-Reply-To: <7832c05de858cfc8bf4b6abba4332523d0547805.1602439661.git.andrew.burgess@embecosm.com>
Content-Type: text/plain; charset=utf-8
Content-Language: fr
Content-Transfer-Encoding: 7bit
X-BeenThere: gdb-patches@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=subscribe>
Errors-To: gdb-patches-bounces@sourceware.org
Sender: "Gdb-patches" <gdb-patches-bounces@sourceware.org>

On 2020-10-11 2:12 p.m., Andrew Burgess wrote:
> This commit brings array slice support to GDB.
>
> WARNING: This patch contains a rather big hack which is limited to
> Fortran arrays, this can be seen in gdbtypes.c and f-lang.c.  More
> details on this below.
>
> This patch rewrites two areas of GDB's Fortran support, the code to
> extract an array slice, and the code to print an array.
>
> After this commit a user can, from the GDB prompt, ask for a slice of
> a Fortran array and should get the correct result back.  Slices can
> (optionally) have the lower bound, upper bound, and a stride
> specified.  Slices can also have a negative stride.
>
> Fortran has the concept of repacking array slices.  Within a compiled
> Fortran program if a user passes a non-contiguous array slice to a
> function then the compiler may have to repack the slice, this involves
> copying the elements of the slice to a new area of memory before the
> call, and copying the elements back to the original array after the
> call.  Whether repacking occurs will depend on which version of
> Fortran is being used, and what type of function is being called.
>
> This commit adds support for both packed, and unpacked array slicing,
> with the default being unpacked.
>
> With an unpacked array slice, when the user asks for a slice of an
> array GDB creates a new type that accurately describes where the
> elements of the slice can be found within the original array, a
> value of this type is then returned to the user.  The address of an
> element within the slice will be equal to the address of an element
> within the original array.
>
> A user can choose to selected packed array slices instead using:

"to selected"

>
>   (gdb) set fortran repack-array-slices on|off
>   (gdb) show fortran repack-array-slices
>
> With packed array slices GDB creates a new type that reflects how the
> elements of the slice would look if they were laid out in contiguous
> memory, allocates a value of this type, and then fetches the elements
> from the original array and places then into the contents buffer of
> the new value.
>
> One benefit of using packed slices over unapacked slices is the memory

"unapacked"

> diff --git a/gdb/f-array-walker.h b/gdb/f-array-walker.h
> new file mode 100644
> index 00000000000..395c26e5350
> --- /dev/null
> +++ b/gdb/f-array-walker.h
> @@ -0,0 +1,255 @@
> +/* Copyright (C) 2020 Free Software Foundation, Inc.
> +
> +   This file is part of GDB.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +/* Support classes to wrap up the process of iterating over a
> +   multi-dimensional Fortran array.  */
> +
> +#ifndef F_ARRAY_WALKER_H
> +#define F_ARRAY_WALKER_H
> +
> +#include "defs.h"
> +#include "gdbtypes.h"
> +#include "f-lang.h"
> +
> +/* Class for calculating the byte offset for elements within a single
> +   dimension of a Fortran array.  */
> +class fortran_array_offset_calculator
> +{
> +public:
> +  /* Create a new offset calculator for TYPE, which is either an array or a
> +     string.  */
> +  fortran_array_offset_calculator (struct type *type)
> +  {
> +    /* Validate the type.  */
> +    type = check_typedef (type);
> +    if (type->code () != TYPE_CODE_ARRAY
> +	&& (type->code () != TYPE_CODE_STRING))
> +      error (_("can only compute offsets for arrays and strings"));
> +
> +    /* Get the range, and extract the bounds.  */
> +    struct type *range_type = type->index_type ();
> +    if (get_discrete_bounds (range_type, &m_lowerbound, &m_upperbound) < 0)
> +      error ("unable to read array bounds");
> +
> +    /* Figure out the stride for this array.  */
> +    struct type *elt_type = check_typedef (TYPE_TARGET_TYPE (type));
> +    m_stride = type->index_type ()->bounds ()->bit_stride ();
> +    if (m_stride == 0)
> +      m_stride = type_length_units (elt_type);
> +    else
> +      {
> +	struct gdbarch *arch = get_type_arch (elt_type);
> +	int unit_size = gdbarch_addressable_memory_unit_size (arch);
> +	m_stride /= (unit_size * 8);
> +      }
> +  };
> +
> +  /* Get the byte offset for element INDEX within the type we are working
> +     on.  There is no bounds checking done on INDEX.  If the stride is
> +     negative then we still assume that the base address (for the array
> +     object) points to the element with the lowest memory address, we then
> +     calculate an offset assuming that index 0 will be the element at the
> +     highest address, index 1 the next highest, and so on.  This is not
> +     quite how Fortran works in reality; in reality the base address of
> +     the object would point at the element with the highest address, and
> +     we would index backwards from there in the "normal" way, however,
> +     GDB's current value contents model doesn't support having the base
> +     address be near to the end of the value contents, so we currently
> +     adjust the base address of Fortran arrays with negative strides so
> +     their base address points at the lowest memory address.  This code
> +     here is part of working around this weirdness.  */

I didn't quite catch this, but I think it's just because I'm missing
some knowledge about how Fortran arrays work.  But it looks well
explained enough that if I needed to fix something here, I could
understand what is happening.

> +  LONGEST index_offset (LONGEST index)
> +  {
> +    LONGEST offset;
> +    if (m_stride < 0)
> +      offset = std::abs (m_stride) * (m_upperbound - index);
> +    else
> +      offset = std::abs (m_stride) * (index - m_lowerbound);
> +    return offset;
> +  }
> +
> +private:
> +
> +  /* The stride for the type we are working with.  */
> +  LONGEST m_stride;
> +
> +  /* The upper bound for the type we are working with.  */
> +  LONGEST m_upperbound;
> +
> +  /* The lower bound for the type we are working with.  */
> +  LONGEST m_lowerbound;
> +};
> +
> +/* A base class used by fortran_array_walker.  */
> +class fortran_array_walker_base_impl
> +{
> +public:
> +  /* Constructor.  */
> +  explicit fortran_array_walker_base_impl ()
> +  { /* Nothing.  */ }
> +
> +  /* Called when iterating between the lower and upper bounds of each
> +     dimension of the array.  Return true if GDB should continue iterating,
> +     otherwise, return false.
> +
> +     SHOULD_CONTINUE indicates if GDB is going to stop anyway, and should
> +     be taken into consideration when deciding what to return.  If
> +     SHOULD_CONTINUE is false then this function must also return false,
> +     the function is still called though in case extra work needs to be
> +     done as part of the stopping process.  */
> +  bool continue_walking (bool should_continue)
> +  { return should_continue; }
> +
> +  /* Called when GDB starts iterating over a dimension of the array.  This
> +     function will be called once for each of the bounds in this dimension.
> +     DIM is the current dimension number, NDIM is the total number of
> +     dimensions, and FIRST_P is true for the first bound of this
> +     dimension, and false in all other cases.  */
> +  void start_dimension (int dim, int ndim, bool first_p)
> +  { /* Nothing.  */ }
> +
> +  /* Called when GDB finishes iterating over a dimension of the array.
> +     This function will be called once for each of the bounds in this
> +     dimension.  DIM is the current dimension number, NDIM is the total
> +     number of dimensions, and LAST_P is true for the last bound of this
> +     dimension, and false in all other cases.  */
> +  void finish_dimension (int dim, int ndim, bool last_p)
> +  { /* Nothing.  */ }
> +
> +  /* Called when processing the inner most dimension of the array, for
> +     every element in the array.  PARENT_VALUE is the value from which
> +     elements are being extracted, ELT_TYPE is the type of the element
> +     being extracted, and ELT_OFF is the offset of the element from the
> +     start of PARENT_VALUE.  */
> +  void process_element (struct value *parent_value, struct type *elt_type,
> +			LONGEST elt_off)
> +  { /* Nothing.  */ }
> +};

I don't understand when start_dimension and finish_dimension get called
exactly.  Looking at the implementation, it looks like it will be
something like this:

  start_dimension(1, 3, true)
    start_dimension(2, 3, true)
      start_dimension(3, 3, true)
	process_element(element at (1,1,1))
      finish_dimension(3, 3, false)
      start_dimension(3, 3, false)
        process_element(element (1,1,2))
      finish_dimension(3, 3, true)
    finish_dimension(2, 3, false)
    start_dimension(2, 3, false)
      start_dimension(3, 3, true)
	process_element(element at (1,2,1))
      finish_dimension(3, 3, false)
      start_dimension(3, 3, false)
        process_element(element (1,2,2))
      finish_dimension(3, 3, true)
    finish_dimension(2, 3, false)
  finish_dimension(1, 3, false)
  start_dimension(1, 3, false)
    start_dimension(2, 3, true)
      start_dimension(3, 3, true)
	process_element(element at (2,1,1))
      finish_dimension(3, 3, false)
      start_dimension(3, 3, false)
        process_element(element (2,1,2))
      finish_dimension(3, 3, true)
    finish_dimension(2, 3, false)
    start_dimension(2, 3, false)
      start_dimension(3, 3, true)
	process_element(element at (2,2,1))
      finish_dimension(3, 3, false)
      start_dimension(3, 3, false)
        process_element(element (2,2,2))
      finish_dimension(3, 3, true)
    finish_dimension(2, 3, false)
  finish_dimension(1, 3, false)

Essentially for the inner dimention, walk_1 calls start_dimension and
finish_dimension around each single element.  That's more than I
expected to given my understanding of start_dimension and
finish_dimension.  I expected them to be called just once around once
"scan" of the inner dimension, so every two elements.  For the outer
dimension, I would expect a single call to start and finish, as there is
a single "scan" in this dimension.  I would expect basically this:

  start_dimension(1, 3, ?)
    start_dimension(2, 3, ?)
      start_dimension(3, 3, ?)
	process_element(element at (1,1,1))
        process_element(element (1,1,2))
      finish_dimension(3, 3, ?)
      start_dimension(3, 3, ?)
	process_element(element at (1,2,1))
        process_element(element (1,2,2))
      finish_dimension(3, 3, ?)
    finish_dimension(2, 3, ?)
    start_dimension(2, 3, ?)
      start_dimension(3, 3, ?)
	process_element(element at (2,1,1))
        process_element(element (2,1,2))
      finish_dimension(3, 3, ?)
      start_dimension(3, 3, ?)
	process_element(element at (2,2,1))
        process_element(element (2,2,2))
      finish_dimension(3, 3, ?)
    finish_dimension(2, 3, ?)
  finish_dimension(1, 3, ?)

I omitted the first_p/last_p values, because I am not sure what they
would be.  Did I understand the meaning of
start_dimension/finish_dimension wrong?

I don't have time to look at the rest of the patch now, so that is all
for now :).

Simon