From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-41589-listarch-gdb=sources.redhat.com@sourceware.org>
Received: (qmail 26033 invoked by alias); 29 Nov 2012 14:49:11 -0000
Received: (qmail 26006 invoked by uid 22791); 29 Nov 2012 14:49:08 -0000
X-SWARE-Spam-Status: No, hits=-6.5 required=5.0	tests=AWL,BAYES_00,KHOP_RCVD_UNTRUST,RCVD_IN_DNSWL_HI,RCVD_IN_HOSTKARMA_W,RP_MATCHES_RCVD,SPF_HELO_PASS,TW_BJ
X-Spam-Check-By: sourceware.org
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 29 Nov 2012 14:49:01 +0000
Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24])	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id qATEn0j5017488	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK)	for <gdb@sourceware.org>; Thu, 29 Nov 2012 09:49:00 -0500
Received: from host2.jankratochvil.net (ovpn-116-104.ams2.redhat.com [10.36.116.104])	by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id qATEmtZ2001924	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO)	for <gdb@sourceware.org>; Thu, 29 Nov 2012 09:48:58 -0500
Date: Thu, 29 Nov 2012 14:49:00 -0000
From: Jan Kratochvil <jan.kratochvil@redhat.com>
To: gdb@sourceware.org
Subject: plan: VLA (Variable Length Arrays) and Fortran dynamic types
Message-ID: <20121129144855.GA16288@host2.jankratochvil.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
X-IsSubscribed: yes
Mailing-List: contact gdb-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb.sourceware.org>
List-Subscribe: <mailto:gdb-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-owner@sourceware.org
X-SW-Source: 2012-11/txt/msg00094.txt.bz2

Hi all,

the work on implementing the plan below has not started yet.  It describes how
to correctly upstream existing branch archer-jankratochvil-vla from
	http://sourceware.org/gdb/wiki/ArcherBranchManagement

which primarily enables GDB to use Fortran for non-trivial programs.

Originally shortly described at:
	Re: fortran multidimensional arrays and pointers
	http://sourceware.org/ml/gdb/2011-03/msg00021.html
	http://sourceware.org/ml/gdb/2011-03/msg00037.html

Regards,
Jan

------------------------------------------------------------------------------
(1)
Resolve the check_typedef problem - the primary problem of this patchset:

Currently for example one forgets to call check_typedef and the code sometimes
works (but sometimes it does not).

Rename "struct type" to "struct abstract_type".  Make all TYPE_LENGTH,
TYPE_ARRAY_UPPER_BOUND_VALUE etc. macros accessing concrete sizes requiring
to provide also "struct value *".  Remove check_typedef, that is hide it under
the TYPE_LENGTH etc. macros.

This would also include work to pass "struct value *" mostly everywhere
instead of where current "struct type *" is passed as one needs the inferior
memory to find out the concrete dimensions of inferior type.
This then makes the current *-vla implementation of DW_AT_object_pointer easy.

This could be done incrementally but it gets useful only after it gets
completed.

It is questionable whether it would have measurable performance regression.
If so there could be done some caching (cleared on each return to mainloop).

Alternative approach would be to "concretize" abstract types by check_typedef
(which would be kept there).  In such case there still would be
"struct abstract_type" but there would be also "struct concrete_type" which
would automatically cache all the values for better performance.
check_typedef would still be impossible to forget like nowadays due to the
non-compatible GDB types "struct abstract_type" vs. "struct concrete_type".
archer-jankratochvil-vla does it this way (but still keeping "struct type"
being both the input and output GDB type of check_typedef).
I do not think this approach is worth the pain.

Time estimation absolutely unknown but in the range of month or months.

------------------------------------------------------------------------------
(2)
Revive the inferior types reference counting vs. garbage collector.
	[patch 0/8] Types Garbage Collector (for VLA+Python)
	http://sourceware.org/ml/gdb-patches/2009-05/msg00543.html

Probably not much a problem but TBH I haven't found it so important compared
to the other GDB problems.

I am just sorry this types garbage collector / reference counting is needed
also by PythonGDB and I took over those days that part myself but never
checked it in.  Therefore AFAIK FSF GDB now leaks inferior types created by
Python (similarly like archer-jankratochvil-vla does but *-vla is not
upstreamed).

Time estimation 2 weeks.  Partially not on the critical path.

------------------------------------------------------------------------------
(3) - very optional
Merge "struct type" and "struct main_type" as with types gc/refc it has no
longer any benefits.

This is just a code cleanup, not on the critical path.  Also one may rather
that time already redo the whole inferior type system into normal C++
inherited types with accessors.

------------------------------------------------------------------------------
(4) - optional for critical path
value->contents should be discontiguous so that Fortran array splices (just
a row or just a column, that is display only each 10th byte for example) can
be stored into GDB $conveniencevariables.

Also with the dynamic bounds calculation (with no explicit check_typedef)
proposed in (1) such discontiguous memory would store the few bytes of Fortran
array descriptor where the array bounds are stored in inferior memory.
The array descriptor stores lower-bound, upper-bound and stride for each array
dimension (gcc/fortran/trans-array.c).

archer-jankratochvil-vla does not implement this, therefore if I display one
column splice of a 10MB array with 10 rows and 1000000 columns each GDB will
allocate 10MB and not just 10 bytes it needs (assuming 1 byte sized elements
here).

Time estimation 3 weeks.

------------------------------------------------------------------------------
(5)
Cleanup LA_VAL_PRINT vs. LA_VALUE_PRINT where are done too many assumptions
about arrays inferior memory layout.  Possibly more similar cleanups around
code hacked around now in archer-jankratochvil-vla.

Some (all) of the problem were described by Pedro in:
	Re: [RFA] valprint.c / *-valprint.c: Don't lose `embedded_offset'
	http://sourceware.org/ml/gdb-patches/2010-10/msg00127.html

This is not done and ugly hacked around in archer-jankratochvil-vla, this is
also a reason why some more complicated reference/splice/arrays do not work in
*-vla.

Time estimation 2-3 weeks.

------------------------------------------------------------------------------
(6)
Cleanup of struct value, fields like OFFSET, ENCLOSING_TYPE, EMBEDDED_OFFSET
and POINTED_TO_OFFSET should no longer be needed.

I find this part very difficult to do, I even already tried once.

It tries to make $conveniencevariables standalone enough so that even if
objfile gets deleted they work.  But in practice they do not as C++ virtual
method pointer is no longer accessible.

So TBH I would no longer falsely pretend $conveniencevariables can work for
complicated inferior types which nobody uses anyway and then we can drop those
fields.

Otherwise if we should make $conveniencevariables really working store all the
stuff like virtual method tables into the discontiguous memory as got
implemented by (4).  Then make the in-GDB copy shared between multiple struct
values and get rid of all those enclosing/embedded offsets.

Time estimation 4 weeks as long as one is able to cope with it.

------------------------------------------------------------------------------
(7) - optional for critical path
Make GDB inferior objects access piecemeal, that is no longer depend on
value->contents containing all the memory at once.

"print array_of_4gb" would then display the first page of elements asking for
--More-- intead of just "locking up" and/or erroring/crashing out of memory.

Somehow related to how value->contents gets represented after (4).

Time estimation very unknown to me but it could be probably done in an
incremental way, it depends a lot on how (4) gets implemented.

Time estimation 3 weeks.

------------------------------------------------------------------------------
(8)
Port the Fortran / VLA DWARF parsing.

This could be reused from the existing archer-jankratochvil-vla code.

Time estimation 1 week, nothing difficult with all the infrastructure already
in place.

------------------------------------------------------------------------------

Total estimation is 16 weeks + the unknown estimate for (1).
Maybe I overestimated the weeks a bit.