From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-54151-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 26394 invoked by alias); 18 Feb 2008 16:28:39 -0000
Received: (qmail 26349 invoked by uid 22791); 18 Feb 2008 16:28:38 -0000
X-Spam-Check-By: sourceware.org
Received: from dmz.mips-uk.com (HELO dmz.mips-uk.com) (194.74.144.194)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 18 Feb 2008 16:28:21 +0000
Received: from internal-mx1 ([192.168.192.240] helo=ukservices1.mips.com) 	by dmz.mips-uk.com with esmtp (Exim 3.35 #1 (Debian)) 	id 1JR8qc-0002wN-00; Mon, 18 Feb 2008 16:28:06 +0000
Received: from perivale.mips.com ([192.168.192.200]) 	by ukservices1.mips.com with esmtp (Exim 3.36 #1 (Debian)) 	id 1JR8qT-0001oi-00; Mon, 18 Feb 2008 16:27:57 +0000
Received: from macro (helo=localhost) 	by perivale.mips.com with local-esmtp (Exim 4.63) 	(envelope-from <macro@mips.com>) 	id 1JR8qT-0001NJ-7z; Mon, 18 Feb 2008 16:27:57 +0000
Date: Mon, 18 Feb 2008 16:28:00 -0000
From: "Maciej W. Rozycki" <macro@mips.com>
To: Nigel Stephens <nigel@mips.com>
cc: Jim Blandy <jimb@red-bean.com>, Daniel Jacobowitz <drow@false.org>,      gdb-patches@sourceware.org, "Maciej W. Rozycki" <macro@linux-mips.org>
Subject: Re: MIPS: Handle manual calls of MIPS16 functions with a call stub
In-Reply-To: <47B988D5.3060605@mips.com>
Message-ID: <Pine.LNX.4.61.0802181347120.13151@perivale.mips.com>
References: <Pine.LNX.4.61.0801311703570.22816@perivale.mips.com>   <20080131220315.GC5085@caradoc.them.org>  <Pine.LNX.4.61.0802011009460.14889@perivale.mips.com>   <Pine.LNX.4.61.0802041149560.28589@perivale.mips.com>   <20080204163929.GA6642@caradoc.them.org>  <Pine.LNX.4.61.0802041642480.28589@perivale.mips.com>   <8f2776cb0802081007v3978bbfbs4c91835ddd8c3885@mail.gmail.com>   <Pine.LNX.4.61.0802131810250.19972@perivale.mips.com>  <8f2776cb0802131254p7fd04ba5lf908414940b7f622@mail.gmail.com>  <Pine.LNX.4.61.0802151113430.26997@perivale.mips.com> <47B988D5.3060605@mips.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-MIPS-Technologies-UK-MailScanner: Found to be clean
X-MIPS-Technologies-UK-MailScanner-From: macro@mips.com
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
X-SW-Source: 2008-02/txt/msg00297.txt.bz2

On Mon, 18 Feb 2008, Nigel Stephens wrote:

> >> - It's contrary to the DWARF spec for producers to put arch-specific
> >>   information in the low bits of the addresses in the line number
> >>   table, function and block address ranges, and so on.  Existing
> >>   toolchains that do this are buggy, but that's life.
> >>     
> 
> Hmm. The ELF symbol table has an "out of band" mechanism to distinguish
> symbols which refer to different architectural encodings, using the
> architecture-specific bits in the st_info field. But how would you
> represent that in  DWARF's object file encoding, noting that the
> "MIPS16-ness" of an undefined symbol is not known at compile time, only
> at final link time.

 As I see it there is really no need to notice the difference of the 
MIPS16 mode text references in DWARF records.  The architecture already 
observes the bit #0 in text references is not a part of the address and is 
merely a mode bit.  We have, in particular, joint text and data address 
space and where a MIPS16 instruction is treated as data, e.g. when 
inserting a breakpoint or with self-modifying code the corresponding data 
address has its bit #0 clear.

 Therefore it is safe and intuitive to actually mask it out everywhere but 
where the "pc" is to be addressed.  We already have the infrastructure in 
place: gdbarch_read_pc(), gdbarch_write_pc() and gdbarch_unwind_pc() are 
enough to get things in order as my proof-of-concept patch has shown.

> And this is not just a way of sneaking architecture-specific
> "information" through the object file: it is a fundamental aspect of the
> MIPS architecture. From the running software's point of view bit 0 of
> the PC must be set to execute MIPS16 instructions (e.g. when performing
> an indirect jump or function call) and it will always be viewed as set
> when made visible to software (e.g. in the return address register after
> a function call, or saved on the stack, or in the exception program
> counter). Bit 0 must therefore be set in any data which points to a
> MIPS16 instruction. Because of the way that DWARF information is
> generated using assembler directives, then by the time a program is
> fully linked bit 0 of a to a MIPS16 address within a DWARF section will
> "by definition" be set: it would otherwise not be a valid pointer to a
> MIPS16 instruction. I suppose that we could invent new relocation types
> for DWARF data, or get the linker to relocate data within .dwarf*
> sections differently, but that would be an intrusive change.

 Now that is where the problem starts -- we do not really have specific 
text-reference relocations for MIPS16 pointers and it makes things tricky.  
In principle any reference to a MIPS16 instruction which is not made for 
the purpose of making a function call should have its bit 0 set.  Imagine 
this piece of code:

#include <stdio.h>
#include <stdint.h>

void foo(void)
{
	return;
}

int main(void)
{
	printf("%04x\n", *(uint16_t *)foo);
	return 0;
}

One could reasonably expect (though probably not specified by the C 
standard) that code above would print the first instruction of foo() as a 
hexadecimal value.  As it is it does not.

> >> - The addresses in GDB's line tables and block ranges should not have
> >>   such extra bits set in them.
> >>     
> 
> Well, I suppose that you could clear bit 0, after stashing it in some
> other field of the line table or block range. But you'd then have to
> make sure that it continued to be passed around, along with the address
> -- or else at some point you'd have to fold it back into bit0 of the
> address anyway.

 It is carried around in the minimal symbol table (which is built from the 
ELF symbol table) and can be queried with pc_is_mips16().

> >> - The proper place to store arch-specific data on minimal symbols is
> >>   in the 'info' field of 'struct minimal_symbol'.  In cases like
> >>   MIPS16, the gdbarch_push_dummy_call method can consult that
> >>   information at call time to see which calling convention is
> >>   appropriate.
> >>     
> 
> 
> >
> >  I certainly agree here.
> >   
> 
> What about when there isn't an ELF symbol associated with an address?

 Then we have a problem anyway.  Functions marked as MIPS16 in the ELF 
symbol table define their respective ranges of addresses to be treated as 
MIPS16 code and everything else is standard MIPS -- this is no different 
now.  And with the ELF symbol table missing we cannot notice MIPS16 code 
now either except by setting the bit #0 where appropriate (e.g. "break 
*<address>") explicitly.

> >  OK, but the original patch does not clear the bit, but actually it sets 
> > it in the single case discovered so far where GAS gets it wrong (and 
> > linker does not fix up in any way). 
> 
> Maybe fix GAS then? Maciej, can you provide example assembler code that
> GAS handles incorrectly?

 I'll skip full code sent here previously and leave what's really 
relevant:

	.globl	print_ten_doubles
.LFB20:
	.loc 1 664 0
	# Stub function for print_ten_doubles (double, double)
	.set	nomips16
	.section	.mips16.fn.print_ten_doubles,"ax",@progbits
	.align	2
	.ent	__fn_stub_print_ten_doubles
__fn_stub_print_ten_doubles:

	# [...]

	.end	__fn_stub_print_ten_doubles
	.text
	.set	mips16
	.ent	print_ten_doubles
print_ten_doubles:

The presence of a section flip above makes relocations against ".LFB20" 
(which is really an alias to "print_ten_doubles") be converted to ones 
against .text which defeats the check to detect references to MIPS16 code 
in the linker.

 I think the bug in GAS here is inconsistency.  How the relocation is 
calculated should not depend on what syntactically lies between the 
instance of a label and the location it corresponds to.  This is 
independent from our abuse of DWARF records though.

  Maciej