From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18656 invoked by alias); 9 Nov 2010 19:35:42 -0000 Received: (qmail 18434 invoked by uid 22791); 9 Nov 2010 19:35:41 -0000 X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL,BAYES_00,TW_XF X-Spam-Check-By: sourceware.org Received: from rock.gnat.com (HELO rock.gnat.com) (205.232.38.15) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 09 Nov 2010 19:35:37 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by filtered-rock.gnat.com (Postfix) with ESMTP id 6103B2BAD1F for ; Tue, 9 Nov 2010 14:35:35 -0500 (EST) Received: from rock.gnat.com ([127.0.0.1]) by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id o10dkYtFFLou for ; Tue, 9 Nov 2010 14:35:35 -0500 (EST) Received: from joel.gnat.com (localhost.localdomain [127.0.0.1]) by rock.gnat.com (Postfix) with ESMTP id CFED22BAD12 for ; Tue, 9 Nov 2010 14:35:33 -0500 (EST) Received: by joel.gnat.com (Postfix, from userid 1000) id 7E06C145B4E; Tue, 9 Nov 2010 11:35:27 -0800 (PST) Date: Tue, 09 Nov 2010 19:35:00 -0000 From: Joel Brobecker To: gdb-patches@sourceware.org Subject: [DWARF/RFC] end-address overflow in location description entry Message-ID: <20101109193527.GI2811@adacore.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="d6Gm4EdcadzBjdND" Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-06-14) Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2010-11/txt/msg00140.txt.bz2 --d6Gm4EdcadzBjdND Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-length: 2794 I was wondering if I could have your opinion on the following situation. We have a global, non-static, variable defined inside a C file. This C file was compiled with a non-GCC compiler. The symbol table places that variable at an address. The debugging info contains a DW_TAG_variable DIE where the location attribute points to a location list. Why the compiler elected to use a location list rather than a provide a plain address is unknown, since as far as we can tell, there is no reason to believe that the variable would have a non-global lifetime. The fun starts when decoding the location list: It has one single entry, where the address is correct (identical to the one in the symbol table), but where the begin/end offsets are 0x0, and 0xffffffff (the address size is 4, and addresses are unsigned). The offsets are then added to the base address, which in this case is the CU low address: static const gdb_byte * find_location_expression (struct dwarf2_loclist_baton *baton, size_t *locexpr_length, CORE_ADDR pc) [...] /* Otherwise, a location expression entry. */ low += base_address; high += base_address; With 32bit hosts, high overflows, and we end up doing the equivalent of "high = base_address - 1", resulting with a null range for that entry. In the end, what the users sees is: (gdb) print my_variable $1 = The reason why I'm mentioning this, is because we do have an overflow, and I'm wondering whether we'd want to do something about it. For instance, we could try adding the following code, which sets high to the highest address for the address size if we overflow. This would make things a little better for the user, I think. if (high < base_address) high = base_mask; But on the other side, we'd still have issues accessing the variable from units whose code is located before the unit where the variable is defined. So it's still not great. We could special case the 0/-1 case even further by making it a universal range. Perhaps something like this: if (low == 0 && high == base_mask) { *locexpr_length = length; return loc_ptr; } Would we be interested in a change of this nature? For now, I'm ruling this debugging information as incorrect. But on the other hand, I cannot find an easy way for the user to work around the issue and get the address in an automated way. So perhaps, in the spirit of being lenient in what we accept, if the a 0/-1 range is obviously wrong (is it?), then we can add that special case in GDB. I think it would be easier to discard this special case if there was a way to tell GDB to only consider minimal symbols in the expression... Thoughts? Thanks, -- Joel --d6Gm4EdcadzBjdND Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="dwarf2loc-overflow.diff" Content-length: 711 diff --git a/gdb/dwarf2loc.c b/gdb/dwarf2loc.c index b2aecf2..3d5606d 100644 --- a/gdb/dwarf2loc.c +++ b/gdb/dwarf2loc.c @@ -108,6 +108,9 @@ find_location_expression (struct dwarf2_loclist_baton *baton, low += base_address; high += base_address; + if (high < base_address) + high = base_mask; + length = extract_unsigned_integer (loc_ptr, 2, byte_order); loc_ptr += 2; @@ -2550,6 +2553,9 @@ loclist_describe_location (struct symbol *symbol, CORE_ADDR addr, low += base_address; high += base_address; + if (high < base_address) + high = base_mask; + length = extract_unsigned_integer (loc_ptr, 2, byte_order); loc_ptr += 2; --d6Gm4EdcadzBjdND--