From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-112314-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 2366 invoked by alias); 13 May 2014 10:21:40 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 2352 invoked by uid 89); 13 May 2014 10:21:39 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 13 May 2014 10:21:38 +0000
Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26])	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s4DALa7t031742	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);	Tue, 13 May 2014 06:21:36 -0400
Received: from blade.nx (ovpn-116-68.ams2.redhat.com [10.36.116.68])	by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s4DALZgB016622;	Tue, 13 May 2014 06:21:36 -0400
Received: by blade.nx (Postfix, from userid 1000)	id 0235826234A; Tue, 13 May 2014 11:21:34 +0100 (BST)
Date: Tue, 13 May 2014 10:21:00 -0000
From: Gary Benson <gbenson@redhat.com>
To: Mark Kettenis <mark.kettenis@xs4all.nl>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH 0/2] Demangler crash handler
Message-ID: <20140513102134.GB17805@blade.nx>
References: <20140509100656.GA4760@blade.nx> <201405091120.s49BKO1f010622@glazunov.sibelius.xs4all.nl> <20140509153305.GA13345@blade.nx> <201405112023.s4BKNL3v024248@glazunov.sibelius.xs4all.nl>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <201405112023.s4BKNL3v024248@glazunov.sibelius.xs4all.nl>
X-IsSubscribed: yes
X-SW-Source: 2014-05/txt/msg00152.txt.bz2

Mark Kettenis wrote:
> There are entire subsystems in GDB that are a breeding ground for
> segfaults.  Should we therefore wrap evrything?

This is a straw man.  I'm not proposing we wrap all subsystems.
I'm proposing we wrap one specific subsystem that has caused
problems in the past and is likely to continue to do so into
the forseeable future.

I'd argue that the demangler warrants special consideration as
in one sense it's not a subsystem of GDB but rather a (somewhat
unloved?) side-project of GCC that we borrow.  The situation
we find ourselves in is this:

 * GDB is more likely to see demangler crashes than libstdc++

   GDB demangles all symbols in every file it loads, always.  In
   libstdc++ the demangler is only called in error situations,
   and then only to demangle the few symbols that appear in the
   backtrace.

   So: we get the bug reports, and have to triage them, even
   though the code is not "ours".   

 * Bugs have a more serious affect on GDB than on libstdc++

   Currently a demangler crash will cause GDB to segfault, usually at
   startup.  A demangler crash in libstdc++ is not such a big deal as
   the application was likely crashing anyway.

   So: bugs that are high-priority for us are low-priority for the
   "main" demangler authors, and we end up having to fix them.

 * Demangler patches often get waved through with minimal scrutiny

   The few people who really know the demangler are busy with other
   things, and the above two points mean demangler issues are low-
   priority for them.

   So: bugs are going to keep on coming our way.
   
This situation is why I feel the demangler merits special handling.

> It is obvious that the demangler is a breeding ground for
> segmentation faults.  It uses strcpy, strcat and sprintf.  So it's
> probably full of buffer overflows.  I bet that if those are fixed,
> the SIGSEGVs are gone.

That's not been the case for the crashes I recently fixed.  The
demangler parses the mangled symbol into a tree that it then walks to
generate the result.  All pretty standard so far, but in certain cases
sections of the tree can be (re)entered from other, non-local parts of
the tree.  Both crashes I fixed involved a stack overflow caused by a
loop in the tree caused by a reentry.

> Note that only some of those buffer overflows will generate a
> SIGSEGV.  Others will corrupt random memory.  And you can't patch
> those up with a signal handler.

Agreed.  I'm not pretending this will solve the underlying issues,
and I'm not making an excuse to not fix demangler crashes.  I was
merely trying to make a bad situation less bad by a) allowing the
possibility that the user could continue to use GDB to debug their
program, and b) helping the user make a more meaningful bug report.

I agree that there might be cases where this patch could turn a
demangler failure into some other difficult-to-debug crash that would
waste developer time.  But developer time is being wasted now: every
demangler filed requires the reporter or (more likely) one of us to
trail through the core file to identify that the issue is a demangler
bug and then pull out the symbol that caused the crash.  This patch
does that work.

In my response to Doug's mail [1] I updated the patch to use
internal_warning.  The user is interrupted (so they cannot miss the
warning) and the questions they are asked make it plain that by
continuing over the error they are now living on the edge.  Does
that minimise the possibility of us wasting time chasing memory
corruption errors enough for you?

> > It's true that you don't get core dumps with this patch, but what
> > you do get in return is a printed warning that includes the symbol
> > that caused the crash.  That's all you need in most cases.  The
> > five recent demangler crashes (14963, 16593, 16752, 16817 and
> > 16845) all required digging by either the reporter or a GDB
> > developer to uncover the failing symbol.  Printing the offending
> > symbol means this work is already done.
> > 
> > If the lack of core dumps is a showstopper for you then I can
> > update the patch to allow disabling the handler with "maint set
> > handle-demangler-crashes 0" or some similar thing.
> 
> Not acceptable.  Unless you make it the default...

Disabled by default wouldn't stop us having to triage the bugs that
are filed.

Would an enabled-by-default crash catcher using internal_warning as
above work for you?

Thanks,
Gary

-- 
[1] https://sourceware.org/ml/gdb-patches/2014-05/msg00151.html