From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2366 invoked by alias); 13 May 2014 10:21:40 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 2352 invoked by uid 89); 13 May 2014 10:21:39 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 13 May 2014 10:21:38 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s4DALa7t031742 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 May 2014 06:21:36 -0400 Received: from blade.nx (ovpn-116-68.ams2.redhat.com [10.36.116.68]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s4DALZgB016622; Tue, 13 May 2014 06:21:36 -0400 Received: by blade.nx (Postfix, from userid 1000) id 0235826234A; Tue, 13 May 2014 11:21:34 +0100 (BST) Date: Tue, 13 May 2014 10:21:00 -0000 From: Gary Benson To: Mark Kettenis Cc: gdb-patches@sourceware.org Subject: Re: [PATCH 0/2] Demangler crash handler Message-ID: <20140513102134.GB17805@blade.nx> References: <20140509100656.GA4760@blade.nx> <201405091120.s49BKO1f010622@glazunov.sibelius.xs4all.nl> <20140509153305.GA13345@blade.nx> <201405112023.s4BKNL3v024248@glazunov.sibelius.xs4all.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201405112023.s4BKNL3v024248@glazunov.sibelius.xs4all.nl> X-IsSubscribed: yes X-SW-Source: 2014-05/txt/msg00152.txt.bz2 Mark Kettenis wrote: > There are entire subsystems in GDB that are a breeding ground for > segfaults. Should we therefore wrap evrything? This is a straw man. I'm not proposing we wrap all subsystems. I'm proposing we wrap one specific subsystem that has caused problems in the past and is likely to continue to do so into the forseeable future. I'd argue that the demangler warrants special consideration as in one sense it's not a subsystem of GDB but rather a (somewhat unloved?) side-project of GCC that we borrow. The situation we find ourselves in is this: * GDB is more likely to see demangler crashes than libstdc++ GDB demangles all symbols in every file it loads, always. In libstdc++ the demangler is only called in error situations, and then only to demangle the few symbols that appear in the backtrace. So: we get the bug reports, and have to triage them, even though the code is not "ours". * Bugs have a more serious affect on GDB than on libstdc++ Currently a demangler crash will cause GDB to segfault, usually at startup. A demangler crash in libstdc++ is not such a big deal as the application was likely crashing anyway. So: bugs that are high-priority for us are low-priority for the "main" demangler authors, and we end up having to fix them. * Demangler patches often get waved through with minimal scrutiny The few people who really know the demangler are busy with other things, and the above two points mean demangler issues are low- priority for them. So: bugs are going to keep on coming our way. This situation is why I feel the demangler merits special handling. > It is obvious that the demangler is a breeding ground for > segmentation faults. It uses strcpy, strcat and sprintf. So it's > probably full of buffer overflows. I bet that if those are fixed, > the SIGSEGVs are gone. That's not been the case for the crashes I recently fixed. The demangler parses the mangled symbol into a tree that it then walks to generate the result. All pretty standard so far, but in certain cases sections of the tree can be (re)entered from other, non-local parts of the tree. Both crashes I fixed involved a stack overflow caused by a loop in the tree caused by a reentry. > Note that only some of those buffer overflows will generate a > SIGSEGV. Others will corrupt random memory. And you can't patch > those up with a signal handler. Agreed. I'm not pretending this will solve the underlying issues, and I'm not making an excuse to not fix demangler crashes. I was merely trying to make a bad situation less bad by a) allowing the possibility that the user could continue to use GDB to debug their program, and b) helping the user make a more meaningful bug report. I agree that there might be cases where this patch could turn a demangler failure into some other difficult-to-debug crash that would waste developer time. But developer time is being wasted now: every demangler filed requires the reporter or (more likely) one of us to trail through the core file to identify that the issue is a demangler bug and then pull out the symbol that caused the crash. This patch does that work. In my response to Doug's mail [1] I updated the patch to use internal_warning. The user is interrupted (so they cannot miss the warning) and the questions they are asked make it plain that by continuing over the error they are now living on the edge. Does that minimise the possibility of us wasting time chasing memory corruption errors enough for you? > > It's true that you don't get core dumps with this patch, but what > > you do get in return is a printed warning that includes the symbol > > that caused the crash. That's all you need in most cases. The > > five recent demangler crashes (14963, 16593, 16752, 16817 and > > 16845) all required digging by either the reporter or a GDB > > developer to uncover the failing symbol. Printing the offending > > symbol means this work is already done. > > > > If the lack of core dumps is a showstopper for you then I can > > update the patch to allow disabling the handler with "maint set > > handle-demangler-crashes 0" or some similar thing. > > Not acceptable. Unless you make it the default... Disabled by default wouldn't stop us having to triage the bugs that are filed. Would an enabled-by-default crash catcher using internal_warning as above work for you? Thanks, Gary -- [1] https://sourceware.org/ml/gdb-patches/2014-05/msg00151.html