From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 113818 invoked by alias); 13 May 2019 20:20:07 -0000 Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org Received: (qmail 113810 invoked by uid 89); 13 May 2019 20:20:06 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-7.6 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 spammy=constantly, underneath, company, his X-HELO: us-smtp-delivery-195.mimecast.com Received: from us-smtp-delivery-195.mimecast.com (HELO us-smtp-delivery-195.mimecast.com) (63.128.21.195) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 13 May 2019 20:20:05 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=datto.com; s=mimecast20190208; t=1557778803; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:openpgp:autocrypt; bh=xuPuDCcZL1Ix+twtOIN0kFtjam6qKQo0vHa72vcBFZk=; b=dRiaZC2h4Ovx+Ja/BQmirKvOcrZX5rV/V3C7Mby/kqPehwpTZLz2CjiCXl8ZiGChWZ6+TS ef5sHnrf1R16xCbvULNPv4S6h03MjTm1ArUCUzgTaLN9aAi1Tznxm7NOJu8/Qb+Vw0lgp1 Jd7k14Tfeto3/gmM17weN5X07cqlVgc= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-340-zsd-M3p-Pz2SDPrlsaEReQ-1; Mon, 13 May 2019 16:20:02 -0400 Received: by mail-wm1-f69.google.com with SMTP id h8so124500wmf.1 for ; Mon, 13 May 2019 13:20:02 -0700 (PDT) MIME-Version: 1.0 From: Thomas Caputi Date: Mon, 13 May 2019 20:20:00 -0000 Message-ID: Subject: Linux kernel debugging and other features To: gdb@sourceware.org X-Mimecast-Spam-Score: 0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2019-05/txt/msg00019.txt.bz2 Hello gdb, My name is Tom Caputi and I am a developer for ZFS on Linux. Recently, I had the opportunity to work with some members of Delphix (another major ZFS on Linux contributor) to build some debugging tools. When we started working on this project we were surprised to see how close gdb was to supporting this kernel debugging natively. For live systems, we were able to use the kernel's vmlinux file from the dbgsym package (after mucking around for a bit with KASLR offsets) along with /proc/kcore as a core file to inspect just about any non-local variable on the system. For inspecting post-mortem kdumps we found that Jeff Mahoney had already been working on this (https://github.com/jeffmahoney/crash-python). Kdump files are compressed and use a different on-disk format from regular core files, but he was able to create a new "kdump" target type to support that. His work also included code that allowed us to load the symbols for kernel modules with their correct offsets. Jeff also has a python script that was able to parse out Linux's list of task_struct structures (which represent all threads on the system threads) and hand them to gdb. This allowed us to switch threads and view stack traces with function arguments just as we could when using gdb to debug a userspace program. On top of all of this, members of the Delphix team were able to put together some code to allow some custom gdb sub-commands (written in python) to be piped together comparable to the way commands can be piped together in bash. By doing this we were able to put together a few relatively simply commands to get some really powerful debugging output. Currently, all of this is still in the proof-of-concept stage, but I think both Datto (my company) and Delphix would like to look to the next steps to get these improvements integrated upstream and stabilized. We think these could be a huge improvement to the current situation of debugging any code in the Linux kernel. However, there are some sticky bits that we would like to discuss if the gdb community is interested in these changes: 1) The kdumpfile support currently requires a few custom patches added to gdb that allow a user to create a custom target in python. The kdumpfile target is then implemented as a python module that calls out to libkdumpfile (written in c). I'm not sure if this is the desired implementation of this feature. If it is not, could we get some pointers for how we could add this support to gdb? 2) The /proc/kcore file *looks* like a core file, but it is constantly changing underneath us as the live system changes. When debugging code we had issues where values that should be changing were cached and appeared to remain static. We were able to reduce the gdb cache size to 2 bytes (I think) by running 'set stack-cache off; set code-cache off; set dcache size 1; set dcache line-size 2', but this still results in (at least) the last variable you inspected being cached until you look at something else. Is there a way we can completely disable the dcache? 3) We aren't 100% sure where all of the new code belongs. The ZFS-specific debugging commands we can definitely keep in the ZFS repository, but the sub-command piping infrastructure could be useful to anyone using gdb. We're also not really sure where the scripts that parse out kernel structures (for things like threads and per-cpu variables) should end up. Please let us know if you are interested in any of these changes and let us know what some good next steps would be. Thanks, Tom