From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-12859-listarch-gdb=sources.redhat.com@sources.redhat.com>
Received: (qmail 12192 invoked by alias); 2 Mar 2003 03:39:06 -0000
Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:gdb-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/gdb/>
List-Post: <mailto:gdb@sources.redhat.com>
List-Help: <mailto:gdb-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: gdb-owner@sources.redhat.com
Received: (qmail 12185 invoked from network); 2 Mar 2003 03:39:06 -0000
Received: from unknown (HELO crack.them.org) (65.125.64.184)
  by 172.16.49.205 with SMTP; 2 Mar 2003 03:39:06 -0000
Received: from nevyn.them.org ([66.93.61.169] ident=mail)
	by crack.them.org with asmtp (Exim 3.12 #1 (Debian))
	id 18pMCr-0000Zz-00; Sat, 01 Mar 2003 23:40:13 -0600
Received: from drow by nevyn.them.org with local (Exim 3.36 #1 (Debian))
	id 18pKJY-0003kW-00; Sat, 01 Mar 2003 22:39:00 -0500
Date: Sun, 02 Mar 2003 03:39:00 -0000
From: Daniel Jacobowitz <drow@mvista.com>
To: Andrew Cagney <ac131313@redhat.com>
Cc: gdb@sources.redhat.com
Subject: Re: expected behavior of GNU/Linux gcore and corefiles
Message-ID: <20030302033900.GA14335@nevyn.them.org>
Mail-Followup-To: Andrew Cagney <ac131313@redhat.com>,
	gdb@sources.redhat.com
References: <3E617797.5000704@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <3E617797.5000704@redhat.com>
User-Agent: Mutt/1.5.1i
X-SW-Source: 2003-03/txt/msg00019.txt.bz2

On Sat, Mar 01, 2003 at 10:16:39PM -0500, Andrew Cagney wrote:
> (notes from a hallway conversation, I had a recollection that, at some 
> stage, the attempt to load libthread db over a core file on GNU/Linux 
> was disabled.)
> 
> When using GDB on a live threaded program that puts all threads into 
> tight infinite loops (while (1);), I'll do something like:
> 
>     $ ./a.out &
>     Pid 1234
>     $ gdb ./a.out 1234
>     (gdb) info threads
>     ....
>     (gdb) quit
> 
> As a user I'd also expect sequences such as:
> 
>     $ kill -QUIT 1234
>     $ gdb ./a.out core
>     (gdb) info threads
>     ....
>     (gdb) quit
> 
> and:
> 
>     $ gcore 1234
>     $ gdb ./a.out core
>     (gdb) info threads
>     ....
>     (gdb) quit

Hmm, what platform is that?  I've never seen a GNU/Linux
implementation, but I'm a youngun.

> 
> and:
> 
>     $ gdb ./a.out 1234
>     (gdb) gcore
>     (gdb) quit
>     $ gdb ./a.out core
>     (gdb) info threads
> 
> to all come back with effectively the same output.  Further, on both 
> live and corefile targets, I'd expect to be able to select/examine each 
> thread vis:
> 
>     (gdb) thread 5
>     11    i = i + 1;
>     (gdb) list
>     10    __thread__ i = 1;
>     11    i = i + 1;
>     (gdb) print i
>     $1 = 1
> 
> (which would involve thread local storage).
> 
> Two problems come to mind:
> 
> - Is the kernel including all the raw information needed to do this in 
> the core file?

Some platforms yes, some platforms no.  I've had Linux patches to do
this for some time.  I believe kernels as of 2.5.50 or so do it also.

> - For GDB to completly implement the above, is it forced to use 
> libthread-db?

Not necessarily.
Here's the added value we get from libthread_db right now, on "normal"
GNU/Linux threading (either LinuxThreads or NPTL, but not including the
M-N stuff that NGPT does):
  - Thread IDs, instead of just LWP IDs.  Big deal.
  - Thread local storage offsets.

We could do find TLS areas with a per-architecture hook; libthread_db
finds the TLS area by grubbing through the thread descriptor, but given
that we have each LWP's state we could just do it in the
platform-mandated way (i.e. examining the registers in question, etc.)

I don't know offhand if the necessary information for Alpha, i386, etc.
TLS is in core files - i.e. platforms where it's more than just a
register.

> My instincts tell me that, to completly implement the above 
> functionality, GDB is always going to need libthread-db.  If GDB could 
> implement the above on a core file without using libthread-db, then GDB 
> could also implement the above on a live target also without using 
> libthread-db.  This is because a core file is always going to contain a 
> subset of the information made available via ptrace et.al.

In my opinion, there are two things wrong with that instinct:
  - What we need for live debugging is a superset of what we need for
    core debugging.  For instance we use libthread_db to find new
    threads at the moment.
  - Current GNU/Linux kernels can implement an entirely thread_db free
    debugging model for threads.  We can trap on each clone, acquire
    each thread as it is created, map them to our internal numbering,
    etc.  We can't easily figure out TIDs (1024, 2049, etc.) - but it's
    not clear how much that matters.

    Doing it this way is more like the old linux-thread code than the
    current lin-lwp/thread-db combination; but it's vastly more
    reliable than either.  In my opinion it's preferable.  I just
    haven't had the time to finish it; and it isn't clear how to
    integrate it with thread_db.

Thread_db is a great interface for querying threads.  It's not so good
for debugging them; for a case in point see my messages about the FAIL
at the end of print-threads.exp.  Thread_db does not interact well with
debugging the thread manager itself, which is a very important feature
for me.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer