From mboxrd@z Thu Jan  1 00:00:00 1970
From: Kevin Buettner <kevinb@cygnus.com>
To: "Glenn F. Maynard" <g_gdb@zewt.org>, gdb@sourceware.cygnus.com
Subject: Re: gdb: detection and/or fork+gethostbyname crash workaround?
Date: Tue, 24 Jul 2001 13:26:00 -0000
Message-id: <1010724202607.ZM20680@ocotillo.lan>
References: <20010723210115.B1359@zewt.org> <1010724155219.ZM20181@ocotillo.lan> <20010724142248.A2923@zewt.org> <g_gdb@zewt.org>
X-SW-Source: 2001-07/msg00346.html

On Jul 24,  2:22pm, Glenn F. Maynard wrote:

> On Tue, Jul 24, 2001 at 08:52:19AM -0700, Kevin Buettner wrote:
> > If it's one which uses ptrace(), the kernel usually prohibits two
> > processes from invoking ptrace() on the same inferior.  So one
> > strategy might be to have the program in question cause ptrace to be
> > invoked on itself.  I don't think the process will be able to do this
> > itself; I think it's likely that it would have to fork and let the
> > child attempt this.  The return status from wait() or waitpid() could
> > indicate whether the attempt to invoke ptrace() was successful or not.
> 
> That's what I tried (x86 Linux); a process can't ptrace itself.  I could
> fork a process to test this, I suppose; I'll try that.  (Wouldn't it
> be the return status from ptrace(), though?  I assume it'd return something
> like EPERM if a process is already being debugged.)

Right; you'd pass back the results of the attempted ptrace() (and it
looks to me like EPERM is the right thing to test for) through the
child's exit status.  That's why I said you'd use wait() or waitpid()
to determine it.

> > I searched gdb@sources.redhat.com w/ both of these search phrases and
> > only came up with the message to which I'm responding.  Can you provide
> > some URLs for the archived messages?
> 
> http://sources.redhat.com/ml/bug-glibc/2000-04/msg00018.html

Thanks.  I think I can explain what is happening now...

GDB is placing a breakpoint in the dummy function, _dl_debug_state(),
which is the function that the shared library machinery calls to
communicate the fact that that the set of shared libraries has changed
(i.e, the application has either loaded or unloaded one of them). 
When a child is forked, this breakpoint still exists in the child
process with no one to catch it.  Ideally, gdb would remove all
breakpoints prior to forking the child and then reinstall these
breakpoints in the parent again after the child has started.  Some
operating systems (such as those which use /proc for the debug
interface to the OS) give the debugger facilities to do this, but to
the best of my knowledge, the ptrace() interface provided by Linux
does not provide a mechanism by which this may be accomplished. [1]
(Basically, GDB needs a way to stop the parent's execution in the
return path from fork().  It would also be nice if gdb allowed the
user to choose to follow the child or follow the parent, but that's
another problem...)

Anyway, for your problem, there are at least two other approaches
that you may wish to explore:

    1) Set up a signal handler in the child for SIGTRAP.  Have the
       signal handler replace the trap instruction with a nop prior
       to returning.

    2) In the parent, call dlopen() to explicitly load the shared
       libraries needed by gethostbyname().  That way, when
       gethostbyname() is called, there won't be any unloaded
       functions.

    3) See footnote...

[1] Actually, maybe there is.  It may be possible to place a
breakpoint libc's fork() entry point.  When this breakpoint is hit,
we call ptrace(PTRACE_SYSCALL,...) twice.  From my reading of the
documentation, the first PTRACE_SYSCALL call will stop at the fork
syscall's entry point, the second will stop when the syscall is about
to return.  We could remove all breakpoints on the first stop (fork
entry) and restore them on the second stop (fork exit).

Kevin