Re: Checking function calls

Mirror of the gdb mailing list
 help / color / mirror / Atom feed

From: Fredrik Tolf <fredrik@dolda2000.cjb.net>
To: gdb@sources.redhat.com
Subject: Re: Checking function calls
Date: Thu, 05 Dec 2002 15:14:00 -0000	[thread overview]
Message-ID: <1039130097.2343.52.camel@pc7> (raw)
In-Reply-To: <200212050451.gB54ppB31800@duracef.shout.net>

On Thu, 2002-12-05 at 05:51, Michael Elizabeth Chastain wrote:
> Hi Fredrik,
> 
> I'm throwing out a bunch of ideas here, take whatever looks useful
> and discard the rest.
> 
> > Therefore, the failure has to be that a called
> > function doesn't restore EBX correctly, on rare occasions, right?
> 
> I have seen this happen in a mixed programming environment,
> with a Cygwin program that used a Windows DLL.  The Windows DLL
> had subtly different calling conventions where it did not preserve
> %ebx, %esi, and %edi across function calls.  Perhaps you have some
> kind of third party library in your program which has a similar
> compatibilty issue?

The only libraries are libc, libpthread, libdl and libpam. In the
affected function, only libc and libpthread are used. Therefore I don't
think it's calling convention incompatibility, unless they in turn call
functions in third party libraries, which I find very unlikely.

> 
> > My question is thus: Is there any way of debugging this with GDB? Can I
> > make GDB check that EBX is the same before and after every function call
> > from that frame in this thread to isolate the failing function? The
> > frame never exits (until the program exits, that is), if that helps.
> 
> You could set a bunch of conditional breakpoints with "break if %ebx !=
> saved_ebx", where you add code to your program to initialize saved_ebx.
> Or you could say "break if %ebx < 0x1000" or some convenient constant.
> 

That would, of course, be a good thing. It's only that I'd have to do
that after every single function call... That would take some time.
Maybe I'll do it, anyway. I was actually thinking of doing something
like that, but with code instead, and making the thread SIGSTOP itself
when EBX is invalid.

> You could also try forcing your variable to be on the stack instead of a
> register.  Remove the "register" attribute from the declaration of "next"
> if you have one.  Then add a "do_nothing(&next)" call to your function,
> to force "next" to be on the stack instead of in a register.  If the
> symptoms go away then it's more likely to really be a register clobber.

That just doesn't feel like a very elegant solution, though. And, this
bug does actually surface in another function as well, only even more
seldomly. There it also affects a variable stored in EBX, but it gets
set to 0 instead. So, I would prefer actually solving the problem, so
that it doesn't show up anywhere else. I have noticed no similarities
between the two places where the bugs shows itself.

> > At first I was expecting that another thread somehow gets there and
> > modifies the storage memory of next.
> 
> I still suspect this.  It's more likely that memory gets clobbered rather
> than a register value.
> 

But next isn't stored in memory at any place, so it cannot be that.

> Perhaps you need a function that locks the whole list and walks it for
> a sanity check, without deleting anything?
> 

I always check the list with gdb when the program crashes, and it's
always correct. That's why I think that it's impossible that next is
loaded when the list is in an unstable state. The only times I actually
set the next element, I always set it to NULL or a pointer returned by
malloc(). If the list was to be made unstable by a buggy function
somewhere, it would have to restored again by the same function (since
it's always consistent when I look at it), and I just don't see that
happening.

> Here is another wild lead: if, somehow, a block gets freed and then
> you read it, many implementations of malloc keep housekeeping information
> in the first word or two of a freed block.  That would explain why the
> value is always 0x10 to 0x30 (that could be block size, especially if it is
> rounded up to a multiple of 4 or 8) and why only 1-2 words are clobbered.
> If you manage your blocks with malloc/free, you could try turning on any
> malloc debugging facilities that you have.

I also suspected that something like that might happen, and therefore I
lock the elements one element ahead of the block I'm currently looking
at, so that the current block and the next are always locked. That's why
I have:

for(cur = list; cur != NULL; cur = next)
{
    if((next = cur->next) != NULL)
        pthread_mutex_lock(&next->mutex);
    ...
}

That is also a reason why the next variable has to be clobbered at some
later point, since pthread_mutex_lock succeeds on it. The program always
on the line "if((next = cur->next) != NULL)", since it segfaults when it
looks up cur->next, i.e. at that point cur has been set to the invalid
next as directed by the loop. Therefore, when the program crashes, next
and cur are equal, and I cannot see what element it was at before.

By the way, if you want to look at the code, it's available at
http://sourceforge.net/projects/dcprod/. I don't know if it's the latest
version, though.

next prev parent reply	other threads:[~2002-12-05 23:14 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-12-04 20:51 Michael Elizabeth Chastain
2002-12-05 15:14 ` Fredrik Tolf [this message]
  -- strict thread matches above, loose matches on Subject: below --
2002-12-06  9:08 Michael Elizabeth Chastain
2002-12-06 11:31 ` Fredrik Tolf
     [not found] <200212052240.gB5Mefm16249@duracef.shout.net>
2002-12-06  8:24 ` Fredrik Tolf
2002-12-04 18:02 Fredrik Tolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1039130097.2343.52.camel@pc7 \
    --to=fredrik@dolda2000.cjb.net \
    --cc=gdb@sources.redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox