From: Pedro Alves <pedro@codesourcery.com>
To: gdb-patches@sourceware.org
Cc: John Spencer <maillist-gdbpatches@barfooze.de>
Subject: Re: wrong assumptions about pthread_t being numeric
Date: Sat, 17 Sep 2011 15:47:00 -0000 [thread overview]
Message-ID: <201109171629.09383.pedro@codesourcery.com> (raw)
In-Reply-To: <201109171558.33536.pedro@codesourcery.com>
On Saturday 17 September 2011 15:58:33, Pedro Alves wrote:
> On Saturday 17 September 2011 02:02:28, John Spencer wrote:
> > On 09/17/2011 02:30 AM, Pedro Alves wrote:
> > > On Saturday 17 September 2011 00:13:10, John Spencer wrote:
> > >> On 09/17/2011 01:00 AM, Pedro Alves wrote:
> > >>> These are only built natively on solaris and aix respectively, so
> > >>> let's just leave them alone.
> > >>>
> > >> I expected it to be desirable for a product in industrial use to be
> > >> standard-compliant and not invoking undefined behavior.
> > > Those files are tied to those platforms' thread_db/libc implementations.
> > > There's absolutely no need to handle some other hipotetical libc that
> > > defines pthread_t diferently there. If it appears, we'll handle it.
> >
> > > Chances are, some other changes would be necessary to make it really
> > > work, not just build.
> > >
> >
> > exactly.
>
> No, you're completely missing the point. The code in question is
> not _using_ posix threads, as a normal application would.
> It is interfacing with the system's libc to get at the thread support
> library's internals, and letting the debugger control and instead the state
> of threads. There's no portability requirement here. The major requirement
> is that the code works with the libc it is intended to be interfacing with.
>
> pthread_t on a given system is not _required_ to be an arithmetic type, but
> it certainly can. Sorry, I strongly object to changing the solaris or AIX
> thread support just because you've learned that on some other systems
> pthread_t may not be numeric.
>
> > for example in musl's case it is wrong to compare the
> > underlying type (which is a struct pointer) with 0.
>
> musl does not support solaris nor aix...
>
> > in the implementation specific part it would practically be valid to
> > compare it with
> > NULL, but that's still invalid regarding the definition in POSIX.
>
> The gdb code in question _is_ peeking at implementation specific
> parts! That's the point that's apparently gone missing.
>
> E.g., with such a program being debugged:
>
> pthread_t thread;
> res = pthread_create (&thread, NULL, entry_func, NULL);
> ...
>
> You can do:
>
> (gdb) p /x thread
> $2 = 0x7ffff7829700
>
> And correlate that with the thread in gdb's thread list:
>
> (gdb) info threads
> 3 Thread 0x7ffff7028700 (LWP 29643) 0x00007ffff78d75ad in nanosleep () at ../sysdeps/unix/syscall-template.S:82
> 2 Thread 0x7ffff7829700 (LWP 29640) 0x00007ffff78d75ad in nanosleep () at ../sysdeps/unix/syscall-template.S:82
> * 1 Thread 0x7ffff7fcd720 (LWP 29618) main () at threads.c:50
>
> See, it's thread 2. If on some other platform pthread_t is
>
> struct pthread_t { int foo; int bar; };
>
> Then we'll need to have specific code in gdb's libthread_db.so handling
> to print that. But on some other platform, it could be,
>
> struct pthread_t { long tid; int some_other_id_who_knows; };
>
> and again, we'd need some code to be able to print this.
>
> So, you see, as we find platforms that need adapting to, we'll
> do it, but it doesn't make much sense to do it before stumbling
> on them.
>
> > so the few places that use a thread id had to be changed everywhere, to
> > not have dozens of special case
> > sections for any libc/OS combination...
>
> Again, let's not invent work until we find such a combination...
> There's no way you can know what such a libc/thread_db's pthread_t
> would look like until it exists.
>
> > also there are a few spots where the thread_t is numerically compared to
> > some arbitrary int value
> > (i have not yet evaluated where those originate from).
>
> Let us know when you find them...
>
> Maybe you're thinking of this in gdbserver/thread-db.c, where we
> have:
>
> /* If the new thread ID is zero, a final thread ID will be available
> later. Do not enable thread debugging yet. */
> if (ti.ti_tid == 0)
> return 0;
>
> but again, this is an implementation detail of glibc that gdb needs
> to know about! There's no portable null pthread_t, and even if
> there was, there's no telling what other libc's do in this scenario.
> Again, there's _no_ way to make this portable, because we _are_ part
> of the implementation.
>
> Code doing the same in gdb, where the comments are a bit more clear:
>
> if (ti.ti_tid == 0)
> {
> /* A thread ID of zero means that this is the main thread, but
> glibc has not yet initialized thread-local storage and the
> pthread library. We do not know what the thread's TID will
> be yet. Just enable event reporting and otherwise ignore
> it. */
>
> > this also can't
> > work when thread_t is no number.
> > this happens for example in the if statement leading to the posted
> > printf format string errors.
>
>
> > i expect that the number of actual fixes would be small, but it might
> > require another way to get to a thread id.
> >
> > if the id is only used for textual representation, it would be
> > sufficient if gdb would assign a new unique number
> > on each successful pthread_create.
> > if it is also used to find a thread internally, it had to have a list of
> > tid numbers -> pthread_t.
>
> Again, when we have a libc/thread_db that requires that, we'll do it.
> Until then, we'll keep the code simple, thank you.
>
> >
> >
> > >>>> thread-db.c: In function 'find_one_thread':
> > >>>> thread-db.c:295:7: error: format '%ld' expects type 'long int', but
> > >>>> argument 3 has type 'thread_t'
> > >>>> thread-db.c: In function 'attach_thread':
> > >>>> thread-db.c:335:7: error: format '%ld' expects type 'long int', but
> > >>>> argument 3 has type 'thread_t'
> > >>>> thread-db.c:341:9: error: format '%ld' expects type 'long int', but
> > >>>> argument 2 has type 'thread_t'
> > >>> So just cast it to long, and you're done.
> > >>>
> > >> pthread_t could legally be a struct, which you can't just cast to a long.
> > > No need to complicate things for an hipotetical scenario. The set of
> > > libc's in existence is finite. If we were to handle a struct pthread_t,
> > > we'd need to be able to print it, and so we'd need some libc specific
> > > way to do it, something autoconf'ed. There's no need to invent work.
> > >
> > i disagree. adding a proper solution once is superior to creating dozens
> > of special case hacks.
>
> What's the "proper" solution then? The only possible way I can think
> of to handle some libc what uses a struct for pthread_t, is with some way at
> runtime to know which libc/thread_db gdb is talking to, failing that,
> #ifdefs and/or autoconfig'ury. Since a simple cast will
> work for glibc, android, musl, and whatever other libc's happen be
> be building gdbserver today, what's the point?
I'll even go a step further, and suggest you do expose pthread_t
as a typedef to long in public headers, and cast it to pointer internally
in musl, just like glibc and most other libc's do. Hiding the fact
that you implement things with a pointer could even be a good thing.
It's just easier for everybody else that way. There's really no point in
being different for the sake of it.
>
> > also it saves a lot of time in the long term.
>
> No, it wastes time right now coming up with a fix for an made up
> problem that does not exist in practice today.
>
>
--
Pedro Alves
next prev parent reply other threads:[~2011-09-17 15:29 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-16 23:01 John Spencer
2011-09-16 23:16 ` Pedro Alves
2011-09-17 0:31 ` John Spencer
2011-09-17 1:05 ` Pedro Alves
2011-09-17 1:23 ` John Spencer
2011-09-17 2:29 ` Matt Rice
2011-09-17 6:47 ` Joel Brobecker
2011-09-18 0:11 ` John Spencer
2011-09-29 2:31 ` John Spencer
2011-09-29 8:39 ` Kai Tietz
2011-10-01 1:08 ` John Spencer
2011-09-29 11:10 ` Pedro Alves
2011-09-17 15:29 ` Pedro Alves
2011-09-17 15:47 ` Pedro Alves [this message]
2011-10-01 1:02 ` John Spencer
2011-10-01 2:00 ` Pedro Alves
2011-10-01 9:00 ` Mark Kettenis
2011-10-01 9:14 ` Kevin Pouget
2011-10-05 8:02 ` John Spencer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201109171629.09383.pedro@codesourcery.com \
--to=pedro@codesourcery.com \
--cc=gdb-patches@sourceware.org \
--cc=maillist-gdbpatches@barfooze.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox