Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed
From: Pedro Alves <pedro@codesourcery.com>
To: gdb-patches@sourceware.org
Cc: John Spencer <maillist-gdbpatches@barfooze.de>
Subject: Re: wrong assumptions about pthread_t being numeric
Date: Sat, 17 Sep 2011 15:47:00 -0000	[thread overview]
Message-ID: <201109171629.09383.pedro@codesourcery.com> (raw)
In-Reply-To: <201109171558.33536.pedro@codesourcery.com>

On Saturday 17 September 2011 15:58:33, Pedro Alves wrote:
> On Saturday 17 September 2011 02:02:28, John Spencer wrote:
> > On 09/17/2011 02:30 AM, Pedro Alves wrote:
> > > On Saturday 17 September 2011 00:13:10, John Spencer wrote:
> > >> On 09/17/2011 01:00 AM, Pedro Alves wrote:
> > >>> These are only built natively on solaris and aix respectively, so
> > >>> let's just leave them alone.
> > >>>
> > >> I expected it to be desirable for a product in industrial use to be
> > >> standard-compliant and not invoking undefined behavior.
> > > Those files are tied to those platforms' thread_db/libc implementations.
> > > There's absolutely no need to handle some other hipotetical libc that
> > > defines pthread_t diferently there.  If it appears, we'll handle it.
> > 
> > > Chances are, some other changes would be necessary to make it really
> > > work, not just build.
> > >
> > 
> > exactly. 
> 
> No, you're completely missing the point.  The code in question is
> not _using_ posix threads, as a normal application would.
> It is interfacing with the system's libc to get at the thread support
> library's internals, and letting the debugger control and instead the state
> of threads.  There's no portability requirement here.  The major requirement
> is that the code works with the libc it is intended to be interfacing with.
> 
> pthread_t on a given system is not _required_ to be an arithmetic type, but
> it certainly can.  Sorry, I strongly object to changing the solaris or AIX
> thread support just because you've learned that on some other systems
> pthread_t may not be numeric.
> 
> > for example in musl's case it is wrong to compare the 
> > underlying type (which is a struct pointer) with 0.
> 
> musl does not support solaris nor aix...
> 
> > in the implementation specific part it would practically be valid to 
> > compare it with
> > NULL, but that's still invalid regarding the definition in POSIX.
> 
> The gdb code in question _is_ peeking at implementation specific
> parts!  That's the point that's apparently gone missing.
> 
> E.g., with such a program being debugged:
> 
>   pthread_t thread;
>   res = pthread_create (&thread, NULL, entry_func, NULL);  
>   ...
> 
> You can do:
> 
>   (gdb) p /x thread
>   $2 = 0x7ffff7829700
> 
> And correlate that with the thread in gdb's thread list:
> 
>  (gdb) info threads 
>    3 Thread 0x7ffff7028700 (LWP 29643)  0x00007ffff78d75ad in nanosleep () at ../sysdeps/unix/syscall-template.S:82
>    2 Thread 0x7ffff7829700 (LWP 29640)  0x00007ffff78d75ad in nanosleep () at ../sysdeps/unix/syscall-template.S:82
>  * 1 Thread 0x7ffff7fcd720 (LWP 29618)  main () at threads.c:50
> 
> See, it's thread 2.  If on some other platform pthread_t is
> 
> struct pthread_t { int foo; int bar; };
> 
> Then we'll need to have specific code in gdb's libthread_db.so handling
> to print that.  But on some other platform, it could be, 
> 
> struct pthread_t { long tid; int some_other_id_who_knows; };
> 
> and again, we'd need some code to be able to print this.
> 
> So, you see, as we find platforms that need adapting to, we'll
> do it, but it doesn't make much sense to do it before stumbling
> on them.
> 
> > so the few places that use a thread id had to be changed everywhere, to 
> > not have dozens of special case
> > sections for any libc/OS combination...
> 
> Again, let's not invent work until we find such a combination...
> There's no way you can know what such a libc/thread_db's pthread_t
> would look like until it exists.
> 
> > also there are a few spots where the thread_t is numerically compared to 
> > some arbitrary int value
> > (i have not yet evaluated where those originate from). 
> 
> Let us know when you find them...
> 
> Maybe you're thinking of this in gdbserver/thread-db.c, where we
> have:
> 
>   /* If the new thread ID is zero, a final thread ID will be available
>      later.  Do not enable thread debugging yet.  */
>   if (ti.ti_tid == 0)
>     return 0;
> 
> but again, this is an implementation detail of glibc that gdb needs
> to know about!  There's no portable null pthread_t, and even if
> there was, there's no telling what other libc's do in this scenario.
> Again, there's _no_ way to make this portable, because we _are_ part
> of the implementation.
> 
> Code doing the same in gdb, where the comments are a bit more clear:
> 
>   if (ti.ti_tid == 0)
>     {
>       /* A thread ID of zero means that this is the main thread, but
> 	 glibc has not yet initialized thread-local storage and the
> 	 pthread library.  We do not know what the thread's TID will
> 	 be yet.  Just enable event reporting and otherwise ignore
> 	 it.  */
> 
> > this also can't 
> > work when thread_t is no number.
> > this happens for example in the if statement leading to the posted 
> > printf format string errors.
> 
> 
> > i expect that the number of actual fixes would be small, but it might 
> > require another way to get to a thread id.
> > 
> > if the id is only used for textual representation, it would be 
> > sufficient if gdb would assign a new unique number
> > on each successful pthread_create.
> > if it is also used to find a thread internally, it had to have a list of 
> > tid numbers -> pthread_t.
> 
> Again, when we have a libc/thread_db that requires that, we'll do it.
> Until then, we'll keep the code simple, thank you.
> 
> > 
> > 
> > >>>> thread-db.c: In function 'find_one_thread':
> > >>>> thread-db.c:295:7: error: format '%ld' expects type 'long int', but
> > >>>> argument 3 has type 'thread_t'
> > >>>> thread-db.c: In function 'attach_thread':
> > >>>> thread-db.c:335:7: error: format '%ld' expects type 'long int', but
> > >>>> argument 3 has type 'thread_t'
> > >>>> thread-db.c:341:9: error: format '%ld' expects type 'long int', but
> > >>>> argument 2 has type 'thread_t'
> > >>> So just cast it to long, and you're done.
> > >>>
> > >> pthread_t could legally be a struct, which you can't just cast to a long.
> > > No need to complicate things for an hipotetical scenario.  The set of
> > > libc's in existence is finite.  If we were to handle a struct pthread_t,
> > > we'd need to be able to print it, and so we'd need some libc specific
> > > way to do it, something autoconf'ed.  There's no need to invent work.
> > >
> > i disagree. adding a proper solution once is superior to creating dozens 
> > of special case hacks.
> 
> What's the "proper" solution then?  The only possible way I can think
> of to handle some libc what uses a struct for pthread_t, is with some way at
> runtime to know which libc/thread_db gdb is talking to, failing that,
> #ifdefs and/or autoconfig'ury.  Since a simple cast will
> work for glibc, android, musl, and whatever other libc's happen be
> be building gdbserver today, what's the point?

I'll even go a step further, and suggest you do expose pthread_t
as a typedef to long in public headers, and cast it to pointer internally
in musl, just like glibc and most other libc's do.  Hiding the fact
that you implement things with a pointer could even be a good thing.

It's just easier for everybody else that way.  There's really no point in
being different for the sake of it.

> 
> > also it saves a lot of time in the long term.
> 
> No, it wastes time right now coming up with a fix for an made up
> problem that does not exist in practice today.
> 
> 

-- 
Pedro Alves


  reply	other threads:[~2011-09-17 15:29 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-16 23:01 John Spencer
2011-09-16 23:16 ` Pedro Alves
2011-09-17  0:31   ` John Spencer
2011-09-17  1:05     ` Pedro Alves
2011-09-17  1:23       ` John Spencer
2011-09-17  2:29         ` Matt Rice
2011-09-17  6:47         ` Joel Brobecker
2011-09-18  0:11           ` John Spencer
2011-09-29  2:31             ` John Spencer
2011-09-29  8:39               ` Kai Tietz
2011-10-01  1:08                 ` John Spencer
2011-09-29 11:10               ` Pedro Alves
2011-09-17 15:29         ` Pedro Alves
2011-09-17 15:47           ` Pedro Alves [this message]
2011-10-01  1:02             ` John Spencer
2011-10-01  2:00               ` Pedro Alves
2011-10-01  9:00                 ` Mark Kettenis
2011-10-01  9:14                   ` Kevin Pouget
2011-10-05  8:02                 ` John Spencer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201109171629.09383.pedro@codesourcery.com \
    --to=pedro@codesourcery.com \
    --cc=gdb-patches@sourceware.org \
    --cc=maillist-gdbpatches@barfooze.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox