* wrong assumptions about pthread_t being numeric @ 2011-09-16 23:01 John Spencer 2011-09-16 23:16 ` Pedro Alves 0 siblings, 1 reply; 19+ messages in thread From: John Spencer @ 2011-09-16 23:01 UTC (permalink / raw) To: gdb-patches there are a couple of spots in thread handling which assume that thread_t (typedef of pthread_t) is of a numeric type. according to POSIX http://pubs.opengroup.org/onlinepubs/007904875/basedefs/sys/types.h.html pthread_t is a non-arithmetic type. thus whenever a "thread id" of type pthread_t is used in a numeric context, it invokes undefined behaviour, since pthread_t could be a struct, a pointer, etc. for example, it is implemented as a pointer to a struct in musl libc for efficiency. i basically wanted to fix my compile error and send a patch, but i think this should be discussed first. thread-db.c: In function 'find_one_thread': thread-db.c:295:7: error: format '%ld' expects type 'long int', but argument 3 has type 'thread_t' thread-db.c: In function 'attach_thread': thread-db.c:335:7: error: format '%ld' expects type 'long int', but argument 3 has type 'thread_t' thread-db.c:341:9: error: format '%ld' expects type 'long int', but argument 2 has type 'thread_t' $ grep ti_tid `find gdb-7.3.1 -name '*.c'` gdb-7.3.1/gdb/sol-thread.c: return BUILD_THREAD (ti.ti_tid, PIDGET (lwp)); gdb-7.3.1/gdb/sol-thread.c: ptid = BUILD_THREAD (ti.ti_tid, PIDGET (inferior_ptid)); gdb-7.3.1/gdb/sol-thread.c: ti.ti_tid, ti.ti_lid); gdb-7.3.1/gdb/linux-thread-db.c: gdb_assert (ti_p->ti_tid != 0); gdb-7.3.1/gdb/linux-thread-db.c: private->tid = ti_p->ti_tid; gdb-7.3.1/gdb/linux-thread-db.c: if (ti.ti_tid == 0 && target_has_execution) gdb-7.3.1/gdb/gdbserver/thread-db.c: ti.ti_tid, ti.ti_lid); gdb-7.3.1/gdb/gdbserver/thread-db.c: if (ti.ti_tid == 0) gdb-7.3.1/gdb/gdbserver/thread-db.c: ti_p->ti_tid, ti_p->ti_lid); gdb-7.3.1/gdb/gdbserver/thread-db.c: ti_p->ti_tid, ti_p->ti_lid); gdb-7.3.1/gdb/aix-thread.c: return thrinf.ti_tid; -- JS ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-09-16 23:01 wrong assumptions about pthread_t being numeric John Spencer @ 2011-09-16 23:16 ` Pedro Alves 2011-09-17 0:31 ` John Spencer 0 siblings, 1 reply; 19+ messages in thread From: Pedro Alves @ 2011-09-16 23:16 UTC (permalink / raw) To: gdb-patches; +Cc: John Spencer On Friday 16 September 2011 23:40:47, John Spencer wrote: > i basically wanted to fix my compile error and send a patch, but i think > this should be discussed first. > $ grep ti_tid `find gdb-7.3.1 -name '*.c'` > gdb-7.3.1/gdb/sol-thread.c: return BUILD_THREAD (ti.ti_tid, PIDGET (lwp)); > gdb-7.3.1/gdb/sol-thread.c: ptid = BUILD_THREAD (ti.ti_tid, PIDGET > (inferior_ptid)); > gdb-7.3.1/gdb/sol-thread.c: ti.ti_tid, ti.ti_lid); > gdb-7.3.1/gdb/aix-thread.c: return thrinf.ti_tid; These are only built natively on solaris and aix respectively, so let's just leave them alone. > thread-db.c: In function 'find_one_thread': > thread-db.c:295:7: error: format '%ld' expects type 'long int', but > argument 3 has type 'thread_t' > thread-db.c: In function 'attach_thread': > thread-db.c:335:7: error: format '%ld' expects type 'long int', but > argument 3 has type 'thread_t' > thread-db.c:341:9: error: format '%ld' expects type 'long int', but > argument 2 has type 'thread_t' So just cast it to long, and you're done. > gdb-7.3.1/gdb/linux-thread-db.c: gdb_assert (ti_p->ti_tid != 0); > gdb-7.3.1/gdb/linux-thread-db.c: private->tid = ti_p->ti_tid; > gdb-7.3.1/gdb/linux-thread-db.c: if (ti.ti_tid == 0 && > target_has_execution) > gdb-7.3.1/gdb/gdbserver/thread-db.c: ti.ti_tid, ti.ti_lid); > gdb-7.3.1/gdb/gdbserver/thread-db.c: if (ti.ti_tid == 0) > gdb-7.3.1/gdb/gdbserver/thread-db.c: ti_p->ti_tid, ti_p->ti_lid); > gdb-7.3.1/gdb/gdbserver/thread-db.c: ti_p->ti_tid, ti_p->ti_lid); Does your libc actually have a libthread_db.so? -- Pedro Alves ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-09-16 23:16 ` Pedro Alves @ 2011-09-17 0:31 ` John Spencer 2011-09-17 1:05 ` Pedro Alves 0 siblings, 1 reply; 19+ messages in thread From: John Spencer @ 2011-09-17 0:31 UTC (permalink / raw) To: gdb-patches; +Cc: Pedro Alves On 09/17/2011 01:00 AM, Pedro Alves wrote: > > These are only built natively on solaris and aix respectively, so > let's just leave them alone. > I expected it to be desirable for a product in industrial use to be standard-compliant and not invoking undefined behavior. >> thread-db.c: In function 'find_one_thread': >> thread-db.c:295:7: error: format '%ld' expects type 'long int', but >> argument 3 has type 'thread_t' >> thread-db.c: In function 'attach_thread': >> thread-db.c:335:7: error: format '%ld' expects type 'long int', but >> argument 3 has type 'thread_t' >> thread-db.c:341:9: error: format '%ld' expects type 'long int', but >> argument 2 has type 'thread_t' > So just cast it to long, and you're done. > pthread_t could legally be a struct, which you can't just cast to a long. >> gdb-7.3.1/gdb/linux-thread-db.c: gdb_assert (ti_p->ti_tid != 0); >> gdb-7.3.1/gdb/linux-thread-db.c: private->tid = ti_p->ti_tid; >> gdb-7.3.1/gdb/linux-thread-db.c: if (ti.ti_tid == 0&& >> target_has_execution) >> gdb-7.3.1/gdb/gdbserver/thread-db.c: ti.ti_tid, ti.ti_lid); >> gdb-7.3.1/gdb/gdbserver/thread-db.c: if (ti.ti_tid == 0) >> gdb-7.3.1/gdb/gdbserver/thread-db.c: ti_p->ti_tid, ti_p->ti_lid); >> gdb-7.3.1/gdb/gdbserver/thread-db.c: ti_p->ti_tid, ti_p->ti_lid); > Does your libc actually have a libthread_db.so? > no, that's another problem. the code just assumes it is there. i had to use a couple of patches to make gdb compile. if you are interested, the cumulative patch is available here http://pastie.org/private/rx05yywro1utmvvniw6gjw -- JS ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-09-17 0:31 ` John Spencer @ 2011-09-17 1:05 ` Pedro Alves 2011-09-17 1:23 ` John Spencer 0 siblings, 1 reply; 19+ messages in thread From: Pedro Alves @ 2011-09-17 1:05 UTC (permalink / raw) To: John Spencer; +Cc: gdb-patches On Saturday 17 September 2011 00:13:10, John Spencer wrote: > On 09/17/2011 01:00 AM, Pedro Alves wrote: > > > > These are only built natively on solaris and aix respectively, so > > let's just leave them alone. > > > > I expected it to be desirable for a product in industrial use to be > standard-compliant and not invoking undefined behavior. Those files are tied to those platforms' thread_db/libc implementations. There's absolutely no need to handle some other hipotetical libc that defines pthread_t diferently there. If it appears, we'll handle it. Chances are, some other changes would be necessary to make it really work, not just build. > >> thread-db.c: In function 'find_one_thread': > >> thread-db.c:295:7: error: format '%ld' expects type 'long int', but > >> argument 3 has type 'thread_t' > >> thread-db.c: In function 'attach_thread': > >> thread-db.c:335:7: error: format '%ld' expects type 'long int', but > >> argument 3 has type 'thread_t' > >> thread-db.c:341:9: error: format '%ld' expects type 'long int', but > >> argument 2 has type 'thread_t' > > So just cast it to long, and you're done. > > > > pthread_t could legally be a struct, which you can't just cast to a long. No need to complicate things for an hipotetical scenario. The set of libc's in existence is finite. If we were to handle a struct pthread_t, we'd need to be able to print it, and so we'd need some libc specific way to do it, something autoconf'ed. There's no need to invent work. -- Pedro Alves ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-09-17 1:05 ` Pedro Alves @ 2011-09-17 1:23 ` John Spencer 2011-09-17 2:29 ` Matt Rice ` (2 more replies) 0 siblings, 3 replies; 19+ messages in thread From: John Spencer @ 2011-09-17 1:23 UTC (permalink / raw) To: gdb-patches; +Cc: Pedro Alves On 09/17/2011 02:30 AM, Pedro Alves wrote: > On Saturday 17 September 2011 00:13:10, John Spencer wrote: >> On 09/17/2011 01:00 AM, Pedro Alves wrote: >>> These are only built natively on solaris and aix respectively, so >>> let's just leave them alone. >>> >> I expected it to be desirable for a product in industrial use to be >> standard-compliant and not invoking undefined behavior. > Those files are tied to those platforms' thread_db/libc implementations. > There's absolutely no need to handle some other hipotetical libc that > defines pthread_t diferently there. If it appears, we'll handle it. > Chances are, some other changes would be necessary to make it really > work, not just build. > exactly. for example in musl's case it is wrong to compare the underlying type (which is a struct pointer) with 0. in the implementation specific part it would practically be valid to compare it with NULL, but that's still invalid regarding the definition in POSIX. so the few places that use a thread id had to be changed everywhere, to not have dozens of special case sections for any libc/OS combination... also there are a few spots where the thread_t is numerically compared to some arbitrary int value (i have not yet evaluated where those originate from). this also can't work when thread_t is no number. this happens for example in the if statement leading to the posted printf format string errors. i expect that the number of actual fixes would be small, but it might require another way to get to a thread id. if the id is only used for textual representation, it would be sufficient if gdb would assign a new unique number on each successful pthread_create. if it is also used to find a thread internally, it had to have a list of tid numbers -> pthread_t. >>>> thread-db.c: In function 'find_one_thread': >>>> thread-db.c:295:7: error: format '%ld' expects type 'long int', but >>>> argument 3 has type 'thread_t' >>>> thread-db.c: In function 'attach_thread': >>>> thread-db.c:335:7: error: format '%ld' expects type 'long int', but >>>> argument 3 has type 'thread_t' >>>> thread-db.c:341:9: error: format '%ld' expects type 'long int', but >>>> argument 2 has type 'thread_t' >>> So just cast it to long, and you're done. >>> >> pthread_t could legally be a struct, which you can't just cast to a long. > No need to complicate things for an hipotetical scenario. The set of > libc's in existence is finite. If we were to handle a struct pthread_t, > we'd need to be able to print it, and so we'd need some libc specific > way to do it, something autoconf'ed. There's no need to invent work. > i disagree. adding a proper solution once is superior to creating dozens of special case hacks. also it saves a lot of time in the long term. -- JS ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-09-17 1:23 ` John Spencer @ 2011-09-17 2:29 ` Matt Rice 2011-09-17 6:47 ` Joel Brobecker 2011-09-17 15:29 ` Pedro Alves 2 siblings, 0 replies; 19+ messages in thread From: Matt Rice @ 2011-09-17 2:29 UTC (permalink / raw) To: John Spencer; +Cc: gdb-patches, Pedro Alves On Fri, Sep 16, 2011 at 6:02 PM, John Spencer <maillist-gdbpatches@barfooze.de> wrote: > On 09/17/2011 02:30 AM, Pedro Alves wrote: >> >> On Saturday 17 September 2011 00:13:10, John Spencer wrote: >>> >>> On 09/17/2011 01:00 AM, Pedro Alves wrote: >>>> >>>> These are only built natively on solaris and aix respectively, so >>>> let's just leave them alone. >>>> >>> I expected it to be desirable for a product in industrial use to be >>> standard-compliant and not invoking undefined behavior. >> >> Those files are tied to those platforms' thread_db/libc implementations. >> There's absolutely no need to handle some other hipotetical libc that >> defines pthread_t diferently there. If it appears, we'll handle it. > >> Chances are, some other changes would be necessary to make it really >> work, not just build. >> > > exactly. for example in musl's case it is wrong to compare the underlying > type (which is a struct pointer) with 0. FWIW, hurd used to do something similar (well, in its case 0 was a valid, pthread_t, the first one in fact), I believe that enough things broke gthr from gcc, libobjc, possibly gdb (don't remember it though..) that they eventually changed their pthread_t even though it was compliant... ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-09-17 1:23 ` John Spencer 2011-09-17 2:29 ` Matt Rice @ 2011-09-17 6:47 ` Joel Brobecker 2011-09-18 0:11 ` John Spencer 2011-09-17 15:29 ` Pedro Alves 2 siblings, 1 reply; 19+ messages in thread From: Joel Brobecker @ 2011-09-17 6:47 UTC (permalink / raw) To: John Spencer; +Cc: gdb-patches, Pedro Alves > i disagree. adding a proper solution once is superior to creating > dozens of special case hacks. also it saves a lot of time in the long > term. Based on my experience so far, I have to agree with Pedro. But if you do have a superior solution, we'll take a look. I still have to tell you that I won't like it if your solution makes the code harder to read. Sometimes, that's OK because it helps achieving portability. But when there is no such need for portability, it just gets in the way. Let's also be pragmatic, here and fix the problems that we know we have. -- Joel ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-09-17 6:47 ` Joel Brobecker @ 2011-09-18 0:11 ` John Spencer 2011-09-29 2:31 ` John Spencer 0 siblings, 1 reply; 19+ messages in thread From: John Spencer @ 2011-09-18 0:11 UTC (permalink / raw) To: gdb-patches; +Cc: Joel Brobecker [-- Attachment #1: Type: text/plain, Size: 1270 bytes --] On 09/17/2011 04:28 AM, Joel Brobecker wrote: >> i disagree. adding a proper solution once is superior to creating >> dozens of special case hacks. also it saves a lot of time in the long >> term. > Based on my experience so far, I have to agree with Pedro. > But if you do have a superior solution, we'll take a look. > I still have to tell you that I won't like it if your solution > makes the code harder to read. Sometimes, that's OK because it > helps achieving portability. But when there is no such need for > portability, it just gets in the way. Let's also be pragmatic, > here and fix the problems that we know we have. > it would be nice to have an approach which would just work anywhere and respect the spec, however i currently have not the resources to implement it. thus, as you suggested, i came up with a small patch fixing the issue at hand. the solution should work on any sys/libc combination where sizeof(long) == sizeof(void*) and pthread_t is either a pointer or an int with sizeof equal or less than sizeof long. this should cover all currently existing combinations. i may find more spots in the future which need the application of the macro, and will eventually be sending more patches applying the conversion macro to it. -- JS [-- Attachment #2: commit-1 --] [-- Type: text/plain, Size: 3348 bytes --] commit 5254245d1e3dbbe43c23bbec2c67e10525f0155a Author: John Spencer <maillist-gdbpatches@barfooze.de> Date: Sat Sep 17 17:21:58 2011 +0200 use an explicit macro for pthread_t to number conversion. diff --git a/gdb/gdbserver/thread-db.c b/gdb/gdbserver/thread-db.c index 529516e..200e9e7 100644 --- a/gdb/gdbserver/thread-db.c +++ b/gdb/gdbserver/thread-db.c @@ -29,6 +29,7 @@ static int thread_db_use_events; #include "gdb_proc_service.h" #include "../gdb_thread_db.h" +#include "../threadtol.h" #ifndef USE_LIBTHREAD_DB_DIRECTLY #include <dlfcn.h> @@ -290,7 +291,7 @@ find_one_thread (ptid_t ptid) if (debug_threads) fprintf (stderr, "Found thread %ld (LWP %d)\n", - ti.ti_tid, ti.ti_lid); + THREADTOL(ti.ti_tid), ti.ti_lid); if (lwpid != ti.ti_lid) { @@ -309,7 +310,7 @@ find_one_thread (ptid_t ptid) /* If the new thread ID is zero, a final thread ID will be available later. Do not enable thread debugging yet. */ - if (ti.ti_tid == 0) + if (THREADTOL(ti.ti_tid) == 0) return 0; lwp->thread_known = 1; @@ -327,13 +328,13 @@ attach_thread (const td_thrhandle_t *th_p, td_thrinfo_t *ti_p) if (debug_threads) fprintf (stderr, "Attaching to thread %ld (LWP %d)\n", - ti_p->ti_tid, ti_p->ti_lid); + THREADTOL(ti_p->ti_tid), ti_p->ti_lid); linux_attach_lwp (ti_p->ti_lid); lwp = find_lwp_pid (pid_to_ptid (ti_p->ti_lid)); if (lwp == NULL) { warning ("Could not attach to thread %ld (LWP %d)\n", - ti_p->ti_tid, ti_p->ti_lid); + THREADTOL(ti_p->ti_tid), ti_p->ti_lid); return 0; } diff --git a/gdb/linux-thread-db.c b/gdb/linux-thread-db.c index 005a34a..b7d6671 100644 --- a/gdb/linux-thread-db.c +++ b/gdb/linux-thread-db.c @@ -24,6 +24,7 @@ #include <dlfcn.h> #include "gdb_proc_service.h" #include "gdb_thread_db.h" +#include "threadtol.h" #include "bfd.h" #include "command.h" @@ -1059,7 +1060,7 @@ attach_thread (ptid_t ptid, const td_thrhandle_t *th_p, if we change GDB to always have at least one thread in the thread list this will have to go somewhere else; maybe private == NULL until the thread_db target claims it. */ - gdb_assert (ti_p->ti_tid != 0); + gdb_assert (THREADTOL(ti_p->ti_tid) != 0); private->th = *th_p; private->tid = ti_p->ti_tid; @@ -1335,7 +1336,7 @@ find_new_threads_callback (const td_thrhandle_t *th_p, void *data) if (ti.ti_state == TD_THR_UNKNOWN || ti.ti_state == TD_THR_ZOMBIE) return 0; /* A zombie -- ignore. */ - if (ti.ti_tid == 0 && target_has_execution) + if (THREADTOL(ti.ti_tid) == 0 && target_has_execution) { /* A thread ID of zero means that this is the main thread, but glibc has not yet initialized thread-local storage and the diff --git a/gdb/threadtol.h b/gdb/threadtol.h new file mode 100644 index 0000000..667b747 --- /dev/null +++ b/gdb/threadtol.h @@ -0,0 +1,13 @@ +#ifndef THREADTOL_H +#define THREADTOL_H + +/* macro for conversion of a pthread_t to a long, for numerical representation. + * pthread_t is a opaque type and it is in fact illegal to use it in a numeric + * context according to POSIX, but all libc implementations in existence + * use either a pointer or a number, so its safe to cast it to long + * for the moment. */ + +#define THREADTOL(X) ((long) (X)) + +#endif + ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-09-18 0:11 ` John Spencer @ 2011-09-29 2:31 ` John Spencer 2011-09-29 8:39 ` Kai Tietz 2011-09-29 11:10 ` Pedro Alves 0 siblings, 2 replies; 19+ messages in thread From: John Spencer @ 2011-09-29 2:31 UTC (permalink / raw) To: gdb-patches On 09/17/2011 05:43 PM, John Spencer wrote: > On 09/17/2011 04:28 AM, Joel Brobecker wrote: >>> i disagree. adding a proper solution once is superior to creating >>> dozens of special case hacks. also it saves a lot of time in the long >>> term. >> Based on my experience so far, I have to agree with Pedro. >> But if you do have a superior solution, we'll take a look. >> I still have to tell you that I won't like it if your solution >> makes the code harder to read. Sometimes, that's OK because it >> helps achieving portability. But when there is no such need for >> portability, it just gets in the way. Let's also be pragmatic, >> here and fix the problems that we know we have. >> > it would be nice to have an approach which would just work anywhere > and respect the spec, > however i currently have not the resources to implement it. > > thus, as you suggested, i came up with a small patch fixing the issue > at hand. the solution should work on any sys/libc combination where > sizeof(long) == sizeof(void*) and pthread_t is either a pointer or an > int with sizeof equal or less than sizeof long. > this should cover all currently existing combinations. > > i may find more spots in the future which need the application of the > macro, and will eventually be sending more patches applying the > conversion macro to it. > > -- JS > > knock, knock. anybody here ? please tell me wether you accept my patch or not. -- JS ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-09-29 2:31 ` John Spencer @ 2011-09-29 8:39 ` Kai Tietz 2011-10-01 1:08 ` John Spencer 2011-09-29 11:10 ` Pedro Alves 1 sibling, 1 reply; 19+ messages in thread From: Kai Tietz @ 2011-09-29 8:39 UTC (permalink / raw) To: John Spencer; +Cc: gdb-patches 2011/9/29 John Spencer <maillist-gdbpatches@barfooze.de>: > On 09/17/2011 05:43 PM, John Spencer wrote: >> >> On 09/17/2011 04:28 AM, Joel Brobecker wrote: >>>> >>>> i disagree. adding a proper solution once is superior to creating >>>> dozens of special case hacks. also it saves a lot of time in the long >>>> term. >>> >>> Based on my experience so far, I have to agree with Pedro. >>> But if you do have a superior solution, we'll take a look. >>> I still have to tell you that I won't like it if your solution >>> makes the code harder to read. Sometimes, that's OK because it >>> helps achieving portability. But when there is no such need for >>> portability, it just gets in the way. Let's also be pragmatic, >>> here and fix the problems that we know we have. >>> >> it would be nice to have an approach which would just work anywhere and >> respect the spec, >> however i currently have not the resources to implement it. >> >> thus, as you suggested, i came up with a small patch fixing the issue at >> hand. the solution should work on any sys/libc combination where >> sizeof(long) == sizeof(void*) and pthread_t is either a pointer or an int >> with sizeof equal or less than sizeof long. >> this should cover all currently existing combinations. >> >> i may find more spots in the future which need the application of the >> macro, and will eventually be sending more patches applying the conversion >> macro to it. >> >> -- JS >> >> > > knock, knock. anybody here ? > please tell me wether you accept my patch or not. > > -- JS Well, I can't approve this and I won't. But I would like to give me 5 cents for this approach. This approach seems to me bogus, as you are dependent to sizeof (long) == sizeof (void *), which is a broken attempt. It might be a way to use here instead intptr_t instead of long type. Nevertheless I admit that the pthread standard doesn't disallow structure-typed pthread_t, so it might be still worth to support this for gthread posix. For windows there is a pthread implementation "winpthread" - hosted by mingw-w64 project in experimental tree but it will be soon put into active trunk. This one uses here for pthread_t an integer-scalar handle instead of a structure, so issues about current implementation in gthread are working fine. Regards, Kai ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-09-29 8:39 ` Kai Tietz @ 2011-10-01 1:08 ` John Spencer 0 siblings, 0 replies; 19+ messages in thread From: John Spencer @ 2011-10-01 1:08 UTC (permalink / raw) To: Kai Tietz; +Cc: gdb-patches On 09/29/2011 10:26 AM, Kai Tietz wrote: > > Well, I can't approve this and I won't. But I would like to give me 5 > cents for this approach. This approach seems to me bogus, as you are > dependent to sizeof (long) == sizeof (void *), which is a broken > attempt. It might be a way to use here instead intptr_t instead of > long type. gdb assumes that the type of thread_t is a long. that is the bogus part. my macro just explicitly casts it to long. on archictectures where sizeof(void*) != sizeof(long) it would possible truncate or zeropad the value, but still return a (hopefully unique) number. if it isnt guaranteed to return a unique number on this not-yet-existing platform, there had to be some ifdef'd code for that. for libc's that use a struct type for pthread_t, there needs to be a specific workarounds, as others already pointed out. > Nevertheless I admit that the pthread standard doesn't > disallow structure-typed pthread_t, so it might be still worth to > support this for gthread posix. > > For windows there is a pthread implementation "winpthread" - hosted by > mingw-w64 project in experimental tree but it will be soon put into > active trunk. This one uses here for pthread_t an integer-scalar > handle instead of a structure, so issues about current implementation > in gthread are working fine. > > Regards, > Kai > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-09-29 2:31 ` John Spencer 2011-09-29 8:39 ` Kai Tietz @ 2011-09-29 11:10 ` Pedro Alves 1 sibling, 0 replies; 19+ messages in thread From: Pedro Alves @ 2011-09-29 11:10 UTC (permalink / raw) To: gdb-patches; +Cc: John Spencer On Thursday 29 September 2011 03:16:43, John Spencer wrote: > knock, knock. anybody here ? > please tell me wether you accept my patch or not. Did you see my last two emails? I didn't hear back from you on those. -- Pedro Alves ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-09-17 1:23 ` John Spencer 2011-09-17 2:29 ` Matt Rice 2011-09-17 6:47 ` Joel Brobecker @ 2011-09-17 15:29 ` Pedro Alves 2011-09-17 15:47 ` Pedro Alves 2 siblings, 1 reply; 19+ messages in thread From: Pedro Alves @ 2011-09-17 15:29 UTC (permalink / raw) To: John Spencer; +Cc: gdb-patches On Saturday 17 September 2011 02:02:28, John Spencer wrote: > On 09/17/2011 02:30 AM, Pedro Alves wrote: > > On Saturday 17 September 2011 00:13:10, John Spencer wrote: > >> On 09/17/2011 01:00 AM, Pedro Alves wrote: > >>> These are only built natively on solaris and aix respectively, so > >>> let's just leave them alone. > >>> > >> I expected it to be desirable for a product in industrial use to be > >> standard-compliant and not invoking undefined behavior. > > Those files are tied to those platforms' thread_db/libc implementations. > > There's absolutely no need to handle some other hipotetical libc that > > defines pthread_t diferently there. If it appears, we'll handle it. > > > Chances are, some other changes would be necessary to make it really > > work, not just build. > > > > exactly. No, you're completely missing the point. The code in question is not _using_ posix threads, as a normal application would. It is interfacing with the system's libc to get at the thread support library's internals, and letting the debugger control and instead the state of threads. There's no portability requirement here. The major requirement is that the code works with the libc it is intended to be interfacing with. pthread_t on a given system is not _required_ to be an arithmetic type, but it certainly can. Sorry, I strongly object to changing the solaris or AIX thread support just because you've learned that on some other systems pthread_t may not be numeric. > for example in musl's case it is wrong to compare the > underlying type (which is a struct pointer) with 0. musl does not support solaris nor aix... > in the implementation specific part it would practically be valid to > compare it with > NULL, but that's still invalid regarding the definition in POSIX. The gdb code in question _is_ peeking at implementation specific parts! That's the point that's apparently gone missing. E.g., with such a program being debugged: pthread_t thread; res = pthread_create (&thread, NULL, entry_func, NULL); ... You can do: (gdb) p /x thread $2 = 0x7ffff7829700 And correlate that with the thread in gdb's thread list: (gdb) info threads 3 Thread 0x7ffff7028700 (LWP 29643) 0x00007ffff78d75ad in nanosleep () at ../sysdeps/unix/syscall-template.S:82 2 Thread 0x7ffff7829700 (LWP 29640) 0x00007ffff78d75ad in nanosleep () at ../sysdeps/unix/syscall-template.S:82 * 1 Thread 0x7ffff7fcd720 (LWP 29618) main () at threads.c:50 See, it's thread 2. If on some other platform pthread_t is struct pthread_t { int foo; int bar; }; Then we'll need to have specific code in gdb's libthread_db.so handling to print that. But on some other platform, it could be, struct pthread_t { long tid; int some_other_id_who_knows; }; and again, we'd need some code to be able to print this. So, you see, as we find platforms that need adapting to, we'll do it, but it doesn't make much sense to do it before stumbling on them. > so the few places that use a thread id had to be changed everywhere, to > not have dozens of special case > sections for any libc/OS combination... Again, let's not invent work until we find such a combination... There's no way you can know what such a libc/thread_db's pthread_t would look like until it exists. > also there are a few spots where the thread_t is numerically compared to > some arbitrary int value > (i have not yet evaluated where those originate from). Let us know when you find them... Maybe you're thinking of this in gdbserver/thread-db.c, where we have: /* If the new thread ID is zero, a final thread ID will be available later. Do not enable thread debugging yet. */ if (ti.ti_tid == 0) return 0; but again, this is an implementation detail of glibc that gdb needs to know about! There's no portable null pthread_t, and even if there was, there's no telling what other libc's do in this scenario. Again, there's _no_ way to make this portable, because we _are_ part of the implementation. Code doing the same in gdb, where the comments are a bit more clear: if (ti.ti_tid == 0) { /* A thread ID of zero means that this is the main thread, but glibc has not yet initialized thread-local storage and the pthread library. We do not know what the thread's TID will be yet. Just enable event reporting and otherwise ignore it. */ > this also can't > work when thread_t is no number. > this happens for example in the if statement leading to the posted > printf format string errors. > i expect that the number of actual fixes would be small, but it might > require another way to get to a thread id. > > if the id is only used for textual representation, it would be > sufficient if gdb would assign a new unique number > on each successful pthread_create. > if it is also used to find a thread internally, it had to have a list of > tid numbers -> pthread_t. Again, when we have a libc/thread_db that requires that, we'll do it. Until then, we'll keep the code simple, thank you. > > > >>>> thread-db.c: In function 'find_one_thread': > >>>> thread-db.c:295:7: error: format '%ld' expects type 'long int', but > >>>> argument 3 has type 'thread_t' > >>>> thread-db.c: In function 'attach_thread': > >>>> thread-db.c:335:7: error: format '%ld' expects type 'long int', but > >>>> argument 3 has type 'thread_t' > >>>> thread-db.c:341:9: error: format '%ld' expects type 'long int', but > >>>> argument 2 has type 'thread_t' > >>> So just cast it to long, and you're done. > >>> > >> pthread_t could legally be a struct, which you can't just cast to a long. > > No need to complicate things for an hipotetical scenario. The set of > > libc's in existence is finite. If we were to handle a struct pthread_t, > > we'd need to be able to print it, and so we'd need some libc specific > > way to do it, something autoconf'ed. There's no need to invent work. > > > i disagree. adding a proper solution once is superior to creating dozens > of special case hacks. What's the "proper" solution then? The only possible way I can think of to handle some libc what uses a struct for pthread_t, is with some way at runtime to know which libc/thread_db gdb is talking to, failing that, #ifdefs and/or autoconfig'ury. Since a simple cast will work for glibc, android, musl, and whatever other libc's happen be be building gdbserver today, what's the point? > also it saves a lot of time in the long term. No, it wastes time right now coming up with a fix for an made up problem that does not exist in practice today. -- Pedro Alves ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-09-17 15:29 ` Pedro Alves @ 2011-09-17 15:47 ` Pedro Alves 2011-10-01 1:02 ` John Spencer 0 siblings, 1 reply; 19+ messages in thread From: Pedro Alves @ 2011-09-17 15:47 UTC (permalink / raw) To: gdb-patches; +Cc: John Spencer On Saturday 17 September 2011 15:58:33, Pedro Alves wrote: > On Saturday 17 September 2011 02:02:28, John Spencer wrote: > > On 09/17/2011 02:30 AM, Pedro Alves wrote: > > > On Saturday 17 September 2011 00:13:10, John Spencer wrote: > > >> On 09/17/2011 01:00 AM, Pedro Alves wrote: > > >>> These are only built natively on solaris and aix respectively, so > > >>> let's just leave them alone. > > >>> > > >> I expected it to be desirable for a product in industrial use to be > > >> standard-compliant and not invoking undefined behavior. > > > Those files are tied to those platforms' thread_db/libc implementations. > > > There's absolutely no need to handle some other hipotetical libc that > > > defines pthread_t diferently there. If it appears, we'll handle it. > > > > > Chances are, some other changes would be necessary to make it really > > > work, not just build. > > > > > > > exactly. > > No, you're completely missing the point. The code in question is > not _using_ posix threads, as a normal application would. > It is interfacing with the system's libc to get at the thread support > library's internals, and letting the debugger control and instead the state > of threads. There's no portability requirement here. The major requirement > is that the code works with the libc it is intended to be interfacing with. > > pthread_t on a given system is not _required_ to be an arithmetic type, but > it certainly can. Sorry, I strongly object to changing the solaris or AIX > thread support just because you've learned that on some other systems > pthread_t may not be numeric. > > > for example in musl's case it is wrong to compare the > > underlying type (which is a struct pointer) with 0. > > musl does not support solaris nor aix... > > > in the implementation specific part it would practically be valid to > > compare it with > > NULL, but that's still invalid regarding the definition in POSIX. > > The gdb code in question _is_ peeking at implementation specific > parts! That's the point that's apparently gone missing. > > E.g., with such a program being debugged: > > pthread_t thread; > res = pthread_create (&thread, NULL, entry_func, NULL); > ... > > You can do: > > (gdb) p /x thread > $2 = 0x7ffff7829700 > > And correlate that with the thread in gdb's thread list: > > (gdb) info threads > 3 Thread 0x7ffff7028700 (LWP 29643) 0x00007ffff78d75ad in nanosleep () at ../sysdeps/unix/syscall-template.S:82 > 2 Thread 0x7ffff7829700 (LWP 29640) 0x00007ffff78d75ad in nanosleep () at ../sysdeps/unix/syscall-template.S:82 > * 1 Thread 0x7ffff7fcd720 (LWP 29618) main () at threads.c:50 > > See, it's thread 2. If on some other platform pthread_t is > > struct pthread_t { int foo; int bar; }; > > Then we'll need to have specific code in gdb's libthread_db.so handling > to print that. But on some other platform, it could be, > > struct pthread_t { long tid; int some_other_id_who_knows; }; > > and again, we'd need some code to be able to print this. > > So, you see, as we find platforms that need adapting to, we'll > do it, but it doesn't make much sense to do it before stumbling > on them. > > > so the few places that use a thread id had to be changed everywhere, to > > not have dozens of special case > > sections for any libc/OS combination... > > Again, let's not invent work until we find such a combination... > There's no way you can know what such a libc/thread_db's pthread_t > would look like until it exists. > > > also there are a few spots where the thread_t is numerically compared to > > some arbitrary int value > > (i have not yet evaluated where those originate from). > > Let us know when you find them... > > Maybe you're thinking of this in gdbserver/thread-db.c, where we > have: > > /* If the new thread ID is zero, a final thread ID will be available > later. Do not enable thread debugging yet. */ > if (ti.ti_tid == 0) > return 0; > > but again, this is an implementation detail of glibc that gdb needs > to know about! There's no portable null pthread_t, and even if > there was, there's no telling what other libc's do in this scenario. > Again, there's _no_ way to make this portable, because we _are_ part > of the implementation. > > Code doing the same in gdb, where the comments are a bit more clear: > > if (ti.ti_tid == 0) > { > /* A thread ID of zero means that this is the main thread, but > glibc has not yet initialized thread-local storage and the > pthread library. We do not know what the thread's TID will > be yet. Just enable event reporting and otherwise ignore > it. */ > > > this also can't > > work when thread_t is no number. > > this happens for example in the if statement leading to the posted > > printf format string errors. > > > > i expect that the number of actual fixes would be small, but it might > > require another way to get to a thread id. > > > > if the id is only used for textual representation, it would be > > sufficient if gdb would assign a new unique number > > on each successful pthread_create. > > if it is also used to find a thread internally, it had to have a list of > > tid numbers -> pthread_t. > > Again, when we have a libc/thread_db that requires that, we'll do it. > Until then, we'll keep the code simple, thank you. > > > > > > > >>>> thread-db.c: In function 'find_one_thread': > > >>>> thread-db.c:295:7: error: format '%ld' expects type 'long int', but > > >>>> argument 3 has type 'thread_t' > > >>>> thread-db.c: In function 'attach_thread': > > >>>> thread-db.c:335:7: error: format '%ld' expects type 'long int', but > > >>>> argument 3 has type 'thread_t' > > >>>> thread-db.c:341:9: error: format '%ld' expects type 'long int', but > > >>>> argument 2 has type 'thread_t' > > >>> So just cast it to long, and you're done. > > >>> > > >> pthread_t could legally be a struct, which you can't just cast to a long. > > > No need to complicate things for an hipotetical scenario. The set of > > > libc's in existence is finite. If we were to handle a struct pthread_t, > > > we'd need to be able to print it, and so we'd need some libc specific > > > way to do it, something autoconf'ed. There's no need to invent work. > > > > > i disagree. adding a proper solution once is superior to creating dozens > > of special case hacks. > > What's the "proper" solution then? The only possible way I can think > of to handle some libc what uses a struct for pthread_t, is with some way at > runtime to know which libc/thread_db gdb is talking to, failing that, > #ifdefs and/or autoconfig'ury. Since a simple cast will > work for glibc, android, musl, and whatever other libc's happen be > be building gdbserver today, what's the point? I'll even go a step further, and suggest you do expose pthread_t as a typedef to long in public headers, and cast it to pointer internally in musl, just like glibc and most other libc's do. Hiding the fact that you implement things with a pointer could even be a good thing. It's just easier for everybody else that way. There's really no point in being different for the sake of it. > > > also it saves a lot of time in the long term. > > No, it wastes time right now coming up with a fix for an made up > problem that does not exist in practice today. > > -- Pedro Alves ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-09-17 15:47 ` Pedro Alves @ 2011-10-01 1:02 ` John Spencer 2011-10-01 2:00 ` Pedro Alves 0 siblings, 1 reply; 19+ messages in thread From: John Spencer @ 2011-10-01 1:02 UTC (permalink / raw) To: gdb-patches; +Cc: Pedro Alves On 09/17/2011 05:29 PM, Pedro Alves wrote: > On Saturday 17 September 2011 15:58:33, Pedro Alves wrote: >> On Saturday 17 September 2011 02:02:28, John Spencer wrote: >>> On 09/17/2011 02:30 AM, Pedro Alves wrote: >>>> On Saturday 17 September 2011 00:13:10, John Spencer wrote: >>>>> On 09/17/2011 01:00 AM, Pedro Alves wrote: >>>>>> These are only built natively on solaris and aix respectively, so >>>>>> let's just leave them alone. >>>>>> >>>>> I expected it to be desirable for a product in industrial use to be >>>>> standard-compliant and not invoking undefined behavior. >>>> Those files are tied to those platforms' thread_db/libc implementations. >>>> There's absolutely no need to handle some other hipotetical libc that >>>> defines pthread_t diferently there. If it appears, we'll handle it. >>>> Chances are, some other changes would be necessary to make it really >>>> work, not just build. >>>> >>> exactly. >> No, you're completely missing the point. The code in question is >> not _using_ posix threads, as a normal application would. >> It is interfacing with the system's libc to get at the thread support >> library's internals, and letting the debugger control and instead the state >> of threads. There's no portability requirement here. The major requirement >> is that the code works with the libc it is intended to be interfacing with. >> >> pthread_t on a given system is not _required_ to be an arithmetic type, but >> it certainly can. Sorry, I strongly object to changing the solaris or AIX >> thread support just because you've learned that on some other systems >> pthread_t may not be numeric. >> >>> for example in musl's case it is wrong to compare the >>> underlying type (which is a struct pointer) with 0. >> musl does not support solaris nor aix... >> >>> in the implementation specific part it would practically be valid to >>> compare it with >>> NULL, but that's still invalid regarding the definition in POSIX. >> The gdb code in question _is_ peeking at implementation specific >> parts! That's the point that's apparently gone missing. >> >> E.g., with such a program being debugged: >> >> pthread_t thread; >> res = pthread_create (&thread, NULL, entry_func, NULL); >> ... >> >> You can do: >> >> (gdb) p /x thread >> $2 = 0x7ffff7829700 >> >> And correlate that with the thread in gdb's thread list: >> >> (gdb) info threads >> 3 Thread 0x7ffff7028700 (LWP 29643) 0x00007ffff78d75ad in nanosleep () at ../sysdeps/unix/syscall-template.S:82 >> 2 Thread 0x7ffff7829700 (LWP 29640) 0x00007ffff78d75ad in nanosleep () at ../sysdeps/unix/syscall-template.S:82 >> * 1 Thread 0x7ffff7fcd720 (LWP 29618) main () at threads.c:50 >> >> See, it's thread 2. If on some other platform pthread_t is >> >> struct pthread_t { int foo; int bar; }; >> >> Then we'll need to have specific code in gdb's libthread_db.so handling >> to print that. But on some other platform, it could be, >> >> struct pthread_t { long tid; int some_other_id_who_knows; }; >> >> and again, we'd need some code to be able to print this. >> >> So, you see, as we find platforms that need adapting to, we'll >> do it, but it doesn't make much sense to do it before stumbling >> on them. >> >>> so the few places that use a thread id had to be changed everywhere, to >>> not have dozens of special case >>> sections for any libc/OS combination... >> Again, let's not invent work until we find such a combination... >> There's no way you can know what such a libc/thread_db's pthread_t >> would look like until it exists. >> >>> also there are a few spots where the thread_t is numerically compared to >>> some arbitrary int value >>> (i have not yet evaluated where those originate from). >> Let us know when you find them... >> >> Maybe you're thinking of this in gdbserver/thread-db.c, where we >> have: >> >> /* If the new thread ID is zero, a final thread ID will be available >> later. Do not enable thread debugging yet. */ >> if (ti.ti_tid == 0) >> return 0; >> >> but again, this is an implementation detail of glibc that gdb needs >> to know about! There's no portable null pthread_t, and even if >> there was, there's no telling what other libc's do in this scenario. >> Again, there's _no_ way to make this portable, because we _are_ part >> of the implementation. >> >> Code doing the same in gdb, where the comments are a bit more clear: >> >> if (ti.ti_tid == 0) >> { >> /* A thread ID of zero means that this is the main thread, but >> glibc has not yet initialized thread-local storage and the >> pthread library. We do not know what the thread's TID will >> be yet. Just enable event reporting and otherwise ignore >> it. */ >> >>> this also can't >>> work when thread_t is no number. >>> this happens for example in the if statement leading to the posted >>> printf format string errors. >> >>> i expect that the number of actual fixes would be small, but it might >>> require another way to get to a thread id. >>> >>> if the id is only used for textual representation, it would be >>> sufficient if gdb would assign a new unique number >>> on each successful pthread_create. >>> if it is also used to find a thread internally, it had to have a list of >>> tid numbers -> pthread_t. >> Again, when we have a libc/thread_db that requires that, we'll do it. >> Until then, we'll keep the code simple, thank you. >> >>> >>>>>>> thread-db.c: In function 'find_one_thread': >>>>>>> thread-db.c:295:7: error: format '%ld' expects type 'long int', but >>>>>>> argument 3 has type 'thread_t' >>>>>>> thread-db.c: In function 'attach_thread': >>>>>>> thread-db.c:335:7: error: format '%ld' expects type 'long int', but >>>>>>> argument 3 has type 'thread_t' >>>>>>> thread-db.c:341:9: error: format '%ld' expects type 'long int', but >>>>>>> argument 2 has type 'thread_t' >>>>>> So just cast it to long, and you're done. >>>>>> >>>>> pthread_t could legally be a struct, which you can't just cast to a long. >>>> No need to complicate things for an hipotetical scenario. The set of >>>> libc's in existence is finite. If we were to handle a struct pthread_t, >>>> we'd need to be able to print it, and so we'd need some libc specific >>>> way to do it, something autoconf'ed. There's no need to invent work. >>>> >>> i disagree. adding a proper solution once is superior to creating dozens >>> of special case hacks. >> What's the "proper" solution then? The only possible way I can think >> of to handle some libc what uses a struct for pthread_t, is with some way at >> runtime to know which libc/thread_db gdb is talking to, failing that, >> #ifdefs and/or autoconfig'ury. Since a simple cast will >> work for glibc, android, musl, and whatever other libc's happen be >> be building gdbserver today, what's the point? > I'll even go a step further, and suggest you do expose pthread_t > as a typedef to long in public headers, and cast it to pointer internally > in musl, just like glibc and most other libc's do. Hiding the fact > that you implement things with a pointer could even be a good thing. > > It's just easier for everybody else that way. There's really no point in > being different for the sake of it. > that's definitely going to far, you can't dictate how libcs should handle an opaque type internally just to keep broken code working. since it seems to much effort to change it to a proper solution, there should be at least a explicit function/macro which takes a thread_t and converts it to long, since it is assumed in a couple of spots that it is of this type. that is exactly what my patch does. and as you wished, it fixes the current issue with minimal effort. >>> also it saves a lot of time in the long term. >> No, it wastes time right now coming up with a fix for an made up >> problem that does not exist in practice today. >> >> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-10-01 1:02 ` John Spencer @ 2011-10-01 2:00 ` Pedro Alves 2011-10-01 9:00 ` Mark Kettenis 2011-10-05 8:02 ` John Spencer 0 siblings, 2 replies; 19+ messages in thread From: Pedro Alves @ 2011-10-01 2:00 UTC (permalink / raw) To: John Spencer; +Cc: gdb-patches On Saturday 01 October 2011 01:59:25, John Spencer wrote: > >>> i disagree. adding a proper solution once is superior to creating dozens > >>> of special case hacks. > >> What's the "proper" solution then? The only possible way I can think > >> of to handle some libc what uses a struct for pthread_t, is with some way at > >> runtime to know which libc/thread_db gdb is talking to, failing that, > >> #ifdefs and/or autoconfig'ury. Since a simple cast will > >> work for glibc, android, musl, and whatever other libc's happen be > >> be building gdbserver today, what's the point? > > I'll even go a step further, and suggest you do expose pthread_t > > as a typedef to long in public headers, and cast it to pointer internally > > in musl, just like glibc and most other libc's do. Hiding the fact > > that you implement things with a pointer could even be a good thing. > > > > It's just easier for everybody else that way. There's really no point in > > being different for the sake of it. > > > > that's definitely going to far, you can't dictate how libcs should > handle an opaque type internally just to keep broken code working. The fact that you're still calling gdb's libthread_db.so handling code broken because of this only shows that you still missed the point of the code you're talking about. Let me sum it up: The debugger is coordinating with a libc-provided shared library, that is used to peek at the libc's internal state. pthread_t's type is implementation defined. The debugger's code in question is interacting with a specific libc implementation (glibc). For that fact, the debugger's code in question can be seen as being part of "the implementation" (as in the pthread_t's type being implementation defined part). The fact that other libcs work with the same code is a bonus. Again: > that's definitely going to far, you can't dictate how libcs should > handle an opaque type internally just to keep broken code working. I can at least try to persuade one libc author into sanity. :-) Ever heard of not inflicting unnecessary pain on others for no good reason? You know people do sprinkle around printfs of pthread_t's for debug purposes, don't you? Things like printing printf ("%ld been here\n", pthread_self ()). I really don't see the point of not being source compatible with glibc (and all other linux libcs) when it's super easy to be so... Again: > that's definitely going to far, you can't dictate how libcs should > handle an opaque type internally just to keep broken code working. public header: typedef unsigned long pthread_t; musl implementation: struct pthread { whatever you have today; }; int pthread_foo_public_function (pthread_t th, ...) { struct pthread *t = (pthread_t) th; // business as usual. } There, how did that dictate how your libc handles its opaque type _internally_ again? It shouldn't take more than 15 minutes to change all of that in musl, and everyone lives happy ever after. > there > should be at least a explicit function/macro which takes a thread_t and > converts it to long, since it is assumed in a couple of spots that it is > of this type. > that is exactly what my patch does. > > and as you wished, it fixes the current issue with minimal effort. The patch has a number of problems (no biggie, just the usual for someone not used to GNU code). I'll take a look if I still failed to convince you to change musl instead. -- Pedro Alves ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-10-01 2:00 ` Pedro Alves @ 2011-10-01 9:00 ` Mark Kettenis 2011-10-01 9:14 ` Kevin Pouget 2011-10-05 8:02 ` John Spencer 1 sibling, 1 reply; 19+ messages in thread From: Mark Kettenis @ 2011-10-01 9:00 UTC (permalink / raw) To: pedro; +Cc: maillist-gdbpatches, gdb-patches > From: Pedro Alves <pedro@codesourcery.com> > Date: Sat, 1 Oct 2011 02:59:59 +0100 > > > there > > should be at least a explicit function/macro which takes a thread_t and > > converts it to long, since it is assumed in a couple of spots that it is > > of this type. > > that is exactly what my patch does. > > > > and as you wished, it fixes the current issue with minimal effort. > > The patch has a number of problems (no biggie, just the usual for > someone not used to GNU code). I'll take a look if I still failed > to convince you to change musl instead. No, I think you shouldn't. This whole madness with a zillion Linux libc's has to stop. We can't add support for each and every one of them. I think we should take the position that if people want thread support for their non-standard libc's in GDB they should provide a libthread_db.so that is ABI compatible with the one provided by glibc. Since pthread_t is part of that ABI, that means pthread_t has to be "unsigned long int". ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-10-01 9:00 ` Mark Kettenis @ 2011-10-01 9:14 ` Kevin Pouget 0 siblings, 0 replies; 19+ messages in thread From: Kevin Pouget @ 2011-10-01 9:14 UTC (permalink / raw) To: maillist-gdbpatches; +Cc: gdb-patches On Sat, Oct 1, 2011 at 11:00 AM, Mark Kettenis <mark.kettenis@xs4all.nl> wrote: >> From: Pedro Alves <pedro@codesourcery.com> >> Date: Sat, 1 Oct 2011 02:59:59 +0100 >> >> > there >> > should be at least a explicit function/macro which takes a thread_t and >> > converts it to long, since it is assumed in a couple of spots that it is >> > of this type. >> > that is exactly what my patch does. >> > >> > and as you wished, it fixes the current issue with minimal effort. >> >> The patch has a number of problems (no biggie, just the usual for >> someone not used to GNU code). I'll take a look if I still failed >> to convince you to change musl instead. > > No, I think you shouldn't. This whole madness with a zillion Linux > libc's has to stop. We can't add support for each and every one of > them. I think we should take the position that if people want thread > support for their non-standard libc's in GDB they should provide a > libthread_db.so that is ABI compatible with the one provided by glibc. > Since pthread_t is part of that ABI, that means pthread_t has to be > "unsigned long int". Just my 2 cents here, but a few years ago, I developped a library named "User-Level Thread_db" [1], which aims at reducing the burden of thread-library developpers when it comes to provide debuggers support. > they should provide a libthread_db.so that is ABI compatible with the one provided by glibc I took the opposite approach: I developped the `libthread_db.so' and the thread-target on GDB's side, and then specified a very light API to link `libthread_db.so' to the threading library. The appoach was demonstrated efficient on two different libraries, just a few days requied for the integration. That was on the 6.8 days, so some adaptations should be necessary for 7.*, but most of the work is already in place. Cordially, Kevin [1] http://mpc.sourceforge.net/files/PouPerCarJou10MTAAP.pdf ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: wrong assumptions about pthread_t being numeric 2011-10-01 2:00 ` Pedro Alves 2011-10-01 9:00 ` Mark Kettenis @ 2011-10-05 8:02 ` John Spencer 1 sibling, 0 replies; 19+ messages in thread From: John Spencer @ 2011-10-05 8:02 UTC (permalink / raw) To: gdb-patches; +Cc: Pedro Alves On 10/01/2011 03:59 AM, Pedro Alves wrote: > On Saturday 01 October 2011 01:59:25, John Spencer wrote: >>>>> i disagree. adding a proper solution once is superior to creating dozens >>>>> of special case hacks. >>>> What's the "proper" solution then? The only possible way I can think >>>> of to handle some libc what uses a struct for pthread_t, is with some way at >>>> runtime to know which libc/thread_db gdb is talking to, failing that, >>>> #ifdefs and/or autoconfig'ury. Since a simple cast will >>>> work for glibc, android, musl, and whatever other libc's happen be >>>> be building gdbserver today, what's the point? >>> I'll even go a step further, and suggest you do expose pthread_t >>> as a typedef to long in public headers, and cast it to pointer internally >>> in musl, just like glibc and most other libc's do. Hiding the fact >>> that you implement things with a pointer could even be a good thing. >>> >>> It's just easier for everybody else that way. There's really no point in >>> being different for the sake of it. >>> >> that's definitely going to far, you can't dictate how libcs should >> handle an opaque type internally just to keep broken code working. > The fact that you're still calling gdb's libthread_db.so handling > code broken because of this only shows that you still missed the point > of the code you're talking about. Let me sum it up: > The debugger is coordinating with a libc-provided shared library, that > is used to peek at the libc's internal state. that's the disturbing part. there are cleaner ways to get a unique number without poking at libc internals. however, the error has been made and it is much effort to correct it, so let's continue patching for the time being. > pthread_t's type is > implementation defined. The debugger's code in question is interacting > with a specific libc implementation (glibc). For that fact, the > debugger's code in question can be seen as being part of "the > implementation" (as in the pthread_t's type being implementation > defined part). The fact that other libcs work with the same code > is a bonus. > > Again: > >> that's definitely going to far, you can't dictate how libcs should >> handle an opaque type internally just to keep broken code working. > I can at least try to persuade one libc author into sanity. :-) i'm not musl's author, just helping to get some important packages built ;) i doubt that the author will change his implementation, since the solution you proposed will induce a performance penalty on some compilers, and result in less strict compile-time type checks. > Ever heard of not inflicting unnecessary pain on others for no > good reason? You know people do sprinkle around printfs of > pthread_t's for debug purposes, don't you? Things like printing > printf ("%ld been here\n", pthread_self ()). I really don't see > the point of not being source compatible with glibc (and all other > linux libcs) when it's super easy to be so... > > Again: > >> that's definitely going to far, you can't dictate how libcs should >> handle an opaque type internally just to keep broken code working. > public header: > > typedef unsigned long pthread_t; > > musl implementation: > > struct pthread > { > whatever you have today; > }; > > int pthread_foo_public_function (pthread_t th, ...) > { > struct pthread *t = (pthread_t) th; > > // business as usual. > } > > There, how did that dictate how your libc handles its > opaque type _internally_ again? It shouldn't take more than > 15 minutes to change all of that in musl, and everyone lives > happy ever after. > >> there >> should be at least a explicit function/macro which takes a thread_t and >> converts it to long, since it is assumed in a couple of spots that it is >> of this type. >> that is exactly what my patch does. >> >> and as you wished, it fixes the current issue with minimal effort. > The patch has a number of problems (no biggie, just the usual for > someone not used to GNU code). I'll take a look if I still failed > to convince you to change musl instead. > yes, please take a look at the patch. it imposes no performance costs and makes the code more correct. also if a new platform shows up, one can simply add a single ifdef to the macro and all code accessing the thread_t in a numeric context will magically work. -- JS ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2011-10-05 8:02 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-09-16 23:01 wrong assumptions about pthread_t being numeric John Spencer 2011-09-16 23:16 ` Pedro Alves 2011-09-17 0:31 ` John Spencer 2011-09-17 1:05 ` Pedro Alves 2011-09-17 1:23 ` John Spencer 2011-09-17 2:29 ` Matt Rice 2011-09-17 6:47 ` Joel Brobecker 2011-09-18 0:11 ` John Spencer 2011-09-29 2:31 ` John Spencer 2011-09-29 8:39 ` Kai Tietz 2011-10-01 1:08 ` John Spencer 2011-09-29 11:10 ` Pedro Alves 2011-09-17 15:29 ` Pedro Alves 2011-09-17 15:47 ` Pedro Alves 2011-10-01 1:02 ` John Spencer 2011-10-01 2:00 ` Pedro Alves 2011-10-01 9:00 ` Mark Kettenis 2011-10-01 9:14 ` Kevin Pouget 2011-10-05 8:02 ` John Spencer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox