VDR gets somehow stuck and consumes all CPU time

Message ID 20051203193614.GA6623@linuxtv.org
State New
Headers

Commit Message

Johannes Stezenbach Dec. 3, 2005, 7:36 p.m. UTC
  On Sat, Dec 03, 2005, Klaus Schmidinger wrote:
> (AFAIK with NPTL all threads
> of a given program have the same pid, so you won't be able to
> distinguish them in 'top').

This is not entirely true, you can still see and distinguish
the threads in htop or "ps -T u -C vdr" etc. (top does not work).

The patch below might help, gettid() returns the PID of the thread. (And
since it's a syscall it is independent of NPTL vs. linuxthreads. Tested
on 2.6 only, but the gettid man page says it's available in 2.4.20.
gettid() is Linux specific.)

Johannes
  

Comments

Klaus Schmidinger Dec. 4, 2005, 9:37 a.m. UTC | #1
Johannes Stezenbach wrote:
> On Sat, Dec 03, 2005, Klaus Schmidinger wrote:
> 
>>(AFAIK with NPTL all threads
>>of a given program have the same pid, so you won't be able to
>>distinguish them in 'top').
> 
> 
> This is not entirely true, you can still see and distinguish
> the threads in htop or "ps -T u -C vdr" etc. (top does not work).
> 
> The patch below might help, gettid() returns the PID of the thread. (And
> since it's a syscall it is independent of NPTL vs. linuxthreads. Tested
> on 2.6 only, but the gettid man page says it's available in 2.4.20.
> gettid() is Linux specific.)

Does this "gettid" call return a different tid than "pthread_self()"?

I'm just wondering because the introduction of "pthread_self()" was one
of the things we had to change to make VDR run with NPTL...

Klaus

> Johannes
> 
> --- vdr-1.3.37/thread.c.orig	2005-12-03 19:52:38.000000000 +0100
> +++ vdr-1.3.37/thread.c	2005-12-03 20:12:47.000000000 +0100
> @@ -17,6 +17,11 @@
>  #include <unistd.h>
>  #include "tools.h"
>  
> +static inline pid_t gettid(void)
> +{
> +  return (pid_t) syscall(224);
> +}
> +
>  static bool GetAbsTime(struct timespec *Abstime, int MillisecondsFromNow)
>  {
>    struct timeval now;
> @@ -231,10 +236,10 @@ void cThread::SetDescription(const char 
>  void *cThread::StartThread(cThread *Thread)
>  {
>    if (Thread->description)
> -     dsyslog("%s thread started (pid=%d, tid=%ld)", Thread->description, getpid(), pthread_self());
> +     dsyslog("%s thread started (pid=%d, tid=%d)", Thread->description, getpid(), gettid());
>    Thread->Action();
>    if (Thread->description)
> -     dsyslog("%s thread ended (pid=%d, tid=%ld)", Thread->description, getpid(), pthread_self());
> +     dsyslog("%s thread ended (pid=%d, tid=%d)", Thread->description, getpid(), gettid());
>    Thread->running = false;
>    Thread->active = false;
>    return NULL;
  
Gerald Raaf Dec. 4, 2005, 10:26 a.m. UTC | #2
Am Sonntag, 4. Dezember 2005 10:37 schrieb Klaus Schmidinger:
> Johannes Stezenbach wrote:
> > On Sat, Dec 03, 2005, Klaus Schmidinger wrote:
> >>(AFAIK with NPTL all threads
> >>of a given program have the same pid, so you won't be able to
> >>distinguish them in 'top').
> >
> > This is not entirely true, you can still see and distinguish
> > the threads in htop or "ps -T u -C vdr" etc. (top does not work).
> >
> > The patch below might help, gettid() returns the PID of the thread. (And
> > since it's a syscall it is independent of NPTL vs. linuxthreads. Tested
> > on 2.6 only, but the gettid man page says it's available in 2.4.20.
> > gettid() is Linux specific.)
>
> Does this "gettid" call return a different tid than "pthread_self()"?
>

pthread_self() sample output log
[2005/12/04 11:14:02] vdr vdr[27644]: tuner on device 3 thread started 
(pid=27644, tid=-1265644624)
sample output ps -T u -C vdr
root@vdr:~# ps -T u -C vdr
USER       PID  SPID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     27644 27644  3.2  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     27644 27742  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     27644 27743  0.0  2.5 129332 26624 ?        RNl  11:14   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     27644 27745  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     27644 27746  0.0  2.5 129332 26624 ?        SNl  11:14   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     27644 27748  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     27644 27749  0.0  2.5 129332 26624 ?        SNl  11:14   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     27644 27750  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     27644 27751  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     27644 27752  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     27644 27753  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -


gettid()  sample output log with Patch from Johannes Stetzenbach
[2005/12/04 11:16:15] vdr vdr[29989]: tuner on device 3 thread started 
(pid=29989, tid=30086)
sample output ps -T u -C vdr
root@vdr:~# ps -T u -C vdr
USER       PID  SPID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     29989 29989  0.1  2.5 129380 26640 ?        Sl   11:16   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     29989 30080  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     29989 30081  0.0  2.5 129380 26640 ?        SNl  11:16   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     29989 30083  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     29989 30084  0.0  2.5 129380 26640 ?        SNl  11:16   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     29989 30086  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     29989 30087  0.0  2.5 129380 26640 ?        SNl  11:16   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     29989 30088  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     29989 30089  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     29989 30090  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
root     29989 30091  0.0  2.5 129380 26640 ?        Rl   11:16   0:00 ./vdr 
-w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -

my System is a NPTL only System (Linuxfromscatch System) with vdr-1.3.37.

> I'm just wondering because the introduction of "pthread_self()" was one
> of the things we had to change to make VDR run with NPTL...
>
> Klaus
>
> > Johannes
> >
> > --- vdr-1.3.37/thread.c.orig	2005-12-03 19:52:38.000000000 +0100
> > +++ vdr-1.3.37/thread.c	2005-12-03 20:12:47.000000000 +0100
> > @@ -17,6 +17,11 @@
> >  #include <unistd.h>
> >  #include "tools.h"
> >
> > +static inline pid_t gettid(void)
> > +{
> > +  return (pid_t) syscall(224);
> > +}
> > +
> >  static bool GetAbsTime(struct timespec *Abstime, int
> > MillisecondsFromNow) {
> >    struct timeval now;
> > @@ -231,10 +236,10 @@ void cThread::SetDescription(const char
> >  void *cThread::StartThread(cThread *Thread)
> >  {
> >    if (Thread->description)
> > -     dsyslog("%s thread started (pid=%d, tid=%ld)", Thread->description,
> > getpid(), pthread_self()); +     dsyslog("%s thread started (pid=%d,
> > tid=%d)", Thread->description, getpid(), gettid()); Thread->Action();
> >    if (Thread->description)
> > -     dsyslog("%s thread ended (pid=%d, tid=%ld)", Thread->description,
> > getpid(), pthread_self()); +     dsyslog("%s thread ended (pid=%d,
> > tid=%d)", Thread->description, getpid(), gettid()); Thread->running =
> > false;
> >    Thread->active = false;
> >    return NULL;
>
> _______________________________________________
> vdr mailing list
> vdr@linuxtv.org
> http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr
  
Klaus Schmidinger Dec. 4, 2005, 11:12 a.m. UTC | #3
Gerald Raaf wrote:
> Am Sonntag, 4. Dezember 2005 10:37 schrieb Klaus Schmidinger:
> 
>>Johannes Stezenbach wrote:
>>
>>>On Sat, Dec 03, 2005, Klaus Schmidinger wrote:
>>>
>>>>(AFAIK with NPTL all threads
>>>>of a given program have the same pid, so you won't be able to
>>>>distinguish them in 'top').
>>>
>>>This is not entirely true, you can still see and distinguish
>>>the threads in htop or "ps -T u -C vdr" etc. (top does not work).
>>>
>>>The patch below might help, gettid() returns the PID of the thread. (And
>>>since it's a syscall it is independent of NPTL vs. linuxthreads. Tested
>>>on 2.6 only, but the gettid man page says it's available in 2.4.20.
>>>gettid() is Linux specific.)
>>
>>Does this "gettid" call return a different tid than "pthread_self()"?
>>
> 
> 
> pthread_self() sample output log
> [2005/12/04 11:14:02] vdr vdr[27644]: tuner on device 3 thread started 
> (pid=27644, tid=-1265644624)
> sample output ps -T u -C vdr
> root@vdr:~# ps -T u -C vdr
> USER       PID  SPID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> root     27644 27644  3.2  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     27644 27742  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     27644 27743  0.0  2.5 129332 26624 ?        RNl  11:14   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     27644 27745  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     27644 27746  0.0  2.5 129332 26624 ?        SNl  11:14   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     27644 27748  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     27644 27749  0.0  2.5 129332 26624 ?        SNl  11:14   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     27644 27750  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     27644 27751  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     27644 27752  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     27644 27753  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> 
> 
> gettid()  sample output log with Patch from Johannes Stetzenbach
> [2005/12/04 11:16:15] vdr vdr[29989]: tuner on device 3 thread started 
> (pid=29989, tid=30086)
> sample output ps -T u -C vdr
> root@vdr:~# ps -T u -C vdr
> USER       PID  SPID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> root     29989 29989  0.1  2.5 129380 26640 ?        Sl   11:16   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     29989 30080  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     29989 30081  0.0  2.5 129380 26640 ?        SNl  11:16   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     29989 30083  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     29989 30084  0.0  2.5 129380 26640 ?        SNl  11:16   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     29989 30086  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     29989 30087  0.0  2.5 129380 26640 ?        SNl  11:16   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     29989 30088  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     29989 30089  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     29989 30090  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> root     29989 30091  0.0  2.5 129380 26640 ?        Rl   11:16   0:00 ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
> 
> my System is a NPTL only System (Linuxfromscatch System) with vdr-1.3.37.

Well, this shows me that on such a system all VDR threads have the same PID,
but different SPIDs. I don't think that this has in any way changed through
Johannes' patch.

What would be interesting would be the log excerpts corresponding to these
test runs, where the VDR threads log their 'pid' and 'tid' values.
Maybe you could post these, too?

Klaus
  
Klaus Schmidinger Dec. 4, 2005, 11:14 a.m. UTC | #4
Klaus Schmidinger wrote:
> Gerald Raaf wrote:
> 
>> Am Sonntag, 4. Dezember 2005 10:37 schrieb Klaus Schmidinger:
>>
>>> Johannes Stezenbach wrote:
>>>
>>>> On Sat, Dec 03, 2005, Klaus Schmidinger wrote:
>>>>
>>>>> (AFAIK with NPTL all threads
>>>>> of a given program have the same pid, so you won't be able to
>>>>> distinguish them in 'top').
>>>>
>>>>
>>>> This is not entirely true, you can still see and distinguish
>>>> the threads in htop or "ps -T u -C vdr" etc. (top does not work).
>>>>
>>>> The patch below might help, gettid() returns the PID of the thread. 
>>>> (And
>>>> since it's a syscall it is independent of NPTL vs. linuxthreads. Tested
>>>> on 2.6 only, but the gettid man page says it's available in 2.4.20.
>>>> gettid() is Linux specific.)
>>>
>>>
>>> Does this "gettid" call return a different tid than "pthread_self()"?
>>>
>>
>>
>> pthread_self() sample output log
>> [2005/12/04 11:14:02] vdr vdr[27644]: tuner on device 3 thread started 
>> (pid=27644, tid=-1265644624)
>> sample output ps -T u -C vdr
>> root@vdr:~# ps -T u -C vdr
>> USER       PID  SPID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME 
>> COMMAND
>> root     27644 27644  3.2  2.5 129332 26624 ?        Sl   11:14   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     27644 27742  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     27644 27743  0.0  2.5 129332 26624 ?        RNl  11:14   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     27644 27745  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     27644 27746  0.0  2.5 129332 26624 ?        SNl  11:14   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     27644 27748  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     27644 27749  0.0  2.5 129332 26624 ?        SNl  11:14   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     27644 27750  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     27644 27751  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     27644 27752  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     27644 27753  0.0  2.5 129332 26624 ?        Sl   11:14   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>>
>>
>> gettid()  sample output log with Patch from Johannes Stetzenbach
>> [2005/12/04 11:16:15] vdr vdr[29989]: tuner on device 3 thread started 
>> (pid=29989, tid=30086)
>> sample output ps -T u -C vdr
>> root@vdr:~# ps -T u -C vdr
>> USER       PID  SPID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME 
>> COMMAND
>> root     29989 29989  0.1  2.5 129380 26640 ?        Sl   11:16   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     29989 30080  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     29989 30081  0.0  2.5 129380 26640 ?        SNl  11:16   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     29989 30083  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     29989 30084  0.0  2.5 129380 26640 ?        SNl  11:16   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     29989 30086  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     29989 30087  0.0  2.5 129380 26640 ?        SNl  11:16   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     29989 30088  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     29989 30089  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     29989 30090  0.0  2.5 129380 26640 ?        Sl   11:16   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>> root     29989 30091  0.0  2.5 129380 26640 ?        Rl   11:16   0:00 
>> ./vdr -w 60 -s /usr/local/bin/stopvdr -r /usr/local/bin/record.sh -
>>
>> my System is a NPTL only System (Linuxfromscatch System) with vdr-1.3.37.
> 
> 
> Well, this shows me that on such a system all VDR threads have the same 
> PID,
> but different SPIDs. I don't think that this has in any way changed through
> Johannes' patch.
> 
> What would be interesting would be the log excerpts corresponding to these
> test runs, where the VDR threads log their 'pid' and 'tid' values.
> Maybe you could post these, too?

Aaaahhhrrrgghhhh - right after sending this message I saw that you
had actually posted these. They must have slipped by my eyes, sorry.

Klaus
  
Johannes Stezenbach Dec. 4, 2005, 2:53 p.m. UTC | #5
On Sun, Dec 04, 2005 at 10:37:42AM +0100, Klaus Schmidinger wrote:
> Johannes Stezenbach wrote:
> >On Sat, Dec 03, 2005, Klaus Schmidinger wrote:
> >
> >>(AFAIK with NPTL all threads
> >>of a given program have the same pid, so you won't be able to
> >>distinguish them in 'top').
> >
> >
> >This is not entirely true, you can still see and distinguish
> >the threads in htop or "ps -T u -C vdr" etc. (top does not work).
> >
> >The patch below might help, gettid() returns the PID of the thread. (And
> >since it's a syscall it is independent of NPTL vs. linuxthreads. Tested
> >on 2.6 only, but the gettid man page says it's available in 2.4.20.
> >gettid() is Linux specific.)
> 
> Does this "gettid" call return a different tid than "pthread_self()"?
> 
> I'm just wondering because the introduction of "pthread_self()" was one
> of the things we had to change to make VDR run with NPTL...

pthread_self() has to be used within programs using to identify
the threads. gettid() is more a debugging aid as the return value
of pthread_self() has no meaning outside the context of the running
program. (Funny that glibc doesn't have a syscall wrapper for
gettid(); dietlibc has.)

Johannes
  
Paavo Hartikainen Dec. 5, 2005, 2:57 a.m. UTC | #6
Johannes Stezenbach <js@linuxtv.org> writes:

> The patch below might help, gettid() returns the PID of the
> thread. (And since it's a syscall it is independent of NPTL
> vs. linuxthreads. Tested on 2.6 only, but the gettid man page says
> it's available in 2.4.20.  gettid() is Linux specific.)

Before patch:
---
Dec  3 05:27:30 sarabi vdr[22722]: tuner on device 1 thread started (pid=22722, tid=822543584)
Dec  3 05:27:32 sarabi vdr[22722]: tuner on device 2 thread started (pid=22722, tid=814154976)
---

After patch:
---
Dec  5 04:31:46 sarabi vdr[27624]: tuner on device 1 thread started (pid=27624, tid=-1)
Dec  5 04:31:48 sarabi vdr[27624]: tuner on device 2 thread started (pid=27624, tid=-1)
---

After patch and "env LD_ASSUME_KERNEL=2.4.1":
---
Dec  5 04:35:01 sarabi vdr[27641]: tuner on device 1 thread started (pid=27641, tid=-1)
Dec  5 04:35:03 sarabi vdr[27644]: tuner on device 2 thread started (pid=27644, tid=-1)
---

I assumed that "make clean" is not required after patching, just
"make".

Now just waiting for threads to get wildly active.
  
Klaus Schmidinger Dec. 10, 2005, 11:05 a.m. UTC | #7
Johannes Stezenbach wrote:
> ...
> pthread_self() has to be used within programs using to identify
> the threads. gettid() is more a debugging aid as the return value
> of pthread_self() has no meaning outside the context of the running
> program. (Funny that glibc doesn't have a syscall wrapper for
> gettid(); dietlibc has.)
> 
> Johannes

Thanks, this appears to work just fine.

I assume it's ok to use the SYS_gettid macro, as in



#include <sys/syscall.h>

static inline pid_t gettid(void)
{
   return syscall(SYS_gettid);
}



instead of the hardcoded 224.

Klaus
  
Klaus Schmidinger Dec. 10, 2005, 11:13 a.m. UTC | #8
Paavo Hartikainen wrote:
> Johannes Stezenbach <js@linuxtv.org> writes:
> 
> 
>>The patch below might help, gettid() returns the PID of the
>>thread. (And since it's a syscall it is independent of NPTL
>>vs. linuxthreads. Tested on 2.6 only, but the gettid man page says
>>it's available in 2.4.20.  gettid() is Linux specific.)
> 
> 
> Before patch:
> ---
> Dec  3 05:27:30 sarabi vdr[22722]: tuner on device 1 thread started (pid=22722, tid=822543584)
> Dec  3 05:27:32 sarabi vdr[22722]: tuner on device 2 thread started (pid=22722, tid=814154976)
> ---
> 
> After patch:
> ---
> Dec  5 04:31:46 sarabi vdr[27624]: tuner on device 1 thread started (pid=27624, tid=-1)
> Dec  5 04:31:48 sarabi vdr[27624]: tuner on device 2 thread started (pid=27624, tid=-1)
> ---
> 
> After patch and "env LD_ASSUME_KERNEL=2.4.1":
> ---
> Dec  5 04:35:01 sarabi vdr[27641]: tuner on device 1 thread started (pid=27641, tid=-1)
> Dec  5 04:35:03 sarabi vdr[27644]: tuner on device 2 thread started (pid=27644, tid=-1)
> ---

This would indicate that the gettid() call has failed (returned -1).

What system is this happening on?
I tested it here on a SUSE 8.2 (kernel 2.4.20) and SUSE 10.0 (kernel 2.6.13),
and it worked fine on both of them. On the kernel 2.4.20 system it
returned the same value as the getpid() call, and on the 2.6.13 system
it returned pid values that were counted upwards from the main thread's
pid.

Klaus
  
Johannes Stezenbach Dec. 10, 2005, 4:42 p.m. UTC | #9
Klaus Schmidinger wrote:
> Johannes Stezenbach wrote:
> >...
> >pthread_self() has to be used within programs using to identify
> >the threads. gettid() is more a debugging aid as the return value
> >of pthread_self() has no meaning outside the context of the running
> >program. (Funny that glibc doesn't have a syscall wrapper for
> >gettid(); dietlibc has.)
> 
> Thanks, this appears to work just fine.
> 
> I assume it's ok to use the SYS_gettid macro, as in
> 
> 
> 
> #include <sys/syscall.h>
> 
> static inline pid_t gettid(void)
> {
>   return syscall(SYS_gettid);
> }
> 
> 
> 
> instead of the hardcoded 224.

The man page actually suggests:
http://homepages.cwi.nl/~aeb/linux/man2html/man2/gettid.2.html

#include <sys/types.h>
#include <linux/unistd.h>

_syscall0(pid_t,gettid)

(I just made a mistake and included <unistd.h> instead of <linux/unistd.h>
so it didn't work when I initially tested it.)

Johannes
  
Klaus Schmidinger Dec. 10, 2005, 5:05 p.m. UTC | #10
Johannes Stezenbach wrote:
> Klaus Schmidinger wrote:
> 
>>Johannes Stezenbach wrote:
>>
>>>...
>>>pthread_self() has to be used within programs using to identify
>>>the threads. gettid() is more a debugging aid as the return value
>>>of pthread_self() has no meaning outside the context of the running
>>>program. (Funny that glibc doesn't have a syscall wrapper for
>>>gettid(); dietlibc has.)
>>
>>Thanks, this appears to work just fine.
>>
>>I assume it's ok to use the SYS_gettid macro, as in
>>
>>
>>
>>#include <sys/syscall.h>
>>
>>static inline pid_t gettid(void)
>>{
>>  return syscall(SYS_gettid);
>>}
>>
>>
>>
>>instead of the hardcoded 224.
> 
> 
> The man page actually suggests:
> http://homepages.cwi.nl/~aeb/linux/man2html/man2/gettid.2.html
> 
> #include <sys/types.h>
> #include <linux/unistd.h>
> 
> _syscall0(pid_t,gettid)
> 
> (I just made a mistake and included <unistd.h> instead of <linux/unistd.h>
> so it didn't work when I initially tested it.)
> 
> Johannes

Ok, I'll make it that way then.

Do you think we need a fallback solution, just in case the syscall fails?
Paavo Hartikainen's posting (12/05/05 03:57) would indicate that this
can happen.

Maybe we should use pthread_self() in case gettid() returns -1, and
use pthread_t to store such values, because it's large enough to hold
pid_t as well as pthread_t.

Or should we make this system dependent (with #ifdef)?
So that non-Linux systems can provide a different solution.

Klaus
  
Johannes Stezenbach Dec. 10, 2005, 5:50 p.m. UTC | #11
On Sat, Dec 10, 2005, Klaus Schmidinger wrote:
> Do you think we need a fallback solution, just in case the syscall fails?
> Paavo Hartikainen's posting (12/05/05 03:57) would indicate that this
> can happen.
> 
> Maybe we should use pthread_self() in case gettid() returns -1, and
> use pthread_t to store such values, because it's large enough to hold
> pid_t as well as pthread_t.
> 
> Or should we make this system dependent (with #ifdef)?
> So that non-Linux systems can provide a different solution.

I think the pthread_self() return value is pretty useless in debug output.

If gettid() fails then you probably have an old kernel and getpid()
already gives you what you want (a PID you can match with top output).

Johannes
  

Patch

--- vdr-1.3.37/thread.c.orig	2005-12-03 19:52:38.000000000 +0100
+++ vdr-1.3.37/thread.c	2005-12-03 20:12:47.000000000 +0100
@@ -17,6 +17,11 @@ 
 #include <unistd.h>
 #include "tools.h"
 
+static inline pid_t gettid(void)
+{
+  return (pid_t) syscall(224);
+}
+
 static bool GetAbsTime(struct timespec *Abstime, int MillisecondsFromNow)
 {
   struct timeval now;
@@ -231,10 +236,10 @@  void cThread::SetDescription(const char 
 void *cThread::StartThread(cThread *Thread)
 {
   if (Thread->description)
-     dsyslog("%s thread started (pid=%d, tid=%ld)", Thread->description, getpid(), pthread_self());
+     dsyslog("%s thread started (pid=%d, tid=%d)", Thread->description, getpid(), gettid());
   Thread->Action();
   if (Thread->description)
-     dsyslog("%s thread ended (pid=%d, tid=%ld)", Thread->description, getpid(), pthread_self());
+     dsyslog("%s thread ended (pid=%d, tid=%d)", Thread->description, getpid(), gettid());
   Thread->running = false;
   Thread->active = false;
   return NULL;