Section handler thread eating too much cpu

Message ID 200505231939.40269.zzam@gmx.de
State New
Headers

Commit Message

Matthias Schwarzott May 23, 2005, 5:39 p.m. UTC
  Hi there,
in the last days I come accross a strange phenomen: One vdr thread eats up to 
45% cpu on my Pentium3 700Mhz system when I did nothing with my vdr except 
live viewing on a ff card without transfer mode. Some digging resulted in 
this:
- The thread is the section handler thread.
- The cpu-load depends on the transponder currently watching on.

The Pro7/Sat1 Transponder was the one with the highest load. Worst case is to 
record Pro7 and watch it live at the same moment: Recording goes over the 
second card resulting in 2 section handler threads, one for each card eating 
whole cpu.


Attached is a Patch to simply add one sleep(1) inte the loop before the poll. 
This results in reducing the cpu-load from 45% to 1.3%. And it does not seem 
to lose any sections.

If you need more data to reproduce this:
I have a gentoo system without nptl using the newest cvs dvb driver. The 
system itself has a FF 1.3 card and a skystar 2.6B card.

Matthias
  

Comments

Wolfgang Rohdewald May 24, 2005, 1:10 a.m. UTC | #1
On Montag 23 Mai 2005 19:39, Matthias Schwarzott wrote:
> in the last days I come accross a strange phenomen: One vdr thread eats up to 
> 45% cpu on my Pentium3 700Mhz system when I did nothing with my vdr except 
> live viewing on a ff card without transfer mode. Some digging resulted in 
> this:
> - The thread is the section handler thread.
> - The cpu-load depends on the transponder currently watching on.

This might be the same as I had, see my thread "trying to find causes for high cpu usage"
from March 23.

I have debian unstable, kernel 2.6.11, Hauppauge Nexus-S 2.1 and a Budget card.
  
Darren Salt May 24, 2005, 3:31 p.m. UTC | #2
I demand that Matthias Schwarzott may or may not have written...

> in the last days I come accross a strange phenomen: One vdr thread eats up
> to 45% cpu on my Pentium3 700Mhz system when I did nothing with my vdr
> except live viewing on a ff card without transfer mode. Some digging
> resulted in this:
> - The thread is the section handler thread.
> - The cpu-load depends on the transponder currently watching on.
[snip]
> Attached is a Patch to simply add one sleep(1) inte the loop before the
> poll. This results in reducing the cpu-load from 45% to 1.3%. And it does
> not seem to lose any sections.

If you replace that sleep(1) with sched_yield(), do you see the same effect
wrt CPU load?
  
Dr. Werner Fink May 24, 2005, 5:15 p.m. UTC | #3
On Tue, May 24, 2005 at 04:31:31PM +0100, Darren Salt wrote:
> I demand that Matthias Schwarzott may or may not have written...
> 
> > in the last days I come accross a strange phenomen: One vdr thread eats up
> > to 45% cpu on my Pentium3 700Mhz system when I did nothing with my vdr
> > except live viewing on a ff card without transfer mode. Some digging
> > resulted in this:
> > - The thread is the section handler thread.
> > - The cpu-load depends on the transponder currently watching on.
> [snip]
> > Attached is a Patch to simply add one sleep(1) inte the loop before the
> > poll. This results in reducing the cpu-load from 45% to 1.3%. And it does
> > not seem to lose any sections.
> 
> If you replace that sleep(1) with sched_yield(), do you see the same effect
> wrt CPU load?

Wouldn't be pthread_yield() the better solution?

          Werner
  
Matthias Schwarzott May 24, 2005, 9:01 p.m. UTC | #4
On Tuesday 24 May 2005 19:15, Dr. Werner Fink wrote:
> On Tue, May 24, 2005 at 04:31:31PM +0100, Darren Salt wrote:
> > I demand that Matthias Schwarzott may or may not have written...
> >
> > > in the last days I come accross a strange phenomen: One vdr thread eats
> > > up to 45% cpu on my Pentium3 700Mhz system when I did nothing with my
> > > vdr except live viewing on a ff card without transfer mode. Some
> > > digging resulted in this:
> > > - The thread is the section handler thread.
> > > - The cpu-load depends on the transponder currently watching on.
> >
> > [snip]
> >
> > > Attached is a Patch to simply add one sleep(1) inte the loop before the
> > > poll. This results in reducing the cpu-load from 45% to 1.3%. And it
> > > does not seem to lose any sections.
> >
> > If you replace that sleep(1) with sched_yield(), do you see the same
> > effect wrt CPU load?
>
> Wouldn't be pthread_yield() the better solution?
>

I tested sched_yield and pthread_yield and both did not change anything. The 
cpu load is the same as with an unchanged vdr.

Matthias
  
Dr. Werner Fink May 25, 2005, 9:22 a.m. UTC | #5
On Tue, May 24, 2005 at 11:01:50PM +0200, Matthias Schwarzott wrote:
> On Tuesday 24 May 2005 19:15, Dr. Werner Fink wrote:
> > On Tue, May 24, 2005 at 04:31:31PM +0100, Darren Salt wrote:
> > > I demand that Matthias Schwarzott may or may not have written...
> > >
> > > > in the last days I come accross a strange phenomen: One vdr thread eats
> > > > up to 45% cpu on my Pentium3 700Mhz system when I did nothing with my
> > > > vdr except live viewing on a ff card without transfer mode. Some
> > > > digging resulted in this:
> > > > - The thread is the section handler thread.
> > > > - The cpu-load depends on the transponder currently watching on.
> > >
> > > [snip]
> > >
> > > > Attached is a Patch to simply add one sleep(1) inte the loop before the
> > > > poll. This results in reducing the cpu-load from 45% to 1.3%. And it
> > > > does not seem to lose any sections.
> > >
> > > If you replace that sleep(1) with sched_yield(), do you see the same
> > > effect wrt CPU load?
> >
> > Wouldn't be pthread_yield() the better solution?
> >
> 
> I tested sched_yield and pthread_yield and both did not change anything. The 
> cpu load is the same as with an unchanged vdr.

Then two question rises: How often is this point of code reached
and how often is the code after this point used all over the time?
Maybe a solution with a condition variable to use a broadcast
if a few sections are received for waking up the point would help.


        Werner
  
matthieu castet May 25, 2005, 9:43 a.m. UTC | #6
Hi,

> On Tue, May 24, 2005 at 11:01:50PM +0200, Matthias Schwarzott wrote:
> > On Tuesday 24 May 2005 19:15, Dr. Werner Fink wrote:
> > > On Tue, May 24, 2005 at 04:31:31PM +0100, Darren Salt wrote:
> > > > I demand that Matthias Schwarzott may or may not have written...
> > > >
> > > > > in the last days I come accross a strange phenomen: One vdr thread
> eats
> > > > > up to 45% cpu on my Pentium3 700Mhz system when I did nothing with my
> > > > > vdr except live viewing on a ff card without transfer mode. Some
> > > > > digging resulted in this:
> > > > > - The thread is the section handler thread.
> > > > > - The cpu-load depends on the transponder currently watching on.
> > > >
> > > > [snip]
> > > >
> > > > > Attached is a Patch to simply add one sleep(1) inte the loop before
> the
> > > > > poll. This results in reducing the cpu-load from 45% to 1.3%. And it
> > > > > does not seem to lose any sections.
> > > >
> > > > If you replace that sleep(1) with sched_yield(), do you see the same
> > > > effect wrt CPU load?
> > >
> > > Wouldn't be pthread_yield() the better solution?
> > >
> >
> > I tested sched_yield and pthread_yield and both did not change anything.
> The
> > cpu load is the same as with an unchanged vdr.
>

sched_yield() could not work, because if vdr loop on it, the linux scheduler
will think that vdr is an interactive process and will increase its priority
and it will be worse.

A solution could be to use nanosleep(1) instead of sched_yield().


Matthieu
  
Dr. Werner Fink May 25, 2005, 10:21 a.m. UTC | #7
On Wed, May 25, 2005 at 11:43:49AM +0200, castet.matthieu@free.fr wrote:
> 
> sched_yield() could not work, because if vdr loop on it, the linux scheduler
> will think that vdr is an interactive process and will increase its priority
> and it will be worse.
> 
> A solution could be to use nanosleep(1) instead of sched_yield().

nanosleep within threaded programs are emulated with the
help of busy loops ... you do not want this  ;^)

        Werner
  
Wolfgang Rohdewald May 25, 2005, 12:03 p.m. UTC | #8
On Mittwoch 25 Mai 2005 11:22, Dr. Werner Fink wrote:
> Then two question rises: How often is this point of code reached

I posted data on 23.3.05. About 1000 times in 3 seconds per DVB card.

I cannot test anything right now, my FF card is dead.
  
Stefan Huelswitt May 25, 2005, 2:38 p.m. UTC | #9
On 25 May 2005 "Dr. Werner Fink" <werner@suse.de> wrote:
> On Tue, May 24, 2005 at 11:01:50PM +0200, Matthias Schwarzott wrote:
>> On Tuesday 24 May 2005 19:15, Dr. Werner Fink wrote:
>> > On Tue, May 24, 2005 at 04:31:31PM +0100, Darren Salt wrote:
>> > > I demand that Matthias Schwarzott may or may not have written...
>> > >
>> > > > in the last days I come accross a strange phenomen: One vdr thread eats
>> > > > up to 45% cpu on my Pentium3 700Mhz system when I did nothing with my
>> > > > vdr except live viewing on a ff card without transfer mode. Some
>> > > > digging resulted in this:
>> > > > - The thread is the section handler thread.
>> > > > - The cpu-load depends on the transponder currently watching on.
>> > >
>> > > [snip]
>> > >
>> > > > Attached is a Patch to simply add one sleep(1) inte the loop before the
>> > > > poll. This results in reducing the cpu-load from 45% to 1.3%. And it
>> > > > does not seem to lose any sections.
>> > >
>> > > If you replace that sleep(1) with sched_yield(), do you see the same
>> > > effect wrt CPU load?
>> >
>> > Wouldn't be pthread_yield() the better solution?
>> >
>> 
>> I tested sched_yield and pthread_yield and both did not change anything. The 
>> cpu load is the same as with an unchanged vdr.
> 
> Then two question rises: How often is this point of code reached
> and how often is the code after this point used all over the time?
> Maybe a solution with a condition variable to use a broadcast
> if a few sections are received for waking up the point would help.

As far as I see, only one section is read from a handle at a time.
So the complete poll loop has to be done multiple times, even if
allready several sections are waiting.
So may be, it could be a solution to loop on a single handle
until all sections are read (must work, as handles are
non-blocking).

Regards.
  
Matthias Schwarzott May 25, 2005, 3:11 p.m. UTC | #10
On Wednesday 25 May 2005 12:21, Dr. Werner Fink wrote:
> On Wed, May 25, 2005 at 11:43:49AM +0200, castet.matthieu@free.fr wrote:
> > sched_yield() could not work, because if vdr loop on it, the linux
> > scheduler will think that vdr is an interactive process and will increase
> > its priority and it will be worse.
> >
> > A solution could be to use nanosleep(1) instead of sched_yield().
>
> nanosleep within threaded programs are emulated with the
> help of busy loops ... you do not want this  ;^)
>
It does not help to sleep inside as sections get lost with it. I added a view 
debug statements to this thread. Here are my results without sleep of any 
type:

channel		~sections/s		CPU load in % on P3 700Mhz
ARD			300			10
ZDF			100			3
RTL			40			2
Pro7/Sat1		330			46
VIVA			25			2


Changing cFilter::match to reject every section changes load to 0%.

Now changing EIT::Process to do nothing with the sections changes load from 
46% to 2%. This means 44% of the 46% are burned in cEITFilter::Process in the 
swich-case 0x12.

Now the shoking result for me:
If i insert a return in cEIT::cEIT before the GetByChannelID the load goes to 
2%.
If I add the return in cEIT::cEIT after the GetByChannelID the load stays at 
46%.

This routine is O(n) and my channels.conf contains 2750 channels.
Cutting down the channels.conf to contain only Pro7 load of section handler 
thread is 3%.

Matthias
  
Luca Olivetti May 25, 2005, 5:23 p.m. UTC | #11
Matthias Schwarzott wrote:
> 
> Now the shoking result for me:
> If i insert a return in cEIT::cEIT before the GetByChannelID the load goes to 
> 2%.
> If I add the return in cEIT::cEIT after the GetByChannelID the load stays at 
> 46%.
> 
> This routine is O(n) and my channels.conf contains 2750 channels.
> Cutting down the channels.conf to contain only Pro7 load of section handler 
> thread is 3%.

Wow. Since I have ~5000 lines in channels conf even for channels that 
are only transmitting now & next the load goes from around 18% (with the 
GetByChannelID) to 2% (without).
Not that I'm a C++ expert (far from it), but maybe using some stl type 
(map?) for cChannels could help here?

Bye
  
Klaus Schmidinger May 25, 2005, 5:29 p.m. UTC | #12
Luca Olivetti wrote:
> Matthias Schwarzott wrote:
> 
>>
>> Now the shoking result for me:
>> If i insert a return in cEIT::cEIT before the GetByChannelID the load 
>> goes to 2%.
>> If I add the return in cEIT::cEIT after the GetByChannelID the load 
>> stays at 46%.
>>
>> This routine is O(n) and my channels.conf contains 2750 channels.
>> Cutting down the channels.conf to contain only Pro7 load of section 
>> handler thread is 3%.
> 
> 
> Wow. Since I have ~5000 lines in channels conf even for channels that 
> are only transmitting now & next the load goes from around 18% (with the 
> GetByChannelID) to 2% (without).
> Not that I'm a C++ expert (far from it), but maybe using some stl type 
> (map?) for cChannels could help here?

Something in that direction was sent to me by the Reelbox people
and I'll look into this one of the next days.

Klaus
  
Georg Acher May 25, 2005, 5:30 p.m. UTC | #13
On Wed, May 25, 2005 at 05:11:05PM +0200, Matthias Schwarzott wrote:
 
> Now the shoking result for me:
> If i insert a return in cEIT::cEIT before the GetByChannelID the load goes to 
> 2%.
> If I add the return in cEIT::cEIT after the GetByChannelID the load stays at 
> 46%.
> 
> This routine is O(n) and my channels.conf contains 2750 channels.
> Cutting down the channels.conf to contain only Pro7 load of section handler 
> thread is 3%.

A patch for this (and other somewhat inefficent code, especially on a Geode
with 300MHz ;-) ) was already sent to Klaus. It introduces a few hashlists in
parallel to the channels and event structures. With an empty event list, the
ARD transponder needs about 40% CPU (but the thread is also niced), after a
while the load goes down to about 1% (both numbers on the Geode system).

http://www.vdrportal.de/board/thread.php?postid=310366#post310366

But beware, it's only tested on 1.3.21 and contains some other experimental
patches...
  

Patch

diff -ur vdr-1.3.24-vanilla/sections.c vdr-1.3.24/sections.c
--- vdr-1.3.24-vanilla/sections.c	2005-05-23 19:19:47.000000000 +0200
+++ vdr-1.3.24/sections.c	2005-05-23 19:20:04.000000000 +0200
@@ -183,6 +183,7 @@ 
         int oldStatusCount = statusCount;
         Unlock();
 
+        sleep(1);
         if (poll(pfd, NumFilters, 1000) > 0) {
            bool DeviceHasLock = device->HasLock();
            if (!DeviceHasLock)