SVDRP ignores EOF (clientside close)

Message ID 20060720100309.M33662@schmirler.de
State New
Headers

Commit Message

Frank Schmirler July 20, 2006, 10:03 a.m. UTC
  On Wed, 19 Jul 2006 12:59:36 +0200, Frank Schmirler wrote
> I would like to raise an issue with SVDRP. When the client doesn't 
> send QUIT but simply closes the connection, VDR happily ignores the 
> EOF returned by read. The socket will remain in state CLOSE_WAIT 
> until it's finally closed due to the SVDRP timeout. In the meantime 
> the SVDRP port remains blocked for other clients.

An update on this issue: with the implementation used before vdr-1.2.2 you
*will* get a broken pipe message, as cSVDRP::Close() tries to send a goodby
message to the client. This is not a problem, but as a "clean" solution I
would suggest:

         isyslog("timeout on SVDRP connection");


Cheers,
Frank
  

Comments

Peter Dittmann July 24, 2006, 7:58 a.m. UTC | #1
vdr-bounces@linuxtv.org schrieb am 20.07.2006 12:03:09:

> On Wed, 19 Jul 2006 12:59:36 +0200, Frank Schmirler wrote
> > I would like to raise an issue with SVDRP. When the client doesn't 
> > send QUIT but simply closes the connection, VDR happily ignores the 
> > EOF returned by read. The socket will remain in state CLOSE_WAIT 
> > until it's finally closed due to the SVDRP timeout. In the meantime 
> > the SVDRP port remains blocked for other clients.
> 
> An update on this issue: with the implementation used before vdr-1.2.2 
you
> *will* get a broken pipe message, as cSVDRP::Close() tries to send a 
goodby
> message to the client. This is not a problem, but as a "clean" solution 
I
> would suggest:
> 
> --- svdrp.c.orig        2006-07-20 11:34:10.000000000 +0200
> +++ svdrp.c     2006-07-20 11:39:51.000000000 +0200
> @@ -1575,8 +1575,11 @@
>                isyslog("lost connection to SVDRP client");
>                Close();
>                }
> -           else
> -              break;
> +           else {
> +              isyslog("SVDRP client closed connection");
> +              file.Close();
> +              DELETENULL(PUTEhandler);
> +              }
>             }
>       if (Setup.SVDRPTimeout && time(NULL) - lastActivity > Setup.SVDRP
> Timeout) {
>          isyslog("timeout on SVDRP connection");
> 
> 

I will try this workarounds by backporting it to my productive system.

Had another kernel crash this morning with stuck SVDRP:

Jul 24 04:09:03 vdr vdr[1399]: closing SVDRP connection
Jul 24 04:09:44 vdr vdr[1399]: connect from 127.0.0.1, port 32781 - 
accepted
Jul 24 04:10:01 vdr /USR/SBIN/CRON[7313]: (root) CMD (test -x 
/usr/sbin/anacron || run-parts --report /etc/cr
Jul 24 04:10:08 vdr vdr[1399]: max. latency time 23 seconds
Jul 24 04:10:28 vdr vdr[1399]: timer 45 (55 2050-2110 'kinder~Bernd~BRAVO 
BERND~der etwas andere Sandmann') a

>> this shurely is VDRAdmin adding timers
>> looks like it takes 54 second here =0

Jul 24 04:10:38 vdr vdr[1399]: closing SVDRP connection
Jul 24 04:10:38 vdr vdr[1399]: connect from 127.0.0.1, port 32782 - 
accepted
Jul 24 04:10:38 vdr vdr[1399]: ERROR (svdrp.c,365): Broken pipe
Jul 24 04:11:03 vdr vdr[1399]: connect from 127.0.0.1, port 32784 - 
accepted
Jul 24 04:11:03 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1415 of 
1500 data blocks [ 94.33%]'
Jul 24 04:11:03 vdr vdr[1399]: info: Infosat:Received 1415 of 1500 data 
blocks [ 94.33%]
Jul 24 04:11:04 vdr vdr[1399]: closing SVDRP connection
Jul 24 04:12:03 vdr vdr[1399]: connect from 127.0.0.1, port 32785 - 
accepted
Jul 24 04:12:03 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1426 of 
1500 data blocks [ 95.07%]'
Jul 24 04:12:03 vdr vdr[1399]: info: Infosat:Received 1426 of 1500 data 
blocks [ 95.07%]
Jul 24 04:12:03 vdr vdr[1399]: closing SVDRP connection
Jul 24 04:13:04 vdr vdr[1399]: connect from 127.0.0.1, port 32786 - 
accepted
Jul 24 04:13:04 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1440 of 
1500 data blocks [ 96.00%]'
Jul 24 04:13:04 vdr vdr[1399]: info: Infosat:Received 1440 of 1500 data 
blocks [ 96.00%]
Jul 24 04:13:04 vdr vdr[1399]: closing SVDRP connection
Jul 24 04:14:04 vdr vdr[1399]: connect from 127.0.0.1, port 32787 - 
accepted
Jul 24 04:14:04 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1449 of 
1500 data blocks [ 96.60%]'
Jul 24 04:14:04 vdr vdr[1399]: info: Infosat:Received 1449 of 1500 data 
blocks [ 96.60%]
Jul 24 04:14:04 vdr vdr[1399]: closing SVDRP connection
Jul 24 04:14:47 vdr vdr[1410]: changing caids of channel 531 from 100 to 0
Jul 24 04:15:04 vdr vdr[1399]: connect from 127.0.0.1, port 32788 - 
accepted
Jul 24 04:15:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1464 of 
1500 data blocks [ 97.60%]'
Jul 24 04:15:05 vdr vdr[1399]: info: Infosat:Received 1464 of 1500 data 
blocks [ 97.60%]
Jul 24 04:15:05 vdr vdr[1399]: closing SVDRP connection
Jul 24 04:16:04 vdr vdr[1399]: connect from 127.0.0.1, port 32789 - 
accepted
Jul 24 04:16:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1472 of 
1500 data blocks [ 98.13%]'
Jul 24 04:16:05 vdr vdr[1399]: info: Infosat:Received 1472 of 1500 data 
blocks [ 98.13%]
Jul 24 04:16:05 vdr vdr[1399]: closing SVDRP connection
Jul 24 04:17:04 vdr vdr[1399]: connect from 127.0.0.1, port 32790 - 
accepted
Jul 24 04:17:04 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1476 of 
1500 data blocks [ 98.40%]'
Jul 24 04:17:04 vdr vdr[1399]: info: Infosat:Received 1476 of 1500 data 
blocks [ 98.40%]
Jul 24 04:17:04 vdr vdr[1399]: closing SVDRP connection
Jul 24 04:18:05 vdr vdr[1399]: connect from 127.0.0.1, port 32791 - 
accepted
Jul 24 04:18:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1481 of 
1500 data blocks [ 98.73%]'
Jul 24 04:18:05 vdr vdr[1399]: info: Infosat:Received 1481 of 1500 data 
blocks [ 98.73%]
Jul 24 04:18:05 vdr vdr[1399]: closing SVDRP connection
Jul 24 04:19:05 vdr vdr[1399]: connect from 127.0.0.1, port 32792 - 
accepted
Jul 24 04:19:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1490 of 
1500 data blocks [ 99.33%]'
Jul 24 04:19:05 vdr vdr[1399]: info: Infosat:Received 1490 of 1500 data 
blocks [ 99.33%]
Jul 24 04:19:05 vdr vdr[1399]: closing SVDRP connection
Jul 24 04:20:05 vdr vdr[1399]: connect from 127.0.0.1, port 32793 - 
accepted
Jul 24 04:20:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1497 of 
1500 data blocks [ 99.80%]'
Jul 24 04:20:05 vdr vdr[1399]: info: Infosat:Received 1497 of 1500 data 
blocks [ 99.80%]
Jul 24 04:20:05 vdr vdr[1399]: closing SVDRP connection

>> this seems to be VDRAdin as it's not the same 60sec raster as the 
messages from infosatepg

Jul 24 04:20:41 vdr vdr[1399]: connect from 127.0.0.1, port 32794 - 
accepted
Jul 24 04:21:42 vdr vdr[1399]: PANIC: watchdog timer expired - exiting!
Jul 24 04:21:43 vdr lircd 0.7.1pre2[839]: removed client
Jul 24 04:21:43 vdr vdr[2415]: KBD remote control thread ended (pid=2415, 
tid=163851)
Jul 24 04:21:43 vdr vdr[2404]: graphlcd plugin: Display update thread 
ended (pid=2404)
Jul 24 04:21:43 vdr runvdrdevel: restarting VDR

Watchdog is 60sec and SVDRP timeout is 30 sec.
VDRAdmin is 0.97-am3.3-2, vdr is 1.3.24 ctvdr-multipatch 
(please no request to test vanilla, this is the productive system and the 
crash comes every other month, no way I'm living with vanilla vdr this 
long ;)) )

Infosat should not do anything other than messages at this time.

Crash comes after restarting with an killed interrupt handler and infosat 
being blaimed.
As a crash woraround I now happily kill infosatepg in my runvdr. That 
should keep me at least from crashing.


regards  Peter
  
Frank Schmirler July 24, 2006, 3:51 p.m. UTC | #2
On Mon, 24 Jul 2006 09:58:00 +0200, Peter Dittmann wrote
> Had another kernel crash this morning with stuck SVDRP:
(...)
> Jul 24 04:19:05 vdr vdr[1399]: connect from 127.0.0.1, port 32792 - accepted
> Jul 24 04:19:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1490 of 1500 
> data blocks [ 99.33%]'
> Jul 24 04:19:05 vdr vdr[1399]: info: Infosat:Received 1490 of 1500 data
> blocks [ 99.33%]
> Jul 24 04:19:05 vdr vdr[1399]: closing SVDRP connection 
> Jul 24 04:20:05 vdr vdr[1399]: connect from 127.0.0.1, port 32793 - accepted
> Jul 24 04:20:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1497 
> of 1500 data blocks [ 99.80%]'
> Jul 24 04:20:05 vdr vdr[1399]: info: Infosat:Received 1497 of 1500 
> data blocks [ 99.80%]
> Jul 24 04:20:05 vdr vdr[1399]: closing SVDRP connection
> 
> >> this seems to be VDRAdin as it's not the same 60sec raster as the 
> messages from infosatepg
> 
> Jul 24 04:20:41 vdr vdr[1399]: connect from 127.0.0.1, port 32794 - 
> accepted
> 
> Jul 24 04:21:42 vdr vdr[1399]: PANIC: watchdog timer expired - exiting!

I don't know infosatepg, but as far as I've read, it first downloads the data
via DVB-S and then tvmovie2vdr uploads it to SVDRP. The whole process is
controled by infosatepg.sh.

According to the log, infosatepg receives about 5-10 packets per minute, so it
is most likely finished at Jul 24 04:20:41. At that time the SVDRP upload
starts which is simply too fast.

Again: try to find out what's really happening when the watchdog is triggered.
Use a sniffer, add some debug lines to vdr - whatever.

Good luck,
Frank
  
Peter Dittmann July 25, 2006, 7:30 a.m. UTC | #3
------------------------------------------------------------------------------------
Peter Dittmann 
APM-DS
Philips Technologie GmbH Automotive Playback Modules
Wetzlar / GERMANY
Tel:      +49 (0) 6441 3722 759
Fax:     +49 (0) 6441 3722 566
E-mail:  : peter.dittmann@philips.com
------------ http:\\www.apm.philips.com 
---------------------------------------------------------------


vdr-bounces@linuxtv.org schrieb am 24.07.2006 17:51:55:

> On Mon, 24 Jul 2006 09:58:00 +0200, Peter Dittmann wrote
> > Had another kernel crash this morning with stuck SVDRP:
> (...)
> > Jul 24 04:19:05 vdr vdr[1399]: connect from 127.0.0.1, port 32792 - 
accepted
> > Jul 24 04:19:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 
> 1490 of 1500 
> > data blocks [ 99.33%]'
> > Jul 24 04:19:05 vdr vdr[1399]: info: Infosat:Received 1490 of 1500 
data
> > blocks [ 99.33%]
> > Jul 24 04:19:05 vdr vdr[1399]: closing SVDRP connection 
> > Jul 24 04:20:05 vdr vdr[1399]: connect from 127.0.0.1, port 32793 - 
accepted
> > Jul 24 04:20:05 vdr vdr[1399]: SVDRP message: 'Infosat:Received 1497 
> > of 1500 data blocks [ 99.80%]'
> > Jul 24 04:20:05 vdr vdr[1399]: info: Infosat:Received 1497 of 1500 
> > data blocks [ 99.80%]
> > Jul 24 04:20:05 vdr vdr[1399]: closing SVDRP connection
> > 
> > >> this seems to be VDRAdin as it's not the same 60sec raster as the 
> > messages from infosatepg
> > 
> > Jul 24 04:20:41 vdr vdr[1399]: connect from 127.0.0.1, port 32794 - 
> > accepted
> > 
> > Jul 24 04:21:42 vdr vdr[1399]: PANIC: watchdog timer expired - 
exiting!
> 
> I don't know infosatepg, but as far as I've read, it first downloads the 
data
> via DVB-S and then tvmovie2vdr uploads it to SVDRP. The whole process is
> controled by infosatepg.sh.
> 
> According to the log, infosatepg receives about 5-10 packets per minute, 
so it
> is most likely finished at Jul 24 04:20:41. At that time the SVDRP 
upload
> starts which is simply too fast.
> 

Nope, infosatepg is not filling the data linearly.
It takes around three runs here to accumulate all blocks as the FF has a 
high packet error rate.
I'm shure that infosatepg would have been happily busy untin 4:35.
This is supported by infosatepg still running and being the cause of the 
kernel crash once VDR has restarted.

> Again: try to find out what's really happening when the watchdog is 
triggered.
> Use a sniffer, add some debug lines to vdr - whatever.

It's a ctvdr productive system, not much possibilities here to not break 
all plugin packages.
However the problem seems to rise from VDRAdmin and the long timeout when 
adding timers.
I had removed VDRAdmin already from the init and run it only during the 
infosatepg update time.
I had strange crashes before. Especially when mp3ng plugin was running and 
a SVDRP message was comming in.
This crashed VDR almost guaranteed.
It looks like Morones mp3ng does not like messages completely.
But that's different issue.

I will try to backport the SVDRP changes to my productive system and check 
were this leads.

regards Peter
  

Patch

--- svdrp.c.orig        2006-07-20 11:34:10.000000000 +0200
+++ svdrp.c     2006-07-20 11:39:51.000000000 +0200
@@ -1575,8 +1575,11 @@ 
               isyslog("lost connection to SVDRP client");
               Close();
               }
-           else
-              break;
+           else {
+              isyslog("SVDRP client closed connection");
+              file.Close();
+              DELETENULL(PUTEhandler);
+              }
            }
      if (Setup.SVDRPTimeout && time(NULL) - lastActivity > Setup.SVDRP
Timeout) {