dynamically sized ringbuffers v1

Message ID 463FA500.4010200@o2.pl
State New
Headers

Commit Message

Artur Skawina May 7, 2007, 10:15 p.m. UTC
  vdr uses quite a lot of memory, RSS here used to be ~70M, so i decided to see if it could be
reduced a bit. First suspects became the ringbuffers, cause i was seeing a lot of log lines
such as "buffer stats: 96068 (1%) used". Turns out many buffers are way over sized, only a few
percent of the memory are typically used, yet because of how a ringbuffer works _all_ of it is
constantly accessed. I first tried to make the ringbuffer users smarter about picking the size
and eg after changing the transfer buffer to start out small and use bigger sizes only when the
current one isn't large enough it seemed a more reasonable size for that vdr instance was 384k
instead of 2M. Similarly the TS buffer could be just a few hundred K instead of 2M.

But that approach was not ideal as there isn't anything one could do after realizing the buffer
is too small, it could be grown next time, after channel switch etc, but that's it. Also some
of the huge buffers came from plugins, and I wasn't all that excited about fixing all of them
(one example w/ the one-digit % fill was streamdev-server).
It is hard to estimate what the proper size would be under load, with many simultaneous recordings.
So I did attack this issue differently -- by making the ringbuffers themselves smarter.

Patch below changes the interpretation of the 'Size' parm given when creating a buffer from a
fixed length to a max limit. The buffers start out much smaller (currently 128k, instead of many
megabytes) and grow automatically when they become almost full. This way all callers benefit,
and they don't have to be modified at all.

The buffers are actually always allocated using the max size, but only the necessary
buffer part is used, the rest of the memory is never accessed; this is done this way
so that no extra locking is necessary.

I've tested this on two vdrs for a few hours and it seems to work, but it needs a lot
more testing.
There is some extra logging in this patch so that you can see what's going on; every
time a ring buffer gets deleted it logs a line which tells you exactly how much memory
was requested and how much was really needed.

vdr: [19729] Deleting ring buffer "TS" size: 196608 / 2097152 (used 9.375% of requested size)
vdr: [19718] Deleting ring buffer "Result" size: 262144 / 262144 (used 100% of requested size)
vdr: [19718] Deleting ring buffer "Recorder" size: 442368 / 5242880 (used 8.4375% of requested size)
vdr: [19718] Deleting ring buffer "Result" size: 262144 / 262144 (used 100% of requested size)
vdr: [19718] Deleting ring buffer "Recorder" size: 131072 / 5242880 (used 2.5% of requested size)
vdr: [19733] Deleting ring buffer "TS" size: 131072 / 2097152 (used 6.25% of requested size)
vdr: [19718] Deleting ring buffer "Result" size: 262144 / 262144 (used 100% of requested size)
vdr: [19718] Deleting ring buffer "Recorder" size: 442368 / 5242880 (used 8.4375% of requested size)

So each of the several 'Recorder' buffers was an order of magnitude smaller than w/o
this patch and several megabytes less ram was used.

artur
  

Comments

Artur Skawina May 8, 2007, 6:21 p.m. UTC | #1
> Patch below changes the interpretation of the 'Size' parm given when creating a buffer from a
> fixed length to a max limit. The buffers start out much smaller (currently 128k, instead of many
> megabytes) and grow automatically when they become almost full. This way all callers benefit,
> and they don't have to be modified at all.

> I've tested this on two vdrs for a few hours and it seems to work, but it needs a lot
> more testing.

I did a few test recordings (total duration likely more than 24h) and the new code
seems mostly fine; just some tweaking of the sizes and grow heuristics left.

However one problem did appear after many hours of simultaneous recordings:

08:17:38 vdr: [24052] buffer usage: 70% (tid=24051)
08:17:39 vdr: [24052] buffer usage: 80% (tid=24051)
08:17:39 vdr: [24052] buffer usage: 90% (tid=24051)
08:17:39 vdr: [24052] buffer usage: 100% (tid=24051)
08:17:39 vdr: [24052] ERROR: 1 ring buffer overflow (61 bytes dropped)
08:17:45 vdr: [24052] ERROR: 9053 ring buffer overflows (1701964 bytes dropped)
08:17:47 vdr: [24052] Enlarging ring buffer "Recorder": 442368 bytes
08:17:47 vdr: [24052] buffer usage: 0% (tid=24051)

Apparently the 'Recorder' consumer stalled for several seconds at a point when
growing the ringbuffer wasn't possible - to keep Get/Put lockless resizing is
only possible when the ringbuffer data is continuous; and if the consumer stops
making progress at the wrong moment, the buffer will not be enlarged and overflow.
In this case the buffer was ~300k and ~2m would have been needed.
I can think of a few ways to make the ringbuffer growing more aggressive, but
they all make the buffer handling much more complex, so i'd like to avoid that.

Looking at the caller:

void cRecorder::Receive(uchar *Data, int Length)
{
  if (Running()) {
     int p = ringBuffer->Put(Data, Length);
     if (p != Length && Running())
        ringBuffer->ReportOverflow(Length - p);
     }
}

it simply drops any data that does not fit into the buffer, which would be fine
for live viewing, but isn't ideal when recording.

Can we try harder not to loose data here? IOW can this function sleep (and retry)?

(not that i think doing this would be enough; it could help for short stalls, but
for the longer ones a much larger buffer size is necessary, i'll have to find a
clean way to handle that anyway. It's kind of hard to test though as the overflow
only happened once during ~24h of recording...)

artur
  

Patch

diff --git a/ringbuffer.c b/ringbuffer.c
index 0633bd3..8dd1efe 100644
--- a/ringbuffer.c
+++ b/ringbuffer.c
@@ -151,13 +151,29 @@  void cRingBufferLinear::PrintDebugRBL(void)
   }
 #endif

+// cRingBufferLinear are dynamically sized, or at least we can pretend they are ;)
+// We treat 'Size' as the maximum size, but start with a small buffer, which can
+// grow later as it fills up. Memory is always allocated for the full buffer, but
+// the unused RAM portion remains untouched until (if at all) it is actually needed.
+// Note that we can not start with a larger than requested buffer because there are
+// ring buffer users that cause crashes then (eg softdevice mpegdecoder absolutely
+// needs 32/64k).
+
+// Startup size. 64k still causes overflows before buffer starts to grow, 128k doesn't.
+// The buffers grow to 200..500k anyway, so maybe increasing this a bit more would
+// make sense, but let's first see if it's really needed.
+#define DEFRBSIZE KILOBYTE(128)
+
 cRingBufferLinear::cRingBufferLinear(int Size, int Margin, bool Statistics, const char *Description)
-:cRingBuffer(Size, Statistics)
+:cRingBuffer(min(Size,DEFRBSIZE), Statistics)
 {
   description = Description ? strdup(Description) : NULL;
   tail = head = margin = Margin;
   gotten = 0;
   buffer = NULL;
+  maxsize = Size;
+  growbuf = 0;
+  dsyslog("New ring buffer \"%s\" size: %d margin: %d", description, Size, Margin );
   if (Size > 1) { // 'Size - 1' must not be 0!
      if (Margin <= Size / 2) {
         buffer = MALLOC(uchar, Size);
@@ -183,6 +199,8 @@  cRingBufferLinear::~cRingBufferLinear()
 #ifdef DEBUGRINGBUFFERS
   DelDebugRBL(this);
 #endif
+  dsyslog("Deleting ring buffer \"%s\" size: %d / %d (used %g%% of requested size)", description,
+                                               size, maxsize, size/(double)maxsize*100.0 );
   free(buffer);
   free(description);
 }
@@ -207,6 +225,13 @@  void cRingBufferLinear::Clear(void)

 int cRingBufferLinear::Read(int FileHandle, int Max)
 {
+  if (Available() >= size*2/4)
+     growbuf = 1;
+  if (growbuf && head>=tail && size<maxsize) {
+     size = min(size+size/2, maxsize);
+     dsyslog("Enlarging ring buffer \"%s\": %d bytes (Read())", description, size);
+     growbuf = 0;
+     }
   int Tail = tail;
   int diff = Tail - head;
   int free = (diff > 0) ? diff - 1 : Size() - head;
@@ -236,6 +261,8 @@  int cRingBufferLinear::Read(int FileHandle, int Max)
   lastHead = head;
   lastPut = Count;
 #endif
+  if (Available() >= size*3/4)
+     growbuf = 1;
   EnableGet();
   if (free == 0)
      WaitForPut();
@@ -245,6 +272,13 @@  int cRingBufferLinear::Read(int FileHandle, int Max)
 int cRingBufferLinear::Put(const uchar *Data, int Count)
 {
   if (Count > 0) {
+     if (Available()+Count >= size*2/4)
+        growbuf = 1;
+     if (growbuf && head>=tail && size<maxsize) {
+        size = min(size+size/2, maxsize);
+        dsyslog("Enlarging ring buffer \"%s\": %d bytes", description, size);
+        growbuf = 0;
+        }
      int Tail = tail;
      int rest = Size() - head;
      int diff = Tail - head;
diff --git a/ringbuffer.h b/ringbuffer.h
index dba0e61..f417d57 100644
--- a/ringbuffer.h
+++ b/ringbuffer.h
@@ -18,11 +18,11 @@  private:
   cCondWait readyForPut, readyForGet;
   int putTimeout;
   int getTimeout;
-  int size;
   time_t lastOverflowReport;
   int overflowCount;
   int overflowBytes;
 protected:
+  int size;
   tThreadId getThreadTid;
   int maxFill;//XXX
   int lastPercent;
@@ -59,6 +59,8 @@  private:
   int margin, head, tail;
   int gotten;
   uchar *buffer;
+  int growbuf;
+  int maxsize;
   char *description;
 public:
   cRingBufferLinear(int Size, int Margin = 0, bool Statistics = false, const char *Description = NULL);