cUnbufferedFile and NFS mounted video dir

Message ID 43DBF542.3080604@o2.pl
State New
Headers

Commit Message

Artur Skawina Jan. 28, 2006, 10:50 p.m. UTC
  Klaus Schmidinger wrote:
> Artur Skawina wrote:
>> However i consider the whole patch a bugfix :) Now that the fsyncs are 
>> disabled it probably isn't that critical though.
> 
> So does this mean that without the fdatasync() calls it works ok as
> it is in plain vanilla VDR (without any patch)?

As I haven't run vdr w/o this patch for a long time, i'm not sure. Apparently
things are not ok, as this thread shows, but other vdr-1.3.38+ users with a 2.6
kernel could answer the above question better...
For me it was the sync calls that made me look at this code; w/o them things
weren't as bad, IIRC. I ended up disabling the fdatasyncs completely even when
using fadvise.

> What is the exact meaning of the "WriteStrategy" option?
> Does it simply turn the fadvise stuff on/off, or is there more
> magic behind it? I would prefer a version that works without this
> option, because this is nothing a normal user would easily know
> how to set.

it started as a debugging knob, so i could test the impact of the sync calls,
compare cutting speed w/ fadvise on/off etc.
(Setup.WriteStrategy==1) meant no POSIX_FADV_DONTNEED calls while accessing a
file (the calls were always made when closing a file).

> So, if you can provide a patch against the latest VDR version that
> still uses "#ifdef USE_FADVISE" to completely turn the fadvise
> stuff off, and does _not_ introduce another setup option, I might
> consider including it for the final version 1.4.

i did the minimal fix to the existing vdr code -- note it's far from optimal, 
and i suspect there are cases where the fadvise-enabled version isn't 
necessarily an improvement. An option which basically turns USE_FADVISE into a 
runtime switch would be useful.
As i haven't needed the option myself lately, in fact didn't even remember what
exactly it did and had to check the code, i just killed it, patch attached.
It's vs 1.3.39 because the UI/channel-switch posts scared me away from upgrading
vdr, still remembering the broken menu key experience... ;)
cutter.c part is entirely optional (affects only cutting speed).

artur
  

Comments

Klaus Schmidinger Jan. 29, 2006, 9:47 a.m. UTC | #1
Artur Skawina wrote:
> Klaus Schmidinger wrote:
> 
>> Artur Skawina wrote:
>>
>>> However i consider the whole patch a bugfix :) Now that the fsyncs 
>>> are disabled it probably isn't that critical though.
>>
>>
>> So does this mean that without the fdatasync() calls it works ok as
>> it is in plain vanilla VDR (without any patch)?
> 
> 
> As I haven't run vdr w/o this patch for a long time, i'm not sure. 
> Apparently
> things are not ok, as this thread shows, but other vdr-1.3.38+ users 
> with a 2.6
> kernel could answer the above question better...
> For me it was the sync calls that made me look at this code; w/o them 
> things
> weren't as bad, IIRC. I ended up disabling the fdatasyncs completely 
> even when
> using fadvise.
> 
>> What is the exact meaning of the "WriteStrategy" option?
>> Does it simply turn the fadvise stuff on/off, or is there more
>> magic behind it? I would prefer a version that works without this
>> option, because this is nothing a normal user would easily know
>> how to set.
> 
> 
> it started as a debugging knob, so i could test the impact of the sync 
> calls,
> compare cutting speed w/ fadvise on/off etc.
> (Setup.WriteStrategy==1) meant no POSIX_FADV_DONTNEED calls while 
> accessing a
> file (the calls were always made when closing a file).
> 
>> So, if you can provide a patch against the latest VDR version that
>> still uses "#ifdef USE_FADVISE" to completely turn the fadvise
>> stuff off, and does _not_ introduce another setup option, I might
>> consider including it for the final version 1.4.
> 
> 
> i did the minimal fix to the existing vdr code -- note it's far from 
> optimal,

Well, this is pretty much the killer argument against including it for
version 1.4, I'm afraid.

I guess I'll leave everything as it is right now, until there is
a common consensus about the optimum solution (which will no require
any setup options - it might automatically adapt to different
situations, like local file, NFS etc.). All file access is
hidden inside cUnbufferedFile, so people can test whatever patches
they like. Basically plain simple write()/read() calls should be handled
gracefully by any file system, so the sole benefit from cUnbufferedFile
would be not to clutter the file system cache, but that's probably
no longer such a big problem since VDR caches the recordings data
and thus can bring up the Recordings menu immediately, even if
recordings are currently being made.

Klaus

> and i suspect there are cases where the fadvise-enabled version
> isn't necessarily an improvement. An option which basically turns 
> USE_FADVISE into a runtime switch would be useful.
> As i haven't needed the option myself lately, in fact didn't even 
> remember what
> exactly it did and had to check the code, i just killed it, patch attached.
> It's vs 1.3.39 because the UI/channel-switch posts scared me away from 
> upgrading
> vdr, still remembering the broken menu key experience... ;)
> cutter.c part is entirely optional (affects only cutting speed).
> 
> artur
  
Artur Skawina Jan. 29, 2006, 11:54 a.m. UTC | #2
Klaus Schmidinger wrote:
> Artur Skawina wrote:
>> i did the minimal fix to the existing vdr code -- note it's far from 
>> optimal,
> 
> Well, this is pretty much the killer argument against including it for
> version 1.4, I'm afraid.

I probably should have been more specific wrt the "far from optimal" part -- 
it's about the Read() calls only. The Write() part, which is what this thread 
was about, and what caused me to make the patch in the first place, is working 
perfectly AFAICT.
IOW simply omitting all of the the USE_FADVISE Read() code (making it basically 
a simple safe_read() again) would be a safe fix; see below however.

The problem with the read side is that the uncaching is much too aggressive, and 
while it works when there's no IO load, it causes large delays when replaying a 
file while some other disk activity is going on. This is what the comment block 
above cUnbufferedFile::Read was referring to. While testing yesterdays patch i 
accidentally found a way to reliably trigger the problem and verified that 
disabling fadvice in Read() makes the delays go away. Will replace the 
complicated read() accounting with much simpler version and post a patch later 
today.

> gracefully by any file system, so the sole benefit from cUnbufferedFile
> would be not to clutter the file system cache, but that's probably

much smoother writes is another benefit, and one, once gotten used to, even more 
  valuable.

> no longer such a big problem since VDR caches the recordings data
> and thus can bring up the Recordings menu immediately, even if
> recordings are currently being made.

in the NFS case rereading the dir tree takes quite some time (eg ~8s here) even 
when it's fully cached on both nfs client and server -- so caching it internally 
helps a lot.

artur
  

Patch

--- vdr-1.3.39.org/cutter.c	2005-10-31 12:26:44.000000000 +0000
+++ vdr-1.3.39/cutter.c	2006-01-28 21:33:39.000000000 +0000
@@ -66,6 +66,8 @@  void cCuttingThread::Action(void)
      toFile = toFileName->Open();
      if (!fromFile || !toFile)
         return;
+     fromFile->SetReadAhead(MEGABYTE(20));
+     toFile->SetReadAhead(MEGABYTE(20));
      int Index = Mark->position;
      Mark = fromMarks.Next(Mark);
      int FileSize = 0;
@@ -90,6 +92,7 @@  void cCuttingThread::Action(void)
            if (fromIndex->Get(Index++, &FileNumber, &FileOffset, &PictureType, &Length)) {
               if (FileNumber != CurrentFileNumber) {
                  fromFile = fromFileName->SetOffset(FileNumber, FileOffset);
+                 fromFile->SetReadAhead(MEGABYTE(20));
                  CurrentFileNumber = FileNumber;
                  }
               if (fromFile) {
@@ -118,10 +121,11 @@  void cCuttingThread::Action(void)
                  break;
               if (FileSize > MEGABYTE(Setup.MaxVideoFileSize)) {
                  toFile = toFileName->NextFile();
-                 if (toFile < 0) {
+                 if (!toFile) {
                     error = "toFile 1";
                     break;
                     }
+                 toFile->SetReadAhead(MEGABYTE(20));
                  FileSize = 0;
                  }
               LastIFrame = 0;
@@ -158,10 +162,11 @@  void cCuttingThread::Action(void)
                  cutIn = true;
                  if (Setup.SplitEditedFiles) {
                     toFile = toFileName->NextFile();
-                    if (toFile < 0) {
+                    if (!toFile) {
                        error = "toFile 2";
                        break;
                        }
+                    toFile->SetReadAhead(MEGABYTE(20));
                     FileSize = 0;
                     }
                  }
--- vdr-1.3.39.org/tools.c	2006-01-15 14:31:45.000000000 +0000
+++ vdr-1.3.39/tools.c	2006-01-28 21:48:22.000000000 +0000
@@ -1040,10 +1040,9 @@  bool cSafeFile::Close(void)
 
 // --- cUnbufferedFile -------------------------------------------------------
 
-//#define USE_FADVISE
+#define USE_FADVISE
 
-#define READ_AHEAD MEGABYTE(2)
-#define WRITE_BUFFER MEGABYTE(10)
+#define WRITE_BUFFER KILOBYTE(800)
 
 cUnbufferedFile::cUnbufferedFile(void)
 {
@@ -1059,8 +1058,17 @@  int cUnbufferedFile::Open(const char *Fi
 {
   Close();
   fd = open(FileName, Flags, Mode);
-  begin = end = ahead = -1;
+  curpos = 0;
+  begin = lastpos = ahead = 0;
+  readahead = 128*1024;
+  pendingreadahead = 0;
   written = 0;
+  totwritten = 0;
+  if (fd >= 0) {
+     // we could use POSIX_FADV_SEQUENTIAL, but we do our own readahead
+     // disabling the kernel one.
+     posix_fadvise(fd, 0, 0, POSIX_FADV_RANDOM);
+  }
   return fd;
 }
 
@@ -1068,15 +1076,10 @@  int cUnbufferedFile::Close(void)
 {
 #ifdef USE_FADVISE
   if (fd >= 0) {
-     if (ahead > end)
-        end = ahead;
-     if (begin >= 0 && end > begin) {
-        //dsyslog("close buffer: %d (flush: %d bytes, %ld-%ld)", fd, written, begin, end);
-        if (written)
-           fdatasync(fd);
-        posix_fadvise(fd, begin, end - begin, POSIX_FADV_DONTNEED);
-        }
-     begin = end = ahead = -1;
+     if (totwritten)    // if we wrote anything make sure the data has hit the disk before
+        fdatasync(fd);  // calling fadvise, as this is our last chance to un-cache it.
+     posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED);
+     begin = lastpos = ahead = 0;
      written = 0;
      }
 #endif
@@ -1085,45 +1088,95 @@  int cUnbufferedFile::Close(void)
   return close(OldFd);
 }
 
-off_t cUnbufferedFile::Seek(off_t Offset, int Whence)
-{
-  if (fd >= 0)
-     return lseek(fd, Offset, Whence);
-  return -1;
-}
-
+// When replaying and going eg FF->PLAY the position jumps back 2..8M
+// hence we might not want to drop that data at once. 
+// Ignoring this for now to avoid making this even more complex,
+// but we could at least try to handle the common cases.
+// (PLAY->FF->PLAY, small jumps, moving editing marks etc)
+
+#define KREADAHEAD MEGABYTE(4) // amount (guess) of kernel/fs prefetch that
+                               // could happen in addition to our own.
+#define FADVGRAN   4096        // AKA fadvise-chunk-size; PAGE_SIZE or
+                               // getpagesize(2) would also work.
+                               
 ssize_t cUnbufferedFile::Read(void *Data, size_t Size)
 {
   if (fd >= 0) {
 #ifdef USE_FADVISE
-     off_t pos = lseek(fd, 0, SEEK_CUR);
-     // jump forward - adjust end position
-     if (pos > end)
-        end = pos;
-     // after adjusting end - don't clear more than previously requested
-     if (end > ahead)
-        end = ahead;
-     // jump backward - drop read ahead of previous run
-     if (pos < begin)
-        end = ahead;
-     if (begin >= 0 && end > begin)
-        posix_fadvise(fd, begin - KILOBYTE(200), end - begin + KILOBYTE(200), POSIX_FADV_DONTNEED);//XXX macros/parameters???
-     begin = pos;
+     off_t jumped = curpos-lastpos; // nonzero means we're not at the last offset
+     if (jumped) {                  // ie some kind of jump happened.
+        pendingreadahead += ahead-lastpos+KREADAHEAD;
+        // jumped forward? - treat as if we did read all the way to current pos.
+        if (jumped >= 0) {
+           lastpos = curpos;
+           // but clamp at ahead so we don't clear more than previously requested.
+           // (would be mostly harmless anyway, unless we got more than one reader of this file)
+           if (lastpos > (ahead+KREADAHEAD))
+              lastpos = ahead+KREADAHEAD;
+        }
+        // jumped backward? - drop both last read _and_ read-ahead
+        else
+          if (curpos < begin)
+              lastpos = ahead+KREADAHEAD;
+           // jumped backward, but still inside prev read window? - pretend we read less.
+           else /* if (curpos >= begin) */
+              lastpos = curpos;
+        }
+        
 #endif
      ssize_t bytesRead = safe_read(fd, Data, Size);
 #ifdef USE_FADVISE
+     // Now drop data accessed during _previous_ Read(). (ie. begin...lastpos)
+     // fadvise() internally has page granularity; it drops only whole pages
+     // from the range we specify (since it cannot know if we will be accessing
+     // the rest). This is why we drop only the previously read data, and do it
+     // _after_ the current read() call, while rounding up the window to make
+     // sure that even not PAGE_SIZE-aligned data gets freed.
+     // We're also trying merge the fadvise calls a bit in order to reduce overhead
+     // (to avoid a fadvise() call here for every read() above).
+     if (begin >= 0 && lastpos > begin)
+        if (jumped || (size_t)(lastpos-begin) > readahead) {
+           //dsyslog("taildrop:    %ld..%ld size %ld", begin, lastpos, lastpos-begin);
+           posix_fadvise(fd, begin-(FADVGRAN-1), lastpos-begin+(FADVGRAN-1)*2, POSIX_FADV_DONTNEED);
+           begin = curpos;
+           }
+        
      if (bytesRead > 0) {
-        pos += bytesRead;
-        end = pos;
-        // this seems to trigger a non blocking read - this
-        // may or may not have been finished when we will be called next time.
-        // If it is not finished we can't release the not yet filled buffers.
-        // So this is commented out till we find a better solution.
-        //posix_fadvise(fd, pos, READ_AHEAD, POSIX_FADV_WILLNEED);
-        ahead = pos + READ_AHEAD;
+        curpos += bytesRead;
+        //dsyslog("jump: %06ld ra: %06ld size: %ld", jumped, (long)readahead, (long)Size);
+
+        // no jump? (allow small forward jump still inside readahead window).
+        if (jumped>=0 && jumped<=(off_t)readahead) {
+           // Trigger the readahead IO, but only if we've used at least
+           // 1/4 of the previously requested area. This avoids calling
+           // fadvise() after every read() call.
+           if ((ahead-curpos)<(off_t)(readahead-readahead/4)) {
+              posix_fadvise(fd, curpos, readahead, POSIX_FADV_WILLNEED);
+              ahead = curpos + readahead;
+              }
+           if ( readahead < Size*64 ) { // automagically tune readahead size.
+              readahead = Size*64;
+              //dsyslog("Readahead for fd: %d increased to %ld", fd, (long)readahead);
+              }
+           }
+        else {
+           // jumped - we really don't want any readahead now. otherwise
+           // eg fast-rewind gets in trouble.
+           ahead = curpos;
+
+           // Every now and then, flush all cached data from this file; mostly
+           // to get rid of nonflushed readahead coming from _previous_ jumps
+           // (if the readahead I/O hasn't finished by the time we called
+           // fadvice() to undo it, the data could still be cached).
+           // The accounting does not have to be 100% accurate, as long as
+           // this triggers after _some_ jumps we should be ok.
+           if (pendingreadahead>MEGABYTE(40)) {
+              pendingreadahead = 0;
+              posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED);
+              }
+           }
         }
-     else
-        end = pos;
+     lastpos = curpos;
 #endif
      return bytesRead;
      }
@@ -1132,29 +1185,37 @@  ssize_t cUnbufferedFile::Read(void *Data
 
 ssize_t cUnbufferedFile::Write(const void *Data, size_t Size)
 {
+  //dsyslog("Unbuffered:Write: fd: %d  %8ld..%8ld  size: %5ld", fd, curpos, curpos+Size, (long)Size);
   if (fd >=0) {
-#ifdef USE_FADVISE
-     off_t pos = lseek(fd, 0, SEEK_CUR);
-#endif
      ssize_t bytesWritten = safe_write(fd, Data, Size);
 #ifdef USE_FADVISE
-     if (bytesWritten >= 0) {
+     if (bytesWritten > 0) {
+        begin = min(begin, curpos);
+        curpos += bytesWritten;
         written += bytesWritten;
-        if (begin >= 0) {
-           if (pos < begin)
-              begin = pos;
-           }
-        else
-           begin = pos;
-        if (pos + bytesWritten > end)
-           end = pos + bytesWritten;
+        lastpos = max(lastpos, curpos);
         if (written > WRITE_BUFFER) {
-           //dsyslog("flush buffer: %d (%d bytes, %ld-%ld)", fd, written, begin, end);
-           fdatasync(fd);
-           if (begin >= 0 && end > begin)
-              posix_fadvise(fd, begin, end - begin, POSIX_FADV_DONTNEED);
-           begin = end = -1;
+           //dsyslog("flush buffer: %d (%d bytes, %ld-%ld)", fd, written, begin, lastpos);
+           if (lastpos>begin) {
+              off_t headdrop = min(begin, WRITE_BUFFER*2L);
+              posix_fadvise(fd, begin-headdrop, lastpos-begin+headdrop, POSIX_FADV_DONTNEED);
+              }
+           begin = lastpos = max(0L, curpos-4095);
+           totwritten += written;
            written = 0;
+           // The above fadvise() works when writing slowly (recording), but could
+           // leave cached data around when writing at a high rate (cutting).
+           // Also, it seems in some setups, the above does not trigger any I/O and
+           // the fdatasync() call below has to do all the work (reiserfs with some
+           // kind of write gathering enabled).
+           // We add 'readahead' to the threshold in an attempt to increase cutting
+           // speed; it's a tradeoff -- speed vs RAM-used.
+           if (totwritten>(MEGABYTE(32)+readahead)) {
+              //fdatasync(fd);
+              off_t headdrop = min(curpos-totwritten, totwritten*2L);
+              posix_fadvise(fd, curpos-totwritten-headdrop, totwritten+headdrop, POSIX_FADV_DONTNEED);
+              totwritten = 0;
+              }
            }
         }
 #endif
--- vdr-1.3.39.org/tools.h	2006-01-08 11:40:37.000000000 +0000
+++ vdr-1.3.39/tools.h	2006-01-28 21:32:26.000000000 +0000
@@ -237,22 +237,40 @@  public:
 /// cUnbufferedFile is used for large files that are mainly written or read
 /// in a streaming manner, and thus should not be cached.
 
+#include <sys/types.h>
+#include <sys/mman.h>
+#include <unistd.h>
+
 class cUnbufferedFile {
 private:
   int fd;
+  off_t curpos;
   off_t begin;
-  off_t end;
+  off_t lastpos;
   off_t ahead;
-  ssize_t written;
+  size_t pendingreadahead;
+  size_t readahead;
+  size_t written;
+  size_t totwritten;
 public:
   cUnbufferedFile(void);
   ~cUnbufferedFile();
   int Open(const char *FileName, int Flags, mode_t Mode = DEFFILEMODE);
   int Close(void);
-  off_t Seek(off_t Offset, int Whence);
+  off_t Seek(off_t Offset, int Whence)
+  {
+    //dsyslog("Seek: fd: %d offs: %ld whence: %d diff: %ld", fd, (long)Offset, Whence, Offset-curpos);
+    if (Whence == SEEK_SET && Offset == curpos)
+       return curpos;
+
+    curpos = lseek(fd, Offset, Whence);
+    return curpos;
+  }
+
   ssize_t Read(void *Data, size_t Size);
   ssize_t Write(const void *Data, size_t Size);
   static cUnbufferedFile *Create(const char *FileName, int Flags, mode_t Mode = DEFFILEMODE);
+  void SetReadAhead(size_t ra) { readahead = ra; };
   };
 
 class cLockFile {