[05/28] media/v4l2: remove V4L2-FLAG-MEMORY-NON-CONSISTENT

Message ID 20200819065555.1802761-6-hch@lst.de (mailing list archive)
State Not Applicable, archived
Headers
Series [01/28] mm: turn alloc_pages into an inline function |

Commit Message

Christoph Hellwig Aug. 19, 2020, 6:55 a.m. UTC
  The V4L2-FLAG-MEMORY-NON-CONSISTENT flag is entirely unused, and causes
weird gymanstics with the DMA_ATTR_NON_CONSISTENT flag, which is
unimplemented except on PARISC and some MIPS configs, and about to be
removed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 .../userspace-api/media/v4l/buffer.rst        | 17 ---------
 .../media/v4l/vidioc-reqbufs.rst              |  1 -
 .../media/common/videobuf2/videobuf2-core.c   | 36 +------------------
 .../common/videobuf2/videobuf2-dma-contig.c   | 19 ----------
 .../media/common/videobuf2/videobuf2-dma-sg.c |  3 +-
 .../media/common/videobuf2/videobuf2-v4l2.c   | 12 -------
 include/media/videobuf2-core.h                |  3 +-
 include/uapi/linux/videodev2.h                |  2 --
 8 files changed, 3 insertions(+), 90 deletions(-)
  

Comments

Tomasz Figa Aug. 19, 2020, 11:16 a.m. UTC | #1
Hi Christoph,

On Wed, Aug 19, 2020 at 8:56 AM Christoph Hellwig <hch@lst.de> wrote:
>
> The V4L2-FLAG-MEMORY-NON-CONSISTENT flag is entirely unused,

Could you explain what makes you think it's unused? It's a feature of
the UAPI generally supported by the videobuf2 framework and relied on
by Chromium OS to get any kind of reasonable performance when
accessing V4L2 buffers in the userspace.

> and causes
> weird gymanstics with the DMA_ATTR_NON_CONSISTENT flag, which is
> unimplemented except on PARISC and some MIPS configs, and about to be
> removed.

It is implemented by the generic DMA mapping layer [1], which is used
by a number of architectures including ARM64 and supposed to be used
by new architectures going forward.

[1] https://elixir.bootlin.com/linux/v5.9-rc1/source/kernel/dma/mapping.c#L341

When removing features from generic kernel code, I'd suggest first
providing viable alternatives for its users, rather than killing the
users altogether.

Given the above, I'm afraid I have to NAK this.

Best regards,
Tomasz

>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  .../userspace-api/media/v4l/buffer.rst        | 17 ---------
>  .../media/v4l/vidioc-reqbufs.rst              |  1 -
>  .../media/common/videobuf2/videobuf2-core.c   | 36 +------------------
>  .../common/videobuf2/videobuf2-dma-contig.c   | 19 ----------
>  .../media/common/videobuf2/videobuf2-dma-sg.c |  3 +-
>  .../media/common/videobuf2/videobuf2-v4l2.c   | 12 -------
>  include/media/videobuf2-core.h                |  3 +-
>  include/uapi/linux/videodev2.h                |  2 --
>  8 files changed, 3 insertions(+), 90 deletions(-)
>
> diff --git a/Documentation/userspace-api/media/v4l/buffer.rst b/Documentation/userspace-api/media/v4l/buffer.rst
> index 57e752aaf414a7..2044ed13cd9d7d 100644
> --- a/Documentation/userspace-api/media/v4l/buffer.rst
> +++ b/Documentation/userspace-api/media/v4l/buffer.rst
> @@ -701,23 +701,6 @@ Memory Consistency Flags
>      :stub-columns: 0
>      :widths:       3 1 4
>
> -    * .. _`V4L2-FLAG-MEMORY-NON-CONSISTENT`:
> -
> -      - ``V4L2_FLAG_MEMORY_NON_CONSISTENT``
> -      - 0x00000001
> -      - A buffer is allocated either in consistent (it will be automatically
> -       coherent between the CPU and the bus) or non-consistent memory. The
> -       latter can provide performance gains, for instance the CPU cache
> -       sync/flush operations can be avoided if the buffer is accessed by the
> -       corresponding device only and the CPU does not read/write to/from that
> -       buffer. However, this requires extra care from the driver -- it must
> -       guarantee memory consistency by issuing a cache flush/sync when
> -       consistency is needed. If this flag is set V4L2 will attempt to
> -       allocate the buffer in non-consistent memory. The flag takes effect
> -       only if the buffer is used for :ref:`memory mapping <mmap>` I/O and the
> -       queue reports the :ref:`V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS
> -       <V4L2-BUF-CAP-SUPPORTS-MMAP-CACHE-HINTS>` capability.
> -
>  .. c:type:: v4l2_memory
>
>  enum v4l2_memory
> diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> index 75d894d9c36c42..3180c111d368ee 100644
> --- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> +++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> @@ -169,7 +169,6 @@ aborting or finishing any DMA in progress, an implicit
>        - This capability is set by the driver to indicate that the queue supports
>          cache and memory management hints. However, it's only valid when the
>          queue is used for :ref:`memory mapping <mmap>` streaming I/O. See
> -        :ref:`V4L2_FLAG_MEMORY_NON_CONSISTENT <V4L2-FLAG-MEMORY-NON-CONSISTENT>`,
>          :ref:`V4L2_BUF_FLAG_NO_CACHE_INVALIDATE <V4L2-BUF-FLAG-NO-CACHE-INVALIDATE>` and
>          :ref:`V4L2_BUF_FLAG_NO_CACHE_CLEAN <V4L2-BUF-FLAG-NO-CACHE-CLEAN>`.
>
> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> index f544d3393e9d6b..66a41cef33c1b1 100644
> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> @@ -721,39 +721,14 @@ int vb2_verify_memory_type(struct vb2_queue *q,
>  }
>  EXPORT_SYMBOL(vb2_verify_memory_type);
>
> -static void set_queue_consistency(struct vb2_queue *q, bool consistent_mem)
> -{
> -       q->dma_attrs &= ~DMA_ATTR_NON_CONSISTENT;
> -
> -       if (!vb2_queue_allows_cache_hints(q))
> -               return;
> -       if (!consistent_mem)
> -               q->dma_attrs |= DMA_ATTR_NON_CONSISTENT;
> -}
> -
> -static bool verify_consistency_attr(struct vb2_queue *q, bool consistent_mem)
> -{
> -       bool queue_is_consistent = !(q->dma_attrs & DMA_ATTR_NON_CONSISTENT);
> -
> -       if (consistent_mem != queue_is_consistent) {
> -               dprintk(q, 1, "memory consistency model mismatch\n");
> -               return false;
> -       }
> -       return true;
> -}
> -
>  int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
>                      unsigned int flags, unsigned int *count)
>  {
>         unsigned int num_buffers, allocated_buffers, num_planes = 0;
>         unsigned plane_sizes[VB2_MAX_PLANES] = { };
> -       bool consistent_mem = true;
>         unsigned int i;
>         int ret;
>
> -       if (flags & V4L2_FLAG_MEMORY_NON_CONSISTENT)
> -               consistent_mem = false;
> -
>         if (q->streaming) {
>                 dprintk(q, 1, "streaming active\n");
>                 return -EBUSY;
> @@ -765,8 +740,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
>         }
>
>         if (*count == 0 || q->num_buffers != 0 ||
> -           (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory) ||
> -           !verify_consistency_attr(q, consistent_mem)) {
> +           (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory)) {
>                 /*
>                  * We already have buffers allocated, so first check if they
>                  * are not in use and can be freed.
> @@ -803,7 +777,6 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
>         num_buffers = min_t(unsigned int, num_buffers, VB2_MAX_FRAME);
>         memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
>         q->memory = memory;
> -       set_queue_consistency(q, consistent_mem);
>
>         /*
>          * Ask the driver how many buffers and planes per buffer it requires.
> @@ -894,12 +867,8 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
>  {
>         unsigned int num_planes = 0, num_buffers, allocated_buffers;
>         unsigned plane_sizes[VB2_MAX_PLANES] = { };
> -       bool consistent_mem = true;
>         int ret;
>
> -       if (flags & V4L2_FLAG_MEMORY_NON_CONSISTENT)
> -               consistent_mem = false;
> -
>         if (q->num_buffers == VB2_MAX_FRAME) {
>                 dprintk(q, 1, "maximum number of buffers already allocated\n");
>                 return -ENOBUFS;
> @@ -912,15 +881,12 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
>                 }
>                 memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
>                 q->memory = memory;
> -               set_queue_consistency(q, consistent_mem);
>                 q->waiting_for_buffers = !q->is_output;
>         } else {
>                 if (q->memory != memory) {
>                         dprintk(q, 1, "memory model mismatch\n");
>                         return -EINVAL;
>                 }
> -               if (!verify_consistency_attr(q, consistent_mem))
> -                       return -EINVAL;
>         }
>
>         num_buffers = min(*count, VB2_MAX_FRAME - q->num_buffers);
> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> index ec3446cc45b8da..7b1b86ec942d7d 100644
> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> @@ -42,11 +42,6 @@ struct vb2_dc_buf {
>         struct dma_buf_attachment       *db_attach;
>  };
>
> -static inline bool vb2_dc_buffer_consistent(unsigned long attr)
> -{
> -       return !(attr & DMA_ATTR_NON_CONSISTENT);
> -}
> -
>  /*********************************************/
>  /*        scatterlist table functions        */
>  /*********************************************/
> @@ -341,13 +336,6 @@ static int
>  vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
>                                    enum dma_data_direction direction)
>  {
> -       struct vb2_dc_buf *buf = dbuf->priv;
> -       struct sg_table *sgt = buf->dma_sgt;
> -
> -       if (vb2_dc_buffer_consistent(buf->attrs))
> -               return 0;
> -
> -       dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
>         return 0;
>  }
>
> @@ -355,13 +343,6 @@ static int
>  vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
>                                  enum dma_data_direction direction)
>  {
> -       struct vb2_dc_buf *buf = dbuf->priv;
> -       struct sg_table *sgt = buf->dma_sgt;
> -
> -       if (vb2_dc_buffer_consistent(buf->attrs))
> -               return 0;
> -
> -       dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
>         return 0;
>  }
>
> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> index 0a40e00f0d7e5c..a86fce5d8ea8bf 100644
> --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> @@ -123,8 +123,7 @@ static void *vb2_dma_sg_alloc(struct device *dev, unsigned long dma_attrs,
>         /*
>          * NOTE: dma-sg allocates memory using the page allocator directly, so
>          * there is no memory consistency guarantee, hence dma-sg ignores DMA
> -        * attributes passed from the upper layer. That means that
> -        * V4L2_FLAG_MEMORY_NON_CONSISTENT has no effect on dma-sg buffers.
> +        * attributes passed from the upper layer.
>          */
>         buf->pages = kvmalloc_array(buf->num_pages, sizeof(struct page *),
>                                     GFP_KERNEL | __GFP_ZERO);
> diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
> index 30caad27281e1a..de83ad48783821 100644
> --- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
> +++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
> @@ -722,20 +722,11 @@ static void fill_buf_caps(struct vb2_queue *q, u32 *caps)
>  #endif
>  }
>
> -static void clear_consistency_attr(struct vb2_queue *q,
> -                                  int memory,
> -                                  unsigned int *flags)
> -{
> -       if (!q->allow_cache_hints || memory != V4L2_MEMORY_MMAP)
> -               *flags &= ~V4L2_FLAG_MEMORY_NON_CONSISTENT;
> -}
> -
>  int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req)
>  {
>         int ret = vb2_verify_memory_type(q, req->memory, req->type);
>
>         fill_buf_caps(q, &req->capabilities);
> -       clear_consistency_attr(q, req->memory, &req->flags);
>         return ret ? ret : vb2_core_reqbufs(q, req->memory,
>                                             req->flags, &req->count);
>  }
> @@ -769,7 +760,6 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
>         unsigned i;
>
>         fill_buf_caps(q, &create->capabilities);
> -       clear_consistency_attr(q, create->memory, &create->flags);
>         create->index = q->num_buffers;
>         if (create->count == 0)
>                 return ret != -EBUSY ? ret : 0;
> @@ -998,7 +988,6 @@ int vb2_ioctl_reqbufs(struct file *file, void *priv,
>         int res = vb2_verify_memory_type(vdev->queue, p->memory, p->type);
>
>         fill_buf_caps(vdev->queue, &p->capabilities);
> -       clear_consistency_attr(vdev->queue, p->memory, &p->flags);
>         if (res)
>                 return res;
>         if (vb2_queue_is_busy(vdev, file))
> @@ -1021,7 +1010,6 @@ int vb2_ioctl_create_bufs(struct file *file, void *priv,
>
>         p->index = vdev->queue->num_buffers;
>         fill_buf_caps(vdev->queue, &p->capabilities);
> -       clear_consistency_attr(vdev->queue, p->memory, &p->flags);
>         /*
>          * If count == 0, then just check if memory and type are valid.
>          * Any -EBUSY result from vb2_verify_memory_type can be mapped to 0.
> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> index 52ef92049073e3..4c7f25b07e9375 100644
> --- a/include/media/videobuf2-core.h
> +++ b/include/media/videobuf2-core.h
> @@ -744,8 +744,7 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
>   * vb2_core_reqbufs() - Initiate streaming.
>   * @q:         pointer to &struct vb2_queue with videobuf2 queue.
>   * @memory:    memory type, as defined by &enum vb2_memory.
> - * @flags:     auxiliary queue/buffer management flags. Currently, the only
> - *             used flag is %V4L2_FLAG_MEMORY_NON_CONSISTENT.
> + * @flags:     auxiliary queue/buffer management flags.
>   * @count:     requested buffer count.
>   *
>   * Videobuf2 core helper to implement VIDIOC_REQBUF() operation. It is called
> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index c7b70ff53bc1dd..5c00f63d9c1b58 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -191,8 +191,6 @@ enum v4l2_memory {
>         V4L2_MEMORY_DMABUF           = 4,
>  };
>
> -#define V4L2_FLAG_MEMORY_NON_CONSISTENT                (1 << 0)
> -
>  /* see also http://vektor.theorem.ca/graphics/ycbcr/ */
>  enum v4l2_colorspace {
>         /*
> --
> 2.28.0
>
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
  
Robin Murphy Aug. 19, 2020, 11:51 a.m. UTC | #2
Hi Tomasz,

On 2020-08-19 12:16, Tomasz Figa wrote:
> Hi Christoph,
> 
> On Wed, Aug 19, 2020 at 8:56 AM Christoph Hellwig <hch@lst.de> wrote:
>>
>> The V4L2-FLAG-MEMORY-NON-CONSISTENT flag is entirely unused,
> 
> Could you explain what makes you think it's unused? It's a feature of
> the UAPI generally supported by the videobuf2 framework and relied on
> by Chromium OS to get any kind of reasonable performance when
> accessing V4L2 buffers in the userspace.
> 
>> and causes
>> weird gymanstics with the DMA_ATTR_NON_CONSISTENT flag, which is
>> unimplemented except on PARISC and some MIPS configs, and about to be
>> removed.
> 
> It is implemented by the generic DMA mapping layer [1], which is used
> by a number of architectures including ARM64 and supposed to be used
> by new architectures going forward.

AFAICS all that V4L2_FLAG_MEMORY_NON_CONSISTENT does is end up 
controling whether DMA_ATTR_NON_CONSISTENT is added to vb2_queue::dma_attrs.

Please can you point to where DMA_ATTR_NON_CONSISTENT does anything at 
all on arm64?

Also, I posit that videobuf2 is not actually relying on 
DMA_ATTR_NON_CONSISTENT anyway, since it's clearly not using it properly:

"By using this API, you are guaranteeing to the platform
that you have all the correct and necessary sync points for this memory
in the driver should it choose to return non-consistent memory."

$ git grep dma_cache_sync drivers/media
$

Robin.

> [1] https://elixir.bootlin.com/linux/v5.9-rc1/source/kernel/dma/mapping.c#L341
> 
> When removing features from generic kernel code, I'd suggest first
> providing viable alternatives for its users, rather than killing the
> users altogether.
> 
> Given the above, I'm afraid I have to NAK this.
> 
> Best regards,
> Tomasz
> 
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> ---
>>   .../userspace-api/media/v4l/buffer.rst        | 17 ---------
>>   .../media/v4l/vidioc-reqbufs.rst              |  1 -
>>   .../media/common/videobuf2/videobuf2-core.c   | 36 +------------------
>>   .../common/videobuf2/videobuf2-dma-contig.c   | 19 ----------
>>   .../media/common/videobuf2/videobuf2-dma-sg.c |  3 +-
>>   .../media/common/videobuf2/videobuf2-v4l2.c   | 12 -------
>>   include/media/videobuf2-core.h                |  3 +-
>>   include/uapi/linux/videodev2.h                |  2 --
>>   8 files changed, 3 insertions(+), 90 deletions(-)
>>
>> diff --git a/Documentation/userspace-api/media/v4l/buffer.rst b/Documentation/userspace-api/media/v4l/buffer.rst
>> index 57e752aaf414a7..2044ed13cd9d7d 100644
>> --- a/Documentation/userspace-api/media/v4l/buffer.rst
>> +++ b/Documentation/userspace-api/media/v4l/buffer.rst
>> @@ -701,23 +701,6 @@ Memory Consistency Flags
>>       :stub-columns: 0
>>       :widths:       3 1 4
>>
>> -    * .. _`V4L2-FLAG-MEMORY-NON-CONSISTENT`:
>> -
>> -      - ``V4L2_FLAG_MEMORY_NON_CONSISTENT``
>> -      - 0x00000001
>> -      - A buffer is allocated either in consistent (it will be automatically
>> -       coherent between the CPU and the bus) or non-consistent memory. The
>> -       latter can provide performance gains, for instance the CPU cache
>> -       sync/flush operations can be avoided if the buffer is accessed by the
>> -       corresponding device only and the CPU does not read/write to/from that
>> -       buffer. However, this requires extra care from the driver -- it must
>> -       guarantee memory consistency by issuing a cache flush/sync when
>> -       consistency is needed. If this flag is set V4L2 will attempt to
>> -       allocate the buffer in non-consistent memory. The flag takes effect
>> -       only if the buffer is used for :ref:`memory mapping <mmap>` I/O and the
>> -       queue reports the :ref:`V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS
>> -       <V4L2-BUF-CAP-SUPPORTS-MMAP-CACHE-HINTS>` capability.
>> -
>>   .. c:type:: v4l2_memory
>>
>>   enum v4l2_memory
>> diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
>> index 75d894d9c36c42..3180c111d368ee 100644
>> --- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
>> +++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
>> @@ -169,7 +169,6 @@ aborting or finishing any DMA in progress, an implicit
>>         - This capability is set by the driver to indicate that the queue supports
>>           cache and memory management hints. However, it's only valid when the
>>           queue is used for :ref:`memory mapping <mmap>` streaming I/O. See
>> -        :ref:`V4L2_FLAG_MEMORY_NON_CONSISTENT <V4L2-FLAG-MEMORY-NON-CONSISTENT>`,
>>           :ref:`V4L2_BUF_FLAG_NO_CACHE_INVALIDATE <V4L2-BUF-FLAG-NO-CACHE-INVALIDATE>` and
>>           :ref:`V4L2_BUF_FLAG_NO_CACHE_CLEAN <V4L2-BUF-FLAG-NO-CACHE-CLEAN>`.
>>
>> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
>> index f544d3393e9d6b..66a41cef33c1b1 100644
>> --- a/drivers/media/common/videobuf2/videobuf2-core.c
>> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
>> @@ -721,39 +721,14 @@ int vb2_verify_memory_type(struct vb2_queue *q,
>>   }
>>   EXPORT_SYMBOL(vb2_verify_memory_type);
>>
>> -static void set_queue_consistency(struct vb2_queue *q, bool consistent_mem)
>> -{
>> -       q->dma_attrs &= ~DMA_ATTR_NON_CONSISTENT;
>> -
>> -       if (!vb2_queue_allows_cache_hints(q))
>> -               return;
>> -       if (!consistent_mem)
>> -               q->dma_attrs |= DMA_ATTR_NON_CONSISTENT;
>> -}
>> -
>> -static bool verify_consistency_attr(struct vb2_queue *q, bool consistent_mem)
>> -{
>> -       bool queue_is_consistent = !(q->dma_attrs & DMA_ATTR_NON_CONSISTENT);
>> -
>> -       if (consistent_mem != queue_is_consistent) {
>> -               dprintk(q, 1, "memory consistency model mismatch\n");
>> -               return false;
>> -       }
>> -       return true;
>> -}
>> -
>>   int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
>>                       unsigned int flags, unsigned int *count)
>>   {
>>          unsigned int num_buffers, allocated_buffers, num_planes = 0;
>>          unsigned plane_sizes[VB2_MAX_PLANES] = { };
>> -       bool consistent_mem = true;
>>          unsigned int i;
>>          int ret;
>>
>> -       if (flags & V4L2_FLAG_MEMORY_NON_CONSISTENT)
>> -               consistent_mem = false;
>> -
>>          if (q->streaming) {
>>                  dprintk(q, 1, "streaming active\n");
>>                  return -EBUSY;
>> @@ -765,8 +740,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
>>          }
>>
>>          if (*count == 0 || q->num_buffers != 0 ||
>> -           (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory) ||
>> -           !verify_consistency_attr(q, consistent_mem)) {
>> +           (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory)) {
>>                  /*
>>                   * We already have buffers allocated, so first check if they
>>                   * are not in use and can be freed.
>> @@ -803,7 +777,6 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
>>          num_buffers = min_t(unsigned int, num_buffers, VB2_MAX_FRAME);
>>          memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
>>          q->memory = memory;
>> -       set_queue_consistency(q, consistent_mem);
>>
>>          /*
>>           * Ask the driver how many buffers and planes per buffer it requires.
>> @@ -894,12 +867,8 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
>>   {
>>          unsigned int num_planes = 0, num_buffers, allocated_buffers;
>>          unsigned plane_sizes[VB2_MAX_PLANES] = { };
>> -       bool consistent_mem = true;
>>          int ret;
>>
>> -       if (flags & V4L2_FLAG_MEMORY_NON_CONSISTENT)
>> -               consistent_mem = false;
>> -
>>          if (q->num_buffers == VB2_MAX_FRAME) {
>>                  dprintk(q, 1, "maximum number of buffers already allocated\n");
>>                  return -ENOBUFS;
>> @@ -912,15 +881,12 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
>>                  }
>>                  memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
>>                  q->memory = memory;
>> -               set_queue_consistency(q, consistent_mem);
>>                  q->waiting_for_buffers = !q->is_output;
>>          } else {
>>                  if (q->memory != memory) {
>>                          dprintk(q, 1, "memory model mismatch\n");
>>                          return -EINVAL;
>>                  }
>> -               if (!verify_consistency_attr(q, consistent_mem))
>> -                       return -EINVAL;
>>          }
>>
>>          num_buffers = min(*count, VB2_MAX_FRAME - q->num_buffers);
>> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> index ec3446cc45b8da..7b1b86ec942d7d 100644
>> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> @@ -42,11 +42,6 @@ struct vb2_dc_buf {
>>          struct dma_buf_attachment       *db_attach;
>>   };
>>
>> -static inline bool vb2_dc_buffer_consistent(unsigned long attr)
>> -{
>> -       return !(attr & DMA_ATTR_NON_CONSISTENT);
>> -}
>> -
>>   /*********************************************/
>>   /*        scatterlist table functions        */
>>   /*********************************************/
>> @@ -341,13 +336,6 @@ static int
>>   vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
>>                                     enum dma_data_direction direction)
>>   {
>> -       struct vb2_dc_buf *buf = dbuf->priv;
>> -       struct sg_table *sgt = buf->dma_sgt;
>> -
>> -       if (vb2_dc_buffer_consistent(buf->attrs))
>> -               return 0;
>> -
>> -       dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
>>          return 0;
>>   }
>>
>> @@ -355,13 +343,6 @@ static int
>>   vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
>>                                   enum dma_data_direction direction)
>>   {
>> -       struct vb2_dc_buf *buf = dbuf->priv;
>> -       struct sg_table *sgt = buf->dma_sgt;
>> -
>> -       if (vb2_dc_buffer_consistent(buf->attrs))
>> -               return 0;
>> -
>> -       dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
>>          return 0;
>>   }
>>
>> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
>> index 0a40e00f0d7e5c..a86fce5d8ea8bf 100644
>> --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
>> +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
>> @@ -123,8 +123,7 @@ static void *vb2_dma_sg_alloc(struct device *dev, unsigned long dma_attrs,
>>          /*
>>           * NOTE: dma-sg allocates memory using the page allocator directly, so
>>           * there is no memory consistency guarantee, hence dma-sg ignores DMA
>> -        * attributes passed from the upper layer. That means that
>> -        * V4L2_FLAG_MEMORY_NON_CONSISTENT has no effect on dma-sg buffers.
>> +        * attributes passed from the upper layer.
>>           */
>>          buf->pages = kvmalloc_array(buf->num_pages, sizeof(struct page *),
>>                                      GFP_KERNEL | __GFP_ZERO);
>> diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
>> index 30caad27281e1a..de83ad48783821 100644
>> --- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
>> +++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
>> @@ -722,20 +722,11 @@ static void fill_buf_caps(struct vb2_queue *q, u32 *caps)
>>   #endif
>>   }
>>
>> -static void clear_consistency_attr(struct vb2_queue *q,
>> -                                  int memory,
>> -                                  unsigned int *flags)
>> -{
>> -       if (!q->allow_cache_hints || memory != V4L2_MEMORY_MMAP)
>> -               *flags &= ~V4L2_FLAG_MEMORY_NON_CONSISTENT;
>> -}
>> -
>>   int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req)
>>   {
>>          int ret = vb2_verify_memory_type(q, req->memory, req->type);
>>
>>          fill_buf_caps(q, &req->capabilities);
>> -       clear_consistency_attr(q, req->memory, &req->flags);
>>          return ret ? ret : vb2_core_reqbufs(q, req->memory,
>>                                              req->flags, &req->count);
>>   }
>> @@ -769,7 +760,6 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
>>          unsigned i;
>>
>>          fill_buf_caps(q, &create->capabilities);
>> -       clear_consistency_attr(q, create->memory, &create->flags);
>>          create->index = q->num_buffers;
>>          if (create->count == 0)
>>                  return ret != -EBUSY ? ret : 0;
>> @@ -998,7 +988,6 @@ int vb2_ioctl_reqbufs(struct file *file, void *priv,
>>          int res = vb2_verify_memory_type(vdev->queue, p->memory, p->type);
>>
>>          fill_buf_caps(vdev->queue, &p->capabilities);
>> -       clear_consistency_attr(vdev->queue, p->memory, &p->flags);
>>          if (res)
>>                  return res;
>>          if (vb2_queue_is_busy(vdev, file))
>> @@ -1021,7 +1010,6 @@ int vb2_ioctl_create_bufs(struct file *file, void *priv,
>>
>>          p->index = vdev->queue->num_buffers;
>>          fill_buf_caps(vdev->queue, &p->capabilities);
>> -       clear_consistency_attr(vdev->queue, p->memory, &p->flags);
>>          /*
>>           * If count == 0, then just check if memory and type are valid.
>>           * Any -EBUSY result from vb2_verify_memory_type can be mapped to 0.
>> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
>> index 52ef92049073e3..4c7f25b07e9375 100644
>> --- a/include/media/videobuf2-core.h
>> +++ b/include/media/videobuf2-core.h
>> @@ -744,8 +744,7 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
>>    * vb2_core_reqbufs() - Initiate streaming.
>>    * @q:         pointer to &struct vb2_queue with videobuf2 queue.
>>    * @memory:    memory type, as defined by &enum vb2_memory.
>> - * @flags:     auxiliary queue/buffer management flags. Currently, the only
>> - *             used flag is %V4L2_FLAG_MEMORY_NON_CONSISTENT.
>> + * @flags:     auxiliary queue/buffer management flags.
>>    * @count:     requested buffer count.
>>    *
>>    * Videobuf2 core helper to implement VIDIOC_REQBUF() operation. It is called
>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>> index c7b70ff53bc1dd..5c00f63d9c1b58 100644
>> --- a/include/uapi/linux/videodev2.h
>> +++ b/include/uapi/linux/videodev2.h
>> @@ -191,8 +191,6 @@ enum v4l2_memory {
>>          V4L2_MEMORY_DMABUF           = 4,
>>   };
>>
>> -#define V4L2_FLAG_MEMORY_NON_CONSISTENT                (1 << 0)
>> -
>>   /* see also http://vektor.theorem.ca/graphics/ycbcr/ */
>>   enum v4l2_colorspace {
>>          /*
>> --
>> 2.28.0
>>
>> _______________________________________________
>> iommu mailing list
>> iommu@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
  
Tomasz Figa Aug. 19, 2020, 12:49 p.m. UTC | #3
On Wed, Aug 19, 2020 at 1:51 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> Hi Tomasz,
>
> On 2020-08-19 12:16, Tomasz Figa wrote:
> > Hi Christoph,
> >
> > On Wed, Aug 19, 2020 at 8:56 AM Christoph Hellwig <hch@lst.de> wrote:
> >>
> >> The V4L2-FLAG-MEMORY-NON-CONSISTENT flag is entirely unused,
> >
> > Could you explain what makes you think it's unused? It's a feature of
> > the UAPI generally supported by the videobuf2 framework and relied on
> > by Chromium OS to get any kind of reasonable performance when
> > accessing V4L2 buffers in the userspace.
> >
> >> and causes
> >> weird gymanstics with the DMA_ATTR_NON_CONSISTENT flag, which is
> >> unimplemented except on PARISC and some MIPS configs, and about to be
> >> removed.
> >
> > It is implemented by the generic DMA mapping layer [1], which is used
> > by a number of architectures including ARM64 and supposed to be used
> > by new architectures going forward.
>
> AFAICS all that V4L2_FLAG_MEMORY_NON_CONSISTENT does is end up
> controling whether DMA_ATTR_NON_CONSISTENT is added to vb2_queue::dma_attrs.
>
> Please can you point to where DMA_ATTR_NON_CONSISTENT does anything at
> all on arm64?
>

With the default config it doesn't, but with
CONFIG_DMA_NONCOHERENT_CACHE_SYNC enabled it makes dma_pgprot() keep
the pgprot value as is, without enforcing coherence attributes.


> Also, I posit that videobuf2 is not actually relying on
> DMA_ATTR_NON_CONSISTENT anyway, since it's clearly not using it properly:
>
> "By using this API, you are guaranteeing to the platform
> that you have all the correct and necessary sync points for this memory
> in the driver should it choose to return non-consistent memory."
>
> $ git grep dma_cache_sync drivers/media
> $

AFAIK dma_cache_sync() isn't the only way to perform the cache
synchronization. The earlier patch series that I reviewed relied on
dma_get_sgtable() and then dma_sync_sg_*() (which existed in the
vb2-dc since forever [1]). However, it looks like with the final code
the sgtable isn't acquired and the synchronization isn't happening, so
you have a point.

FWIW, I asked back in time what the plan is for non-coherent
allocations and it seemed like DMA_ATTR_NON_CONSISTENT and
dma_sync_*() was supposed to be the right thing to go with. [2] The
same thread also explains why dma_alloc_pages() isn't suitable for the
users of dma_alloc_attrs() and DMA_ATTR_NON_CONSISTENT.

I think we could make a deal here. We could revert back the parts
using DMA_ATTR_NON_CONSISTENT, keeping the UAPI intact, but just
rendering it no-op, since it's just a hint after all. Then, you would
propose a proper, functionally equivalent and working for ARM64,
replacement for dma_alloc_attrs(..., DMA_ATTR_NON_CONSISTENT), which
we could then use to enable the functionality expected by this UAPI.
Does it sound like something that could work as a way forward here?

By the way, as a videobuf2 reviewer, I'd appreciate being CC'd on any
series related to the subsystem-facing DMA API changes, since
videobuf2 is one of the biggest users of it.

[1] https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/media/common/videobuf2/videobuf2-dma-contig.c#L98
[2] https://patchwork.kernel.org/comment/23312203/

Best regards,
Tomasz


>
> Robin.
>
> > [1] https://elixir.bootlin.com/linux/v5.9-rc1/source/kernel/dma/mapping.c#L341
> >
> > When removing features from generic kernel code, I'd suggest first
> > providing viable alternatives for its users, rather than killing the
> > users altogether.
> >
> > Given the above, I'm afraid I have to NAK this.
> >
> > Best regards,
> > Tomasz
> >
> >>
> >> Signed-off-by: Christoph Hellwig <hch@lst.de>
> >> ---
> >>   .../userspace-api/media/v4l/buffer.rst        | 17 ---------
> >>   .../media/v4l/vidioc-reqbufs.rst              |  1 -
> >>   .../media/common/videobuf2/videobuf2-core.c   | 36 +------------------
> >>   .../common/videobuf2/videobuf2-dma-contig.c   | 19 ----------
> >>   .../media/common/videobuf2/videobuf2-dma-sg.c |  3 +-
> >>   .../media/common/videobuf2/videobuf2-v4l2.c   | 12 -------
> >>   include/media/videobuf2-core.h                |  3 +-
> >>   include/uapi/linux/videodev2.h                |  2 --
> >>   8 files changed, 3 insertions(+), 90 deletions(-)
> >>
> >> diff --git a/Documentation/userspace-api/media/v4l/buffer.rst b/Documentation/userspace-api/media/v4l/buffer.rst
> >> index 57e752aaf414a7..2044ed13cd9d7d 100644
> >> --- a/Documentation/userspace-api/media/v4l/buffer.rst
> >> +++ b/Documentation/userspace-api/media/v4l/buffer.rst
> >> @@ -701,23 +701,6 @@ Memory Consistency Flags
> >>       :stub-columns: 0
> >>       :widths:       3 1 4
> >>
> >> -    * .. _`V4L2-FLAG-MEMORY-NON-CONSISTENT`:
> >> -
> >> -      - ``V4L2_FLAG_MEMORY_NON_CONSISTENT``
> >> -      - 0x00000001
> >> -      - A buffer is allocated either in consistent (it will be automatically
> >> -       coherent between the CPU and the bus) or non-consistent memory. The
> >> -       latter can provide performance gains, for instance the CPU cache
> >> -       sync/flush operations can be avoided if the buffer is accessed by the
> >> -       corresponding device only and the CPU does not read/write to/from that
> >> -       buffer. However, this requires extra care from the driver -- it must
> >> -       guarantee memory consistency by issuing a cache flush/sync when
> >> -       consistency is needed. If this flag is set V4L2 will attempt to
> >> -       allocate the buffer in non-consistent memory. The flag takes effect
> >> -       only if the buffer is used for :ref:`memory mapping <mmap>` I/O and the
> >> -       queue reports the :ref:`V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS
> >> -       <V4L2-BUF-CAP-SUPPORTS-MMAP-CACHE-HINTS>` capability.
> >> -
> >>   .. c:type:: v4l2_memory
> >>
> >>   enum v4l2_memory
> >> diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> >> index 75d894d9c36c42..3180c111d368ee 100644
> >> --- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> >> +++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
> >> @@ -169,7 +169,6 @@ aborting or finishing any DMA in progress, an implicit
> >>         - This capability is set by the driver to indicate that the queue supports
> >>           cache and memory management hints. However, it's only valid when the
> >>           queue is used for :ref:`memory mapping <mmap>` streaming I/O. See
> >> -        :ref:`V4L2_FLAG_MEMORY_NON_CONSISTENT <V4L2-FLAG-MEMORY-NON-CONSISTENT>`,
> >>           :ref:`V4L2_BUF_FLAG_NO_CACHE_INVALIDATE <V4L2-BUF-FLAG-NO-CACHE-INVALIDATE>` and
> >>           :ref:`V4L2_BUF_FLAG_NO_CACHE_CLEAN <V4L2-BUF-FLAG-NO-CACHE-CLEAN>`.
> >>
> >> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> >> index f544d3393e9d6b..66a41cef33c1b1 100644
> >> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> >> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> >> @@ -721,39 +721,14 @@ int vb2_verify_memory_type(struct vb2_queue *q,
> >>   }
> >>   EXPORT_SYMBOL(vb2_verify_memory_type);
> >>
> >> -static void set_queue_consistency(struct vb2_queue *q, bool consistent_mem)
> >> -{
> >> -       q->dma_attrs &= ~DMA_ATTR_NON_CONSISTENT;
> >> -
> >> -       if (!vb2_queue_allows_cache_hints(q))
> >> -               return;
> >> -       if (!consistent_mem)
> >> -               q->dma_attrs |= DMA_ATTR_NON_CONSISTENT;
> >> -}
> >> -
> >> -static bool verify_consistency_attr(struct vb2_queue *q, bool consistent_mem)
> >> -{
> >> -       bool queue_is_consistent = !(q->dma_attrs & DMA_ATTR_NON_CONSISTENT);
> >> -
> >> -       if (consistent_mem != queue_is_consistent) {
> >> -               dprintk(q, 1, "memory consistency model mismatch\n");
> >> -               return false;
> >> -       }
> >> -       return true;
> >> -}
> >> -
> >>   int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> >>                       unsigned int flags, unsigned int *count)
> >>   {
> >>          unsigned int num_buffers, allocated_buffers, num_planes = 0;
> >>          unsigned plane_sizes[VB2_MAX_PLANES] = { };
> >> -       bool consistent_mem = true;
> >>          unsigned int i;
> >>          int ret;
> >>
> >> -       if (flags & V4L2_FLAG_MEMORY_NON_CONSISTENT)
> >> -               consistent_mem = false;
> >> -
> >>          if (q->streaming) {
> >>                  dprintk(q, 1, "streaming active\n");
> >>                  return -EBUSY;
> >> @@ -765,8 +740,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> >>          }
> >>
> >>          if (*count == 0 || q->num_buffers != 0 ||
> >> -           (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory) ||
> >> -           !verify_consistency_attr(q, consistent_mem)) {
> >> +           (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory)) {
> >>                  /*
> >>                   * We already have buffers allocated, so first check if they
> >>                   * are not in use and can be freed.
> >> @@ -803,7 +777,6 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> >>          num_buffers = min_t(unsigned int, num_buffers, VB2_MAX_FRAME);
> >>          memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
> >>          q->memory = memory;
> >> -       set_queue_consistency(q, consistent_mem);
> >>
> >>          /*
> >>           * Ask the driver how many buffers and planes per buffer it requires.
> >> @@ -894,12 +867,8 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
> >>   {
> >>          unsigned int num_planes = 0, num_buffers, allocated_buffers;
> >>          unsigned plane_sizes[VB2_MAX_PLANES] = { };
> >> -       bool consistent_mem = true;
> >>          int ret;
> >>
> >> -       if (flags & V4L2_FLAG_MEMORY_NON_CONSISTENT)
> >> -               consistent_mem = false;
> >> -
> >>          if (q->num_buffers == VB2_MAX_FRAME) {
> >>                  dprintk(q, 1, "maximum number of buffers already allocated\n");
> >>                  return -ENOBUFS;
> >> @@ -912,15 +881,12 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
> >>                  }
> >>                  memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
> >>                  q->memory = memory;
> >> -               set_queue_consistency(q, consistent_mem);
> >>                  q->waiting_for_buffers = !q->is_output;
> >>          } else {
> >>                  if (q->memory != memory) {
> >>                          dprintk(q, 1, "memory model mismatch\n");
> >>                          return -EINVAL;
> >>                  }
> >> -               if (!verify_consistency_attr(q, consistent_mem))
> >> -                       return -EINVAL;
> >>          }
> >>
> >>          num_buffers = min(*count, VB2_MAX_FRAME - q->num_buffers);
> >> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> >> index ec3446cc45b8da..7b1b86ec942d7d 100644
> >> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> >> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> >> @@ -42,11 +42,6 @@ struct vb2_dc_buf {
> >>          struct dma_buf_attachment       *db_attach;
> >>   };
> >>
> >> -static inline bool vb2_dc_buffer_consistent(unsigned long attr)
> >> -{
> >> -       return !(attr & DMA_ATTR_NON_CONSISTENT);
> >> -}
> >> -
> >>   /*********************************************/
> >>   /*        scatterlist table functions        */
> >>   /*********************************************/
> >> @@ -341,13 +336,6 @@ static int
> >>   vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
> >>                                     enum dma_data_direction direction)
> >>   {
> >> -       struct vb2_dc_buf *buf = dbuf->priv;
> >> -       struct sg_table *sgt = buf->dma_sgt;
> >> -
> >> -       if (vb2_dc_buffer_consistent(buf->attrs))
> >> -               return 0;
> >> -
> >> -       dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
> >>          return 0;
> >>   }
> >>
> >> @@ -355,13 +343,6 @@ static int
> >>   vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
> >>                                   enum dma_data_direction direction)
> >>   {
> >> -       struct vb2_dc_buf *buf = dbuf->priv;
> >> -       struct sg_table *sgt = buf->dma_sgt;
> >> -
> >> -       if (vb2_dc_buffer_consistent(buf->attrs))
> >> -               return 0;
> >> -
> >> -       dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
> >>          return 0;
> >>   }
> >>
> >> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> >> index 0a40e00f0d7e5c..a86fce5d8ea8bf 100644
> >> --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> >> +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
> >> @@ -123,8 +123,7 @@ static void *vb2_dma_sg_alloc(struct device *dev, unsigned long dma_attrs,
> >>          /*
> >>           * NOTE: dma-sg allocates memory using the page allocator directly, so
> >>           * there is no memory consistency guarantee, hence dma-sg ignores DMA
> >> -        * attributes passed from the upper layer. That means that
> >> -        * V4L2_FLAG_MEMORY_NON_CONSISTENT has no effect on dma-sg buffers.
> >> +        * attributes passed from the upper layer.
> >>           */
> >>          buf->pages = kvmalloc_array(buf->num_pages, sizeof(struct page *),
> >>                                      GFP_KERNEL | __GFP_ZERO);
> >> diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
> >> index 30caad27281e1a..de83ad48783821 100644
> >> --- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
> >> +++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
> >> @@ -722,20 +722,11 @@ static void fill_buf_caps(struct vb2_queue *q, u32 *caps)
> >>   #endif
> >>   }
> >>
> >> -static void clear_consistency_attr(struct vb2_queue *q,
> >> -                                  int memory,
> >> -                                  unsigned int *flags)
> >> -{
> >> -       if (!q->allow_cache_hints || memory != V4L2_MEMORY_MMAP)
> >> -               *flags &= ~V4L2_FLAG_MEMORY_NON_CONSISTENT;
> >> -}
> >> -
> >>   int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req)
> >>   {
> >>          int ret = vb2_verify_memory_type(q, req->memory, req->type);
> >>
> >>          fill_buf_caps(q, &req->capabilities);
> >> -       clear_consistency_attr(q, req->memory, &req->flags);
> >>          return ret ? ret : vb2_core_reqbufs(q, req->memory,
> >>                                              req->flags, &req->count);
> >>   }
> >> @@ -769,7 +760,6 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
> >>          unsigned i;
> >>
> >>          fill_buf_caps(q, &create->capabilities);
> >> -       clear_consistency_attr(q, create->memory, &create->flags);
> >>          create->index = q->num_buffers;
> >>          if (create->count == 0)
> >>                  return ret != -EBUSY ? ret : 0;
> >> @@ -998,7 +988,6 @@ int vb2_ioctl_reqbufs(struct file *file, void *priv,
> >>          int res = vb2_verify_memory_type(vdev->queue, p->memory, p->type);
> >>
> >>          fill_buf_caps(vdev->queue, &p->capabilities);
> >> -       clear_consistency_attr(vdev->queue, p->memory, &p->flags);
> >>          if (res)
> >>                  return res;
> >>          if (vb2_queue_is_busy(vdev, file))
> >> @@ -1021,7 +1010,6 @@ int vb2_ioctl_create_bufs(struct file *file, void *priv,
> >>
> >>          p->index = vdev->queue->num_buffers;
> >>          fill_buf_caps(vdev->queue, &p->capabilities);
> >> -       clear_consistency_attr(vdev->queue, p->memory, &p->flags);
> >>          /*
> >>           * If count == 0, then just check if memory and type are valid.
> >>           * Any -EBUSY result from vb2_verify_memory_type can be mapped to 0.
> >> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> >> index 52ef92049073e3..4c7f25b07e9375 100644
> >> --- a/include/media/videobuf2-core.h
> >> +++ b/include/media/videobuf2-core.h
> >> @@ -744,8 +744,7 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
> >>    * vb2_core_reqbufs() - Initiate streaming.
> >>    * @q:         pointer to &struct vb2_queue with videobuf2 queue.
> >>    * @memory:    memory type, as defined by &enum vb2_memory.
> >> - * @flags:     auxiliary queue/buffer management flags. Currently, the only
> >> - *             used flag is %V4L2_FLAG_MEMORY_NON_CONSISTENT.
> >> + * @flags:     auxiliary queue/buffer management flags.
> >>    * @count:     requested buffer count.
> >>    *
> >>    * Videobuf2 core helper to implement VIDIOC_REQBUF() operation. It is called
> >> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> >> index c7b70ff53bc1dd..5c00f63d9c1b58 100644
> >> --- a/include/uapi/linux/videodev2.h
> >> +++ b/include/uapi/linux/videodev2.h
> >> @@ -191,8 +191,6 @@ enum v4l2_memory {
> >>          V4L2_MEMORY_DMABUF           = 4,
> >>   };
> >>
> >> -#define V4L2_FLAG_MEMORY_NON_CONSISTENT                (1 << 0)
> >> -
> >>   /* see also http://vektor.theorem.ca/graphics/ycbcr/ */
> >>   enum v4l2_colorspace {
> >>          /*
> >> --
> >> 2.28.0
> >>
> >> _______________________________________________
> >> iommu mailing list
> >> iommu@lists.linux-foundation.org
> >> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >
  
Christoph Hellwig Aug. 19, 2020, 1:54 p.m. UTC | #4
On Wed, Aug 19, 2020 at 01:16:51PM +0200, Tomasz Figa wrote:
> Hi Christoph,
> 
> On Wed, Aug 19, 2020 at 8:56 AM Christoph Hellwig <hch@lst.de> wrote:
> >
> > The V4L2-FLAG-MEMORY-NON-CONSISTENT flag is entirely unused,
> 
> Could you explain what makes you think it's unused? It's a feature of
> the UAPI generally supported by the videobuf2 framework and relied on
> by Chromium OS to get any kind of reasonable performance when
> accessing V4L2 buffers in the userspace.

Because it doesn't do anything except on PARISC and non-coherent MIPS,
so by definition it isn't used by any of these media drivers.
  
Christoph Hellwig Aug. 19, 2020, 1:57 p.m. UTC | #5
On Wed, Aug 19, 2020 at 02:49:01PM +0200, Tomasz Figa wrote:
> With the default config it doesn't, but with
> CONFIG_DMA_NONCOHERENT_CACHE_SYNC enabled it makes dma_pgprot() keep
> the pgprot value as is, without enforcing coherence attributes.

Which isn't selected on arm64, and that is for a good reason.

> AFAIK dma_cache_sync() isn't the only way to perform the cache
> synchronization.

Yes, it is the only documented way to do it.  And if you read the whole
series instead of screaming you'd see that it provides a proper way
to deal with non-coherent memory which will also work with arm64.
instead of screaming 

> By the way, as a videobuf2 reviewer, I'd appreciate being CC'd on any
> series related to the subsystem-facing DMA API changes, since
> videobuf2 is one of the biggest users of it.

The cc list is too long - I cc lists and key maintainers.  As a reviewer
should should watch your subsystems lists closely.
  
Tomasz Figa Aug. 19, 2020, 1:57 p.m. UTC | #6
On Wed, Aug 19, 2020 at 3:55 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Wed, Aug 19, 2020 at 01:16:51PM +0200, Tomasz Figa wrote:
> > Hi Christoph,
> >
> > On Wed, Aug 19, 2020 at 8:56 AM Christoph Hellwig <hch@lst.de> wrote:
> > >
> > > The V4L2-FLAG-MEMORY-NON-CONSISTENT flag is entirely unused,
> >
> > Could you explain what makes you think it's unused? It's a feature of
> > the UAPI generally supported by the videobuf2 framework and relied on
> > by Chromium OS to get any kind of reasonable performance when
> > accessing V4L2 buffers in the userspace.
>
> Because it doesn't do anything except on PARISC and non-coherent MIPS,
> so by definition it isn't used by any of these media drivers.

It's still an UAPI feature, so we can't simply remove the flag, it
must stay there as a no-op, until the problem is resolved.

Also, it of course might be disputable as an out-of-tree usage, but
selecting CONFIG_DMA_NONCOHERENT_CACHE_SYNC makes the flag actually do
something on other platforms, including ARM64.

Best regards,
Tomasz
  
Robin Murphy Aug. 19, 2020, 2:07 p.m. UTC | #7
On 2020-08-19 13:49, Tomasz Figa wrote:
> On Wed, Aug 19, 2020 at 1:51 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> Hi Tomasz,
>>
>> On 2020-08-19 12:16, Tomasz Figa wrote:
>>> Hi Christoph,
>>>
>>> On Wed, Aug 19, 2020 at 8:56 AM Christoph Hellwig <hch@lst.de> wrote:
>>>>
>>>> The V4L2-FLAG-MEMORY-NON-CONSISTENT flag is entirely unused,
>>>
>>> Could you explain what makes you think it's unused? It's a feature of
>>> the UAPI generally supported by the videobuf2 framework and relied on
>>> by Chromium OS to get any kind of reasonable performance when
>>> accessing V4L2 buffers in the userspace.
>>>
>>>> and causes
>>>> weird gymanstics with the DMA_ATTR_NON_CONSISTENT flag, which is
>>>> unimplemented except on PARISC and some MIPS configs, and about to be
>>>> removed.
>>>
>>> It is implemented by the generic DMA mapping layer [1], which is used
>>> by a number of architectures including ARM64 and supposed to be used
>>> by new architectures going forward.
>>
>> AFAICS all that V4L2_FLAG_MEMORY_NON_CONSISTENT does is end up
>> controling whether DMA_ATTR_NON_CONSISTENT is added to vb2_queue::dma_attrs.
>>
>> Please can you point to where DMA_ATTR_NON_CONSISTENT does anything at
>> all on arm64?
>>
> 
> With the default config it doesn't, but with
> CONFIG_DMA_NONCOHERENT_CACHE_SYNC enabled it makes dma_pgprot() keep
> the pgprot value as is, without enforcing coherence attributes.

How active are the PA-RISC and MIPS ports of Chromium OS?

Hacking CONFIG_DMA_NONCOHERENT_CACHE_SYNC into an architecture that 
doesn't provide dma_cache_sync() is wrong, since at worst it may break 
other drivers. If downstream is wildly misusing an API then so be it, 
but it's hardly a strong basis for an upstream argument.

>> Also, I posit that videobuf2 is not actually relying on
>> DMA_ATTR_NON_CONSISTENT anyway, since it's clearly not using it properly:
>>
>> "By using this API, you are guaranteeing to the platform
>> that you have all the correct and necessary sync points for this memory
>> in the driver should it choose to return non-consistent memory."
>>
>> $ git grep dma_cache_sync drivers/media
>> $
> 
> AFAIK dma_cache_sync() isn't the only way to perform the cache
> synchronization. The earlier patch series that I reviewed relied on
> dma_get_sgtable() and then dma_sync_sg_*() (which existed in the
> vb2-dc since forever [1]). However, it looks like with the final code
> the sgtable isn't acquired and the synchronization isn't happening, so
> you have a point.

Using the streaming sync calls on coherent allocations has also always 
been wrong per the API, regardless of the bodies of code that have 
happened to get away with it for so long.

> FWIW, I asked back in time what the plan is for non-coherent
> allocations and it seemed like DMA_ATTR_NON_CONSISTENT and
> dma_sync_*() was supposed to be the right thing to go with. [2] The
> same thread also explains why dma_alloc_pages() isn't suitable for the
> users of dma_alloc_attrs() and DMA_ATTR_NON_CONSISTENT.

AFAICS even back then Christoph was implying getting rid of 
NON_CONSISTENT and *replacing* it with something streaming-API-based - 
i.e. this series - not encouraging mixing the existing APIs. It doesn't 
seem impossible to implement a remapping version of this new 
dma_alloc_pages() for IOMMU-backed ops if it's really warranted 
(although at that point it seems like "non-coherent" vb2-dc starts to 
have significant conceptual overlap with vb2-sg).

Robin.
  
Tomasz Figa Aug. 19, 2020, 2:11 p.m. UTC | #8
On Wed, Aug 19, 2020 at 3:57 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Wed, Aug 19, 2020 at 02:49:01PM +0200, Tomasz Figa wrote:
> > With the default config it doesn't, but with
> > CONFIG_DMA_NONCOHERENT_CACHE_SYNC enabled it makes dma_pgprot() keep
> > the pgprot value as is, without enforcing coherence attributes.
>
> Which isn't selected on arm64, and that is for a good reason.
>
> > AFAIK dma_cache_sync() isn't the only way to perform the cache
> > synchronization.
>
> Yes, it is the only documented way to do it.  And if you read the whole
> series instead of screaming you'd see that it provides a proper way
> to deal with non-coherent memory which will also work with arm64.
> instead of screaming
>

I'm sorry if I have offended you in any way, but would also appreciate
it if a less aggressive tone was directed towards me as well.

I have valid reasons to object to this patch, as stated in my previous
emails. The fact that the original feature has problems is of course
another story and, as I mentioned too, I'm willing to look into fixing
them.

I'm of course happy to review the rest of the series and even more
happy to help migrating this code to whatever is added there, as long
as the functionality is preserved.

> > By the way, as a videobuf2 reviewer, I'd appreciate being CC'd on any
> > series related to the subsystem-facing DMA API changes, since
> > videobuf2 is one of the biggest users of it.
>
> The cc list is too long - I cc lists and key maintainers.  As a reviewer
> should should watch your subsystems lists closely.

Well, I guess we can disagree on this, because there is no clear
policy. I'm listed in the MAINTAINERS file for the subsystem and I
believe the purpose of the file is to list the people to CC on
relevant patches. We're all overloaded with work and having to look
through the huge volume of mailing lists like linux-media doesn't help
and thus I'd still appreciate being added on CC.

Best regards,
Tomasz
  
Tomasz Figa Aug. 19, 2020, 2:22 p.m. UTC | #9
On Wed, Aug 19, 2020 at 4:07 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2020-08-19 13:49, Tomasz Figa wrote:
> > On Wed, Aug 19, 2020 at 1:51 PM Robin Murphy <robin.murphy@arm.com> wrote:
> >>
> >> Hi Tomasz,
> >>
> >> On 2020-08-19 12:16, Tomasz Figa wrote:
> >>> Hi Christoph,
> >>>
> >>> On Wed, Aug 19, 2020 at 8:56 AM Christoph Hellwig <hch@lst.de> wrote:
> >>>>
> >>>> The V4L2-FLAG-MEMORY-NON-CONSISTENT flag is entirely unused,
> >>>
> >>> Could you explain what makes you think it's unused? It's a feature of
> >>> the UAPI generally supported by the videobuf2 framework and relied on
> >>> by Chromium OS to get any kind of reasonable performance when
> >>> accessing V4L2 buffers in the userspace.
> >>>
> >>>> and causes
> >>>> weird gymanstics with the DMA_ATTR_NON_CONSISTENT flag, which is
> >>>> unimplemented except on PARISC and some MIPS configs, and about to be
> >>>> removed.
> >>>
> >>> It is implemented by the generic DMA mapping layer [1], which is used
> >>> by a number of architectures including ARM64 and supposed to be used
> >>> by new architectures going forward.
> >>
> >> AFAICS all that V4L2_FLAG_MEMORY_NON_CONSISTENT does is end up
> >> controling whether DMA_ATTR_NON_CONSISTENT is added to vb2_queue::dma_attrs.
> >>
> >> Please can you point to where DMA_ATTR_NON_CONSISTENT does anything at
> >> all on arm64?
> >>
> >
> > With the default config it doesn't, but with
> > CONFIG_DMA_NONCOHERENT_CACHE_SYNC enabled it makes dma_pgprot() keep
> > the pgprot value as is, without enforcing coherence attributes.
>
> How active are the PA-RISC and MIPS ports of Chromium OS?

Not active. We enable CONFIG_DMA_NONCOHERENT_CACHE_SYNC for ARM64,
given the directions received back in April when discussing the
noncoherent memory functionality on the mailing list in the thread I
pointed out in my previous message and no clarification on why it is
disabled for ARM64 in upstream, despite making several attempts to get
some.

>
> Hacking CONFIG_DMA_NONCOHERENT_CACHE_SYNC into an architecture that
> doesn't provide dma_cache_sync() is wrong, since at worst it may break
> other drivers. If downstream is wildly misusing an API then so be it,
> but it's hardly a strong basis for an upstream argument.

I guess it means that we're wildly misusing the API, but it still does
work. Could you explain how it could break other drivers?

>
> >> Also, I posit that videobuf2 is not actually relying on
> >> DMA_ATTR_NON_CONSISTENT anyway, since it's clearly not using it properly:
> >>
> >> "By using this API, you are guaranteeing to the platform
> >> that you have all the correct and necessary sync points for this memory
> >> in the driver should it choose to return non-consistent memory."
> >>
> >> $ git grep dma_cache_sync drivers/media
> >> $
> >
> > AFAIK dma_cache_sync() isn't the only way to perform the cache
> > synchronization. The earlier patch series that I reviewed relied on
> > dma_get_sgtable() and then dma_sync_sg_*() (which existed in the
> > vb2-dc since forever [1]). However, it looks like with the final code
> > the sgtable isn't acquired and the synchronization isn't happening, so
> > you have a point.
>
> Using the streaming sync calls on coherent allocations has also always
> been wrong per the API, regardless of the bodies of code that have
> happened to get away with it for so long.
>
> > FWIW, I asked back in time what the plan is for non-coherent
> > allocations and it seemed like DMA_ATTR_NON_CONSISTENT and
> > dma_sync_*() was supposed to be the right thing to go with. [2] The
> > same thread also explains why dma_alloc_pages() isn't suitable for the
> > users of dma_alloc_attrs() and DMA_ATTR_NON_CONSISTENT.
>
> AFAICS even back then Christoph was implying getting rid of
> NON_CONSISTENT and *replacing* it with something streaming-API-based -

That's not how I read his reply from the thread I pointed to, but that
might of course be my misunderstanding.

> i.e. this series - not encouraging mixing the existing APIs. It doesn't
> seem impossible to implement a remapping version of this new
> dma_alloc_pages() for IOMMU-backed ops if it's really warranted
> (although at that point it seems like "non-coherent" vb2-dc starts to
> have significant conceptual overlap with vb2-sg).

No, there is no overlap between vb2-dc and vb2-sg. They differ on
another level - the former is to be used by devices without
scatter-gather or internal mapping capabilities and gives the driver a
single DMA address for the whole buffer, regardless of whether it's
IOVA-contiguous (for devices behind an IOMMU) or physically contiguous
(for the others), while the latter gives the driver an sgtable, which
of course may be DMA-contiguous internally, but doesn't have to and
usually isn't. This model makes it possible to hide the SoC
implementation details from particular drivers, since those are very
often reused on many SoCs which differ in the availability of IOMMU,
DMA addressing restrictions and so on.

Best regards,
Tomasz
  
Christoph Hellwig Aug. 20, 2020, 4:43 a.m. UTC | #10
On Wed, Aug 19, 2020 at 03:57:53PM +0200, Tomasz Figa wrote:
> > > Could you explain what makes you think it's unused? It's a feature of
> > > the UAPI generally supported by the videobuf2 framework and relied on
> > > by Chromium OS to get any kind of reasonable performance when
> > > accessing V4L2 buffers in the userspace.
> >
> > Because it doesn't do anything except on PARISC and non-coherent MIPS,
> > so by definition it isn't used by any of these media drivers.
> 
> It's still an UAPI feature, so we can't simply remove the flag, it
> must stay there as a no-op, until the problem is resolved.

Ok, I'll switch to just ignoring it for the next version.

> Also, it of course might be disputable as an out-of-tree usage, but
> selecting CONFIG_DMA_NONCOHERENT_CACHE_SYNC makes the flag actually do
> something on other platforms, including ARM64.

It isn't just disputable, but by kernel policies simply is not relevant.
  
Christoph Hellwig Aug. 20, 2020, 4:45 a.m. UTC | #11
On Wed, Aug 19, 2020 at 04:11:52PM +0200, Tomasz Figa wrote:
> > > By the way, as a videobuf2 reviewer, I'd appreciate being CC'd on any
> > > series related to the subsystem-facing DMA API changes, since
> > > videobuf2 is one of the biggest users of it.
> >
> > The cc list is too long - I cc lists and key maintainers.  As a reviewer
> > should should watch your subsystems lists closely.
> 
> Well, I guess we can disagree on this, because there is no clear
> policy. I'm listed in the MAINTAINERS file for the subsystem and I
> believe the purpose of the file is to list the people to CC on
> relevant patches. We're all overloaded with work and having to look
> through the huge volume of mailing lists like linux-media doesn't help
> and thus I'd still appreciate being added on CC.

I'm happy to Cc and active participant in the discussion.  I'm not
going to add all reviewers because even with the trimmed CC list
I'm already hitting the number of receipients limit on various lists.
  
Christoph Hellwig Aug. 20, 2020, 4:52 a.m. UTC | #12
On Wed, Aug 19, 2020 at 04:22:29PM +0200, Tomasz Figa wrote:
> > > FWIW, I asked back in time what the plan is for non-coherent
> > > allocations and it seemed like DMA_ATTR_NON_CONSISTENT and
> > > dma_sync_*() was supposed to be the right thing to go with. [2] The
> > > same thread also explains why dma_alloc_pages() isn't suitable for the
> > > users of dma_alloc_attrs() and DMA_ATTR_NON_CONSISTENT.
> >
> > AFAICS even back then Christoph was implying getting rid of
> > NON_CONSISTENT and *replacing* it with something streaming-API-based -
> 
> That's not how I read his reply from the thread I pointed to, but that
> might of course be my misunderstanding.

Yes.  Without changes like in this series just calling dma_sync_single_*
will break in various cases, e.g. because dma_alloc_attrs returns
memory remapped in the vmalloc space, and the dma_sync_single_*
implementation implementation can't cope with vmalloc addresses.
  
Christoph Hellwig Aug. 20, 2020, 5:02 a.m. UTC | #13
On Wed, Aug 19, 2020 at 03:07:04PM +0100, Robin Murphy wrote:
>> FWIW, I asked back in time what the plan is for non-coherent
>> allocations and it seemed like DMA_ATTR_NON_CONSISTENT and
>> dma_sync_*() was supposed to be the right thing to go with. [2] The
>> same thread also explains why dma_alloc_pages() isn't suitable for the
>> users of dma_alloc_attrs() and DMA_ATTR_NON_CONSISTENT.
>
> AFAICS even back then Christoph was implying getting rid of NON_CONSISTENT 
> and *replacing* it with something streaming-API-based - i.e. this series - 
> not encouraging mixing the existing APIs. It doesn't seem impossible to 
> implement a remapping version of this new dma_alloc_pages() for 
> IOMMU-backed ops if it's really warranted (although at that point it seems 
> like "non-coherent" vb2-dc starts to have significant conceptual overlap 
> with vb2-sg).

You can alway vmap the returned pages from dma_alloc_pages, but it will
make cache invalidation hell - you'll need to use
invalidate_kernel_vmap_range and flush_kernel_vmap_range to properly
handle virtually indexed caches.

Or with remapping you mean using the iommu do de-scatter/gather?

You can implement that trivially implement it yourself for the iommu
case:

{
	merge_boundary = dma_get_merge_boundary(dev);
	if (!merge_boundary || merge_boundary > chunk_size - 1) {
		/* can't coalesce */
		return -EINVAL;
	}

	
	nents = DIV_ROUND_UP(total_size, chunk_size);
	sg = sgl_alloc();
	for_each_sgl() {
		sg->page = __alloc_pages(get_order(chunk_size))
		sg->len = chunk_size;
	}
	dma_map_sg(sg, DMA_ATTR_SKIP_CPU_SYNC);
	// you are guaranteed to get a single dma_addr out
}

Of course this still uses the scatterlist structure with its annoying
mix of input and output parametes, so I'd rather not expose it as
an official API at the DMA layer.
  
Christoph Hellwig Aug. 20, 2020, 5:20 a.m. UTC | #14
On Thu, Aug 20, 2020 at 06:43:47AM +0200, Christoph Hellwig wrote:
> On Wed, Aug 19, 2020 at 03:57:53PM +0200, Tomasz Figa wrote:
> > > > Could you explain what makes you think it's unused? It's a feature of
> > > > the UAPI generally supported by the videobuf2 framework and relied on
> > > > by Chromium OS to get any kind of reasonable performance when
> > > > accessing V4L2 buffers in the userspace.
> > >
> > > Because it doesn't do anything except on PARISC and non-coherent MIPS,
> > > so by definition it isn't used by any of these media drivers.
> > 
> > It's still an UAPI feature, so we can't simply remove the flag, it
> > must stay there as a no-op, until the problem is resolved.
> 
> Ok, I'll switch to just ignoring it for the next version.

So I took a deeper look.  I don't really think it qualifies as a UAPI
in our traditional sense.  For one it only appeared in 5.9-rc1, so we
can trivially expedite the patch into 5.9-rc and not actually make it
show up in any released kernel version.  And even as of the current
Linus' tree the only user is a test driver.  So I really think the best
way to go ahead is to just revert it ASAP as the design wasn't thought
out at all.
  
Tomasz Figa Aug. 20, 2020, 10:05 a.m. UTC | #15
On Thu, Aug 20, 2020 at 7:20 AM Christoph Hellwig <hch@lst.de> wrote:
>
> On Thu, Aug 20, 2020 at 06:43:47AM +0200, Christoph Hellwig wrote:
> > On Wed, Aug 19, 2020 at 03:57:53PM +0200, Tomasz Figa wrote:
> > > > > Could you explain what makes you think it's unused? It's a feature of
> > > > > the UAPI generally supported by the videobuf2 framework and relied on
> > > > > by Chromium OS to get any kind of reasonable performance when
> > > > > accessing V4L2 buffers in the userspace.
> > > >
> > > > Because it doesn't do anything except on PARISC and non-coherent MIPS,
> > > > so by definition it isn't used by any of these media drivers.
> > >
> > > It's still an UAPI feature, so we can't simply remove the flag, it
> > > must stay there as a no-op, until the problem is resolved.
> >
> > Ok, I'll switch to just ignoring it for the next version.
>
> So I took a deeper look.  I don't really think it qualifies as a UAPI
> in our traditional sense.  For one it only appeared in 5.9-rc1, so we
> can trivially expedite the patch into 5.9-rc and not actually make it
> show up in any released kernel version.  And even as of the current
> Linus' tree the only user is a test driver.  So I really think the best
> way to go ahead is to just revert it ASAP as the design wasn't thought
> out at all.

The UAPI and V4L2/videobuf2 changes are in good shape and the only
wrong part is the use of DMA API, which was based on an earlier email
guidance anyway, and a change to the synchronization part . I find
conclusions like the above insulting for people who put many hours
into designing and implementing the related functionality, given the
complexity of the videobuf2 framework and how ill-defined the DMA API
was, and would feel better if such could be avoided in future
communication.

That said, we can revert it on the basis of the implementation issues,
but I feel like we wouldn't get anything by doing so, because as I
said, the design is sane and most of the implementation is fine as
well. Instead. I'd suggest simply removing the use of the attribute
being removed, so that the feature stays no-op until the DMA API
provides a way to implement it or we just migrate videobuf2 to stop
using the DMA API as much as possible, like many drivers in the DRM
subsystem did.

Best regards,
Tomasz
  
Tomasz Figa Aug. 20, 2020, 10:09 a.m. UTC | #16
On Thu, Aug 20, 2020 at 6:45 AM Christoph Hellwig <hch@lst.de> wrote:
>
> On Wed, Aug 19, 2020 at 04:11:52PM +0200, Tomasz Figa wrote:
> > > > By the way, as a videobuf2 reviewer, I'd appreciate being CC'd on any
> > > > series related to the subsystem-facing DMA API changes, since
> > > > videobuf2 is one of the biggest users of it.
> > >
> > > The cc list is too long - I cc lists and key maintainers.  As a reviewer
> > > should should watch your subsystems lists closely.
> >
> > Well, I guess we can disagree on this, because there is no clear
> > policy. I'm listed in the MAINTAINERS file for the subsystem and I
> > believe the purpose of the file is to list the people to CC on
> > relevant patches. We're all overloaded with work and having to look
> > through the huge volume of mailing lists like linux-media doesn't help
> > and thus I'd still appreciate being added on CC.
>
> I'm happy to Cc and active participant in the discussion.  I'm not
> going to add all reviewers because even with the trimmed CC list
> I'm already hitting the number of receipients limit on various lists.

Fair enough.

We'll make your job easier and just turn my MAINTAINERS entry into a
maintainer. :)

Best regards,
Tomasz
  
Tomasz Figa Aug. 20, 2020, 10:24 a.m. UTC | #17
On Thu, Aug 20, 2020 at 7:02 AM Christoph Hellwig <hch@lst.de> wrote:
>
> On Wed, Aug 19, 2020 at 03:07:04PM +0100, Robin Murphy wrote:
> >> FWIW, I asked back in time what the plan is for non-coherent
> >> allocations and it seemed like DMA_ATTR_NON_CONSISTENT and
> >> dma_sync_*() was supposed to be the right thing to go with. [2] The
> >> same thread also explains why dma_alloc_pages() isn't suitable for the
> >> users of dma_alloc_attrs() and DMA_ATTR_NON_CONSISTENT.
> >
> > AFAICS even back then Christoph was implying getting rid of NON_CONSISTENT
> > and *replacing* it with something streaming-API-based - i.e. this series -
> > not encouraging mixing the existing APIs. It doesn't seem impossible to
> > implement a remapping version of this new dma_alloc_pages() for
> > IOMMU-backed ops if it's really warranted (although at that point it seems
> > like "non-coherent" vb2-dc starts to have significant conceptual overlap
> > with vb2-sg).
>
> You can alway vmap the returned pages from dma_alloc_pages, but it will
> make cache invalidation hell - you'll need to use
> invalidate_kernel_vmap_range and flush_kernel_vmap_range to properly
> handle virtually indexed caches.
>
> Or with remapping you mean using the iommu do de-scatter/gather?

Ideally, both.

For remapping in the CPU sense, there are drivers which rely on a
contiguous kernel mapping of the vb2 buffers, which was provided by
dma_alloc_attrs(). I think they could be reworked to work on single
pages, but that would significantly complicate the code. At the same
time, such drivers would actually benefit from a cached mapping,
because they often have non-bursty, random access patterns.

Then, in the IOMMU sense, the whole idea of videobuf2-dma-contig is to
rely on the DMA API to always provide device-contiguous memory, as
required by the hardware which only has a single pointer and size.

>
> You can implement that trivially implement it yourself for the iommu
> case:
>
> {
>         merge_boundary = dma_get_merge_boundary(dev);
>         if (!merge_boundary || merge_boundary > chunk_size - 1) {
>                 /* can't coalesce */
>                 return -EINVAL;
>         }
>
>
>         nents = DIV_ROUND_UP(total_size, chunk_size);
>         sg = sgl_alloc();
>         for_each_sgl() {
>                 sg->page = __alloc_pages(get_order(chunk_size))
>                 sg->len = chunk_size;
>         }
>         dma_map_sg(sg, DMA_ATTR_SKIP_CPU_SYNC);
>         // you are guaranteed to get a single dma_addr out
> }
>
> Of course this still uses the scatterlist structure with its annoying
> mix of input and output parametes, so I'd rather not expose it as
> an official API at the DMA layer.

The problem with the above open coded approach is that it requires
explicit handling of the non-IOMMU and IOMMU cases and this is exactly
what we don't want to have in vb2 and what was actually the job of the
DMA API to hide. Is the plan to actually move the IOMMU handling out
of the DMA API?

Do you think we could instead turn it into a dma_alloc_noncoherent()
helper, which has similar semantics as dma_alloc_attrs() and handles
the various corner cases (e.g. invalidate_kernel_vmap_range and
flush_kernel_vmap_range) to achieve the desired functionality without
delegating the "hell", as you called it, to the users?

Best regards,
Tomasz
  
Christoph Hellwig Aug. 20, 2020, 4:51 p.m. UTC | #18
On Thu, Aug 20, 2020 at 12:09:34PM +0200, Tomasz Figa wrote:
> > I'm happy to Cc and active participant in the discussion.  I'm not
> > going to add all reviewers because even with the trimmed CC list
> > I'm already hitting the number of receipients limit on various lists.
> 
> Fair enough.
> 
> We'll make your job easier and just turn my MAINTAINERS entry into a
> maintainer. :)

Sounds like a plan.
  
Christoph Hellwig Aug. 20, 2020, 4:52 p.m. UTC | #19
On Thu, Aug 20, 2020 at 12:24:31PM +0200, Tomasz Figa wrote:
> > Of course this still uses the scatterlist structure with its annoying
> > mix of input and output parametes, so I'd rather not expose it as
> > an official API at the DMA layer.
> 
> The problem with the above open coded approach is that it requires
> explicit handling of the non-IOMMU and IOMMU cases and this is exactly
> what we don't want to have in vb2 and what was actually the job of the
> DMA API to hide. Is the plan to actually move the IOMMU handling out
> of the DMA API?
> 
> Do you think we could instead turn it into a dma_alloc_noncoherent()
> helper, which has similar semantics as dma_alloc_attrs() and handles
> the various corner cases (e.g. invalidate_kernel_vmap_range and
> flush_kernel_vmap_range) to achieve the desired functionality without
> delegating the "hell", as you called it, to the users?

Yes, I guess I could do something in that direction.  At least for
dma-iommu, which thanks to Robin should be all you'll need in the
foreseeable future.
  
Christoph Hellwig Aug. 20, 2020, 4:54 p.m. UTC | #20
On Thu, Aug 20, 2020 at 12:05:29PM +0200, Tomasz Figa wrote:
> The UAPI and V4L2/videobuf2 changes are in good shape and the only
> wrong part is the use of DMA API, which was based on an earlier email
> guidance anyway, and a change to the synchronization part . I find
> conclusions like the above insulting for people who put many hours
> into designing and implementing the related functionality, given the
> complexity of the videobuf2 framework and how ill-defined the DMA API
> was, and would feel better if such could be avoided in future
> communication.

It wasn't meant to be too insulting, but I found this out when trying
to figure out how to just disable it.  But it also ends up using
the actual dma attr flags for it's own consistency checks, so just
not setting the flag did not turn out to work that easily.

But in general it helps to add a few more people to the Cc list for
such things that do stranger things.  Especially if you think you did
it based on the advice of those people.
  
Tomasz Figa Aug. 20, 2020, 5:33 p.m. UTC | #21
On Thu, Aug 20, 2020 at 6:54 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Thu, Aug 20, 2020 at 12:05:29PM +0200, Tomasz Figa wrote:
> > The UAPI and V4L2/videobuf2 changes are in good shape and the only
> > wrong part is the use of DMA API, which was based on an earlier email
> > guidance anyway, and a change to the synchronization part . I find
> > conclusions like the above insulting for people who put many hours
> > into designing and implementing the related functionality, given the
> > complexity of the videobuf2 framework and how ill-defined the DMA API
> > was, and would feel better if such could be avoided in future
> > communication.
>
> It wasn't meant to be too insulting, but I found this out when trying
> to figure out how to just disable it.  But it also ends up using
> the actual dma attr flags for it's own consistency checks, so just
> not setting the flag did not turn out to work that easily.
>

Yes, sadly the videobuf2 ended up becoming quite counterintuitive
after growing for the long years and that is reflected in the design
of this feature as well. I think we need to do something about it.

> But in general it helps to add a few more people to the Cc list for
> such things that do stranger things.  Especially if you think you did
> it based on the advice of those people.

Indeed, we should have CCed you and other DMA folks. Sergey who worked
on this series is quite new to these areas of the kernel (although not
to the kernel itself) and it's my fault for not explicitly letting him
know to do that.

Best regards,
Tomasz
  
Tomasz Figa Aug. 20, 2020, 5:41 p.m. UTC | #22
On Thu, Aug 20, 2020 at 6:52 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Thu, Aug 20, 2020 at 12:24:31PM +0200, Tomasz Figa wrote:
> > > Of course this still uses the scatterlist structure with its annoying
> > > mix of input and output parametes, so I'd rather not expose it as
> > > an official API at the DMA layer.
> >
> > The problem with the above open coded approach is that it requires
> > explicit handling of the non-IOMMU and IOMMU cases and this is exactly
> > what we don't want to have in vb2 and what was actually the job of the
> > DMA API to hide. Is the plan to actually move the IOMMU handling out
> > of the DMA API?
> >
> > Do you think we could instead turn it into a dma_alloc_noncoherent()
> > helper, which has similar semantics as dma_alloc_attrs() and handles
> > the various corner cases (e.g. invalidate_kernel_vmap_range and
> > flush_kernel_vmap_range) to achieve the desired functionality without
> > delegating the "hell", as you called it, to the users?
>
> Yes, I guess I could do something in that direction.  At least for
> dma-iommu, which thanks to Robin should be all you'll need in the
> foreseeable future.

That would be really great. Let me know if we can help by testing with
V4L2/vb2 or in any other way.

Best regards,
Tomasz
  
Christoph Hellwig Sept. 1, 2020, 11:06 a.m. UTC | #23
On Thu, Aug 20, 2020 at 07:33:48PM +0200, Tomasz Figa wrote:
> > It wasn't meant to be too insulting, but I found this out when trying
> > to figure out how to just disable it.  But it also ends up using
> > the actual dma attr flags for it's own consistency checks, so just
> > not setting the flag did not turn out to work that easily.
> >
> 
> Yes, sadly the videobuf2 ended up becoming quite counterintuitive
> after growing for the long years and that is reflected in the design
> of this feature as well. I think we need to do something about it.

So I'm about to respin the series and wonder how we should proceed.
I've failed to come up with a clean patch to keep the flag and make
it a no-op.  Can you or your team give it a spin?

Also I wonder if the flag should be renamed from NON_CONSISTENT
to NON_COHERENT - the consistent thing is a weird wart from the times
the old PCI DMA API that is mostly gone now.
  
Tomasz Figa Sept. 1, 2020, 3:02 p.m. UTC | #24
On Tue, Sep 1, 2020 at 1:06 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Thu, Aug 20, 2020 at 07:33:48PM +0200, Tomasz Figa wrote:
> > > It wasn't meant to be too insulting, but I found this out when trying
> > > to figure out how to just disable it.  But it also ends up using
> > > the actual dma attr flags for it's own consistency checks, so just
> > > not setting the flag did not turn out to work that easily.
> > >
> >
> > Yes, sadly the videobuf2 ended up becoming quite counterintuitive
> > after growing for the long years and that is reflected in the design
> > of this feature as well. I think we need to do something about it.
>
> So I'm about to respin the series and wonder how we should proceed.
> I've failed to come up with a clean patch to keep the flag and make
> it a no-op.  Can you or your team give it a spin?
>

Okay, I'll take a look.

> Also I wonder if the flag should be renamed from NON_CONSISTENT
> to NON_COHERENT - the consistent thing is a weird wart from the times
> the old PCI DMA API that is mostly gone now.

It originated from the DMA_ATTR_NON_CONSISTENT flag, but agreed that
NON_COHERENT would be more consistent (pun not intended) with the rest
of the DMA API given the removal of that flag. Let me see if we can
still change it.

Best regards,
Tomasz
  
Tomasz Figa Sept. 8, 2020, 9:58 p.m. UTC | #25
Hi Hans, Mauro,

On Tue, Sep 1, 2020 at 5:02 PM Tomasz Figa <tfiga@chromium.org> wrote:
>
> On Tue, Sep 1, 2020 at 1:06 PM Christoph Hellwig <hch@lst.de> wrote:
> >
> > On Thu, Aug 20, 2020 at 07:33:48PM +0200, Tomasz Figa wrote:
> > > > It wasn't meant to be too insulting, but I found this out when trying
> > > > to figure out how to just disable it.  But it also ends up using
> > > > the actual dma attr flags for it's own consistency checks, so just
> > > > not setting the flag did not turn out to work that easily.
> > > >
> > >
> > > Yes, sadly the videobuf2 ended up becoming quite counterintuitive
> > > after growing for the long years and that is reflected in the design
> > > of this feature as well. I think we need to do something about it.
> >
> > So I'm about to respin the series and wonder how we should proceed.
> > I've failed to come up with a clean patch to keep the flag and make
> > it a no-op.  Can you or your team give it a spin?
> >
>
> Okay, I'll take a look.
>
> > Also I wonder if the flag should be renamed from NON_CONSISTENT
> > to NON_COHERENT - the consistent thing is a weird wart from the times
> > the old PCI DMA API that is mostly gone now.
>
> It originated from the DMA_ATTR_NON_CONSISTENT flag, but agreed that
> NON_COHERENT would be more consistent (pun not intended) with the rest
> of the DMA API given the removal of that flag. Let me see if we can
> still change it.

Given the above, we would like to make changes that affect the UAPI.
Would you still be able to revert this series?

Best regards,
Tomasz
  
Tomasz Figa Sept. 8, 2020, 10:09 p.m. UTC | #26
On Tue, Sep 8, 2020 at 11:58 PM Tomasz Figa <tfiga@chromium.org> wrote:
>
> Hi Hans, Mauro,
>
> On Tue, Sep 1, 2020 at 5:02 PM Tomasz Figa <tfiga@chromium.org> wrote:
> >
> > On Tue, Sep 1, 2020 at 1:06 PM Christoph Hellwig <hch@lst.de> wrote:
> > >
> > > On Thu, Aug 20, 2020 at 07:33:48PM +0200, Tomasz Figa wrote:
> > > > > It wasn't meant to be too insulting, but I found this out when trying
> > > > > to figure out how to just disable it.  But it also ends up using
> > > > > the actual dma attr flags for it's own consistency checks, so just
> > > > > not setting the flag did not turn out to work that easily.
> > > > >
> > > >
> > > > Yes, sadly the videobuf2 ended up becoming quite counterintuitive
> > > > after growing for the long years and that is reflected in the design
> > > > of this feature as well. I think we need to do something about it.
> > >
> > > So I'm about to respin the series and wonder how we should proceed.
> > > I've failed to come up with a clean patch to keep the flag and make
> > > it a no-op.  Can you or your team give it a spin?
> > >
> >
> > Okay, I'll take a look.
> >
> > > Also I wonder if the flag should be renamed from NON_CONSISTENT
> > > to NON_COHERENT - the consistent thing is a weird wart from the times
> > > the old PCI DMA API that is mostly gone now.
> >
> > It originated from the DMA_ATTR_NON_CONSISTENT flag, but agreed that
> > NON_COHERENT would be more consistent (pun not intended) with the rest
> > of the DMA API given the removal of that flag. Let me see if we can
> > still change it.
>
> Given the above, we would like to make changes that affect the UAPI.
> Would you still be able to revert this series?

Sorry, I just realized that this isn't the original series that
introduced the thing, but rather a patch that does a partial revert. I
think it could be also applied as an alternative for the revert, but
perhaps a full series revert is just safer at this point?

For reference, the series in question is:
https://patchwork.linuxtv.org/project/linux-media/cover/20200514160153.3646-1-sergey.senozhatsky@gmail.com/

Best regards,
tomasz
  
Sergey Senozhatsky Sept. 10, 2020, 9:49 a.m. UTC | #27
Hi,

On (20/09/08 23:58), Tomasz Figa wrote:
> 
> Given the above, we would like to make changes that affect the UAPI.
> Would you still be able to revert this series?
>

If we want to apply only "media/v4l2: remove V4L2-FLAG-MEMORY-NON-CONSISTENT"
patch and keep the rest of the buffer cache hints series in the kernel, then
I'd add one or two more patches. We don't need ->flags argument in
create_bufs() and reqbufs() functions, that argument was introduced in order
to pass in the requested coherency flag.


Now.

Then we have a question - do we need flags member in struct
v4l2_requestbuffers and v4l2_create_buffers or shall we just
return back those 4 bytes to reserved[]? We pass BUF_FLAG_NO_CACHE_INVALIDATE
and V4L2_BUF_FLAG_NO_CACHE_SYNC in struct v4l2_buffer.flags.

If we decide to remove v4l2_requestbuffers and v4l2_create_buffers ->flags,
then we also need to rollback documentation changes.

Then we need to change CLEAR_AFTER_FIELD(foo, capabilities) in
v4l2-ioctl to zero out reserved[] areas in v4l2_requestbuffers
and v4l2_create_buffers. I think v4l2_create_buffers is fine,
but requstbuffers has flags and reversed[1] in the union so for
requestbuffers we simply removed the CLEAR_AFTER_FIELD() and
hence dropped the corresponding check from v4l-compliance.

So, do we plan on using .flags in v4l2_requestbuffers and
v4l2_create_buffers?


- create_bufs()/reqbufs() flags argument removal patch
(no struct-s/documentation cleanup yet).

====8<====
Subject: [PATCH] remove redundant flags argument

---
 drivers/media/common/videobuf2/videobuf2-core.c | 10 +++++-----
 drivers/media/common/videobuf2/videobuf2-v4l2.c |  6 ++----
 include/media/videobuf2-core.h                  |  6 ++----
 3 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
index 66a41cef33c1..4eab6d81cce1 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -722,7 +722,7 @@ int vb2_verify_memory_type(struct vb2_queue *q,
 EXPORT_SYMBOL(vb2_verify_memory_type);
 
 int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
-		     unsigned int flags, unsigned int *count)
+		     unsigned int *count)
 {
 	unsigned int num_buffers, allocated_buffers, num_planes = 0;
 	unsigned plane_sizes[VB2_MAX_PLANES] = { };
@@ -861,7 +861,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
 EXPORT_SYMBOL_GPL(vb2_core_reqbufs);
 
 int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
-			 unsigned int flags, unsigned int *count,
+			 unsigned int *count,
 			 unsigned int requested_planes,
 			 const unsigned int requested_sizes[])
 {
@@ -2547,7 +2547,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
 	fileio->memory = VB2_MEMORY_MMAP;
 	fileio->type = q->type;
 	q->fileio = fileio;
-	ret = vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);
+	ret = vb2_core_reqbufs(q, fileio->memory, &fileio->count);
 	if (ret)
 		goto err_kfree;
 
@@ -2604,7 +2604,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
 
 err_reqbufs:
 	fileio->count = 0;
-	vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);
+	vb2_core_reqbufs(q, fileio->memory, &fileio->count);
 
 err_kfree:
 	q->fileio = NULL;
@@ -2624,7 +2624,7 @@ static int __vb2_cleanup_fileio(struct vb2_queue *q)
 		vb2_core_streamoff(q, q->type);
 		q->fileio = NULL;
 		fileio->count = 0;
-		vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);
+		vb2_core_reqbufs(q, fileio->memory, &fileio->count);
 		kfree(fileio);
 		dprintk(q, 3, "file io emulator closed\n");
 	}
diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index d2879f53455b..96d3b2b2aa31 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -728,8 +728,7 @@ int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req)
 	int ret = vb2_verify_memory_type(q, req->memory, req->type);
 
 	fill_buf_caps(q, &req->capabilities);
-	return ret ? ret : vb2_core_reqbufs(q, req->memory,
-					    req->flags, &req->count);
+	return ret ? ret : vb2_core_reqbufs(q, req->memory, &req->count);
 }
 EXPORT_SYMBOL_GPL(vb2_reqbufs);
 
@@ -804,7 +803,6 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
 		if (requested_sizes[i] == 0)
 			return -EINVAL;
 	return ret ? ret : vb2_core_create_bufs(q, create->memory,
-						create->flags,
 						&create->count,
 						requested_planes,
 						requested_sizes);
@@ -993,7 +991,7 @@ int vb2_ioctl_reqbufs(struct file *file, void *priv,
 		return res;
 	if (vb2_queue_is_busy(vdev, file))
 		return -EBUSY;
-	res = vb2_core_reqbufs(vdev->queue, p->memory, p->flags, &p->count);
+	res = vb2_core_reqbufs(vdev->queue, p->memory, &p->count);
 	/* If count == 0, then the owner has released all buffers and he
 	   is no longer owner of the queue. Otherwise we have a new owner. */
 	if (res == 0)
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index 4c7f25b07e93..bbb3f26fbde9 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -744,7 +744,6 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
  * vb2_core_reqbufs() - Initiate streaming.
  * @q:		pointer to &struct vb2_queue with videobuf2 queue.
  * @memory:	memory type, as defined by &enum vb2_memory.
- * @flags:	auxiliary queue/buffer management flags.
  * @count:	requested buffer count.
  *
  * Videobuf2 core helper to implement VIDIOC_REQBUF() operation. It is called
@@ -769,13 +768,12 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
  * Return: returns zero on success; an error code otherwise.
  */
 int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
-		    unsigned int flags, unsigned int *count);
+		    unsigned int *count);
 
 /**
  * vb2_core_create_bufs() - Allocate buffers and any required auxiliary structs
  * @q: pointer to &struct vb2_queue with videobuf2 queue.
  * @memory: memory type, as defined by &enum vb2_memory.
- * @flags: auxiliary queue/buffer management flags.
  * @count: requested buffer count.
  * @requested_planes: number of planes requested.
  * @requested_sizes: array with the size of the planes.
@@ -793,7 +791,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
  * Return: returns zero on success; an error code otherwise.
  */
 int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
-			 unsigned int flags, unsigned int *count,
+			 unsigned int *count,
 			 unsigned int requested_planes,
 			 const unsigned int requested_sizes[]);
  
Hans Verkuil Sept. 10, 2020, 9:57 a.m. UTC | #28
On 10/09/2020 11:49, Sergey Senozhatsky wrote:
> Hi,
> 
> On (20/09/08 23:58), Tomasz Figa wrote:
>>
>> Given the above, we would like to make changes that affect the UAPI.
>> Would you still be able to revert this series?
>>
> 
> If we want to apply only "media/v4l2: remove V4L2-FLAG-MEMORY-NON-CONSISTENT"
> patch and keep the rest of the buffer cache hints series in the kernel, then
> I'd add one or two more patches. We don't need ->flags argument in
> create_bufs() and reqbufs() functions, that argument was introduced in order
> to pass in the requested coherency flag.
> 
> 
> Now.
> 
> Then we have a question - do we need flags member in struct
> v4l2_requestbuffers and v4l2_create_buffers or shall we just
> return back those 4 bytes to reserved[]? We pass BUF_FLAG_NO_CACHE_INVALIDATE
> and V4L2_BUF_FLAG_NO_CACHE_SYNC in struct v4l2_buffer.flags.

Drop the now unused flags member in v4l2_requestbuffers and v4l2_create_buffers.

> 
> If we decide to remove v4l2_requestbuffers and v4l2_create_buffers ->flags,
> then we also need to rollback documentation changes.

Correct.

> 
> Then we need to change CLEAR_AFTER_FIELD(foo, capabilities) in
> v4l2-ioctl to zero out reserved[] areas in v4l2_requestbuffers
> and v4l2_create_buffers. I think v4l2_create_buffers is fine,
> but requstbuffers has flags and reversed[1] in the union so for
> requestbuffers we simply removed the CLEAR_AFTER_FIELD() and
> hence dropped the corresponding check from v4l-compliance.
> 
> So, do we plan on using .flags in v4l2_requestbuffers and
> v4l2_create_buffers?

Perhaps, but this patch is meant to revert *all* changes relating to
V4L2_FLAG_MEMORY_NON_CONSISTENT. We don't want to have unused fields
in the public API.

Regards,

	Hans

> 
> 
> - create_bufs()/reqbufs() flags argument removal patch
> (no struct-s/documentation cleanup yet).
> 
> ====8<====
> Subject: [PATCH] remove redundant flags argument
> 
> ---
>  drivers/media/common/videobuf2/videobuf2-core.c | 10 +++++-----
>  drivers/media/common/videobuf2/videobuf2-v4l2.c |  6 ++----
>  include/media/videobuf2-core.h                  |  6 ++----
>  3 files changed, 9 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> index 66a41cef33c1..4eab6d81cce1 100644
> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> @@ -722,7 +722,7 @@ int vb2_verify_memory_type(struct vb2_queue *q,
>  EXPORT_SYMBOL(vb2_verify_memory_type);
>  
>  int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> -		     unsigned int flags, unsigned int *count)
> +		     unsigned int *count)
>  {
>  	unsigned int num_buffers, allocated_buffers, num_planes = 0;
>  	unsigned plane_sizes[VB2_MAX_PLANES] = { };
> @@ -861,7 +861,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
>  EXPORT_SYMBOL_GPL(vb2_core_reqbufs);
>  
>  int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
> -			 unsigned int flags, unsigned int *count,
> +			 unsigned int *count,
>  			 unsigned int requested_planes,
>  			 const unsigned int requested_sizes[])
>  {
> @@ -2547,7 +2547,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
>  	fileio->memory = VB2_MEMORY_MMAP;
>  	fileio->type = q->type;
>  	q->fileio = fileio;
> -	ret = vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);
> +	ret = vb2_core_reqbufs(q, fileio->memory, &fileio->count);
>  	if (ret)
>  		goto err_kfree;
>  
> @@ -2604,7 +2604,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
>  
>  err_reqbufs:
>  	fileio->count = 0;
> -	vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);
> +	vb2_core_reqbufs(q, fileio->memory, &fileio->count);
>  
>  err_kfree:
>  	q->fileio = NULL;
> @@ -2624,7 +2624,7 @@ static int __vb2_cleanup_fileio(struct vb2_queue *q)
>  		vb2_core_streamoff(q, q->type);
>  		q->fileio = NULL;
>  		fileio->count = 0;
> -		vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);
> +		vb2_core_reqbufs(q, fileio->memory, &fileio->count);
>  		kfree(fileio);
>  		dprintk(q, 3, "file io emulator closed\n");
>  	}
> diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
> index d2879f53455b..96d3b2b2aa31 100644
> --- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
> +++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
> @@ -728,8 +728,7 @@ int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req)
>  	int ret = vb2_verify_memory_type(q, req->memory, req->type);
>  
>  	fill_buf_caps(q, &req->capabilities);
> -	return ret ? ret : vb2_core_reqbufs(q, req->memory,
> -					    req->flags, &req->count);
> +	return ret ? ret : vb2_core_reqbufs(q, req->memory, &req->count);
>  }
>  EXPORT_SYMBOL_GPL(vb2_reqbufs);
>  
> @@ -804,7 +803,6 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
>  		if (requested_sizes[i] == 0)
>  			return -EINVAL;
>  	return ret ? ret : vb2_core_create_bufs(q, create->memory,
> -						create->flags,
>  						&create->count,
>  						requested_planes,
>  						requested_sizes);
> @@ -993,7 +991,7 @@ int vb2_ioctl_reqbufs(struct file *file, void *priv,
>  		return res;
>  	if (vb2_queue_is_busy(vdev, file))
>  		return -EBUSY;
> -	res = vb2_core_reqbufs(vdev->queue, p->memory, p->flags, &p->count);
> +	res = vb2_core_reqbufs(vdev->queue, p->memory, &p->count);
>  	/* If count == 0, then the owner has released all buffers and he
>  	   is no longer owner of the queue. Otherwise we have a new owner. */
>  	if (res == 0)
> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> index 4c7f25b07e93..bbb3f26fbde9 100644
> --- a/include/media/videobuf2-core.h
> +++ b/include/media/videobuf2-core.h
> @@ -744,7 +744,6 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
>   * vb2_core_reqbufs() - Initiate streaming.
>   * @q:		pointer to &struct vb2_queue with videobuf2 queue.
>   * @memory:	memory type, as defined by &enum vb2_memory.
> - * @flags:	auxiliary queue/buffer management flags.
>   * @count:	requested buffer count.
>   *
>   * Videobuf2 core helper to implement VIDIOC_REQBUF() operation. It is called
> @@ -769,13 +768,12 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
>   * Return: returns zero on success; an error code otherwise.
>   */
>  int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
> -		    unsigned int flags, unsigned int *count);
> +		    unsigned int *count);
>  
>  /**
>   * vb2_core_create_bufs() - Allocate buffers and any required auxiliary structs
>   * @q: pointer to &struct vb2_queue with videobuf2 queue.
>   * @memory: memory type, as defined by &enum vb2_memory.
> - * @flags: auxiliary queue/buffer management flags.
>   * @count: requested buffer count.
>   * @requested_planes: number of planes requested.
>   * @requested_sizes: array with the size of the planes.
> @@ -793,7 +791,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
>   * Return: returns zero on success; an error code otherwise.
>   */
>  int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
> -			 unsigned int flags, unsigned int *count,
> +			 unsigned int *count,
>  			 unsigned int requested_planes,
>  			 const unsigned int requested_sizes[]);
>  
>
  
Sergey Senozhatsky Sept. 10, 2020, 10:14 a.m. UTC | #29
On (20/09/10 11:57), Hans Verkuil wrote:
> 
> Perhaps, but this patch is meant to revert *all* changes relating to
> V4L2_FLAG_MEMORY_NON_CONSISTENT. We don't want to have unused fields
> in the public API.

OK, would you prefer a squashed patch for all the kernel rollbacks and
cleanups or a patch series?

	-ss
  
Hans Verkuil Sept. 10, 2020, 10:23 a.m. UTC | #30
On 10/09/2020 12:14, Sergey Senozhatsky wrote:
> On (20/09/10 11:57), Hans Verkuil wrote:
>>
>> Perhaps, but this patch is meant to revert *all* changes relating to
>> V4L2_FLAG_MEMORY_NON_CONSISTENT. We don't want to have unused fields
>> in the public API.
> 
> OK, would you prefer a squashed patch for all the kernel rollbacks and
> cleanups or a patch series?

My preference is a single patch.

Regards,

	Hans
  
Sergey Senozhatsky Sept. 10, 2020, 2:48 p.m. UTC | #31
On (20/09/10 12:23), Hans Verkuil wrote:
> On 10/09/2020 12:14, Sergey Senozhatsky wrote:
> > On (20/09/10 11:57), Hans Verkuil wrote:
> >>
> >> Perhaps, but this patch is meant to revert *all* changes relating to
> >> V4L2_FLAG_MEMORY_NON_CONSISTENT. We don't want to have unused fields
> >> in the public API.
> > 
> > OK, would you prefer a squashed patch for all the kernel rollbacks and
> > cleanups or a patch series?
> 
> My preference is a single patch.

I've a kernel patch (I think I got all the pieces). This is not a
format submission yet, because I don't have v4l2-compliance patch
yet, so didn't really test it.

========
From 66980b8c25ae59f24e587d9b4d297f3117b2b668 Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Subject: [PATCH] media/v4l2: remove V4L2-FLAG-MEMORY-NON-CONSISTENT flag

The patch partially reverts some of the UAPI bits of the buffer
cache management hints. Namely, the queue consistency (memory
coherency) user-space hint because, as it turned out, the kernel
implementation of this feature was misusing DMA_ATTR_NON_CONSISTENT.

The patch revers both kernel and user space parts: removes the
DMA consistency attr functions, rollbacks changes to v4l2_requestbuffers,
v4l2_create_buffers structures and corresponding UAPI functions
(plus compat32 layer) and cleanups the documentation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 .../userspace-api/media/v4l/buffer.rst        | 17 -------
 .../media/v4l/vidioc-create-bufs.rst          |  6 +--
 .../media/v4l/vidioc-reqbufs.rst              | 12 +----
 .../media/common/videobuf2/videobuf2-core.c   | 46 +++----------------
 .../common/videobuf2/videobuf2-dma-contig.c   | 19 --------
 .../media/common/videobuf2/videobuf2-dma-sg.c |  3 +-
 .../media/common/videobuf2/videobuf2-v4l2.c   | 18 +-------
 drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 10 +---
 drivers/media/v4l2-core/v4l2-ioctl.c          |  5 +-
 include/media/videobuf2-core.h                |  7 +--
 include/uapi/linux/videodev2.h                | 13 +-----
 11 files changed, 22 insertions(+), 134 deletions(-)

diff --git a/Documentation/userspace-api/media/v4l/buffer.rst b/Documentation/userspace-api/media/v4l/buffer.rst
index 177ed8b6ce62..4f95496adc5b 100644
--- a/Documentation/userspace-api/media/v4l/buffer.rst
+++ b/Documentation/userspace-api/media/v4l/buffer.rst
@@ -694,23 +694,6 @@ Memory Consistency Flags
     :stub-columns: 0
     :widths:       3 1 4
 
-    * .. _`V4L2-FLAG-MEMORY-NON-CONSISTENT`:
-
-      - ``V4L2_FLAG_MEMORY_NON_CONSISTENT``
-      - 0x00000001
-      - A buffer is allocated either in consistent (it will be automatically
-	coherent between the CPU and the bus) or non-consistent memory. The
-	latter can provide performance gains, for instance the CPU cache
-	sync/flush operations can be avoided if the buffer is accessed by the
-	corresponding device only and the CPU does not read/write to/from that
-	buffer. However, this requires extra care from the driver -- it must
-	guarantee memory consistency by issuing a cache flush/sync when
-	consistency is needed. If this flag is set V4L2 will attempt to
-	allocate the buffer in non-consistent memory. The flag takes effect
-	only if the buffer is used for :ref:`memory mapping <mmap>` I/O and the
-	queue reports the :ref:`V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS
-	<V4L2-BUF-CAP-SUPPORTS-MMAP-CACHE-HINTS>` capability.
-
 .. c:type:: v4l2_memory
 
 enum v4l2_memory
diff --git a/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst b/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
index c050dffa4237..d999028f47df 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
@@ -113,13 +113,9 @@ than the number requested.
 	If you want to just query the capabilities without making any
 	other changes, then set ``count`` to 0, ``memory`` to
 	``V4L2_MEMORY_MMAP`` and ``format.type`` to the buffer type.
-    * - __u32
-      - ``flags``
-      - Specifies additional buffer management attributes.
-	See :ref:`memory-flags`.
 
     * - __u32
-      - ``reserved``\ [6]
+      - ``reserved``\ [7]
       - A place holder for future extensions. Drivers and applications
 	must set the array to zero.
 
diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
index 47e433161690..afc35cd7b614 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
@@ -105,17 +105,10 @@ aborting or finishing any DMA in progress, an implicit
 	``V4L2_MEMORY_MMAP`` and ``type`` set to the buffer type. This will
 	free any previously allocated buffers, so this is typically something
 	that will be done at the start of the application.
-    * - union {
-      - (anonymous)
-    * - __u32
-      - ``flags``
-      - Specifies additional buffer management attributes.
-	See :ref:`memory-flags`.
     * - __u32
       - ``reserved``\ [1]
-      - Kept for backwards compatibility. Use ``flags`` instead.
-    * - }
-      -
+      - A place holder for future extensions. Drivers and applications
+	must set the array to zero.
 
 .. tabularcolumns:: |p{6.1cm}|p{2.2cm}|p{8.7cm}|
 
@@ -162,7 +155,6 @@ aborting or finishing any DMA in progress, an implicit
       - This capability is set by the driver to indicate that the queue supports
         cache and memory management hints. However, it's only valid when the
         queue is used for :ref:`memory mapping <mmap>` streaming I/O. See
-        :ref:`V4L2_FLAG_MEMORY_NON_CONSISTENT <V4L2-FLAG-MEMORY-NON-CONSISTENT>`,
         :ref:`V4L2_BUF_FLAG_NO_CACHE_INVALIDATE <V4L2-BUF-FLAG-NO-CACHE-INVALIDATE>` and
         :ref:`V4L2_BUF_FLAG_NO_CACHE_CLEAN <V4L2-BUF-FLAG-NO-CACHE-CLEAN>`.
 
diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
index f544d3393e9d..4eab6d81cce1 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -721,39 +721,14 @@ int vb2_verify_memory_type(struct vb2_queue *q,
 }
 EXPORT_SYMBOL(vb2_verify_memory_type);
 
-static void set_queue_consistency(struct vb2_queue *q, bool consistent_mem)
-{
-	q->dma_attrs &= ~DMA_ATTR_NON_CONSISTENT;
-
-	if (!vb2_queue_allows_cache_hints(q))
-		return;
-	if (!consistent_mem)
-		q->dma_attrs |= DMA_ATTR_NON_CONSISTENT;
-}
-
-static bool verify_consistency_attr(struct vb2_queue *q, bool consistent_mem)
-{
-	bool queue_is_consistent = !(q->dma_attrs & DMA_ATTR_NON_CONSISTENT);
-
-	if (consistent_mem != queue_is_consistent) {
-		dprintk(q, 1, "memory consistency model mismatch\n");
-		return false;
-	}
-	return true;
-}
-
 int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
-		     unsigned int flags, unsigned int *count)
+		     unsigned int *count)
 {
 	unsigned int num_buffers, allocated_buffers, num_planes = 0;
 	unsigned plane_sizes[VB2_MAX_PLANES] = { };
-	bool consistent_mem = true;
 	unsigned int i;
 	int ret;
 
-	if (flags & V4L2_FLAG_MEMORY_NON_CONSISTENT)
-		consistent_mem = false;
-
 	if (q->streaming) {
 		dprintk(q, 1, "streaming active\n");
 		return -EBUSY;
@@ -765,8 +740,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
 	}
 
 	if (*count == 0 || q->num_buffers != 0 ||
-	    (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory) ||
-	    !verify_consistency_attr(q, consistent_mem)) {
+	    (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory)) {
 		/*
 		 * We already have buffers allocated, so first check if they
 		 * are not in use and can be freed.
@@ -803,7 +777,6 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
 	num_buffers = min_t(unsigned int, num_buffers, VB2_MAX_FRAME);
 	memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
 	q->memory = memory;
-	set_queue_consistency(q, consistent_mem);
 
 	/*
 	 * Ask the driver how many buffers and planes per buffer it requires.
@@ -888,18 +861,14 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
 EXPORT_SYMBOL_GPL(vb2_core_reqbufs);
 
 int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
-			 unsigned int flags, unsigned int *count,
+			 unsigned int *count,
 			 unsigned int requested_planes,
 			 const unsigned int requested_sizes[])
 {
 	unsigned int num_planes = 0, num_buffers, allocated_buffers;
 	unsigned plane_sizes[VB2_MAX_PLANES] = { };
-	bool consistent_mem = true;
 	int ret;
 
-	if (flags & V4L2_FLAG_MEMORY_NON_CONSISTENT)
-		consistent_mem = false;
-
 	if (q->num_buffers == VB2_MAX_FRAME) {
 		dprintk(q, 1, "maximum number of buffers already allocated\n");
 		return -ENOBUFS;
@@ -912,15 +881,12 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
 		}
 		memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
 		q->memory = memory;
-		set_queue_consistency(q, consistent_mem);
 		q->waiting_for_buffers = !q->is_output;
 	} else {
 		if (q->memory != memory) {
 			dprintk(q, 1, "memory model mismatch\n");
 			return -EINVAL;
 		}
-		if (!verify_consistency_attr(q, consistent_mem))
-			return -EINVAL;
 	}
 
 	num_buffers = min(*count, VB2_MAX_FRAME - q->num_buffers);
@@ -2581,7 +2547,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
 	fileio->memory = VB2_MEMORY_MMAP;
 	fileio->type = q->type;
 	q->fileio = fileio;
-	ret = vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);
+	ret = vb2_core_reqbufs(q, fileio->memory, &fileio->count);
 	if (ret)
 		goto err_kfree;
 
@@ -2638,7 +2604,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
 
 err_reqbufs:
 	fileio->count = 0;
-	vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);
+	vb2_core_reqbufs(q, fileio->memory, &fileio->count);
 
 err_kfree:
 	q->fileio = NULL;
@@ -2658,7 +2624,7 @@ static int __vb2_cleanup_fileio(struct vb2_queue *q)
 		vb2_core_streamoff(q, q->type);
 		q->fileio = NULL;
 		fileio->count = 0;
-		vb2_core_reqbufs(q, fileio->memory, 0, &fileio->count);
+		vb2_core_reqbufs(q, fileio->memory, &fileio->count);
 		kfree(fileio);
 		dprintk(q, 3, "file io emulator closed\n");
 	}
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
index ec3446cc45b8..7b1b86ec942d 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
@@ -42,11 +42,6 @@ struct vb2_dc_buf {
 	struct dma_buf_attachment	*db_attach;
 };
 
-static inline bool vb2_dc_buffer_consistent(unsigned long attr)
-{
-	return !(attr & DMA_ATTR_NON_CONSISTENT);
-}
-
 /*********************************************/
 /*        scatterlist table functions        */
 /*********************************************/
@@ -341,13 +336,6 @@ static int
 vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
 				   enum dma_data_direction direction)
 {
-	struct vb2_dc_buf *buf = dbuf->priv;
-	struct sg_table *sgt = buf->dma_sgt;
-
-	if (vb2_dc_buffer_consistent(buf->attrs))
-		return 0;
-
-	dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
 	return 0;
 }
 
@@ -355,13 +343,6 @@ static int
 vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
 				 enum dma_data_direction direction)
 {
-	struct vb2_dc_buf *buf = dbuf->priv;
-	struct sg_table *sgt = buf->dma_sgt;
-
-	if (vb2_dc_buffer_consistent(buf->attrs))
-		return 0;
-
-	dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
 	return 0;
 }
 
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
index 0a40e00f0d7e..a86fce5d8ea8 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
@@ -123,8 +123,7 @@ static void *vb2_dma_sg_alloc(struct device *dev, unsigned long dma_attrs,
 	/*
 	 * NOTE: dma-sg allocates memory using the page allocator directly, so
 	 * there is no memory consistency guarantee, hence dma-sg ignores DMA
-	 * attributes passed from the upper layer. That means that
-	 * V4L2_FLAG_MEMORY_NON_CONSISTENT has no effect on dma-sg buffers.
+	 * attributes passed from the upper layer.
 	 */
 	buf->pages = kvmalloc_array(buf->num_pages, sizeof(struct page *),
 				    GFP_KERNEL | __GFP_ZERO);
diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index f4197c2e1f30..96d3b2b2aa31 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -723,22 +723,12 @@ static void fill_buf_caps(struct vb2_queue *q, u32 *caps)
 #endif
 }
 
-static void clear_consistency_attr(struct vb2_queue *q,
-				   int memory,
-				   unsigned int *flags)
-{
-	if (!q->allow_cache_hints || memory != V4L2_MEMORY_MMAP)
-		*flags &= ~V4L2_FLAG_MEMORY_NON_CONSISTENT;
-}
-
 int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req)
 {
 	int ret = vb2_verify_memory_type(q, req->memory, req->type);
 
 	fill_buf_caps(q, &req->capabilities);
-	clear_consistency_attr(q, req->memory, &req->flags);
-	return ret ? ret : vb2_core_reqbufs(q, req->memory,
-					    req->flags, &req->count);
+	return ret ? ret : vb2_core_reqbufs(q, req->memory, &req->count);
 }
 EXPORT_SYMBOL_GPL(vb2_reqbufs);
 
@@ -770,7 +760,6 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
 	unsigned i;
 
 	fill_buf_caps(q, &create->capabilities);
-	clear_consistency_attr(q, create->memory, &create->flags);
 	create->index = q->num_buffers;
 	if (create->count == 0)
 		return ret != -EBUSY ? ret : 0;
@@ -814,7 +803,6 @@ int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
 		if (requested_sizes[i] == 0)
 			return -EINVAL;
 	return ret ? ret : vb2_core_create_bufs(q, create->memory,
-						create->flags,
 						&create->count,
 						requested_planes,
 						requested_sizes);
@@ -999,12 +987,11 @@ int vb2_ioctl_reqbufs(struct file *file, void *priv,
 	int res = vb2_verify_memory_type(vdev->queue, p->memory, p->type);
 
 	fill_buf_caps(vdev->queue, &p->capabilities);
-	clear_consistency_attr(vdev->queue, p->memory, &p->flags);
 	if (res)
 		return res;
 	if (vb2_queue_is_busy(vdev, file))
 		return -EBUSY;
-	res = vb2_core_reqbufs(vdev->queue, p->memory, p->flags, &p->count);
+	res = vb2_core_reqbufs(vdev->queue, p->memory, &p->count);
 	/* If count == 0, then the owner has released all buffers and he
 	   is no longer owner of the queue. Otherwise we have a new owner. */
 	if (res == 0)
@@ -1022,7 +1009,6 @@ int vb2_ioctl_create_bufs(struct file *file, void *priv,
 
 	p->index = vdev->queue->num_buffers;
 	fill_buf_caps(vdev->queue, &p->capabilities);
-	clear_consistency_attr(vdev->queue, p->memory, &p->flags);
 	/*
 	 * If count == 0, then just check if memory and type are valid.
 	 * Any -EBUSY result from vb2_verify_memory_type can be mapped to 0.
diff --git a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
index 593bcf6c3735..a99e82ec9ab6 100644
--- a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
+++ b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c
@@ -246,9 +246,6 @@ struct v4l2_format32 {
  * @memory:	buffer memory type
  * @format:	frame format, for which buffers are requested
  * @capabilities: capabilities of this buffer type.
- * @flags:	additional buffer management attributes (ignored unless the
- *		queue has V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS capability and
- *		configured for MMAP streaming I/O).
  * @reserved:	future extensions
  */
 struct v4l2_create_buffers32 {
@@ -257,8 +254,7 @@ struct v4l2_create_buffers32 {
 	__u32			memory;	/* enum v4l2_memory */
 	struct v4l2_format32	format;
 	__u32			capabilities;
-	__u32			flags;
-	__u32			reserved[6];
+	__u32			reserved[7];
 };
 
 static int __bufsize_v4l2_format(struct v4l2_format32 __user *p32, u32 *size)
@@ -359,8 +355,7 @@ static int get_v4l2_create32(struct v4l2_create_buffers __user *p64,
 {
 	if (!access_ok(p32, sizeof(*p32)) ||
 	    copy_in_user(p64, p32,
-			 offsetof(struct v4l2_create_buffers32, format)) ||
-	    assign_in_user(&p64->flags, &p32->flags))
+			 offsetof(struct v4l2_create_buffers32, format)))
 		return -EFAULT;
 	return __get_v4l2_format32(&p64->format, &p32->format,
 				   aux_buf, aux_space);
@@ -422,7 +417,6 @@ static int put_v4l2_create32(struct v4l2_create_buffers __user *p64,
 	    copy_in_user(p32, p64,
 			 offsetof(struct v4l2_create_buffers32, format)) ||
 	    assign_in_user(&p32->capabilities, &p64->capabilities) ||
-	    assign_in_user(&p32->flags, &p64->flags) ||
 	    copy_in_user(p32->reserved, p64->reserved, sizeof(p64->reserved)))
 		return -EFAULT;
 	return __put_v4l2_format32(&p64->format, &p32->format);
diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
index f74b42280892..eeff398fbdcc 100644
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -2042,6 +2042,9 @@ static int v4l_reqbufs(const struct v4l2_ioctl_ops *ops,
 
 	if (ret)
 		return ret;
+
+	CLEAR_AFTER_FIELD(p, capabilities);
+
 	return ops->vidioc_reqbufs(file, fh, p);
 }
 
@@ -2081,7 +2084,7 @@ static int v4l_create_bufs(const struct v4l2_ioctl_ops *ops,
 	if (ret)
 		return ret;
 
-	CLEAR_AFTER_FIELD(create, flags);
+	CLEAR_AFTER_FIELD(create, capabilities);
 
 	v4l_sanitize_format(&create->format);
 
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index 52ef92049073..bbb3f26fbde9 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -744,8 +744,6 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
  * vb2_core_reqbufs() - Initiate streaming.
  * @q:		pointer to &struct vb2_queue with videobuf2 queue.
  * @memory:	memory type, as defined by &enum vb2_memory.
- * @flags:	auxiliary queue/buffer management flags. Currently, the only
- *		used flag is %V4L2_FLAG_MEMORY_NON_CONSISTENT.
  * @count:	requested buffer count.
  *
  * Videobuf2 core helper to implement VIDIOC_REQBUF() operation. It is called
@@ -770,13 +768,12 @@ void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
  * Return: returns zero on success; an error code otherwise.
  */
 int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
-		    unsigned int flags, unsigned int *count);
+		    unsigned int *count);
 
 /**
  * vb2_core_create_bufs() - Allocate buffers and any required auxiliary structs
  * @q: pointer to &struct vb2_queue with videobuf2 queue.
  * @memory: memory type, as defined by &enum vb2_memory.
- * @flags: auxiliary queue/buffer management flags.
  * @count: requested buffer count.
  * @requested_planes: number of planes requested.
  * @requested_sizes: array with the size of the planes.
@@ -794,7 +791,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
  * Return: returns zero on success; an error code otherwise.
  */
 int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
-			 unsigned int flags, unsigned int *count,
+			 unsigned int *count,
 			 unsigned int requested_planes,
 			 const unsigned int requested_sizes[]);
 
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 4769628790da..f717826d5d7c 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -191,8 +191,6 @@ enum v4l2_memory {
 	V4L2_MEMORY_DMABUF           = 4,
 };
 
-#define V4L2_FLAG_MEMORY_NON_CONSISTENT		(1 << 0)
-
 /* see also http://vektor.theorem.ca/graphics/ycbcr/ */
 enum v4l2_colorspace {
 	/*
@@ -948,10 +946,7 @@ struct v4l2_requestbuffers {
 	__u32			type;		/* enum v4l2_buf_type */
 	__u32			memory;		/* enum v4l2_memory */
 	__u32			capabilities;
-	union {
-		__u32		flags;
-		__u32		reserved[1];
-	};
+	__u32			reserved[1];
 };
 
 /* capabilities for struct v4l2_requestbuffers and v4l2_create_buffers */
@@ -2455,9 +2450,6 @@ struct v4l2_dbg_chip_info {
  * @memory:	enum v4l2_memory; buffer memory type
  * @format:	frame format, for which buffers are requested
  * @capabilities: capabilities of this buffer type.
- * @flags:	additional buffer management attributes (ignored unless the
- *		queue has V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS capability
- *		and configured for MMAP streaming I/O).
  * @reserved:	future extensions
  */
 struct v4l2_create_buffers {
@@ -2466,8 +2458,7 @@ struct v4l2_create_buffers {
 	__u32			memory;
 	struct v4l2_format	format;
 	__u32			capabilities;
-	__u32			flags;
-	__u32			reserved[6];
+	__u32			reserved[7];
 };
 
 /*
  
Sergey Senozhatsky Sept. 10, 2020, 3:38 p.m. UTC | #32
On (20/09/10 23:48), Sergey Senozhatsky wrote:
> 
> I've a kernel patch (I think I got all the pieces). This is not a
> format submission yet, because I don't have v4l2-compliance patch
> yet, so didn't really test it.
> 

I guess,,, the v4l-utils patch should look something like this:
====

From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Subject: [PATCH] v4l-compliance: remove NON_CONSISTENT hint test

Kernel support for V4L2_FLAG_MEMORY_NON_CONSISTENT has been
reverted, so we need to to remove the memory consistency
(coherency) tests from the test-buffers code.

Note, the buffer cache management hints support is still there
and should be tested.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 utils/common/cv4l-helpers.h                 |  8 ++---
 utils/common/v4l-helpers.h                  |  8 ++---
 utils/v4l2-compliance/v4l2-test-buffers.cpp | 40 ++-------------------
 3 files changed, 9 insertions(+), 47 deletions(-)

diff --git a/utils/common/cv4l-helpers.h b/utils/common/cv4l-helpers.h
index 3cee372b..712efde6 100644
--- a/utils/common/cv4l-helpers.h
+++ b/utils/common/cv4l-helpers.h
@@ -754,17 +754,17 @@ public:
 	int g_fd(unsigned index, unsigned plane) const { return v4l_queue_g_fd(this, index, plane); }
 	void s_fd(unsigned index, unsigned plane, int fd) { v4l_queue_s_fd(this, index, plane, fd); }
 
-	int reqbufs(cv4l_fd *fd, unsigned count = 0, unsigned int flags = 0)
+	int reqbufs(cv4l_fd *fd, unsigned count = 0)
 	{
-		return v4l_queue_reqbufs(fd->g_v4l_fd(), this, count, flags);
+		return v4l_queue_reqbufs(fd->g_v4l_fd(), this, count);
 	}
 	bool has_create_bufs(cv4l_fd *fd) const
 	{
 		return v4l_queue_has_create_bufs(fd->g_v4l_fd(), this);
 	}
-	int create_bufs(cv4l_fd *fd, unsigned count, const v4l2_format *fmt = NULL, unsigned int flags = 0)
+	int create_bufs(cv4l_fd *fd, unsigned count, const v4l2_format *fmt = NULL)
 	{
-		return v4l_queue_create_bufs(fd->g_v4l_fd(), this, count, fmt, flags);
+		return v4l_queue_create_bufs(fd->g_v4l_fd(), this, count, fmt);
 	}
 	int mmap_bufs(cv4l_fd *fd, unsigned from = 0)
 	{
diff --git a/utils/common/v4l-helpers.h b/utils/common/v4l-helpers.h
index c09cd987..f96b3c38 100644
--- a/utils/common/v4l-helpers.h
+++ b/utils/common/v4l-helpers.h
@@ -1515,7 +1515,7 @@ static inline int v4l_queue_querybufs(struct v4l_fd *f, struct v4l_queue *q, uns
 }
 
 static inline int v4l_queue_reqbufs(struct v4l_fd *f,
-		struct v4l_queue *q, unsigned count, unsigned int flags = 0)
+		struct v4l_queue *q, unsigned count)
 {
 	struct v4l2_requestbuffers reqbufs;
 	int ret;
@@ -1523,7 +1523,6 @@ static inline int v4l_queue_reqbufs(struct v4l_fd *f,
 	reqbufs.type = q->type;
 	reqbufs.memory = q->memory;
 	reqbufs.count = count;
-	reqbufs.flags = flags;
 	/*
 	 * Problem: if REQBUFS returns an error, did it free any old
 	 * buffers or not?
@@ -1548,7 +1547,7 @@ static inline bool v4l_queue_has_create_bufs(struct v4l_fd *f, const struct v4l_
 
 static inline int v4l_queue_create_bufs(struct v4l_fd *f,
 		struct v4l_queue *q, unsigned count,
-		const struct v4l2_format *fmt, unsigned int flags = 0)
+		const struct v4l2_format *fmt)
 {
 	struct v4l2_create_buffers createbufs;
 	int ret;
@@ -1556,7 +1555,6 @@ static inline int v4l_queue_create_bufs(struct v4l_fd *f,
 	createbufs.format.type = q->type;
 	createbufs.memory = q->memory;
 	createbufs.count = count;
-	createbufs.flags = flags;
 	if (fmt) {
 		createbufs.format = *fmt;
 	} else {
@@ -1735,7 +1733,7 @@ static inline void v4l_queue_free(struct v4l_fd *f, struct v4l_queue *q)
 	v4l_ioctl(f, VIDIOC_STREAMOFF, &q->type);
 	v4l_queue_release_bufs(f, q, 0);
 	v4l_queue_close_exported_fds(q);
-	v4l_queue_reqbufs(f, q, 0, 0);
+	v4l_queue_reqbufs(f, q, 0);
 }
 
 static inline void v4l_queue_buffer_update(const struct v4l_queue *q,
diff --git a/utils/v4l2-compliance/v4l2-test-buffers.cpp b/utils/v4l2-compliance/v4l2-test-buffers.cpp
index aca0eb68..1651e95c 100644
--- a/utils/v4l2-compliance/v4l2-test-buffers.cpp
+++ b/utils/v4l2-compliance/v4l2-test-buffers.cpp
@@ -653,10 +653,6 @@ int testReqBufs(struct node *node)
 		fail_on_test(q.reqbufs(node, 0));
 
 		for (m = V4L2_MEMORY_MMAP; m <= V4L2_MEMORY_DMABUF; m++) {
-			bool cache_hints_cap = false;
-			bool consistent;
-
-			cache_hints_cap = q.g_capabilities() & V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS;
 			if (!(node->valid_memorytype & (1 << m)))
 				continue;
 			cv4l_queue q2(i, m);
@@ -672,17 +668,8 @@ int testReqBufs(struct node *node)
 			reqbufs.count = 1;
 			reqbufs.type = i;
 			reqbufs.memory = m;
-			reqbufs.flags = V4L2_FLAG_MEMORY_NON_CONSISTENT;
 			fail_on_test(doioctl(node, VIDIOC_REQBUFS, &reqbufs));
-			consistent = reqbufs.flags & V4L2_FLAG_MEMORY_NON_CONSISTENT;
-			if (!cache_hints_cap) {
-				fail_on_test(consistent);
-			} else {
-				if (m == V4L2_MEMORY_MMAP)
-					fail_on_test(!consistent);
-				else
-					fail_on_test(consistent);
-			}
+			fail_on_test(check_0(reqbufs.reserved, sizeof(reqbufs.reserved)));
 			q.reqbufs(node);
 
 			ret = q.create_bufs(node, 0);
@@ -695,32 +682,9 @@ int testReqBufs(struct node *node)
 			node->g_fmt(crbufs.format, i);
 			crbufs.count = 1;
 			crbufs.memory = m;
-			crbufs.flags = V4L2_FLAG_MEMORY_NON_CONSISTENT;
 			fail_on_test(doioctl(node, VIDIOC_CREATE_BUFS, &crbufs));
 			fail_on_test(check_0(crbufs.reserved, sizeof(crbufs.reserved)));
 			fail_on_test(crbufs.index != q.g_buffers());
-
-			consistent = crbufs.flags & V4L2_FLAG_MEMORY_NON_CONSISTENT;
-			if (!cache_hints_cap) {
-				fail_on_test(consistent);
-			} else {
-				if (m == V4L2_MEMORY_MMAP)
-					fail_on_test(!consistent);
-				else
-					fail_on_test(consistent);
-			}
-
-			if (cache_hints_cap) {
-				/*
-				 * Different memory consistency model. Should fail for MMAP
-				 * queues which support cache hints.
-				 */
-				crbufs.flags = 0;
-				if (m == V4L2_MEMORY_MMAP)
-					fail_on_test(doioctl(node, VIDIOC_CREATE_BUFS, &crbufs) != EINVAL);
-				else
-					fail_on_test(doioctl(node, VIDIOC_CREATE_BUFS, &crbufs));
-			}
 			q.reqbufs(node);
 
 			fail_on_test(q.create_bufs(node, 1));
@@ -1352,7 +1316,7 @@ int testMmap(struct node *node, struct node *node_m2m_cap, unsigned frame_count,
 			have_createbufs = false;
 		if (have_createbufs) {
 			q.reqbufs(node);
-			q.create_bufs(node, 2, &cur_fmt, V4L2_FLAG_MEMORY_NON_CONSISTENT);
+			q.create_bufs(node, 2, &cur_fmt);
 			fail_on_test(setupMmap(node, q));
 			q.munmap_bufs(node);
 			q.reqbufs(node, 2);
  

Patch

diff --git a/Documentation/userspace-api/media/v4l/buffer.rst b/Documentation/userspace-api/media/v4l/buffer.rst
index 57e752aaf414a7..2044ed13cd9d7d 100644
--- a/Documentation/userspace-api/media/v4l/buffer.rst
+++ b/Documentation/userspace-api/media/v4l/buffer.rst
@@ -701,23 +701,6 @@  Memory Consistency Flags
     :stub-columns: 0
     :widths:       3 1 4
 
-    * .. _`V4L2-FLAG-MEMORY-NON-CONSISTENT`:
-
-      - ``V4L2_FLAG_MEMORY_NON_CONSISTENT``
-      - 0x00000001
-      - A buffer is allocated either in consistent (it will be automatically
-	coherent between the CPU and the bus) or non-consistent memory. The
-	latter can provide performance gains, for instance the CPU cache
-	sync/flush operations can be avoided if the buffer is accessed by the
-	corresponding device only and the CPU does not read/write to/from that
-	buffer. However, this requires extra care from the driver -- it must
-	guarantee memory consistency by issuing a cache flush/sync when
-	consistency is needed. If this flag is set V4L2 will attempt to
-	allocate the buffer in non-consistent memory. The flag takes effect
-	only if the buffer is used for :ref:`memory mapping <mmap>` I/O and the
-	queue reports the :ref:`V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS
-	<V4L2-BUF-CAP-SUPPORTS-MMAP-CACHE-HINTS>` capability.
-
 .. c:type:: v4l2_memory
 
 enum v4l2_memory
diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
index 75d894d9c36c42..3180c111d368ee 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
@@ -169,7 +169,6 @@  aborting or finishing any DMA in progress, an implicit
       - This capability is set by the driver to indicate that the queue supports
         cache and memory management hints. However, it's only valid when the
         queue is used for :ref:`memory mapping <mmap>` streaming I/O. See
-        :ref:`V4L2_FLAG_MEMORY_NON_CONSISTENT <V4L2-FLAG-MEMORY-NON-CONSISTENT>`,
         :ref:`V4L2_BUF_FLAG_NO_CACHE_INVALIDATE <V4L2-BUF-FLAG-NO-CACHE-INVALIDATE>` and
         :ref:`V4L2_BUF_FLAG_NO_CACHE_CLEAN <V4L2-BUF-FLAG-NO-CACHE-CLEAN>`.
 
diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
index f544d3393e9d6b..66a41cef33c1b1 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -721,39 +721,14 @@  int vb2_verify_memory_type(struct vb2_queue *q,
 }
 EXPORT_SYMBOL(vb2_verify_memory_type);
 
-static void set_queue_consistency(struct vb2_queue *q, bool consistent_mem)
-{
-	q->dma_attrs &= ~DMA_ATTR_NON_CONSISTENT;
-
-	if (!vb2_queue_allows_cache_hints(q))
-		return;
-	if (!consistent_mem)
-		q->dma_attrs |= DMA_ATTR_NON_CONSISTENT;
-}
-
-static bool verify_consistency_attr(struct vb2_queue *q, bool consistent_mem)
-{
-	bool queue_is_consistent = !(q->dma_attrs & DMA_ATTR_NON_CONSISTENT);
-
-	if (consistent_mem != queue_is_consistent) {
-		dprintk(q, 1, "memory consistency model mismatch\n");
-		return false;
-	}
-	return true;
-}
-
 int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
 		     unsigned int flags, unsigned int *count)
 {
 	unsigned int num_buffers, allocated_buffers, num_planes = 0;
 	unsigned plane_sizes[VB2_MAX_PLANES] = { };
-	bool consistent_mem = true;
 	unsigned int i;
 	int ret;
 
-	if (flags & V4L2_FLAG_MEMORY_NON_CONSISTENT)
-		consistent_mem = false;
-
 	if (q->streaming) {
 		dprintk(q, 1, "streaming active\n");
 		return -EBUSY;
@@ -765,8 +740,7 @@  int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
 	}
 
 	if (*count == 0 || q->num_buffers != 0 ||
-	    (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory) ||
-	    !verify_consistency_attr(q, consistent_mem)) {
+	    (q->memory != VB2_MEMORY_UNKNOWN && q->memory != memory)) {
 		/*
 		 * We already have buffers allocated, so first check if they
 		 * are not in use and can be freed.
@@ -803,7 +777,6 @@  int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
 	num_buffers = min_t(unsigned int, num_buffers, VB2_MAX_FRAME);
 	memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
 	q->memory = memory;
-	set_queue_consistency(q, consistent_mem);
 
 	/*
 	 * Ask the driver how many buffers and planes per buffer it requires.
@@ -894,12 +867,8 @@  int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
 {
 	unsigned int num_planes = 0, num_buffers, allocated_buffers;
 	unsigned plane_sizes[VB2_MAX_PLANES] = { };
-	bool consistent_mem = true;
 	int ret;
 
-	if (flags & V4L2_FLAG_MEMORY_NON_CONSISTENT)
-		consistent_mem = false;
-
 	if (q->num_buffers == VB2_MAX_FRAME) {
 		dprintk(q, 1, "maximum number of buffers already allocated\n");
 		return -ENOBUFS;
@@ -912,15 +881,12 @@  int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
 		}
 		memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
 		q->memory = memory;
-		set_queue_consistency(q, consistent_mem);
 		q->waiting_for_buffers = !q->is_output;
 	} else {
 		if (q->memory != memory) {
 			dprintk(q, 1, "memory model mismatch\n");
 			return -EINVAL;
 		}
-		if (!verify_consistency_attr(q, consistent_mem))
-			return -EINVAL;
 	}
 
 	num_buffers = min(*count, VB2_MAX_FRAME - q->num_buffers);
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
index ec3446cc45b8da..7b1b86ec942d7d 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
@@ -42,11 +42,6 @@  struct vb2_dc_buf {
 	struct dma_buf_attachment	*db_attach;
 };
 
-static inline bool vb2_dc_buffer_consistent(unsigned long attr)
-{
-	return !(attr & DMA_ATTR_NON_CONSISTENT);
-}
-
 /*********************************************/
 /*        scatterlist table functions        */
 /*********************************************/
@@ -341,13 +336,6 @@  static int
 vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
 				   enum dma_data_direction direction)
 {
-	struct vb2_dc_buf *buf = dbuf->priv;
-	struct sg_table *sgt = buf->dma_sgt;
-
-	if (vb2_dc_buffer_consistent(buf->attrs))
-		return 0;
-
-	dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
 	return 0;
 }
 
@@ -355,13 +343,6 @@  static int
 vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
 				 enum dma_data_direction direction)
 {
-	struct vb2_dc_buf *buf = dbuf->priv;
-	struct sg_table *sgt = buf->dma_sgt;
-
-	if (vb2_dc_buffer_consistent(buf->attrs))
-		return 0;
-
-	dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
 	return 0;
 }
 
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
index 0a40e00f0d7e5c..a86fce5d8ea8bf 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
@@ -123,8 +123,7 @@  static void *vb2_dma_sg_alloc(struct device *dev, unsigned long dma_attrs,
 	/*
 	 * NOTE: dma-sg allocates memory using the page allocator directly, so
 	 * there is no memory consistency guarantee, hence dma-sg ignores DMA
-	 * attributes passed from the upper layer. That means that
-	 * V4L2_FLAG_MEMORY_NON_CONSISTENT has no effect on dma-sg buffers.
+	 * attributes passed from the upper layer.
 	 */
 	buf->pages = kvmalloc_array(buf->num_pages, sizeof(struct page *),
 				    GFP_KERNEL | __GFP_ZERO);
diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index 30caad27281e1a..de83ad48783821 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -722,20 +722,11 @@  static void fill_buf_caps(struct vb2_queue *q, u32 *caps)
 #endif
 }
 
-static void clear_consistency_attr(struct vb2_queue *q,
-				   int memory,
-				   unsigned int *flags)
-{
-	if (!q->allow_cache_hints || memory != V4L2_MEMORY_MMAP)
-		*flags &= ~V4L2_FLAG_MEMORY_NON_CONSISTENT;
-}
-
 int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req)
 {
 	int ret = vb2_verify_memory_type(q, req->memory, req->type);
 
 	fill_buf_caps(q, &req->capabilities);
-	clear_consistency_attr(q, req->memory, &req->flags);
 	return ret ? ret : vb2_core_reqbufs(q, req->memory,
 					    req->flags, &req->count);
 }
@@ -769,7 +760,6 @@  int vb2_create_bufs(struct vb2_queue *q, struct v4l2_create_buffers *create)
 	unsigned i;
 
 	fill_buf_caps(q, &create->capabilities);
-	clear_consistency_attr(q, create->memory, &create->flags);
 	create->index = q->num_buffers;
 	if (create->count == 0)
 		return ret != -EBUSY ? ret : 0;
@@ -998,7 +988,6 @@  int vb2_ioctl_reqbufs(struct file *file, void *priv,
 	int res = vb2_verify_memory_type(vdev->queue, p->memory, p->type);
 
 	fill_buf_caps(vdev->queue, &p->capabilities);
-	clear_consistency_attr(vdev->queue, p->memory, &p->flags);
 	if (res)
 		return res;
 	if (vb2_queue_is_busy(vdev, file))
@@ -1021,7 +1010,6 @@  int vb2_ioctl_create_bufs(struct file *file, void *priv,
 
 	p->index = vdev->queue->num_buffers;
 	fill_buf_caps(vdev->queue, &p->capabilities);
-	clear_consistency_attr(vdev->queue, p->memory, &p->flags);
 	/*
 	 * If count == 0, then just check if memory and type are valid.
 	 * Any -EBUSY result from vb2_verify_memory_type can be mapped to 0.
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index 52ef92049073e3..4c7f25b07e9375 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -744,8 +744,7 @@  void vb2_core_querybuf(struct vb2_queue *q, unsigned int index, void *pb);
  * vb2_core_reqbufs() - Initiate streaming.
  * @q:		pointer to &struct vb2_queue with videobuf2 queue.
  * @memory:	memory type, as defined by &enum vb2_memory.
- * @flags:	auxiliary queue/buffer management flags. Currently, the only
- *		used flag is %V4L2_FLAG_MEMORY_NON_CONSISTENT.
+ * @flags:	auxiliary queue/buffer management flags.
  * @count:	requested buffer count.
  *
  * Videobuf2 core helper to implement VIDIOC_REQBUF() operation. It is called
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index c7b70ff53bc1dd..5c00f63d9c1b58 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -191,8 +191,6 @@  enum v4l2_memory {
 	V4L2_MEMORY_DMABUF           = 4,
 };
 
-#define V4L2_FLAG_MEMORY_NON_CONSISTENT		(1 << 0)
-
 /* see also http://vektor.theorem.ca/graphics/ycbcr/ */
 enum v4l2_colorspace {
 	/*