[1/4] dma-buf: Remove kmap kerneldoc vestiges

Message ID 20201211155843.3348718-1-daniel.vetter@ffwll.ch (mailing list archive)
State Not Applicable, archived
Headers
Series [1/4] dma-buf: Remove kmap kerneldoc vestiges |

Commit Message

Daniel Vetter Dec. 11, 2020, 3:58 p.m. UTC
  Also try to clarify a bit when dma_buf_begin/end_cpu_access should
be called.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/dma-buf/dma-buf.c | 20 ++++++++++++++------
 include/linux/dma-buf.h   | 25 +++++++++----------------
 2 files changed, 23 insertions(+), 22 deletions(-)
  

Comments

Christian König Dec. 14, 2020, 10:33 a.m. UTC | #1
Am 11.12.20 um 16:58 schrieb Daniel Vetter:
> Also try to clarify a bit when dma_buf_begin/end_cpu_access should
> be called.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>   drivers/dma-buf/dma-buf.c | 20 ++++++++++++++------
>   include/linux/dma-buf.h   | 25 +++++++++----------------
>   2 files changed, 23 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index e63684d4cd90..a12fdffa130f 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -1001,15 +1001,15 @@ EXPORT_SYMBOL_GPL(dma_buf_move_notify);
>    *   vmalloc space might be limited and result in vmap calls failing.
>    *
>    *   Interfaces::
> + *
>    *      void \*dma_buf_vmap(struct dma_buf \*dmabuf)
>    *      void dma_buf_vunmap(struct dma_buf \*dmabuf, void \*vaddr)
>    *
>    *   The vmap call can fail if there is no vmap support in the exporter, or if
> - *   it runs out of vmalloc space. Fallback to kmap should be implemented. Note
> - *   that the dma-buf layer keeps a reference count for all vmap access and
> - *   calls down into the exporter's vmap function only when no vmapping exists,
> - *   and only unmaps it once. Protection against concurrent vmap/vunmap calls is
> - *   provided by taking the dma_buf->lock mutex.
> + *   it runs out of vmalloc space. Note that the dma-buf layer keeps a reference
> + *   count for all vmap access and calls down into the exporter's vmap function
> + *   only when no vmapping exists, and only unmaps it once. Protection against
> + *   concurrent vmap/vunmap calls is provided by taking the &dma_buf.lock mutex.

Who is talking the lock? The caller of the dma_buf_vmap/vunmap() 
functions, the functions itself or the callback inside the exporter?

Christian.

>    *
>    * - For full compatibility on the importer side with existing userspace
>    *   interfaces, which might already support mmap'ing buffers. This is needed in
> @@ -1098,6 +1098,11 @@ static int __dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
>    * dma_buf_end_cpu_access(). Only when cpu access is braketed by both calls is
>    * it guaranteed to be coherent with other DMA access.
>    *
> + * This function will also wait for any DMA transactions tracked through
> + * implicit synchronization in &dma_buf.resv. For DMA transactions with explicit
> + * synchronization this function will only ensure cache coherency, callers must
> + * ensure synchronization with such DMA transactions on their own.
> + *
>    * Can return negative error values, returns 0 on success.
>    */
>   int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
> @@ -1199,7 +1204,10 @@ EXPORT_SYMBOL_GPL(dma_buf_mmap);
>    * This call may fail due to lack of virtual mapping address space.
>    * These calls are optional in drivers. The intended use for them
>    * is for mapping objects linear in kernel space for high use objects.
> - * Please attempt to use kmap/kunmap before thinking about these interfaces.
> + *
> + * To ensure coherency users must call dma_buf_begin_cpu_access() and
> + * dma_buf_end_cpu_access() around any cpu access performed through this
> + * mapping.
>    *
>    * Returns 0 on success, or a negative errno code otherwise.
>    */
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index cf72699cb2bc..7eca37c8b10c 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -183,24 +183,19 @@ struct dma_buf_ops {
>   	 * @begin_cpu_access:
>   	 *
>   	 * This is called from dma_buf_begin_cpu_access() and allows the
> -	 * exporter to ensure that the memory is actually available for cpu
> -	 * access - the exporter might need to allocate or swap-in and pin the
> -	 * backing storage. The exporter also needs to ensure that cpu access is
> -	 * coherent for the access direction. The direction can be used by the
> -	 * exporter to optimize the cache flushing, i.e. access with a different
> +	 * exporter to ensure that the memory is actually coherent for cpu
> +	 * access. The exporter also needs to ensure that cpu access is coherent
> +	 * for the access direction. The direction can be used by the exporter
> +	 * to optimize the cache flushing, i.e. access with a different
>   	 * direction (read instead of write) might return stale or even bogus
>   	 * data (e.g. when the exporter needs to copy the data to temporary
>   	 * storage).
>   	 *
> -	 * This callback is optional.
> +	 * Note that this is both called through the DMA_BUF_IOCTL_SYNC IOCTL
> +	 * command for userspace mappings established through @mmap, and also
> +	 * for kernel mappings established with @vmap.
>   	 *
> -	 * FIXME: This is both called through the DMA_BUF_IOCTL_SYNC command
> -	 * from userspace (where storage shouldn't be pinned to avoid handing
> -	 * de-factor mlock rights to userspace) and for the kernel-internal
> -	 * users of the various kmap interfaces, where the backing storage must
> -	 * be pinned to guarantee that the atomic kmap calls can succeed. Since
> -	 * there's no in-kernel users of the kmap interfaces yet this isn't a
> -	 * real problem.
> +	 * This callback is optional.
>   	 *
>   	 * Returns:
>   	 *
> @@ -216,9 +211,7 @@ struct dma_buf_ops {
>   	 *
>   	 * This is called from dma_buf_end_cpu_access() when the importer is
>   	 * done accessing the CPU. The exporter can use this to flush caches and
> -	 * unpin any resources pinned in @begin_cpu_access.
> -	 * The result of any dma_buf kmap calls after end_cpu_access is
> -	 * undefined.
> +	 * undo anything else done in @begin_cpu_access.
>   	 *
>   	 * This callback is optional.
>   	 *
  
Daniel Vetter Dec. 14, 2020, 4:01 p.m. UTC | #2
On Mon, Dec 14, 2020 at 11:33:10AM +0100, Christian König wrote:
> Am 11.12.20 um 16:58 schrieb Daniel Vetter:
> > Also try to clarify a bit when dma_buf_begin/end_cpu_access should
> > be called.
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >   drivers/dma-buf/dma-buf.c | 20 ++++++++++++++------
> >   include/linux/dma-buf.h   | 25 +++++++++----------------
> >   2 files changed, 23 insertions(+), 22 deletions(-)
> > 
> > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > index e63684d4cd90..a12fdffa130f 100644
> > --- a/drivers/dma-buf/dma-buf.c
> > +++ b/drivers/dma-buf/dma-buf.c
> > @@ -1001,15 +1001,15 @@ EXPORT_SYMBOL_GPL(dma_buf_move_notify);
> >    *   vmalloc space might be limited and result in vmap calls failing.
> >    *
> >    *   Interfaces::
> > + *
> >    *      void \*dma_buf_vmap(struct dma_buf \*dmabuf)
> >    *      void dma_buf_vunmap(struct dma_buf \*dmabuf, void \*vaddr)
> >    *
> >    *   The vmap call can fail if there is no vmap support in the exporter, or if
> > - *   it runs out of vmalloc space. Fallback to kmap should be implemented. Note
> > - *   that the dma-buf layer keeps a reference count for all vmap access and
> > - *   calls down into the exporter's vmap function only when no vmapping exists,
> > - *   and only unmaps it once. Protection against concurrent vmap/vunmap calls is
> > - *   provided by taking the dma_buf->lock mutex.
> > + *   it runs out of vmalloc space. Note that the dma-buf layer keeps a reference
> > + *   count for all vmap access and calls down into the exporter's vmap function
> > + *   only when no vmapping exists, and only unmaps it once. Protection against
> > + *   concurrent vmap/vunmap calls is provided by taking the &dma_buf.lock mutex.
> 
> Who is talking the lock? The caller of the dma_buf_vmap/vunmap() functions,
> the functions itself or the callback inside the exporter?

That's the part I didn't change at all here, just re-laid out the line
breaking. I only removed the outdated kmap section here.

Should I do another patch and remove this one sentence here (it's kinda
pointless and generally we don't muse about implementation details that
callers don't care about)?

I did try and do a cursory review of the dma-buf docs, but this is kinda
not meant as an all-out revamp. Just a few things I've noticed while
reviewing Thomas' vmap_local stuff.
-Daniel

> 
> Christian.
> 
> >    *
> >    * - For full compatibility on the importer side with existing userspace
> >    *   interfaces, which might already support mmap'ing buffers. This is needed in
> > @@ -1098,6 +1098,11 @@ static int __dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
> >    * dma_buf_end_cpu_access(). Only when cpu access is braketed by both calls is
> >    * it guaranteed to be coherent with other DMA access.
> >    *
> > + * This function will also wait for any DMA transactions tracked through
> > + * implicit synchronization in &dma_buf.resv. For DMA transactions with explicit
> > + * synchronization this function will only ensure cache coherency, callers must
> > + * ensure synchronization with such DMA transactions on their own.
> > + *
> >    * Can return negative error values, returns 0 on success.
> >    */
> >   int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
> > @@ -1199,7 +1204,10 @@ EXPORT_SYMBOL_GPL(dma_buf_mmap);
> >    * This call may fail due to lack of virtual mapping address space.
> >    * These calls are optional in drivers. The intended use for them
> >    * is for mapping objects linear in kernel space for high use objects.
> > - * Please attempt to use kmap/kunmap before thinking about these interfaces.
> > + *
> > + * To ensure coherency users must call dma_buf_begin_cpu_access() and
> > + * dma_buf_end_cpu_access() around any cpu access performed through this
> > + * mapping.
> >    *
> >    * Returns 0 on success, or a negative errno code otherwise.
> >    */
> > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> > index cf72699cb2bc..7eca37c8b10c 100644
> > --- a/include/linux/dma-buf.h
> > +++ b/include/linux/dma-buf.h
> > @@ -183,24 +183,19 @@ struct dma_buf_ops {
> >   	 * @begin_cpu_access:
> >   	 *
> >   	 * This is called from dma_buf_begin_cpu_access() and allows the
> > -	 * exporter to ensure that the memory is actually available for cpu
> > -	 * access - the exporter might need to allocate or swap-in and pin the
> > -	 * backing storage. The exporter also needs to ensure that cpu access is
> > -	 * coherent for the access direction. The direction can be used by the
> > -	 * exporter to optimize the cache flushing, i.e. access with a different
> > +	 * exporter to ensure that the memory is actually coherent for cpu
> > +	 * access. The exporter also needs to ensure that cpu access is coherent
> > +	 * for the access direction. The direction can be used by the exporter
> > +	 * to optimize the cache flushing, i.e. access with a different
> >   	 * direction (read instead of write) might return stale or even bogus
> >   	 * data (e.g. when the exporter needs to copy the data to temporary
> >   	 * storage).
> >   	 *
> > -	 * This callback is optional.
> > +	 * Note that this is both called through the DMA_BUF_IOCTL_SYNC IOCTL
> > +	 * command for userspace mappings established through @mmap, and also
> > +	 * for kernel mappings established with @vmap.
> >   	 *
> > -	 * FIXME: This is both called through the DMA_BUF_IOCTL_SYNC command
> > -	 * from userspace (where storage shouldn't be pinned to avoid handing
> > -	 * de-factor mlock rights to userspace) and for the kernel-internal
> > -	 * users of the various kmap interfaces, where the backing storage must
> > -	 * be pinned to guarantee that the atomic kmap calls can succeed. Since
> > -	 * there's no in-kernel users of the kmap interfaces yet this isn't a
> > -	 * real problem.
> > +	 * This callback is optional.
> >   	 *
> >   	 * Returns:
> >   	 *
> > @@ -216,9 +211,7 @@ struct dma_buf_ops {
> >   	 *
> >   	 * This is called from dma_buf_end_cpu_access() when the importer is
> >   	 * done accessing the CPU. The exporter can use this to flush caches and
> > -	 * unpin any resources pinned in @begin_cpu_access.
> > -	 * The result of any dma_buf kmap calls after end_cpu_access is
> > -	 * undefined.
> > +	 * undo anything else done in @begin_cpu_access.
> >   	 *
> >   	 * This callback is optional.
> >   	 *
>
  
Christian König Dec. 15, 2020, 2:18 p.m. UTC | #3
Am 14.12.20 um 17:01 schrieb Daniel Vetter:
> On Mon, Dec 14, 2020 at 11:33:10AM +0100, Christian König wrote:
>> Am 11.12.20 um 16:58 schrieb Daniel Vetter:
>>> Also try to clarify a bit when dma_buf_begin/end_cpu_access should
>>> be called.
>>>
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: Thomas Zimmermann <tzimmermann@suse.de>
>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>> Cc: "Christian König" <christian.koenig@amd.com>
>>> Cc: linux-media@vger.kernel.org
>>> Cc: linaro-mm-sig@lists.linaro.org
>>> ---
>>>    drivers/dma-buf/dma-buf.c | 20 ++++++++++++++------
>>>    include/linux/dma-buf.h   | 25 +++++++++----------------
>>>    2 files changed, 23 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>> index e63684d4cd90..a12fdffa130f 100644
>>> --- a/drivers/dma-buf/dma-buf.c
>>> +++ b/drivers/dma-buf/dma-buf.c
>>> @@ -1001,15 +1001,15 @@ EXPORT_SYMBOL_GPL(dma_buf_move_notify);
>>>     *   vmalloc space might be limited and result in vmap calls failing.
>>>     *
>>>     *   Interfaces::
>>> + *
>>>     *      void \*dma_buf_vmap(struct dma_buf \*dmabuf)
>>>     *      void dma_buf_vunmap(struct dma_buf \*dmabuf, void \*vaddr)
>>>     *
>>>     *   The vmap call can fail if there is no vmap support in the exporter, or if
>>> - *   it runs out of vmalloc space. Fallback to kmap should be implemented. Note
>>> - *   that the dma-buf layer keeps a reference count for all vmap access and
>>> - *   calls down into the exporter's vmap function only when no vmapping exists,
>>> - *   and only unmaps it once. Protection against concurrent vmap/vunmap calls is
>>> - *   provided by taking the dma_buf->lock mutex.
>>> + *   it runs out of vmalloc space. Note that the dma-buf layer keeps a reference
>>> + *   count for all vmap access and calls down into the exporter's vmap function
>>> + *   only when no vmapping exists, and only unmaps it once. Protection against
>>> + *   concurrent vmap/vunmap calls is provided by taking the &dma_buf.lock mutex.
>> Who is talking the lock? The caller of the dma_buf_vmap/vunmap() functions,
>> the functions itself or the callback inside the exporter?
> That's the part I didn't change at all here, just re-laid out the line
> breaking. I only removed the outdated kmap section here.

I just wanted to point out that this still isn't described here very very.


> Should I do another patch and remove this one sentence here (it's kinda
> pointless and generally we don't muse about implementation details that
> callers don't care about)?

Na, works like this for me.

> I did try and do a cursory review of the dma-buf docs, but this is kinda
> not meant as an all-out revamp. Just a few things I've noticed while
> reviewing Thomas' vmap_local stuff.


Fell free to add an Acked-by: Christian König <christian.koenig@amd.com> 
to the series.

Christian.

> -Daniel
>
>> Christian.
>>
>>>     *
>>>     * - For full compatibility on the importer side with existing userspace
>>>     *   interfaces, which might already support mmap'ing buffers. This is needed in
>>> @@ -1098,6 +1098,11 @@ static int __dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
>>>     * dma_buf_end_cpu_access(). Only when cpu access is braketed by both calls is
>>>     * it guaranteed to be coherent with other DMA access.
>>>     *
>>> + * This function will also wait for any DMA transactions tracked through
>>> + * implicit synchronization in &dma_buf.resv. For DMA transactions with explicit
>>> + * synchronization this function will only ensure cache coherency, callers must
>>> + * ensure synchronization with such DMA transactions on their own.
>>> + *
>>>     * Can return negative error values, returns 0 on success.
>>>     */
>>>    int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
>>> @@ -1199,7 +1204,10 @@ EXPORT_SYMBOL_GPL(dma_buf_mmap);
>>>     * This call may fail due to lack of virtual mapping address space.
>>>     * These calls are optional in drivers. The intended use for them
>>>     * is for mapping objects linear in kernel space for high use objects.
>>> - * Please attempt to use kmap/kunmap before thinking about these interfaces.
>>> + *
>>> + * To ensure coherency users must call dma_buf_begin_cpu_access() and
>>> + * dma_buf_end_cpu_access() around any cpu access performed through this
>>> + * mapping.
>>>     *
>>>     * Returns 0 on success, or a negative errno code otherwise.
>>>     */
>>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
>>> index cf72699cb2bc..7eca37c8b10c 100644
>>> --- a/include/linux/dma-buf.h
>>> +++ b/include/linux/dma-buf.h
>>> @@ -183,24 +183,19 @@ struct dma_buf_ops {
>>>    	 * @begin_cpu_access:
>>>    	 *
>>>    	 * This is called from dma_buf_begin_cpu_access() and allows the
>>> -	 * exporter to ensure that the memory is actually available for cpu
>>> -	 * access - the exporter might need to allocate or swap-in and pin the
>>> -	 * backing storage. The exporter also needs to ensure that cpu access is
>>> -	 * coherent for the access direction. The direction can be used by the
>>> -	 * exporter to optimize the cache flushing, i.e. access with a different
>>> +	 * exporter to ensure that the memory is actually coherent for cpu
>>> +	 * access. The exporter also needs to ensure that cpu access is coherent
>>> +	 * for the access direction. The direction can be used by the exporter
>>> +	 * to optimize the cache flushing, i.e. access with a different
>>>    	 * direction (read instead of write) might return stale or even bogus
>>>    	 * data (e.g. when the exporter needs to copy the data to temporary
>>>    	 * storage).
>>>    	 *
>>> -	 * This callback is optional.
>>> +	 * Note that this is both called through the DMA_BUF_IOCTL_SYNC IOCTL
>>> +	 * command for userspace mappings established through @mmap, and also
>>> +	 * for kernel mappings established with @vmap.
>>>    	 *
>>> -	 * FIXME: This is both called through the DMA_BUF_IOCTL_SYNC command
>>> -	 * from userspace (where storage shouldn't be pinned to avoid handing
>>> -	 * de-factor mlock rights to userspace) and for the kernel-internal
>>> -	 * users of the various kmap interfaces, where the backing storage must
>>> -	 * be pinned to guarantee that the atomic kmap calls can succeed. Since
>>> -	 * there's no in-kernel users of the kmap interfaces yet this isn't a
>>> -	 * real problem.
>>> +	 * This callback is optional.
>>>    	 *
>>>    	 * Returns:
>>>    	 *
>>> @@ -216,9 +211,7 @@ struct dma_buf_ops {
>>>    	 *
>>>    	 * This is called from dma_buf_end_cpu_access() when the importer is
>>>    	 * done accessing the CPU. The exporter can use this to flush caches and
>>> -	 * unpin any resources pinned in @begin_cpu_access.
>>> -	 * The result of any dma_buf kmap calls after end_cpu_access is
>>> -	 * undefined.
>>> +	 * undo anything else done in @begin_cpu_access.
>>>    	 *
>>>    	 * This callback is optional.
>>>    	 *
  
Daniel Vetter Dec. 16, 2020, 10:29 a.m. UTC | #4
On Tue, Dec 15, 2020 at 03:18:49PM +0100, Christian König wrote:
> Am 14.12.20 um 17:01 schrieb Daniel Vetter:
> > On Mon, Dec 14, 2020 at 11:33:10AM +0100, Christian König wrote:
> > > Am 11.12.20 um 16:58 schrieb Daniel Vetter:
> > > > Also try to clarify a bit when dma_buf_begin/end_cpu_access should
> > > > be called.
> > > > 
> > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > > > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > > > Cc: "Christian König" <christian.koenig@amd.com>
> > > > Cc: linux-media@vger.kernel.org
> > > > Cc: linaro-mm-sig@lists.linaro.org
> > > > ---
> > > >    drivers/dma-buf/dma-buf.c | 20 ++++++++++++++------
> > > >    include/linux/dma-buf.h   | 25 +++++++++----------------
> > > >    2 files changed, 23 insertions(+), 22 deletions(-)
> > > > 
> > > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > > > index e63684d4cd90..a12fdffa130f 100644
> > > > --- a/drivers/dma-buf/dma-buf.c
> > > > +++ b/drivers/dma-buf/dma-buf.c
> > > > @@ -1001,15 +1001,15 @@ EXPORT_SYMBOL_GPL(dma_buf_move_notify);
> > > >     *   vmalloc space might be limited and result in vmap calls failing.
> > > >     *
> > > >     *   Interfaces::
> > > > + *
> > > >     *      void \*dma_buf_vmap(struct dma_buf \*dmabuf)
> > > >     *      void dma_buf_vunmap(struct dma_buf \*dmabuf, void \*vaddr)
> > > >     *
> > > >     *   The vmap call can fail if there is no vmap support in the exporter, or if
> > > > - *   it runs out of vmalloc space. Fallback to kmap should be implemented. Note
> > > > - *   that the dma-buf layer keeps a reference count for all vmap access and
> > > > - *   calls down into the exporter's vmap function only when no vmapping exists,
> > > > - *   and only unmaps it once. Protection against concurrent vmap/vunmap calls is
> > > > - *   provided by taking the dma_buf->lock mutex.
> > > > + *   it runs out of vmalloc space. Note that the dma-buf layer keeps a reference
> > > > + *   count for all vmap access and calls down into the exporter's vmap function
> > > > + *   only when no vmapping exists, and only unmaps it once. Protection against
> > > > + *   concurrent vmap/vunmap calls is provided by taking the &dma_buf.lock mutex.
> > > Who is talking the lock? The caller of the dma_buf_vmap/vunmap() functions,
> > > the functions itself or the callback inside the exporter?
> > That's the part I didn't change at all here, just re-laid out the line
> > breaking. I only removed the outdated kmap section here.
> 
> I just wanted to point out that this still isn't described here very very.
> 
> 
> > Should I do another patch and remove this one sentence here (it's kinda
> > pointless and generally we don't muse about implementation details that
> > callers don't care about)?
> 
> Na, works like this for me.
> 
> > I did try and do a cursory review of the dma-buf docs, but this is kinda
> > not meant as an all-out revamp. Just a few things I've noticed while
> > reviewing Thomas' vmap_local stuff.
> 
> 
> Fell free to add an Acked-by: Christian König <christian.koenig@amd.com> to
> the series.

Thanks for taking a look, and yeah I actually want to do a review of all
the dma-buf docs but just haven't found the quiet time for that yet.

Patches pushed to drm-misc-next.
-Daniel

> 
> Christian.
> 
> > -Daniel
> > 
> > > Christian.
> > > 
> > > >     *
> > > >     * - For full compatibility on the importer side with existing userspace
> > > >     *   interfaces, which might already support mmap'ing buffers. This is needed in
> > > > @@ -1098,6 +1098,11 @@ static int __dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
> > > >     * dma_buf_end_cpu_access(). Only when cpu access is braketed by both calls is
> > > >     * it guaranteed to be coherent with other DMA access.
> > > >     *
> > > > + * This function will also wait for any DMA transactions tracked through
> > > > + * implicit synchronization in &dma_buf.resv. For DMA transactions with explicit
> > > > + * synchronization this function will only ensure cache coherency, callers must
> > > > + * ensure synchronization with such DMA transactions on their own.
> > > > + *
> > > >     * Can return negative error values, returns 0 on success.
> > > >     */
> > > >    int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
> > > > @@ -1199,7 +1204,10 @@ EXPORT_SYMBOL_GPL(dma_buf_mmap);
> > > >     * This call may fail due to lack of virtual mapping address space.
> > > >     * These calls are optional in drivers. The intended use for them
> > > >     * is for mapping objects linear in kernel space for high use objects.
> > > > - * Please attempt to use kmap/kunmap before thinking about these interfaces.
> > > > + *
> > > > + * To ensure coherency users must call dma_buf_begin_cpu_access() and
> > > > + * dma_buf_end_cpu_access() around any cpu access performed through this
> > > > + * mapping.
> > > >     *
> > > >     * Returns 0 on success, or a negative errno code otherwise.
> > > >     */
> > > > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> > > > index cf72699cb2bc..7eca37c8b10c 100644
> > > > --- a/include/linux/dma-buf.h
> > > > +++ b/include/linux/dma-buf.h
> > > > @@ -183,24 +183,19 @@ struct dma_buf_ops {
> > > >    	 * @begin_cpu_access:
> > > >    	 *
> > > >    	 * This is called from dma_buf_begin_cpu_access() and allows the
> > > > -	 * exporter to ensure that the memory is actually available for cpu
> > > > -	 * access - the exporter might need to allocate or swap-in and pin the
> > > > -	 * backing storage. The exporter also needs to ensure that cpu access is
> > > > -	 * coherent for the access direction. The direction can be used by the
> > > > -	 * exporter to optimize the cache flushing, i.e. access with a different
> > > > +	 * exporter to ensure that the memory is actually coherent for cpu
> > > > +	 * access. The exporter also needs to ensure that cpu access is coherent
> > > > +	 * for the access direction. The direction can be used by the exporter
> > > > +	 * to optimize the cache flushing, i.e. access with a different
> > > >    	 * direction (read instead of write) might return stale or even bogus
> > > >    	 * data (e.g. when the exporter needs to copy the data to temporary
> > > >    	 * storage).
> > > >    	 *
> > > > -	 * This callback is optional.
> > > > +	 * Note that this is both called through the DMA_BUF_IOCTL_SYNC IOCTL
> > > > +	 * command for userspace mappings established through @mmap, and also
> > > > +	 * for kernel mappings established with @vmap.
> > > >    	 *
> > > > -	 * FIXME: This is both called through the DMA_BUF_IOCTL_SYNC command
> > > > -	 * from userspace (where storage shouldn't be pinned to avoid handing
> > > > -	 * de-factor mlock rights to userspace) and for the kernel-internal
> > > > -	 * users of the various kmap interfaces, where the backing storage must
> > > > -	 * be pinned to guarantee that the atomic kmap calls can succeed. Since
> > > > -	 * there's no in-kernel users of the kmap interfaces yet this isn't a
> > > > -	 * real problem.
> > > > +	 * This callback is optional.
> > > >    	 *
> > > >    	 * Returns:
> > > >    	 *
> > > > @@ -216,9 +211,7 @@ struct dma_buf_ops {
> > > >    	 *
> > > >    	 * This is called from dma_buf_end_cpu_access() when the importer is
> > > >    	 * done accessing the CPU. The exporter can use this to flush caches and
> > > > -	 * unpin any resources pinned in @begin_cpu_access.
> > > > -	 * The result of any dma_buf kmap calls after end_cpu_access is
> > > > -	 * undefined.
> > > > +	 * undo anything else done in @begin_cpu_access.
> > > >    	 *
> > > >    	 * This callback is optional.
> > > >    	 *
>
  

Patch

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index e63684d4cd90..a12fdffa130f 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -1001,15 +1001,15 @@  EXPORT_SYMBOL_GPL(dma_buf_move_notify);
  *   vmalloc space might be limited and result in vmap calls failing.
  *
  *   Interfaces::
+ *
  *      void \*dma_buf_vmap(struct dma_buf \*dmabuf)
  *      void dma_buf_vunmap(struct dma_buf \*dmabuf, void \*vaddr)
  *
  *   The vmap call can fail if there is no vmap support in the exporter, or if
- *   it runs out of vmalloc space. Fallback to kmap should be implemented. Note
- *   that the dma-buf layer keeps a reference count for all vmap access and
- *   calls down into the exporter's vmap function only when no vmapping exists,
- *   and only unmaps it once. Protection against concurrent vmap/vunmap calls is
- *   provided by taking the dma_buf->lock mutex.
+ *   it runs out of vmalloc space. Note that the dma-buf layer keeps a reference
+ *   count for all vmap access and calls down into the exporter's vmap function
+ *   only when no vmapping exists, and only unmaps it once. Protection against
+ *   concurrent vmap/vunmap calls is provided by taking the &dma_buf.lock mutex.
  *
  * - For full compatibility on the importer side with existing userspace
  *   interfaces, which might already support mmap'ing buffers. This is needed in
@@ -1098,6 +1098,11 @@  static int __dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
  * dma_buf_end_cpu_access(). Only when cpu access is braketed by both calls is
  * it guaranteed to be coherent with other DMA access.
  *
+ * This function will also wait for any DMA transactions tracked through
+ * implicit synchronization in &dma_buf.resv. For DMA transactions with explicit
+ * synchronization this function will only ensure cache coherency, callers must
+ * ensure synchronization with such DMA transactions on their own.
+ *
  * Can return negative error values, returns 0 on success.
  */
 int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
@@ -1199,7 +1204,10 @@  EXPORT_SYMBOL_GPL(dma_buf_mmap);
  * This call may fail due to lack of virtual mapping address space.
  * These calls are optional in drivers. The intended use for them
  * is for mapping objects linear in kernel space for high use objects.
- * Please attempt to use kmap/kunmap before thinking about these interfaces.
+ *
+ * To ensure coherency users must call dma_buf_begin_cpu_access() and
+ * dma_buf_end_cpu_access() around any cpu access performed through this
+ * mapping.
  *
  * Returns 0 on success, or a negative errno code otherwise.
  */
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index cf72699cb2bc..7eca37c8b10c 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -183,24 +183,19 @@  struct dma_buf_ops {
 	 * @begin_cpu_access:
 	 *
 	 * This is called from dma_buf_begin_cpu_access() and allows the
-	 * exporter to ensure that the memory is actually available for cpu
-	 * access - the exporter might need to allocate or swap-in and pin the
-	 * backing storage. The exporter also needs to ensure that cpu access is
-	 * coherent for the access direction. The direction can be used by the
-	 * exporter to optimize the cache flushing, i.e. access with a different
+	 * exporter to ensure that the memory is actually coherent for cpu
+	 * access. The exporter also needs to ensure that cpu access is coherent
+	 * for the access direction. The direction can be used by the exporter
+	 * to optimize the cache flushing, i.e. access with a different
 	 * direction (read instead of write) might return stale or even bogus
 	 * data (e.g. when the exporter needs to copy the data to temporary
 	 * storage).
 	 *
-	 * This callback is optional.
+	 * Note that this is both called through the DMA_BUF_IOCTL_SYNC IOCTL
+	 * command for userspace mappings established through @mmap, and also
+	 * for kernel mappings established with @vmap.
 	 *
-	 * FIXME: This is both called through the DMA_BUF_IOCTL_SYNC command
-	 * from userspace (where storage shouldn't be pinned to avoid handing
-	 * de-factor mlock rights to userspace) and for the kernel-internal
-	 * users of the various kmap interfaces, where the backing storage must
-	 * be pinned to guarantee that the atomic kmap calls can succeed. Since
-	 * there's no in-kernel users of the kmap interfaces yet this isn't a
-	 * real problem.
+	 * This callback is optional.
 	 *
 	 * Returns:
 	 *
@@ -216,9 +211,7 @@  struct dma_buf_ops {
 	 *
 	 * This is called from dma_buf_end_cpu_access() when the importer is
 	 * done accessing the CPU. The exporter can use this to flush caches and
-	 * unpin any resources pinned in @begin_cpu_access.
-	 * The result of any dma_buf kmap calls after end_cpu_access is
-	 * undefined.
+	 * undo anything else done in @begin_cpu_access.
 	 *
 	 * This callback is optional.
 	 *