From patchwork Tue Dec  3 11:51:13 2013
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Archit Taneja <archit@ti.com>
X-Patchwork-Id: 20872
Received: from mail.tu-berlin.de ([130.149.7.33])
	by www.linuxtv.org with esmtp (Exim 4.72)
	(envelope-from <linux-media-owner@vger.kernel.org>)
	id 1VnoVm-0004eW-Ci; Tue, 03 Dec 2013 12:51:30 +0100
X-tubIT-Incoming-IP: 209.132.180.67
Received: from vger.kernel.org ([209.132.180.67])
	by mail.tu-berlin.de (exim-4.72/mailfrontend-7) with esmtp
	id 1VnoVk-0006Jn-0o; Tue, 03 Dec 2013 12:51:30 +0100
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753818Ab3LCLvY (ORCPT <rfc822;mkrufky@linuxtv.org> + 1 other);
	Tue, 3 Dec 2013 06:51:24 -0500
Received: from arroyo.ext.ti.com ([192.94.94.40]:57124 "EHLO
	arroyo.ext.ti.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1753805Ab3LCLvW (ORCPT <rfc822;linux-media@vger.kernel.org>);
	Tue, 3 Dec 2013 06:51:22 -0500
Received: from dlelxv90.itg.ti.com ([172.17.2.17])
	by arroyo.ext.ti.com (8.13.7/8.13.7) with ESMTP id rB3BpIFX008689;
	Tue, 3 Dec 2013 05:51:18 -0600
Received: from DFLE72.ent.ti.com (dfle72.ent.ti.com [128.247.5.109])
	by dlelxv90.itg.ti.com (8.14.3/8.13.8) with ESMTP id rB3BpI5t017330;
	Tue, 3 Dec 2013 05:51:18 -0600
Received: from dflp33.itg.ti.com (10.64.6.16) by DFLE72.ent.ti.com
	(128.247.5.109) with Microsoft SMTP Server id 14.2.342.3;
	Tue, 3 Dec 2013 05:51:18 -0600
Received: from legion.dal.design.ti.com (legion.dal.design.ti.com
	[128.247.22.53])	by dflp33.itg.ti.com (8.14.3/8.13.8) with ESMTP id
	rB3BpIcf020373;	Tue, 3 Dec 2013 05:51:18 -0600
Received: from localhost (a0393947pc.apr.dhcp.ti.com [172.24.145.166])	by
	legion.dal.design.ti.com (8.11.7p1+Sun/8.11.7) with ESMTP id
	rB3BpGt12052; Tue, 3 Dec 2013 05:51:16 -0600 (CST)
From: Archit Taneja <archit@ti.com>
To: <linux-media@vger.kernel.org>, <k.debski@samsung.com>
CC: <linux-omap@vger.kernel.org>, <hverkuil@xs4all.nl>,
	Archit Taneja <archit@ti.com>
Subject: [PATCH 2/2] v4l: ti-vpe: make sure VPDMA line stride constraints
	are met
Date: Tue, 3 Dec 2013 17:21:13 +0530
Message-ID: <1386071473-10808-3-git-send-email-archit@ti.com>
X-Mailer: git-send-email 1.8.3.2
In-Reply-To: <1386071473-10808-1-git-send-email-archit@ti.com>
References: <1386071473-10808-1-git-send-email-archit@ti.com>
MIME-Version: 1.0
Sender: linux-media-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-media.vger.kernel.org>
X-Mailing-List: linux-media@vger.kernel.org
X-PMX-Version: 6.0.0.2142326, Antispam-Engine: 2.7.2.2107409,
	Antispam-Data: 2013.12.3.114215
X-PMX-Spam: Gauge=IIIIIIIII, Probability=9%, Report='
	MULTIPLE_RCPTS 0.1, HTML_00_01 0.05, HTML_00_10 0.05,
	MSGID_ADDED_BY_MTA 0.05, BODY_SIZE_6000_6999 0,
	BODY_SIZE_7000_LESS 0, URI_ENDS_IN_HTML 0, __ANY_URI 0,
	__CP_MEDIA_BODY 0, __CP_URI_IN_BODY 0, __CT 0,
	__CT_TEXT_PLAIN 0, __HAS_FROM 0, __HAS_MSGID 0,
	__HAS_X_MAILER 0, __HAS_X_MAILING_LIST 0, __IN_REP_TO 0,
	__MIME_TEXT_ONLY 0, __MIME_VERSION 0, __MULTIPLE_RCPTS_CC_X2 0,
	__SANE_MSGID 0, __STOCK_PHRASE_7 0, __SUBJ_ALPHA_END 0,
	__TO_MALFORMED_2 0, __TO_NO_NAME 0, __URI_NO_WWW 0, __URI_NS '

When VPDMA fetches or writes to an image buffer, the line stride must be a
multiple of 16 bytes. If it isn't, VPDMA HW will write/fetch until the next
16 byte boundry. This causes VPE to work incorrectly for source or destination
widths which don't satisfy the above alignment requirement.

In order to prevent this, we now make sure that when we set pix format for the
input and output buffers, the VPE source and destination image line strides are
16 byte aligned. Also, the motion vector buffers for the de-interlacer are
allocated in such a way that it ensures the same alignment.

Signed-off-by: Archit Taneja <archit@ti.com>
---
 drivers/media/platform/ti-vpe/vpdma.c |  4 +--
 drivers/media/platform/ti-vpe/vpdma.h |  5 +++-
 drivers/media/platform/ti-vpe/vpe.c   | 53 ++++++++++++++++++++++++++---------
 3 files changed, 46 insertions(+), 16 deletions(-)

diff --git a/drivers/media/platform/ti-vpe/vpdma.c b/drivers/media/platform/ti-vpe/vpdma.c
index af0a5ff..f97253f 100644
--- a/drivers/media/platform/ti-vpe/vpdma.c
+++ b/drivers/media/platform/ti-vpe/vpdma.c
@@ -602,7 +602,7 @@ void vpdma_add_out_dtd(struct vpdma_desc_list *list, struct v4l2_rect *c_rect,
 	if (fmt->data_type == DATA_TYPE_C420)
 		depth = 8;
 
-	stride = (depth * c_rect->width) >> 3;
+	stride = ALIGN((depth * c_rect->width) >> 3, VPDMA_STRIDE_ALIGN);
 	dma_addr += (c_rect->left * depth) >> 3;
 
 	dtd = list->next;
@@ -655,7 +655,7 @@ void vpdma_add_in_dtd(struct vpdma_desc_list *list, int frame_width,
 		depth = 8;
 	}
 
-	stride = (depth * c_rect->width) >> 3;
+	stride = ALIGN((depth * c_rect->width) >> 3, VPDMA_STRIDE_ALIGN);
 	dma_addr += (c_rect->left * depth) >> 3;
 
 	dtd = list->next;
diff --git a/drivers/media/platform/ti-vpe/vpdma.h b/drivers/media/platform/ti-vpe/vpdma.h
index eaa2a71..62dd143 100644
--- a/drivers/media/platform/ti-vpe/vpdma.h
+++ b/drivers/media/platform/ti-vpe/vpdma.h
@@ -45,7 +45,10 @@ struct vpdma_data_format {
 };
 
 #define VPDMA_DESC_ALIGN		16	/* 16-byte descriptor alignment */
-
+#define VPDMA_STRIDE_ALIGN		16	/*
+						 * line stride of source and dest
+						 * buffers should be 16 byte aligned
+						 */
 #define VPDMA_DTD_DESC_SIZE		32	/* 8 words */
 #define VPDMA_CFD_CTD_DESC_SIZE		16	/* 4 words */
 
diff --git a/drivers/media/platform/ti-vpe/vpe.c b/drivers/media/platform/ti-vpe/vpe.c
index 4e58069..a5f7a35 100644
--- a/drivers/media/platform/ti-vpe/vpe.c
+++ b/drivers/media/platform/ti-vpe/vpe.c
@@ -30,6 +30,7 @@
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/videodev2.h>
+#include <linux/log2.h>
 
 #include <media/v4l2-common.h>
 #include <media/v4l2-ctrls.h>
@@ -54,10 +55,6 @@
 /* required alignments */
 #define S_ALIGN		0	/* multiple of 1 */
 #define H_ALIGN		1	/* multiple of 2 */
-#define W_ALIGN		1	/* multiple of 2 */
-
-/* multiple of 128 bits, line stride, 16 bytes */
-#define L_ALIGN		4
 
 /* flags that indicate a format can be used for capture/output */
 #define VPE_FMT_TYPE_CAPTURE	(1 << 0)
@@ -780,12 +777,21 @@ static int set_srcdst_params(struct vpe_ctx *ctx)
 
 	if ((s_q_data->flags & Q_DATA_INTERLACED) &&
 			!(d_q_data->flags & Q_DATA_INTERLACED)) {
+		int bytes_per_line;
 		const struct vpdma_data_format *mv =
 			&vpdma_misc_fmts[VPDMA_DATA_FMT_MV];
 
 		ctx->deinterlacing = 1;
-		mv_buf_size =
-			(s_q_data->width * s_q_data->height * mv->depth) >> 3;
+		/*
+		 * we make sure that the source image has a 16 byte aligned
+		 * stride, we need to do the same for the motion vector buffer
+		 * by aligning it's stride to the next 16 byte boundry. this
+		 * extra space will not be used by the de-interlacer, but will
+		 * ensure that vpdma operates correctly
+		 */
+		bytes_per_line = ALIGN((s_q_data->width * mv->depth) >> 3,
+					VPDMA_STRIDE_ALIGN);
+		mv_buf_size = bytes_per_line * s_q_data->height;
 	} else {
 		ctx->deinterlacing = 0;
 		mv_buf_size = 0;
@@ -1352,7 +1358,8 @@ static int __vpe_try_fmt(struct vpe_ctx *ctx, struct v4l2_format *f,
 {
 	struct v4l2_pix_format_mplane *pix = &f->fmt.pix_mp;
 	struct v4l2_plane_pix_format *plane_fmt;
-	int i;
+	unsigned int w_align;
+	int i, depth, depth_bytes;
 
 	if (!fmt || !(fmt->types & type)) {
 		vpe_err(ctx->dev, "Fourcc format (0x%08x) invalid.\n",
@@ -1363,7 +1370,31 @@ static int __vpe_try_fmt(struct vpe_ctx *ctx, struct v4l2_format *f,
 	if (pix->field != V4L2_FIELD_NONE && pix->field != V4L2_FIELD_ALTERNATE)
 		pix->field = V4L2_FIELD_NONE;
 
-	v4l_bound_align_image(&pix->width, MIN_W, MAX_W, W_ALIGN,
+	depth = fmt->vpdma_fmt[VPE_LUMA]->depth;
+
+	/*
+	 * the line stride should 16 byte aligned for VPDMA to work, based on
+	 * the bytes per pixel, figure out how much the width should be aligned
+	 * to make sure line stride is 16 byte aligned
+	 */
+	depth_bytes = depth >> 3;
+
+	if (depth_bytes == 3)
+		/*
+		 * if bpp is 3(as in some RGB formats), the pixel width doesn't
+		 * really help in ensuring line stride is 16 byte aligned
+		 */
+		w_align = 4;
+	else
+		/*
+		 * for the remainder bpp(4, 2 and 1), the pixel width alignment
+		 * can ensure a line stride alignment of 16 bytes. For example,
+		 * if bpp is 2, then the line stride can be 16 byte aligned if
+		 * the width is 8 byte aligned
+		 */
+		w_align = order_base_2(VPDMA_DESC_ALIGN / depth_bytes);
+
+	v4l_bound_align_image(&pix->width, MIN_W, MAX_W, w_align,
 			      &pix->height, MIN_H, MAX_H, H_ALIGN,
 			      S_ALIGN);
 
@@ -1383,15 +1414,11 @@ static int __vpe_try_fmt(struct vpe_ctx *ctx, struct v4l2_format *f,
 	}
 
 	for (i = 0; i < pix->num_planes; i++) {
-		int depth;
-
 		plane_fmt = &pix->plane_fmt[i];
 		depth = fmt->vpdma_fmt[i]->depth;
 
 		if (i == VPE_LUMA)
-			plane_fmt->bytesperline =
-					round_up((pix->width * depth) >> 3,
-						1 << L_ALIGN);
+			plane_fmt->bytesperline = (pix->width * depth) >> 3;
 		else
 			plane_fmt->bytesperline = pix->width;