From patchwork Tue Sep 3 20:25:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 103939 Received: from am.mirrors.kernel.org ([147.75.80.249]) by linuxtv.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1sla6n-0005jM-1g for patchwork@linuxtv.org; Tue, 03 Sep 2024 20:26:38 +0000 Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 594BA1F241B2 for ; Tue, 3 Sep 2024 20:26:35 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 69A631D5887; Tue, 3 Sep 2024 20:26:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=collabora.com header.i=adrian.larumbe@collabora.com header.b="imqEIJlC" X-Original-To: linux-media@vger.kernel.org Received: from sender4-pp-f112.zoho.com (sender4-pp-f112.zoho.com [136.143.188.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 24E841D0177; Tue, 3 Sep 2024 20:26:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=pass smtp.client-ip=136.143.188.112 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725395177; cv=pass; b=RgDGzlb1WEaEDo0AL091MgL6P7hBTC3UpNfzjHmY7RO1B21SLfwh97DTIAnOrntBr4wL5huScQMDdhmRnEVU32mVe1A6qFsy9FFywihTrUeDyA+sZ5cLuiQ/65NiddYTK16gMoY9zCHfVMfE3wqoBPoFQGdtudIS4iwfjQpxduw= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725395177; c=relaxed/simple; bh=lcjqQtDdPov+JTnCe/hssWg2jgy0n8VeMDuBjDDvr/4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=GGIiPmFNq9Vou7jWEKIN4X+F//in5gtBL1/R1s/pLS/FsYCB8sAUHF0Ysf3na8FhXvUGDPh/kwx7s6Ro+FeW2pCQD3f4/AfInYz4Zj/5zNWUWox4teqlLZIXNmysqKoZxcc7NL1711IyJzJARhpFMQRFsPSY7zc7tH7MU6EgxQs= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (1024-bit key) header.d=collabora.com header.i=adrian.larumbe@collabora.com header.b=imqEIJlC; arc=pass smtp.client-ip=136.143.188.112 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Delivered-To: boris.brezillon@collabora.com ARC-Seal: i=1; a=rsa-sha256; t=1725395158; cv=none; d=zohomail.com; s=zohoarc; b=n77YTNeCT7jLDQpRgYXpY6+m13wc09MJQ+8xf1zySkX+a5FFoUA7caS2U20UycMUHqzOoN3Y6qZPMVH0r1vLhzsu7MSYVSkDJ/4EAL547RfOxbbyno6N/Qumtu+CKOtjvBIVk0GB0lXno/aU3ljPhoWeTy6L+m/5LcXUuzFVjvo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1725395158; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:Subject:To:To:Message-Id:Reply-To; bh=/hw3BzwevORfe2F1Dsvxv8mp6XnBcwkj84xsyjzqFvE=; b=V7M4cUNn0HbjgBC0da0g2oWz6Bbv5Sfd+C8aqrRPQUlaanhND/Q+5e524BRNePRau2gepMDbBjcJqypScfkKF3d2UnV+P+HWteF1gTvmne6loPVEh0XX0L5uBgn93+iP8AYsZI5tPD1Le/tu367HvKwyvH+3yI8V3lu7O65ecYQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=collabora.com; spf=pass smtp.mailfrom=adrian.larumbe@collabora.com; dmarc=pass header.from= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1725395158; s=zohomail; d=collabora.com; i=adrian.larumbe@collabora.com; h=From:From:To:To:Cc:Cc:Subject:Subject:Date:Date:Message-ID:In-Reply-To:References:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-Id:Reply-To; bh=/hw3BzwevORfe2F1Dsvxv8mp6XnBcwkj84xsyjzqFvE=; b=imqEIJlC5rLw1rzYsKcOxOVJRx3enikF8Vch4bCujc1mI9O2OvKTy4hBataEvk5S VOgatXvcnNRgWP2lru7zgyMV9N3wiewsKjd/cSVWo8sw5DsUl3rw5UpUe+phIbZYxyg AtXdqYXecmclSHrNIKjVvzuR8ebfDuunThKinO6M= Received: by mx.zohomail.com with SMTPS id 1725395156600567.3635895856995; Tue, 3 Sep 2024 13:25:56 -0700 (PDT) From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: Boris Brezillon , Steven Price , Liviu Dudau , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= Cc: kernel@collabora.com, =?utf-8?q?Adri=C3=A1n_Larumbe?= , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org Subject: [PATCH v5 1/4] drm/panthor: introduce job cycle and timestamp accounting Date: Tue, 3 Sep 2024 21:25:35 +0100 Message-ID: <20240903202541.430225-2-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240903202541.430225-1-adrian.larumbe@collabora.com> References: <20240903202541.430225-1-adrian.larumbe@collabora.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-LSpam-Score: -6.3 (------) X-LSpam-Report: No, score=-6.3 required=5.0 tests=ARC_SIGNED=0.001,ARC_VALID=-0.1,BAYES_00=-1.9,DKIM_SIGNED=0.1,DKIM_VALID=-0.1,DKIM_VALID_AU=-0.1,DMARC_PASS=-0.001,HEADER_FROM_DIFFERENT_DOMAINS=0.5,MAILING_LIST_MULTI=-1,RCVD_IN_VALIDITY_CERTIFIED=-3,RCVD_IN_VALIDITY_RPBL=1.31,RCVD_IN_VALIDITY_SAFE=-2,SPF_HELO_NONE=0.001,SPF_PASS=-0.001 autolearn=ham autolearn_force=no Enable calculations of job submission times in clock cycles and wall time. This is done by expanding the boilerplate command stream when running a job to include instructions that compute said times right before an after a user CS. A separate kernel BO is created per queue to store those values. Jobs can access their sampled data through a slots buffer-specific index different from that of the queue's ringbuffer. The reason for this is saving memory on the profiling information kernel BO, since the amount of simultaneous profiled jobs we can write into the queue's ringbuffer might be much smaller than for regular jobs, as the former take more CSF instructions. This commit is done in preparation for enabling DRM fdinfo support in the Panthor driver, which depends on the numbers calculated herein. A profile mode mask has been added that will in a future commit allow UM to toggle performance metric sampling behaviour, which is disabled by default to save power. When a ringbuffer CS is constructed, timestamp and cycling sampling instructions are added depending on the enabled flags in the profiling mask. A helper was provided that calculates the number of instructions for a given set of enablement mask, and these are passed as the number of credits when initialising a DRM scheduler job. Signed-off-by: Adrián Larumbe Reviewed-by: Boris Brezillon Reviewed-by: Liviu Dudau --- drivers/gpu/drm/panthor/panthor_device.h | 22 ++ drivers/gpu/drm/panthor/panthor_sched.c | 327 ++++++++++++++++++++--- 2 files changed, 305 insertions(+), 44 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h index e388c0472ba7..a48e30d0af30 100644 --- a/drivers/gpu/drm/panthor/panthor_device.h +++ b/drivers/gpu/drm/panthor/panthor_device.h @@ -66,6 +66,25 @@ struct panthor_irq { atomic_t suspended; }; +/** + * enum panthor_device_profiling_mode - Profiling state + */ +enum panthor_device_profiling_flags { + /** @PANTHOR_DEVICE_PROFILING_DISABLED: Profiling is disabled. */ + PANTHOR_DEVICE_PROFILING_DISABLED = 0, + + /** @PANTHOR_DEVICE_PROFILING_CYCLES: Sampling job cycles. */ + PANTHOR_DEVICE_PROFILING_CYCLES = BIT(0), + + /** @PANTHOR_DEVICE_PROFILING_TIMESTAMP: Sampling job timestamp. */ + PANTHOR_DEVICE_PROFILING_TIMESTAMP = BIT(1), + + /** @PANTHOR_DEVICE_PROFILING_ALL: Sampling everything. */ + PANTHOR_DEVICE_PROFILING_ALL = + PANTHOR_DEVICE_PROFILING_CYCLES | + PANTHOR_DEVICE_PROFILING_TIMESTAMP, +}; + /** * struct panthor_device - Panthor device */ @@ -162,6 +181,9 @@ struct panthor_device { */ struct page *dummy_latest_flush; } pm; + + /** @profile_mask: User-set profiling flags for job accounting. */ + u32 profile_mask; }; /** diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c index c426a392b081..b087648bf59a 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.c +++ b/drivers/gpu/drm/panthor/panthor_sched.c @@ -93,6 +93,9 @@ #define MIN_CSGS 3 #define MAX_CSG_PRIO 0xf +#define NUM_INSTRS_PER_CACHE_LINE (64 / sizeof(u64)) +#define MAX_INSTRS_PER_JOB 32 + struct panthor_group; /** @@ -476,6 +479,18 @@ struct panthor_queue { */ struct list_head in_flight_jobs; } fence_ctx; + + /** @profiling_info: Job profiling data slots and access information. */ + struct { + /** @slots: Kernel BO holding the slots. */ + struct panthor_kernel_bo *slots; + + /** @slot_count: Number of jobs ringbuffer can hold at once. */ + u32 slot_count; + + /** @profiling_seqno: Index of the next available profiling information slot. */ + u32 profiling_seqno; + } profiling_info; }; /** @@ -661,6 +676,18 @@ struct panthor_group { struct list_head wait_node; }; +struct panthor_job_profiling_data { + struct { + u64 before; + u64 after; + } cycles; + + struct { + u64 before; + u64 after; + } time; +}; + /** * group_queue_work() - Queue a group work * @group: Group to queue the work for. @@ -774,6 +801,12 @@ struct panthor_job { /** @done_fence: Fence signaled when the job is finished or cancelled. */ struct dma_fence *done_fence; + + /** @profile_mask: Current device job profiling enablement bitmask. */ + u32 profile_mask; + + /** @profile_slot: Job profiling information index in the profiling slots BO. */ + u32 profiling_slot; }; static void @@ -838,6 +871,7 @@ static void group_free_queue(struct panthor_group *group, struct panthor_queue * panthor_kernel_bo_destroy(queue->ringbuf); panthor_kernel_bo_destroy(queue->iface.mem); + panthor_kernel_bo_destroy(queue->profiling_info.slots); /* Release the last_fence we were holding, if any. */ dma_fence_put(queue->fence_ctx.last_fence); @@ -1982,8 +2016,6 @@ tick_ctx_init(struct panthor_scheduler *sched, } } -#define NUM_INSTRS_PER_SLOT 16 - static void group_term_post_processing(struct panthor_group *group) { @@ -2815,65 +2847,211 @@ static void group_sync_upd_work(struct work_struct *work) group_put(group); } -static struct dma_fence * -queue_run_job(struct drm_sched_job *sched_job) +struct panthor_job_ringbuf_instrs { + u64 buffer[MAX_INSTRS_PER_JOB]; + u32 count; +}; + +struct panthor_job_instr { + u32 profile_mask; + u64 instr; +}; + +#define JOB_INSTR(__prof, __instr) \ + { \ + .profile_mask = __prof, \ + .instr = __instr, \ + } + +static void +copy_instrs_to_ringbuf(struct panthor_queue *queue, + struct panthor_job *job, + struct panthor_job_ringbuf_instrs *instrs) +{ + ssize_t ringbuf_size = panthor_kernel_bo_size(queue->ringbuf); + u32 start = job->ringbuf.start & (ringbuf_size - 1); + ssize_t size, written; + + /* + * We need to write a whole slot, including any trailing zeroes + * that may come at the end of it. Also, because instrs.buffer had + * been zero-initialised, there's no need to pad it with 0's + */ + instrs->count = ALIGN(instrs->count, NUM_INSTRS_PER_CACHE_LINE); + size = instrs->count * sizeof(u64); + written = min(ringbuf_size - start, size); + + memcpy(queue->ringbuf->kmap + start, instrs->buffer, written); + + if (written < size) + memcpy(queue->ringbuf->kmap, + &instrs->buffer[(ringbuf_size - start)/sizeof(u64)], + size - written); +} + +struct panthor_job_cs_params { + u32 profile_mask; + u64 addr_reg; u64 val_reg; + u64 cycle_reg; u64 time_reg; + u64 sync_addr; u64 times_addr; + u64 cs_start; u64 cs_size; + u32 last_flush; u32 waitall_mask; +}; + +static void +get_job_cs_params(struct panthor_job *job, struct panthor_job_cs_params *params) { - struct panthor_job *job = container_of(sched_job, struct panthor_job, base); struct panthor_group *group = job->group; struct panthor_queue *queue = group->queues[job->queue_idx]; struct panthor_device *ptdev = group->ptdev; struct panthor_scheduler *sched = ptdev->scheduler; - u32 ringbuf_size = panthor_kernel_bo_size(queue->ringbuf); - u32 ringbuf_insert = queue->iface.input->insert & (ringbuf_size - 1); - u64 addr_reg = ptdev->csif_info.cs_reg_count - - ptdev->csif_info.unpreserved_cs_reg_count; - u64 val_reg = addr_reg + 2; - u64 sync_addr = panthor_kernel_bo_gpuva(group->syncobjs) + - job->queue_idx * sizeof(struct panthor_syncobj_64b); - u32 waitall_mask = GENMASK(sched->sb_slot_count - 1, 0); - struct dma_fence *done_fence; - int ret; - u64 call_instrs[NUM_INSTRS_PER_SLOT] = { + params->addr_reg = ptdev->csif_info.cs_reg_count - + ptdev->csif_info.unpreserved_cs_reg_count; + params->val_reg = params->addr_reg + 2; + params->cycle_reg = params->addr_reg; + params->time_reg = params->val_reg; + + params->sync_addr = panthor_kernel_bo_gpuva(group->syncobjs) + + job->queue_idx * sizeof(struct panthor_syncobj_64b); + params->times_addr = panthor_kernel_bo_gpuva(queue->profiling_info.slots) + + (job->profiling_slot * sizeof(struct panthor_job_profiling_data)); + params->waitall_mask = GENMASK(sched->sb_slot_count - 1, 0); + + params->cs_start = job->call_info.start; + params->cs_size = job->call_info.size; + params->last_flush = job->call_info.latest_flush; + + params->profile_mask = job->profile_mask; +} + +static void +prepare_job_instrs(const struct panthor_job_cs_params *params, + struct panthor_job_ringbuf_instrs *instrs) +{ + const struct panthor_job_instr instr_seq[] = { /* MOV32 rX+2, cs.latest_flush */ - (2ull << 56) | (val_reg << 48) | job->call_info.latest_flush, + JOB_INSTR(PANTHOR_DEVICE_PROFILING_DISABLED, + (2ull << 56) | (params->val_reg << 48) | params->last_flush), /* FLUSH_CACHE2.clean_inv_all.no_wait.signal(0) rX+2 */ - (36ull << 56) | (0ull << 48) | (val_reg << 40) | (0 << 16) | 0x233, + JOB_INSTR(PANTHOR_DEVICE_PROFILING_DISABLED, + (36ull << 56) | (0ull << 48) | (params->val_reg << 40) | (0 << 16) | 0x233), + + /* MOV48 rX:rX+1, cycles_offset */ + JOB_INSTR(PANTHOR_DEVICE_PROFILING_CYCLES, + (1ull << 56) | (params->cycle_reg << 48) | + (params->times_addr + + offsetof(struct panthor_job_profiling_data, cycles.before))), + /* STORE_STATE cycles */ + JOB_INSTR(PANTHOR_DEVICE_PROFILING_CYCLES, + (40ull << 56) | (params->cycle_reg << 40) | (1ll << 32)), + /* MOV48 rX:rX+1, time_offset */ + JOB_INSTR(PANTHOR_DEVICE_PROFILING_TIMESTAMP, + (1ull << 56) | (params->time_reg << 48) | + (params->times_addr + + offsetof(struct panthor_job_profiling_data, time.before))), + /* STORE_STATE timer */ + JOB_INSTR(PANTHOR_DEVICE_PROFILING_TIMESTAMP, + (40ull << 56) | (params->time_reg << 40) | (0ll << 32)), /* MOV48 rX:rX+1, cs.start */ - (1ull << 56) | (addr_reg << 48) | job->call_info.start, - + JOB_INSTR(PANTHOR_DEVICE_PROFILING_DISABLED, + (1ull << 56) | (params->addr_reg << 48) | params->cs_start), /* MOV32 rX+2, cs.size */ - (2ull << 56) | (val_reg << 48) | job->call_info.size, - + JOB_INSTR(PANTHOR_DEVICE_PROFILING_DISABLED, + (2ull << 56) | (params->val_reg << 48) | params->cs_size), /* WAIT(0) => waits for FLUSH_CACHE2 instruction */ - (3ull << 56) | (1 << 16), - + JOB_INSTR(PANTHOR_DEVICE_PROFILING_DISABLED, (3ull << 56) | (1 << 16)), /* CALL rX:rX+1, rX+2 */ - (32ull << 56) | (addr_reg << 40) | (val_reg << 32), + JOB_INSTR(PANTHOR_DEVICE_PROFILING_DISABLED, + (32ull << 56) | (params->addr_reg << 40) | (params->val_reg << 32)), + + /* MOV48 rX:rX+1, cycles_offset */ + JOB_INSTR(PANTHOR_DEVICE_PROFILING_CYCLES, + (1ull << 56) | (params->cycle_reg << 48) | + (params->times_addr + + offsetof(struct panthor_job_profiling_data, cycles.after))), + /* STORE_STATE cycles */ + JOB_INSTR(PANTHOR_DEVICE_PROFILING_CYCLES, + (40ull << 56) | (params->cycle_reg << 40) | (1ll << 32)), + + /* MOV48 rX:rX+1, time_offset */ + JOB_INSTR(PANTHOR_DEVICE_PROFILING_TIMESTAMP, + (1ull << 56) | (params->time_reg << 48) | + (params->times_addr + + offsetof(struct panthor_job_profiling_data, time.after))), + /* STORE_STATE timer */ + JOB_INSTR(PANTHOR_DEVICE_PROFILING_TIMESTAMP, + (40ull << 56) | (params->time_reg << 40) | (0ll << 32)), /* MOV48 rX:rX+1, sync_addr */ - (1ull << 56) | (addr_reg << 48) | sync_addr, - + JOB_INSTR(PANTHOR_DEVICE_PROFILING_DISABLED, + (1ull << 56) | (params->addr_reg << 48) | params->sync_addr), /* MOV48 rX+2, #1 */ - (1ull << 56) | (val_reg << 48) | 1, - + JOB_INSTR(PANTHOR_DEVICE_PROFILING_DISABLED, + (1ull << 56) | (params->val_reg << 48) | 1), /* WAIT(all) */ - (3ull << 56) | (waitall_mask << 16), - + JOB_INSTR(PANTHOR_DEVICE_PROFILING_DISABLED, + (3ull << 56) | (params->waitall_mask << 16)), /* SYNC_ADD64.system_scope.propage_err.nowait rX:rX+1, rX+2*/ - (51ull << 56) | (0ull << 48) | (addr_reg << 40) | (val_reg << 32) | (0 << 16) | 1, - + JOB_INSTR(PANTHOR_DEVICE_PROFILING_DISABLED, + (51ull << 56) | (0ull << 48) | (params->addr_reg << 40) | + (params->val_reg << 32) | (0 << 16) | 1), /* ERROR_BARRIER, so we can recover from faults at job * boundaries. */ - (47ull << 56), + JOB_INSTR(PANTHOR_DEVICE_PROFILING_DISABLED, (47ull << 56)), + }; + u32 pad; + + /* NEED to be cacheline aligned to please the prefetcher. */ + static_assert(sizeof(instrs->buffer) % 64 == 0, + "panthor_job_ringbuf_instrs::buffer is not aligned on a cacheline"); + + /* Make sure we have enough storage to store the whole sequence. */ + static_assert(ALIGN(ARRAY_SIZE(instr_seq), NUM_INSTRS_PER_CACHE_LINE) <= + ARRAY_SIZE(instrs->buffer), + "instr_seq vs panthor_job_ringbuf_instrs::buffer size mismatch"); + + for (u32 i = 0; i < ARRAY_SIZE(instr_seq); i++) { + /* If the profile mask of this instruction is not enabled, skip it. */ + if (instr_seq[i].profile_mask && + !(instr_seq[i].profile_mask & params->profile_mask)) + continue; + + instrs->buffer[instrs->count++] = instr_seq[i].instr; + } + + pad = ALIGN(instrs->count, NUM_INSTRS_PER_CACHE_LINE); + memset(&instrs->buffer[instrs->count], 0, + (pad - instrs->count) * sizeof(instrs->buffer[0])); + instrs->count = pad; +} + +static u32 calc_job_credits(u32 profile_mask) +{ + struct panthor_job_ringbuf_instrs instrs; + struct panthor_job_cs_params params = { + .profile_mask = profile_mask, }; - /* Need to be cacheline aligned to please the prefetcher. */ - static_assert(sizeof(call_instrs) % 64 == 0, - "call_instrs is not aligned on a cacheline"); + prepare_job_instrs(¶ms, &instrs); + return instrs.count; +} + +static struct dma_fence * +queue_run_job(struct drm_sched_job *sched_job) +{ + struct panthor_job *job = container_of(sched_job, struct panthor_job, base); + struct panthor_group *group = job->group; + struct panthor_queue *queue = group->queues[job->queue_idx]; + struct panthor_device *ptdev = group->ptdev; + struct panthor_scheduler *sched = ptdev->scheduler; + struct panthor_job_ringbuf_instrs instrs; + struct panthor_job_cs_params cs_params; + struct dma_fence *done_fence; + int ret; /* Stream size is zero, nothing to do except making sure all previously * submitted jobs are done before we signal the @@ -2900,17 +3078,23 @@ queue_run_job(struct drm_sched_job *sched_job) queue->fence_ctx.id, atomic64_inc_return(&queue->fence_ctx.seqno)); - memcpy(queue->ringbuf->kmap + ringbuf_insert, - call_instrs, sizeof(call_instrs)); + job->profiling_slot = queue->profiling_info.profiling_seqno++; + if (queue->profiling_info.profiling_seqno == queue->profiling_info.slot_count) + queue->profiling_info.profiling_seqno = 0; + + job->ringbuf.start = queue->iface.input->insert; + + get_job_cs_params(job, &cs_params); + prepare_job_instrs(&cs_params, &instrs); + copy_instrs_to_ringbuf(queue, job, &instrs); + + job->ringbuf.end = job->ringbuf.start + (instrs.count * sizeof(u64)); panthor_job_get(&job->base); spin_lock(&queue->fence_ctx.lock); list_add_tail(&job->node, &queue->fence_ctx.in_flight_jobs); spin_unlock(&queue->fence_ctx.lock); - job->ringbuf.start = queue->iface.input->insert; - job->ringbuf.end = job->ringbuf.start + sizeof(call_instrs); - /* Make sure the ring buffer is updated before the INSERT * register. */ @@ -3003,6 +3187,24 @@ static const struct drm_sched_backend_ops panthor_queue_sched_ops = { .free_job = queue_free_job, }; +static u32 calc_profiling_ringbuf_num_slots(struct panthor_device *ptdev, + u32 cs_ringbuf_size) +{ + u32 min_profiled_job_instrs = U32_MAX; + u32 last_flag = fls(PANTHOR_DEVICE_PROFILING_ALL); + + for (u32 i = 0; i < last_flag; i++) { + if (BIT(i) & PANTHOR_DEVICE_PROFILING_ALL) + min_profiled_job_instrs = + min(min_profiled_job_instrs, calc_job_credits(BIT(i))); + } + + drm_WARN_ON(&ptdev->base, + !IS_ALIGNED(min_profiled_job_instrs, NUM_INSTRS_PER_CACHE_LINE)); + + return DIV_ROUND_UP(cs_ringbuf_size, min_profiled_job_instrs * sizeof(u64)); +} + static struct panthor_queue * group_create_queue(struct panthor_group *group, const struct drm_panthor_queue_create *args) @@ -3056,9 +3258,38 @@ group_create_queue(struct panthor_group *group, goto err_free_queue; } + queue->profiling_info.slot_count = + calc_profiling_ringbuf_num_slots(group->ptdev, args->ringbuf_size); + + queue->profiling_info.slots = + panthor_kernel_bo_create(group->ptdev, group->vm, + queue->profiling_info.slot_count * + sizeof(struct panthor_job_profiling_data), + DRM_PANTHOR_BO_NO_MMAP, + DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC | + DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED, + PANTHOR_VM_KERNEL_AUTO_VA); + + if (IS_ERR(queue->profiling_info.slots)) { + ret = PTR_ERR(queue->profiling_info.slots); + goto err_free_queue; + } + + ret = panthor_kernel_bo_vmap(queue->profiling_info.slots); + if (ret) + goto err_free_queue; + + memset(queue->profiling_info.slots->kmap, 0, + queue->profiling_info.slot_count * sizeof(struct panthor_job_profiling_data)); + + /* + * Credit limit argument tells us the total number of instructions + * across all CS slots in the ringbuffer, with some jobs requiring + * twice as many as others, depending on their profiling status. + */ ret = drm_sched_init(&queue->scheduler, &panthor_queue_sched_ops, group->ptdev->scheduler->wq, 1, - args->ringbuf_size / (NUM_INSTRS_PER_SLOT * sizeof(u64)), + args->ringbuf_size / sizeof(u64), 0, msecs_to_jiffies(JOB_TIMEOUT_MS), group->ptdev->reset.wq, NULL, "panthor-queue", group->ptdev->base.dev); @@ -3354,6 +3585,7 @@ panthor_job_create(struct panthor_file *pfile, { struct panthor_group_pool *gpool = pfile->groups; struct panthor_job *job; + u32 credits; int ret; if (qsubmit->pad) @@ -3407,9 +3639,16 @@ panthor_job_create(struct panthor_file *pfile, } } + job->profile_mask = pfile->ptdev->profile_mask; + credits = calc_job_credits(job->profile_mask); + if (credits == 0) { + ret = -EINVAL; + goto err_put_job; + } + ret = drm_sched_job_init(&job->base, &job->group->queues[job->queue_idx]->entity, - 1, job->group); + credits, job->group); if (ret) goto err_put_job; From patchwork Tue Sep 3 20:25:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 103940 Received: from sy.mirrors.kernel.org ([147.75.48.161]) by linuxtv.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1sla70-0005mx-0y for patchwork@linuxtv.org; Tue, 03 Sep 2024 20:26:51 +0000 Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id CA34AB212B8 for ; Tue, 3 Sep 2024 20:26:47 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id EB2CA1C9DCE; Tue, 3 Sep 2024 20:26:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=collabora.com header.i=adrian.larumbe@collabora.com header.b="gyCFKi3X" X-Original-To: linux-media@vger.kernel.org Received: from sender4-pp-f112.zoho.com (sender4-pp-f112.zoho.com [136.143.188.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C17C81D5897; Tue, 3 Sep 2024 20:26:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=pass smtp.client-ip=136.143.188.112 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725395180; cv=pass; b=lGrOdzHdAtiR4F1VJ/N99WIFO10s0a++Rr2HXb5VEyNgPnc21C2pHgJF37yRYHbF5BgkbCjD80BgJSyqdi2D2LzsT3dgqGAzWURGqLeNhdXyzNY55acgvKIfXTiQh1fsX68A5zrPB3be6YqvELyY4n5L10oMLp8aFgQoYxYcvxg= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725395180; c=relaxed/simple; bh=Fm3yf2Fi9CYnTTkfNrN4BIanGwQZqGTpdIPeSOgzU+0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=mH1BX6BK0kkvGjZyqWztu3xDBExEqQ6sVBeVSF6J1YWfu32EDhpYpVBIP3sBaqiv3FGYbupOK/gY7LbRToxoWHLUm6Rg4mrb7XAqU93MxmO7YVjMjBW78kuplEz981jTwfUqkmd/jmHMBu4GPtqUADE4u+Vty7f7bz8XILRrfiQ= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (1024-bit key) header.d=collabora.com header.i=adrian.larumbe@collabora.com header.b=gyCFKi3X; arc=pass smtp.client-ip=136.143.188.112 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Delivered-To: boris.brezillon@collabora.com ARC-Seal: i=1; a=rsa-sha256; t=1725395160; cv=none; d=zohomail.com; s=zohoarc; b=M/s+g08UUl/LWh4sC4Z6WvnBdBMSSMwtIIFbLtPVCOt5KiMQIbd/06fzrjOEY6PlF864BRdLwcpX8nnUKz3JPKHHvUHB2RGZMAwKVXrWfNz+rMz8U9MwNGPvlGmlHRVbWVA6NZrybus0jPXLJB2UFgwNkcop3LaxUeivHtzygLc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1725395160; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:Subject:To:To:Message-Id:Reply-To; bh=0hK5IstV5Rak8f/N8IrRoF6nKtzqAKTvRVsx38XrqAs=; b=UxKbhzU21vfUPmsgyAN1FJVaOnwqJoymesv8bQcwKUQls2hWF1+oZx7YjazzmpjuLsII5dj2VXb5LRU4AR+yUvoC5uUOxto6YMcUK+H8wyPOliCPvcCbsDASiP2any3JjY7v1Kmr+1ajhMxanv9BOZvjBzf90ngQGtLmqWU1OSA= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=collabora.com; spf=pass smtp.mailfrom=adrian.larumbe@collabora.com; dmarc=pass header.from= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1725395160; s=zohomail; d=collabora.com; i=adrian.larumbe@collabora.com; h=From:From:To:To:Cc:Cc:Subject:Subject:Date:Date:Message-ID:In-Reply-To:References:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-Id:Reply-To; bh=0hK5IstV5Rak8f/N8IrRoF6nKtzqAKTvRVsx38XrqAs=; b=gyCFKi3X5gft20rQ1tns+ufBt7iTdh/GJovBr13qOAK1mAQCNl8rj0CQednJiNn4 nDMoTWMXQQpGPSau0RXM9eD80II5AzILJmypBUJprp8taQiE0iQrcA0op9qIZGiU7pK fQuRHO+yu43zKBm5sE0Xgana+/KONI3vqtRnJ5xU= Received: by mx.zohomail.com with SMTPS id 1725395159718851.3382164517398; Tue, 3 Sep 2024 13:25:59 -0700 (PDT) From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: Boris Brezillon , Steven Price , Liviu Dudau , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= Cc: kernel@collabora.com, =?utf-8?q?Adri=C3=A1n_Larumbe?= , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org Subject: [PATCH v5 2/4] drm/panthor: add DRM fdinfo support Date: Tue, 3 Sep 2024 21:25:36 +0100 Message-ID: <20240903202541.430225-3-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240903202541.430225-1-adrian.larumbe@collabora.com> References: <20240903202541.430225-1-adrian.larumbe@collabora.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-LSpam-Score: -6.3 (------) X-LSpam-Report: No, score=-6.3 required=5.0 tests=ARC_SIGNED=0.001,ARC_VALID=-0.1,BAYES_00=-1.9,DKIM_SIGNED=0.1,DKIM_VALID=-0.1,DKIM_VALID_AU=-0.1,DMARC_PASS=-0.001,HEADER_FROM_DIFFERENT_DOMAINS=0.5,MAILING_LIST_MULTI=-1,RCVD_IN_VALIDITY_CERTIFIED=-3,RCVD_IN_VALIDITY_RPBL=1.31,RCVD_IN_VALIDITY_SAFE=-2,SPF_HELO_NONE=0.001,SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no Drawing from the FW-calculated values in the previous commit, we can increase the numbers for an open file by collecting them from finished jobs when updating their group synchronisation objects. Display of fdinfo key-value pairs is governed by a flag that is by default disabled in the present commit, and supporting manual toggle of it will be the matter of a later commit. Signed-off-by: Adrián Larumbe --- drivers/gpu/drm/panthor/panthor_devfreq.c | 18 ++++++++- drivers/gpu/drm/panthor/panthor_device.h | 14 +++++++ drivers/gpu/drm/panthor/panthor_drv.c | 35 ++++++++++++++++++ drivers/gpu/drm/panthor/panthor_sched.c | 45 +++++++++++++++++++++++ 4 files changed, 111 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.c b/drivers/gpu/drm/panthor/panthor_devfreq.c index c6d3c327cc24..9d0f891b9b53 100644 --- a/drivers/gpu/drm/panthor/panthor_devfreq.c +++ b/drivers/gpu/drm/panthor/panthor_devfreq.c @@ -62,14 +62,20 @@ static void panthor_devfreq_update_utilization(struct panthor_devfreq *pdevfreq) static int panthor_devfreq_target(struct device *dev, unsigned long *freq, u32 flags) { + struct panthor_device *ptdev = dev_get_drvdata(dev); struct dev_pm_opp *opp; + int err; opp = devfreq_recommended_opp(dev, freq, flags); if (IS_ERR(opp)) return PTR_ERR(opp); dev_pm_opp_put(opp); - return dev_pm_opp_set_rate(dev, *freq); + err = dev_pm_opp_set_rate(dev, *freq); + if (!err) + ptdev->current_frequency = *freq; + + return err; } static void panthor_devfreq_reset(struct panthor_devfreq *pdevfreq) @@ -130,6 +136,7 @@ int panthor_devfreq_init(struct panthor_device *ptdev) struct panthor_devfreq *pdevfreq; struct dev_pm_opp *opp; unsigned long cur_freq; + unsigned long freq = ULONG_MAX; int ret; pdevfreq = drmm_kzalloc(&ptdev->base, sizeof(*ptdev->devfreq), GFP_KERNEL); @@ -161,6 +168,7 @@ int panthor_devfreq_init(struct panthor_device *ptdev) return PTR_ERR(opp); panthor_devfreq_profile.initial_freq = cur_freq; + ptdev->current_frequency = cur_freq; /* Regulator coupling only takes care of synchronizing/balancing voltage * updates, but the coupled regulator needs to be enabled manually. @@ -204,6 +212,14 @@ int panthor_devfreq_init(struct panthor_device *ptdev) dev_pm_opp_put(opp); + /* Find the fastest defined rate */ + opp = dev_pm_opp_find_freq_floor(dev, &freq); + if (IS_ERR(opp)) + return PTR_ERR(opp); + ptdev->fast_rate = freq; + + dev_pm_opp_put(opp); + /* * Setup default thresholds for the simple_ondemand governor. * The values are chosen based on experiments. diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h index a48e30d0af30..0e68f5a70d20 100644 --- a/drivers/gpu/drm/panthor/panthor_device.h +++ b/drivers/gpu/drm/panthor/panthor_device.h @@ -184,6 +184,17 @@ struct panthor_device { /** @profile_mask: User-set profiling flags for job accounting. */ u32 profile_mask; + + /** @current_frequency: Device clock frequency at present. Set by DVFS*/ + unsigned long current_frequency; + + /** @fast_rate: Maximum device clock frequency. Set by DVFS */ + unsigned long fast_rate; +}; + +struct panthor_gpu_usage { + u64 time; + u64 cycles; }; /** @@ -198,6 +209,9 @@ struct panthor_file { /** @groups: Scheduling group pool attached to this file. */ struct panthor_group_pool *groups; + + /** @stats: cycle and timestamp measures for job execution. */ + struct panthor_gpu_usage stats; }; int panthor_device_init(struct panthor_device *ptdev); diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c index b5e7b919f241..e18838754963 100644 --- a/drivers/gpu/drm/panthor/panthor_drv.c +++ b/drivers/gpu/drm/panthor/panthor_drv.c @@ -3,12 +3,17 @@ /* Copyright 2019 Linaro, Ltd., Rob Herring */ /* Copyright 2019 Collabora ltd. */ +#ifdef CONFIG_ARM_ARCH_TIMER +#include +#endif + #include #include #include #include #include #include +#include #include #include @@ -1351,6 +1356,34 @@ static int panthor_mmap(struct file *filp, struct vm_area_struct *vma) return ret; } +static void panthor_gpu_show_fdinfo(struct panthor_device *ptdev, + struct panthor_file *pfile, + struct drm_printer *p) +{ + if (ptdev->profile_mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP) { +#ifdef CONFIG_ARM_ARCH_TIMER + drm_printf(p, "drm-engine-panthor:\t%llu ns\n", + DIV_ROUND_UP_ULL((pfile->stats.time * NSEC_PER_SEC), + arch_timer_get_cntfrq())); +#endif + } + if (ptdev->profile_mask & PANTHOR_DEVICE_PROFILING_CYCLES) + drm_printf(p, "drm-cycles-panthor:\t%llu\n", pfile->stats.cycles); + + drm_printf(p, "drm-maxfreq-panthor:\t%lu Hz\n", ptdev->fast_rate); + drm_printf(p, "drm-curfreq-panthor:\t%lu Hz\n", ptdev->current_frequency); +} + +static void panthor_show_fdinfo(struct drm_printer *p, struct drm_file *file) +{ + struct drm_device *dev = file->minor->dev; + struct panthor_device *ptdev = container_of(dev, struct panthor_device, base); + + panthor_gpu_show_fdinfo(ptdev, file->driver_priv, p); + + drm_show_memory_stats(p, file); +} + static const struct file_operations panthor_drm_driver_fops = { .open = drm_open, .release = drm_release, @@ -1360,6 +1393,7 @@ static const struct file_operations panthor_drm_driver_fops = { .read = drm_read, .llseek = noop_llseek, .mmap = panthor_mmap, + .show_fdinfo = drm_show_fdinfo, }; #ifdef CONFIG_DEBUG_FS @@ -1378,6 +1412,7 @@ static const struct drm_driver panthor_drm_driver = { DRIVER_SYNCOBJ_TIMELINE | DRIVER_GEM_GPUVA, .open = panthor_open, .postclose = panthor_postclose, + .show_fdinfo = panthor_show_fdinfo, .ioctls = panthor_drm_driver_ioctls, .num_ioctls = ARRAY_SIZE(panthor_drm_driver_ioctls), .fops = &panthor_drm_driver_fops, diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c index b087648bf59a..e69ab5175ae8 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.c +++ b/drivers/gpu/drm/panthor/panthor_sched.c @@ -619,6 +619,18 @@ struct panthor_group { */ struct panthor_kernel_bo *syncobjs; + /** @fdinfo: Per-file total cycle and timestamp values reference. */ + struct { + /** @data: Pointer to actual per-file sample data. */ + struct panthor_gpu_usage *data; + + /** + * @lock: Mutex to govern concurrent access from drm file's fdinfo callback + * and job post-completion processing function + */ + struct mutex lock; + } fdinfo; + /** @state: Group state. */ enum panthor_group_state state; @@ -886,6 +898,8 @@ static void group_release_work(struct work_struct *work) release_work); u32 i; + mutex_destroy(&group->fdinfo.lock); + for (i = 0; i < group->queue_count; i++) group_free_queue(group, group->queues[i]); @@ -2808,6 +2822,28 @@ void panthor_sched_post_reset(struct panthor_device *ptdev, bool reset_failed) } } +static void update_fdinfo_stats(struct panthor_job *job) +{ + struct panthor_group *group = job->group; + struct panthor_queue *queue = group->queues[job->queue_idx]; + struct panthor_gpu_usage *fdinfo; + struct panthor_job_profiling_data *times; + + times = (struct panthor_job_profiling_data *) + ((unsigned long) queue->profiling_info.slots->kmap + + (job->profiling_slot * sizeof(struct panthor_job_profiling_data))); + + mutex_lock(&group->fdinfo.lock); + if ((group->fdinfo.data)) { + fdinfo = group->fdinfo.data; + if (job->profile_mask & PANTHOR_DEVICE_PROFILING_CYCLES) + fdinfo->cycles += times->cycles.after - times->cycles.before; + if (job->profile_mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP) + fdinfo->time += times->time.after - times->time.before; + } + mutex_unlock(&group->fdinfo.lock); +} + static void group_sync_upd_work(struct work_struct *work) { struct panthor_group *group = @@ -2840,6 +2876,8 @@ static void group_sync_upd_work(struct work_struct *work) dma_fence_end_signalling(cookie); list_for_each_entry_safe(job, job_tmp, &done_jobs, node) { + if (job->profile_mask) + update_fdinfo_stats(job); list_del_init(&job->node); panthor_job_put(&job->base); } @@ -3430,6 +3468,9 @@ int panthor_group_create(struct panthor_file *pfile, } mutex_unlock(&sched->reset.lock); + group->fdinfo.data = &pfile->stats; + mutex_init(&group->fdinfo.lock); + return gid; err_put_group: @@ -3469,6 +3510,10 @@ int panthor_group_destroy(struct panthor_file *pfile, u32 group_handle) mutex_unlock(&sched->lock); mutex_unlock(&sched->reset.lock); + mutex_lock(&group->fdinfo.lock); + group->fdinfo.data = NULL; + mutex_unlock(&group->fdinfo.lock); + group_put(group); return 0; } From patchwork Tue Sep 3 20:25:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 103941 Received: from sv.mirrors.kernel.org ([139.178.88.99]) by linuxtv.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1sla7D-0005nM-1o for patchwork@linuxtv.org; Tue, 03 Sep 2024 20:27:04 +0000 Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id B1B5F2858CE for ; Tue, 3 Sep 2024 20:27:02 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6840E1D5CC2; Tue, 3 Sep 2024 20:26:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=collabora.com header.i=adrian.larumbe@collabora.com header.b="Zqr/jTHb" X-Original-To: linux-media@vger.kernel.org Received: from sender4-pp-f112.zoho.com (sender4-pp-f112.zoho.com [136.143.188.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3A241D58A4; Tue, 3 Sep 2024 20:26:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=pass smtp.client-ip=136.143.188.112 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725395183; cv=pass; b=RKVtIg5xDVKAjhKVat/gUYXUHrAep8A9xP/u3UEDxflBeY8eKmyQbFRNexeA9lNaC9vlFEjmA3qwt5AbfiDEZAVfMXR6SWLZAmdNPOcoebAKH4uGhz/wY02r5mwmA5zzKS5vm8NuxP4+aQeol2CAJvpKScs99TfyAnem1CfwKCw= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725395183; c=relaxed/simple; bh=XGPb9LbdVs7JugQ3H1n47tAaUHx6JRllnNEKY/oQUiA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=cMtHuN7Nv3yXT5jjPmPFz/iXa20l1rZE8dCrhMlVrQKqsooTwnB04Gwi9o2/don0sfm8BwdHXBXPQQd/5s9kakd9zilpzenGAIieirOIzIvP+DBwyV8CyN2AZd7asXFhkWxHCv1P+Zxb4K5nxHZdchIUgZuc5xfRtAWh/A6DgHs= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (1024-bit key) header.d=collabora.com header.i=adrian.larumbe@collabora.com header.b=Zqr/jTHb; arc=pass smtp.client-ip=136.143.188.112 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Delivered-To: boris.brezillon@collabora.com ARC-Seal: i=1; a=rsa-sha256; t=1725395164; cv=none; d=zohomail.com; s=zohoarc; b=PHJigLn3btVtrMZHHeuZViEjw7MJrDS5XPk1DQVzOVE/r99LpyTge5UIkVjAORfo9AhFKq7ElqyPmAB8P7NhYxtmSBDw05xNCY9e1tjdIoDY9B3F40G26rXGk2PFSycNgY3jS8IClRE4y2f2ThKobiDXr0YIpFoAvNieO0nNwKw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1725395164; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:Subject:To:To:Message-Id:Reply-To; bh=PrtZYtmwgxZHb8XgHEA0W+MTLGH+E4kIm30ubb9dtB4=; b=H9Ix7wUYZKgiQhxkB2nLq/Ljgr84mC100ku28HyJBtYBd6eBqIDxGptJTz2+JHrlgDNIOYHahcYO5v6D2w6IgK8ghOJjE2vVFZeKpWjr3upLnnZqEdsbC/EbCh9UT9y5F8wELGj0nP6f3GjO+n/lA7u/rUHSIs+6CnQtPTQ0mw4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=collabora.com; spf=pass smtp.mailfrom=adrian.larumbe@collabora.com; dmarc=pass header.from= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1725395164; s=zohomail; d=collabora.com; i=adrian.larumbe@collabora.com; h=From:From:To:To:Cc:Cc:Subject:Subject:Date:Date:Message-ID:In-Reply-To:References:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-Id:Reply-To; bh=PrtZYtmwgxZHb8XgHEA0W+MTLGH+E4kIm30ubb9dtB4=; b=Zqr/jTHbOzr2DkmD1SvqxnEvWJeLmfy5Jiju+NeBACjDUJ4jOJBzXDB+iC9a5JhW MupztvuOlbYB+ToE2k4kzyDOwMSSuYWelVHIIMKXe1DyOgNenlkG6ONeZvJ5RwRyVEd haXe8eGiFGI9i0okdDXwCOL73B070STyPu38Hfw8= Received: by mx.zohomail.com with SMTPS id 1725395162731400.01079093347084; Tue, 3 Sep 2024 13:26:02 -0700 (PDT) From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: Boris Brezillon , Steven Price , Liviu Dudau , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= Cc: kernel@collabora.com, =?utf-8?q?Adri=C3=A1n_Larumbe?= , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org Subject: [PATCH v5 3/4] drm/panthor: enable fdinfo for memory stats Date: Tue, 3 Sep 2024 21:25:37 +0100 Message-ID: <20240903202541.430225-4-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240903202541.430225-1-adrian.larumbe@collabora.com> References: <20240903202541.430225-1-adrian.larumbe@collabora.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-LSpam-Score: -8.6 (--------) X-LSpam-Report: No, score=-8.6 required=5.0 tests=ARC_SIGNED=0.001,ARC_VALID=-0.1,BAYES_00=-1.9,DKIM_SIGNED=0.1,DKIM_VALID=-0.1,DKIM_VALID_AU=-0.1,DMARC_PASS=-0.001,HEADER_FROM_DIFFERENT_DOMAINS=0.5,MAILING_LIST_MULTI=-1,RCVD_IN_DNSWL_MED=-2.3,RCVD_IN_VALIDITY_CERTIFIED=-3,RCVD_IN_VALIDITY_RPBL=1.31,RCVD_IN_VALIDITY_SAFE=-2,SPF_HELO_NONE=0.001,SPF_PASS=-0.001 autolearn=ham autolearn_force=no Implement drm object's status callback. Also, we consider a PRIME imported BO to be resident if its matching dma_buf has an open attachment, which means its backing storage had already been allocated. Signed-off-by: Adrián Larumbe Reviewed-by: Liviu Dudau Reviewed-by: Boris Brezillon --- drivers/gpu/drm/panthor/panthor_gem.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/panthor/panthor_gem.c index 38f560864879..c60b599665d8 100644 --- a/drivers/gpu/drm/panthor/panthor_gem.c +++ b/drivers/gpu/drm/panthor/panthor_gem.c @@ -145,6 +145,17 @@ panthor_gem_prime_export(struct drm_gem_object *obj, int flags) return drm_gem_prime_export(obj, flags); } +static enum drm_gem_object_status panthor_gem_status(struct drm_gem_object *obj) +{ + struct panthor_gem_object *bo = to_panthor_bo(obj); + enum drm_gem_object_status res = 0; + + if (bo->base.base.import_attach || bo->base.pages) + res |= DRM_GEM_OBJECT_RESIDENT; + + return res; +} + static const struct drm_gem_object_funcs panthor_gem_funcs = { .free = panthor_gem_free_object, .print_info = drm_gem_shmem_object_print_info, @@ -154,6 +165,7 @@ static const struct drm_gem_object_funcs panthor_gem_funcs = { .vmap = drm_gem_shmem_object_vmap, .vunmap = drm_gem_shmem_object_vunmap, .mmap = panthor_gem_mmap, + .status = panthor_gem_status, .export = panthor_gem_prime_export, .vm_ops = &drm_gem_shmem_vm_ops, }; From patchwork Tue Sep 3 20:25:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Adri=C3=A1n_Larumbe?= X-Patchwork-Id: 103942 Received: from ny.mirrors.kernel.org ([147.75.199.223]) by linuxtv.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1sla7U-0005no-2o for patchwork@linuxtv.org; Tue, 03 Sep 2024 20:27:22 +0000 Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id CC5D31C213AD for ; Tue, 3 Sep 2024 20:27:19 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2C0F11D5CF5; Tue, 3 Sep 2024 20:26:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=collabora.com header.i=adrian.larumbe@collabora.com header.b="ZCSHzAWj" X-Original-To: linux-media@vger.kernel.org Received: from sender4-pp-f112.zoho.com (sender4-pp-f112.zoho.com [136.143.188.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3220D1D58BF; Tue, 3 Sep 2024 20:26:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=pass smtp.client-ip=136.143.188.112 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725395184; cv=pass; b=QcWXT3TeRjr3MB34EY0d+FP5ptagIX9QUv7WQ93+NCc20MMMjCMtvNyGHEKzu1si+I4Ppwy5sWpub17UG6R1g0fQUloMCPPzHg06q0evtrScfAP/0NAVv+PTJQzPr0ybXdwP5TJw30QiMqjjFrhuBBzCSkEJQr2ZQOiiMFxdbIM= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725395184; c=relaxed/simple; bh=YdZ2DdOrTJDFkr8c/oJaKgykzTFrzwcLpFaAYmSUZzU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=AbQWpLzOr1ByqPf3HXz19IhCw9geR6EfMtTmpzLwpNxzQN8zLea/Sqt3lFdLJc4akiHbdmUNdOAg/KASYSLHursxD3GIq88FJM4X2uYCf5nITckKRcv8mPHopdoYyIEYVHWt0dkK7sbjTxtEUgQEqCAUjYcDITLRVCftKwu6ea8= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (1024-bit key) header.d=collabora.com header.i=adrian.larumbe@collabora.com header.b=ZCSHzAWj; arc=pass smtp.client-ip=136.143.188.112 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Delivered-To: boris.brezillon@collabora.com ARC-Seal: i=1; a=rsa-sha256; t=1725395166; cv=none; d=zohomail.com; s=zohoarc; b=m1K9UV8esZGKz0vKWVafBmGl2luqEoWELrJEGlqEudisxAfCH4vOvdtheEy58ruAmfhQ9t4qIv4LV+r4qV/3cjlXoz7ZHcaE8Gk/wlnu0sdALgjnyy+IpVSVNT3y7f792h36eUvL5tBJ1CfbCGb/3di4rByLqJ2IA8FEMyNqGBM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1725395166; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:Subject:To:To:Message-Id:Reply-To; bh=S4H1TvHBmLqicNFvkbMgMBC/nubwCjLcKnTiqllabKQ=; b=cLTr9OeDdr6uhrQkIHj3hoyJ5JtuwJkQsJZGDekRPEGA5t3HTDGMtRcUdNLCuExCfMaDL58NC/03Ry7hA0aqsxm86O7wuWKdwos+Eh15hr6QN1wBYItnBOqLVc/sxSrlX3QlUQPbLcJeXWIZYznwEXNgSMRDHFJxvWNoVSwEHsk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=collabora.com; spf=pass smtp.mailfrom=adrian.larumbe@collabora.com; dmarc=pass header.from= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1725395166; s=zohomail; d=collabora.com; i=adrian.larumbe@collabora.com; h=From:From:To:To:Cc:Cc:Subject:Subject:Date:Date:Message-ID:In-Reply-To:References:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-Id:Reply-To; bh=S4H1TvHBmLqicNFvkbMgMBC/nubwCjLcKnTiqllabKQ=; b=ZCSHzAWjzS1zMDIDhRxAxElB6sd+6LtjInAURASYoWEvzgyjqXmYm32dFq+47zy0 KJyORlmIxNCxzcHrW5ODBjmr1vmNrKy7TL2gtiep/GrY9aoycwadfDqitoLHXE/9+4E UdJStkweJy9MO6mJBXlcMpETdrl+GzecaBNoGCY8= Received: by mx.zohomail.com with SMTPS id 1725395165743108.55172235697557; Tue, 3 Sep 2024 13:26:05 -0700 (PDT) From: =?utf-8?q?Adri=C3=A1n_Larumbe?= To: Boris Brezillon , Steven Price , Liviu Dudau , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= Cc: kernel@collabora.com, =?utf-8?q?Adri=C3=A1n_Larumbe?= , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org Subject: [PATCH v5 4/4] drm/panthor: add sysfs knob for enabling job profiling Date: Tue, 3 Sep 2024 21:25:38 +0100 Message-ID: <20240903202541.430225-5-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240903202541.430225-1-adrian.larumbe@collabora.com> References: <20240903202541.430225-1-adrian.larumbe@collabora.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-LSpam-Score: -6.3 (------) X-LSpam-Report: No, score=-6.3 required=5.0 tests=ARC_SIGNED=0.001,ARC_VALID=-0.1,BAYES_00=-1.9,DKIM_SIGNED=0.1,DKIM_VALID=-0.1,DKIM_VALID_AU=-0.1,DMARC_PASS=-0.001,HEADER_FROM_DIFFERENT_DOMAINS=0.5,MAILING_LIST_MULTI=-1,RCVD_IN_VALIDITY_CERTIFIED=-3,RCVD_IN_VALIDITY_RPBL=1.31,RCVD_IN_VALIDITY_SAFE=-2,SPF_HELO_NONE=0.001,SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no This commit introduces a DRM device sysfs attribute that lets UM control the job accounting status in the device. The knob variable had been brought in as part of a previous commit, but now we're able to fix it manually. As sysfs files are part of a driver's uAPI, describe its legitimate input values and output format in a documentation file. Signed-off-by: Adrián Larumbe --- Documentation/gpu/panthor.rst | 46 +++++++++++++++++++++++++++ drivers/gpu/drm/panthor/panthor_drv.c | 39 +++++++++++++++++++++++ 2 files changed, 85 insertions(+) create mode 100644 Documentation/gpu/panthor.rst diff --git a/Documentation/gpu/panthor.rst b/Documentation/gpu/panthor.rst new file mode 100644 index 000000000000..cbf5c4429a2d --- /dev/null +++ b/Documentation/gpu/panthor.rst @@ -0,0 +1,46 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +========================= + drm/Panthor CSF driver +========================= + +.. _panfrost-usage-stats: + +Panthor DRM client usage stats implementation +============================================== + +The drm/Panthor driver implements the DRM client usage stats specification as +documented in :ref:`drm-client-usage-stats`. + +Example of the output showing the implemented key value pairs and entirety of +the currently possible format options: + +:: + pos: 0 + flags: 02400002 + mnt_id: 29 + ino: 491 + drm-driver: panthor + drm-client-id: 10 + drm-engine-panthor: 111110952750 ns + drm-cycles-panthor: 94439687187 + drm-maxfreq-panthor: 1000000000 Hz + drm-curfreq-panthor: 1000000000 Hz + drm-total-memory: 16480 KiB + drm-shared-memory: 0 + drm-active-memory: 16200 KiB + drm-resident-memory: 16480 KiB + drm-purgeable-memory: 0 + +Possible `drm-engine-` key names are: `panthor`. +`drm-curfreq-` values convey the current operating frequency for that engine. + +Users must bear in mind that engine and cycle sampling are disabled by default, +because of power saving concerns. `fdinfo` users and benchmark applications which +query the fdinfo file must make sure to toggle the job profiling status of the +driver by writing into the appropriate sysfs node:: + + echo > /sys/bus/platform/drivers/panthor/[a-f0-9]*.gpu/profiling + +Where `N` is a bit mask where cycle and timestamp sampling are respectively +enabled by the first and second bits. diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c index e18838754963..26475db96c41 100644 --- a/drivers/gpu/drm/panthor/panthor_drv.c +++ b/drivers/gpu/drm/panthor/panthor_drv.c @@ -1450,6 +1450,44 @@ static void panthor_remove(struct platform_device *pdev) panthor_device_unplug(ptdev); } +static ssize_t profiling_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct panthor_device *ptdev = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%d\n", ptdev->profile_mask); +} + +static ssize_t profiling_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + struct panthor_device *ptdev = dev_get_drvdata(dev); + u32 value; + int err; + + err = kstrtou32(buf, 0, &value); + if (err) + return err; + + if ((value & ~PANTHOR_DEVICE_PROFILING_ALL) != 0) + return -EINVAL; + + ptdev->profile_mask = value; + + return len; +} + +static DEVICE_ATTR_RW(profiling); + +static struct attribute *panthor_attrs[] = { + &dev_attr_profiling.attr, + NULL, +}; + +ATTRIBUTE_GROUPS(panthor); + static const struct of_device_id dt_match[] = { { .compatible = "rockchip,rk3588-mali" }, { .compatible = "arm,mali-valhall-csf" }, @@ -1469,6 +1507,7 @@ static struct platform_driver panthor_driver = { .name = "panthor", .pm = pm_ptr(&panthor_pm_ops), .of_match_table = dt_match, + .dev_groups = panthor_groups, }, };