Message ID | 20240113183334.1690740-1-aurelien@aurel32.net (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Hans Verkuil |
Headers |
Return-path: <linux-media+bounces-3669-mchehab=infradead.org@vger.kernel.org> Envelope-to: mchehab@infradead.org Delivery-date: Sat, 13 Jan 2024 18:41:58 +0000 Received: from bombadil.infradead.org [198.137.202.133] by coco.lan with IMAP (fetchmail-6.4.37) for <mchehab@localhost> (single-drop); Sat, 13 Jan 2024 19:42:33 +0100 (CET) Received: from ny.mirrors.kernel.org ([147.75.199.223]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rOixA-005Xll-1l for mchehab@infradead.org; Sat, 13 Jan 2024 18:41:58 +0000 Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 887951C211DE for <mchehab@infradead.org>; Sat, 13 Jan 2024 18:41:25 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 7892D6122; Sat, 13 Jan 2024 18:41:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=aurel32.net header.i=@aurel32.net header.b="s6GswYBU" X-Original-To: linux-media@vger.kernel.org Received: from hall.aurel32.net (hall.aurel32.net [195.154.113.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A23963A6; Sat, 13 Jan 2024 18:41:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=aurel32.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=aurel32.net DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=aurel32.net ; s=202004.hall; h=Content-Transfer-Encoding:MIME-Version:Message-ID:Date: Subject:Cc:To:From:Content-Type:From:Reply-To:Subject:Content-ID: Content-Description:In-Reply-To:References:X-Debbugs-Cc; bh=RuJgwh+4GbW/+hxO9wfk+hS1gNlricVg+fWINBANyag=; b=s6GswYBUChRyFi0RUQt7eFa348 THJNVCSfF9dsOWDNXQ2FyYVugMB2FDC7WvDNfGy6Ub8SxZ1fbGUolaOirL8G0fDVKIyUmFY3VsvfV iSHyPD0pkTADxfCse4NF8BlBkXEN9Fqa9BpS3FFcKOJwHjdD6iMcl79QZl05Y91WfIX+QU+MrIfn2 bQ4uKuxrr63dBE8vqAydFK0b8Opj6Cx3T2RT/3o+xMuz1NPCLfyeR1JZIt6HmibnsU2OQzrbGj3c6 N7BkRNkTDIbAleC4mkFNs201XSU3vS48Ov1Ec4/yQEUjAAaiqS2RHtZSW+OrU+sJykf/ZuNWfoA0u UIt+sGkw==; Received: from ohm.aurel32.net ([2001:bc8:30d7:111::2] helo=ohm.rr44.fr) by hall.aurel32.net with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from <aurelien@aurel32.net>) id 1rOip8-00HYMM-1w; Sat, 13 Jan 2024 19:33:38 +0100 From: Aurelien Jarno <aurelien@aurel32.net> To: linux-kernel@vger.kernel.org, Bluecherry Maintainers <maintainers@bluecherrydvr.com>, Anton Sviridenko <anton@corp.bluecherry.net>, Andrey Utkin <andrey_utkin@fastmail.com>, Ismael Luceno <ismael@iodev.co.uk>, Mauro Carvalho Chehab <mchehab@kernel.org>, linux-media@vger.kernel.org (open list:SOFTLOGIC 6x10 MPEG CODEC) Cc: Linus Torvalds <torvalds@linux-foundation.org>, Andy Shevchenko' <andriy.shevchenko@linux.intel.com>, Andrew Morton' <akpm@linux-foundation.org>, Matthew Wilcox <willy@infradead.org>, Christoph Hellwig' <hch@infradead.org>, "Jason A . Donenfeld" <Jason@zx2c4.com>, Aurelien Jarno <aurelien@aurel32.net>, Jiri Slaby <jirislaby@gmail.com>, stable@vger.kernel.org, David Laight <David.Laight@ACULAB.COM> Subject: [PATCH] media: solo6x10: replace max(a, min(b, c)) by clamp(b, a, c) Date: Sat, 13 Jan 2024 19:33:31 +0100 Message-ID: <20240113183334.1690740-1-aurelien@aurel32.net> X-Mailer: git-send-email 2.42.0 Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: <linux-media.vger.kernel.org> List-Subscribe: <mailto:linux-media+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-media+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240113_104156_739081_0CFF6B87 X-CRM114-Status: GOOD ( 11.53 ) X-Spam-Score: -3.2 (---) X-Spam-Report: Spam detection software, running on the system "bombadil.infradead.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: This patch replaces max(a, min(b, c)) by clamp(b, a, c) in the solo6x10 driver. This improves the readability and more importantly, for the solo6x10-p2m.c file, this reduces on my system (x86-64, gcc [...] Content analysis details: (-3.2 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at https://www.dnswl.org/, medium trust [147.75.199.223 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.2 HEADER_FROM_DIFFERENT_DOMAINS From and EnvelopeFrom 2nd level mail domains are different -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -1.0 MAILING_LIST_MULTI Multiple indicators imply a widely-seen list manager |
Series |
media: solo6x10: replace max(a, min(b, c)) by clamp(b, a, c)
|
|
Commit Message
Aurelien Jarno
Jan. 13, 2024, 6:33 p.m. UTC
This patch replaces max(a, min(b, c)) by clamp(b, a, c) in the solo6x10
driver. This improves the readability and more importantly, for the
solo6x10-p2m.c file, this reduces on my system (x86-64, gcc 13):
- the preprocessed size from 121 MiB to 4.5 MiB;
- the build CPU time from 46.8 s to 1.6 s;
- the build memory from 2786 MiB to 98MiB.
In fine, this allows this relatively simple C file to be built on a
32-bit system.
Reported-by: Jiri Slaby <jirislaby@gmail.com>
Closes: https://lore.kernel.org/lkml/18c6df0d-45ed-450c-9eda-95160a2bbb8e@gmail.com/
Cc: stable@vger.kernel.org # v6.7+
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
drivers/media/pci/solo6x10/solo6x10-offsets.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
Comments
From: Aurelien Jarno > Sent: 13 January 2024 18:34 > > This patch replaces max(a, min(b, c)) by clamp(b, a, c) in the solo6x10 > driver. This improves the readability and more importantly, for the > solo6x10-p2m.c file, this reduces on my system (x86-64, gcc 13): > - the preprocessed size from 121 MiB to 4.5 MiB; > - the build CPU time from 46.8 s to 1.6 s; > - the build memory from 2786 MiB to 98MiB. > > In fine, this allows this relatively simple C file to be built on a > 32-bit system. > > Reported-by: Jiri Slaby <jirislaby@gmail.com> > Closes: https://lore.kernel.org/lkml/18c6df0d-45ed-450c-9eda-95160a2bbb8e@gmail.com/ > Cc: stable@vger.kernel.org # v6.7+ > Suggested-by: David Laight <David.Laight@ACULAB.COM> > Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: David Laight <David.Laight@ACULAB.COM> I was about to send the same patch. Although I'm not sure why the cpu time is so large. It compiles pretty immediately on my system. I do have some patches to minmax.h that reduce the .i for the nested clamp() to around 200k. Mostly obtained be adding min/max_const() for the few places that need a constant and min/max_ptr() for pointer types. Supporting both causes the expansion to be a lot larger. David > --- > drivers/media/pci/solo6x10/solo6x10-offsets.h | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/media/pci/solo6x10/solo6x10-offsets.h b/drivers/media/pci/solo6x10/solo6x10- > offsets.h > index f414ee1316f2..fdbb817e6360 100644 > --- a/drivers/media/pci/solo6x10/solo6x10-offsets.h > +++ b/drivers/media/pci/solo6x10/solo6x10-offsets.h > @@ -57,16 +57,16 @@ > #define SOLO_MP4E_EXT_ADDR(__solo) \ > (SOLO_EREF_EXT_ADDR(__solo) + SOLO_EREF_EXT_AREA(__solo)) > #define SOLO_MP4E_EXT_SIZE(__solo) \ > - max((__solo->nr_chans * 0x00080000), \ > - min(((__solo->sdram_size - SOLO_MP4E_EXT_ADDR(__solo)) - \ > - __SOLO_JPEG_MIN_SIZE(__solo)), 0x00ff0000)) > + clamp(__solo->sdram_size - SOLO_MP4E_EXT_ADDR(__solo) - \ > + __SOLO_JPEG_MIN_SIZE(__solo), \ > + __solo->nr_chans * 0x00080000, 0x00ff0000) > > #define __SOLO_JPEG_MIN_SIZE(__solo) (__solo->nr_chans * 0x00080000) > #define SOLO_JPEG_EXT_ADDR(__solo) \ > (SOLO_MP4E_EXT_ADDR(__solo) + SOLO_MP4E_EXT_SIZE(__solo)) > #define SOLO_JPEG_EXT_SIZE(__solo) \ > - max(__SOLO_JPEG_MIN_SIZE(__solo), \ > - min((__solo->sdram_size - SOLO_JPEG_EXT_ADDR(__solo)), 0x00ff0000)) > + clamp(__solo->sdram_size - SOLO_JPEG_EXT_ADDR(__solo), \ > + __SOLO_JPEG_MIN_SIZE(__solo), 0x00ff0000) > > #define SOLO_SDRAM_END(__solo) \ > (SOLO_JPEG_EXT_ADDR(__solo) + SOLO_JPEG_EXT_SIZE(__solo)) > -- > 2.42.0 - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Hi Aurelien, On 13/01/2024 19:33, Aurelien Jarno wrote: > This patch replaces max(a, min(b, c)) by clamp(b, a, c) in the solo6x10 > driver. This improves the readability and more importantly, for the > solo6x10-p2m.c file, this reduces on my system (x86-64, gcc 13): > - the preprocessed size from 121 MiB to 4.5 MiB; > - the build CPU time from 46.8 s to 1.6 s; > - the build memory from 2786 MiB to 98MiB. Thank you for this patch. Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> I'll pick this up as a fix for v6.8. Linus, if you prefer to pick this up directly, then that's fine as well. I was aware for some time that something had changed elsewhere that caused the solo driver compilation to take a lot of memory, but I didn't have the time yet to investigate it. I now see that this was discussed in the thread from the Closes tag. Note that it would have been nice if that was CCed to linux-media, since it concerned a media driver. The use of clamp() is definitely cleaner code, but it is rather insane that max(a, min(b, c)) takes 2786 MiB over 98 MiB for clamp(). The driver code looks perfectly innocuous, and you won't see the problem unless you compile on a system or a VM with a lot less memory. (I only noticed it myself when compiling in a VM. Perhaps sparse or smatch should be extended with a check for nested min/max and warn for that with a recommendation to use clamp instead? Although right now these tools skip the solo driver since it takes too much memory :-) I even found a min(max(max())) case in drivers/net/wireless/ath/ath11k/mac.c: nss = min(nss, max(max(ath11k_mac_max_ht_nss(ht_mcs_mask), ath11k_mac_max_vht_nss(vht_mcs_mask)), ath11k_mac_max_he_nss(he_mcs_mask))); I wonder how that expands in cpp. So while this patch is a nice cleanup, the underlying problem remains. Regards, Hans > > In fine, this allows this relatively simple C file to be built on a > 32-bit system. > > Reported-by: Jiri Slaby <jirislaby@gmail.com> > Closes: https://lore.kernel.org/lkml/18c6df0d-45ed-450c-9eda-95160a2bbb8e@gmail.com/ > Cc: stable@vger.kernel.org # v6.7+ > Suggested-by: David Laight <David.Laight@ACULAB.COM> > Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> > --- > drivers/media/pci/solo6x10/solo6x10-offsets.h | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/media/pci/solo6x10/solo6x10-offsets.h b/drivers/media/pci/solo6x10/solo6x10-offsets.h > index f414ee1316f2..fdbb817e6360 100644 > --- a/drivers/media/pci/solo6x10/solo6x10-offsets.h > +++ b/drivers/media/pci/solo6x10/solo6x10-offsets.h > @@ -57,16 +57,16 @@ > #define SOLO_MP4E_EXT_ADDR(__solo) \ > (SOLO_EREF_EXT_ADDR(__solo) + SOLO_EREF_EXT_AREA(__solo)) > #define SOLO_MP4E_EXT_SIZE(__solo) \ > - max((__solo->nr_chans * 0x00080000), \ > - min(((__solo->sdram_size - SOLO_MP4E_EXT_ADDR(__solo)) - \ > - __SOLO_JPEG_MIN_SIZE(__solo)), 0x00ff0000)) > + clamp(__solo->sdram_size - SOLO_MP4E_EXT_ADDR(__solo) - \ > + __SOLO_JPEG_MIN_SIZE(__solo), \ > + __solo->nr_chans * 0x00080000, 0x00ff0000) > > #define __SOLO_JPEG_MIN_SIZE(__solo) (__solo->nr_chans * 0x00080000) > #define SOLO_JPEG_EXT_ADDR(__solo) \ > (SOLO_MP4E_EXT_ADDR(__solo) + SOLO_MP4E_EXT_SIZE(__solo)) > #define SOLO_JPEG_EXT_SIZE(__solo) \ > - max(__SOLO_JPEG_MIN_SIZE(__solo), \ > - min((__solo->sdram_size - SOLO_JPEG_EXT_ADDR(__solo)), 0x00ff0000)) > + clamp(__solo->sdram_size - SOLO_JPEG_EXT_ADDR(__solo), \ > + __SOLO_JPEG_MIN_SIZE(__solo), 0x00ff0000) > > #define SOLO_SDRAM_END(__solo) \ > (SOLO_JPEG_EXT_ADDR(__solo) + SOLO_JPEG_EXT_SIZE(__solo))
On Sun, 14 Jan 2024 at 03:04, Hans Verkuil <hverkuil@xs4all.nl> wrote: > > I'll pick this up as a fix for v6.8. > > Linus, if you prefer to pick this up directly, then that's fine as well. Bah, missed this email, and so a belated note that I picked the patch up as commit 31e97d7c9ae3. It even got your Reviewed-by thanks to b4 picking that up automatically. Linus
diff --git a/drivers/media/pci/solo6x10/solo6x10-offsets.h b/drivers/media/pci/solo6x10/solo6x10-offsets.h index f414ee1316f2..fdbb817e6360 100644 --- a/drivers/media/pci/solo6x10/solo6x10-offsets.h +++ b/drivers/media/pci/solo6x10/solo6x10-offsets.h @@ -57,16 +57,16 @@ #define SOLO_MP4E_EXT_ADDR(__solo) \ (SOLO_EREF_EXT_ADDR(__solo) + SOLO_EREF_EXT_AREA(__solo)) #define SOLO_MP4E_EXT_SIZE(__solo) \ - max((__solo->nr_chans * 0x00080000), \ - min(((__solo->sdram_size - SOLO_MP4E_EXT_ADDR(__solo)) - \ - __SOLO_JPEG_MIN_SIZE(__solo)), 0x00ff0000)) + clamp(__solo->sdram_size - SOLO_MP4E_EXT_ADDR(__solo) - \ + __SOLO_JPEG_MIN_SIZE(__solo), \ + __solo->nr_chans * 0x00080000, 0x00ff0000) #define __SOLO_JPEG_MIN_SIZE(__solo) (__solo->nr_chans * 0x00080000) #define SOLO_JPEG_EXT_ADDR(__solo) \ (SOLO_MP4E_EXT_ADDR(__solo) + SOLO_MP4E_EXT_SIZE(__solo)) #define SOLO_JPEG_EXT_SIZE(__solo) \ - max(__SOLO_JPEG_MIN_SIZE(__solo), \ - min((__solo->sdram_size - SOLO_JPEG_EXT_ADDR(__solo)), 0x00ff0000)) + clamp(__solo->sdram_size - SOLO_JPEG_EXT_ADDR(__solo), \ + __SOLO_JPEG_MIN_SIZE(__solo), 0x00ff0000) #define SOLO_SDRAM_END(__solo) \ (SOLO_JPEG_EXT_ADDR(__solo) + SOLO_JPEG_EXT_SIZE(__solo))