From patchwork Thu Sep 2 08:54:24 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: KAMEZAWA Hiroyuki X-Patchwork-Id: 4247 Return-path: Envelope-to: mchehab@localhost Delivery-date: Thu, 02 Sep 2010 18:54:27 -0300 Received: from mchehab by localhost with local (Exim 4.72) (envelope-from ) id 1OrHjm-0000j0-Nm for mchehab@localhost; Thu, 02 Sep 2010 18:54:27 -0300 Received: from bombadil.infradead.org [18.85.46.34] by localhost with IMAP (fetchmail-6.3.17) for (single-drop); Thu, 02 Sep 2010 18:54:26 -0300 (BRT) Received: from vger.kernel.org ([209.132.180.67]) by bombadil.infradead.org with esmtp (Exim 4.72 #1 (Red Hat Linux)) id 1Or5eU-0006tQ-PN; Thu, 02 Sep 2010 09:00:11 +0000 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753524Ab0IBI7n (ORCPT + 1 other); Thu, 2 Sep 2010 04:59:43 -0400 Received: from fgwmail7.fujitsu.co.jp ([192.51.44.37]:37917 "EHLO fgwmail7.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752431Ab0IBI7m (ORCPT ); Thu, 2 Sep 2010 04:59:42 -0400 Received: from m2.gw.fujitsu.co.jp ([10.0.50.72]) by fgwmail7.fujitsu.co.jp (Fujitsu Gateway) with ESMTP id o828xeNZ017928 (envelope-from kamezawa.hiroyu@jp.fujitsu.com); Thu, 2 Sep 2010 17:59:40 +0900 Received: from smail (m2 [127.0.0.1]) by outgoing.m2.gw.fujitsu.co.jp (Postfix) with ESMTP id C1FC445DE55; Thu, 2 Sep 2010 17:59:39 +0900 (JST) Received: from s2.gw.fujitsu.co.jp (s2.gw.fujitsu.co.jp [10.0.50.92]) by m2.gw.fujitsu.co.jp (Postfix) with ESMTP id 819AC45DE4E; Thu, 2 Sep 2010 17:59:39 +0900 (JST) Received: from s2.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s2.gw.fujitsu.co.jp (Postfix) with ESMTP id 5C0961DB803A; Thu, 2 Sep 2010 17:59:39 +0900 (JST) Received: from m106.s.css.fujitsu.com (m106.s.css.fujitsu.com [10.249.87.106]) by s2.gw.fujitsu.co.jp (Postfix) with ESMTP id 9F8BD1DB803C; Thu, 2 Sep 2010 17:59:38 +0900 (JST) Received: from m106.css.fujitsu.com (m106 [127.0.0.1]) by m106.s.css.fujitsu.com (Postfix) with ESMTP id 5A9D25B8E84; Thu, 2 Sep 2010 17:59:38 +0900 (JST) Received: from WIN-WAU6SZB64RR (unknown [10.124.102.31]) by m106.s.css.fujitsu.com (Postfix) with SMTP id 83C185B8E7F; Thu, 2 Sep 2010 17:59:37 +0900 (JST) X-SecurityPolicyCheck-FJ: OK by FujitsuOutboundMailChecker v1.3.1 Received: from WIN-WAU6SZB64RR[10.124.102.31] by WIN-WAU6SZB64RR (FujitsuOutboundMailChecker v1.3.1/9992[10.124.102.31]); Thu, 02 Sep 2010 17:54:47 +0900 (JST) Date: Thu, 2 Sep 2010 17:54:24 +0900 From: KAMEZAWA Hiroyuki To: KAMEZAWA Hiroyuki Cc: Minchan Kim , =?UTF-8?B?TWljaGHFgg==?= Nazarewicz , Andrew Morton , Hans Verkuil , Daniel Walker , Russell King , Jonathan Corbet , Peter Zijlstra , Pawel Osciak , Konrad Rzeszutek Wilk , linux-kernel@vger.kernel.org, FUJITA Tomonori , linux-mm@kvack.org, Kyungmin Park , Zach Pfeffer , Mark Brown , Mel Gorman , linux-media@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Marek Szyprowski Subject: Re: [PATCH/RFCv4 0/6] The Contiguous Memory Allocator framework Message-Id: <20100902175424.5849c197.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <20100827171639.83c8642c.kamezawa.hiroyu@jp.fujitsu.com> References: <1282310110.2605.976.camel@laptop> <20100825155814.25c783c7.akpm@linux-foundation.org> <20100826095857.5b821d7f.kamezawa.hiroyu@jp.fujitsu.com> <20100826115017.04f6f707.kamezawa.hiroyu@jp.fujitsu.com> <20100826124434.6089630d.kamezawa.hiroyu@jp.fujitsu.com> <20100826133028.39d731da.kamezawa.hiroyu@jp.fujitsu.com> <20100827171639.83c8642c.kamezawa.hiroyu@jp.fujitsu.com> Organization: FUJITSU Co. LTD. X-Mailer: Sylpheed 3.0.3 (GTK+ 2.10.14; i686-pc-mingw32) Mime-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org Sender: Mauro Carvalho Chehab On Fri, 27 Aug 2010 17:16:39 +0900 KAMEZAWA Hiroyuki wrote: > On Thu, 26 Aug 2010 18:36:24 +0900 > Minchan Kim wrote: > > > On Thu, Aug 26, 2010 at 1:30 PM, KAMEZAWA Hiroyuki > > wrote: > > > On Thu, 26 Aug 2010 13:06:28 +0900 > > > Minchan Kim wrote: > > > > > >> On Thu, Aug 26, 2010 at 12:44 PM, KAMEZAWA Hiroyuki > > >> wrote: > > >> > On Thu, 26 Aug 2010 11:50:17 +0900 > > >> > KAMEZAWA Hiroyuki wrote: > > >> > > > >> >> 128MB...too big ? But it's depend on config. > > >> >> > > >> >> IBM's ppc guys used 16MB section, and recently, a new interface to shrink > > >> >> the number of /sys files are added, maybe usable. > > >> >> > > >> >> Something good with this approach will be you can create "cma" memory > > >> >> before installing driver. > > >> >> > > >> >> But yes, complicated and need some works. > > >> >> > > >> > Ah, I need to clarify what I want to say. > > >> > > > >> > With compaction, it's helpful, but you can't get contiguous memory larger > > >> > than MAX_ORDER, I think. To get memory larger than MAX_ORDER on demand, > > >> > memory hot-plug code has almost all necessary things. > > >> > > >> True. Doesn't patch's idea of Christoph helps this ? > > >> http://lwn.net/Articles/200699/ > > >> > > > > > > yes, I think so. But, IIRC,  it's own purpose of Chirstoph's work is > > > for removing zones. please be careful what's really necessary. > > > > Ahh. Sorry for missing point. > > You're right. The patch can't help our problem. > > > > How about changing following this? > > The thing is MAX_ORDER is static. But we want to avoid too big > > MAX_ORDER of whole zones to support devices which requires big > > allocation chunk. > > So let's add MAX_ORDER into each zone and then, each zone can have > > different max order. > > For example, while DMA[32], NORMAL, HIGHMEM can have normal size 11, > > MOVABLE zone could have a 15. > > > > This approach has a big side effect? > > > > Hm...need to check hard coded MAX_ORDER usages...I don't think > side-effect is big. Hmm. But I think enlarging MAX_ORDER isn't an > important thing. A code which strips contiguous chunks of pages from > buddy allocator is a necessaty thing, as.. > > What I can think of at 1st is... > == > int steal_pages(unsigned long start_pfn, unsigned long end_pfn) > { > /* Be careful mutal execution with memory hotplug, because reusing code */ > > split [start_pfn, end_pfn) to pageblock_order > > for each pageblock in the range { > Mark this block as MIGRATE_ISOLATE > try-to-free pages in the range or > migrate pages in the range to somewhere. > /* Here all pages in the range are on buddy allocator > and free and never be allocated by anyone else. */ > } > > please see __rmqueue_fallback(). it selects migration-type at 1st. > Then, if you can pass start_migratetype of MIGLATE_ISOLATE, > you can automatically strip all MIGRATE_ISOLATE pages from free_area[]. > > return chunk of pages. > } > == > Here is a rough code for this. I'm sorry I can't have time to show enough good code. Maybe this cannot be compiled. But you may be able to see what can be done with memory hotplug or compaction code. I'll brush this up if someone has interest. == This is a code for creating isolated memory block of contiguous pages. find_isolate_contig_block(unsigned long hint, unsigned long size) will retrun [start, start+size] of isolated pages - start > hint, - no memory holes within it. - page allocator will never touch pages within the range. Of course, this can fail. This code makes use of memory-hotunplug's code. But yes, you can think of reusing compaction codes. This is an example. Not compiled at all...please don't see details. --- mm/isolation.c | 236 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 236 insertions(+) Index: kametest/mm/isolation.c =================================================================== --- /dev/null +++ kametest/mm/isolation.c @@ -0,0 +1,233 @@ +struct page_range { + unsigned long base, end, pages; +}; + +int __get_contig_block(unsigned long pfn, unsigned long nr_pages, void *arg) +{ + struct page_range *blockinfo = arg; + + if (nr_pages > blockinfo->pages) { + blockinfo->base = pfn; + blockinfo->end = pfn + nr_pages; + return 1; + } + return 0; +} + + +unsigned long __find_contig_block(unsigned long base, + unsigned long end, unsigned long pages) +{ + unsigned long pfn, tmp, index; + struct page_range blockinfo; + int ret; + + /* Skip memory holes */ +retry: + blockinfo.base = base; + blockinfo.end = end; + blockinfo.pages = pages; + ret = walk_system_ram_range(base, end - base, &blockinfo, + __get_contig_block); + if (!ret) + return 0; + /* Ok, we gound contiguous memory chunk of size. Isolate it.*/ + for (pfn = blockinfo->base; pfn + pages < blockinfo->end;) { + + for (index = 0; index < nr_pages; index += pageblock_nr_pages) + struct page *page; + + page = pfn_to_page(pfn+index); + if (set_migratetype_isoalte(page)) + break; + } + if (index == nr_pages) + return pfn; /* [pfn...pfn+nr_pages) are isolated */ + /* rollback */ + for (tmp = 0; tmp < index; tmp += pageblock_nr_pages) { + page = pfn_to_page(pfn+tmp); + unset_migratetype_isolate(page); + } + pfn += index; + } + /* failed ? */ + if (blockinfo.end + pages < end) { + /* Move base address and find the next block */ + base = blockinfo.end; + goto retry; + } + return 0; +} + +unsigned long +find_isolate_conting_block(unsigned long hint, unsigned long size) +{ + unsigned long base, found, end, blocks, pages; + unsigned long *map; + int nid, retry; + physaddr_t addr = 0; + + pages = PAGE_ALIGN(size) >> PAGE_SHIFT; + pages = ALIGN(pages, pageblock_nr_pages); + blocks = pages/pageblock_nr_pages; + base = hint; + +retry: + for_each_node_state(nid, N_HIGH_MEMORY) { + unsigned long start; + pg_data_t *node = NODE_DATA(nid); + + if (node->node_start_pfn + node->node_end_pfn - base < pages) + continue; + if (base < node->node_start_pfn) + base = node->node_start_pfn; + end = node->node_end_pfn; + /* Maybe we can use this Node */ + found = __find_contig_block(base, end, blocks); + if (found) /* Found ? */ + break; + base = end; /* try next node*/ + } + if (!found) + goto out; + /* + * Ok, here, we have contiguous pageblock marked as "isolated" + * try migration. + */ + retry = 5; + while (retry--) { + if (!do_migrate_range(found, found + pages)) + break; + lru_add_drain_all(); + flush_scheduled_work(); + cond_resched(); + drain_all_pages(); + } + lru_add_drain_all(); + flush_scheduled_work(); + drain_all_pages(); + offlined_pages = check_pages_isolated(found, found+pages); + /* Ok, here, [found...found+pages) memory are isolated */ +out: + return found; +} -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html