From: Russell King - ARM Linux on 29 Jul 2010 17:20 On Thu, Jul 29, 2010 at 02:55:53PM -0500, Christoph Lameter wrote: > On Thu, 29 Jul 2010, Russell King - ARM Linux wrote: > > > And no, setting the sparse section size to 512kB doesn't work - memory is > > offset by 256MB already, so you need a sparsemem section array of 1024 > > entries just to cover that - with the full 256MB populated, that's 512 > > unused entries followed by 512 used entries. That too is going to waste > > memory like nobodies business. > > SPARSEMEM EXTREME does not handle that? > > Some ARMs seem to have MMUs. If so then use SPARSEMEM_VMEMMAP. You can map > 4k pages for the mmap through a page table. Redirect unused 4k blocks to > the NULL page. We're going over old ground which has already been covered in this very thread. I've no compunction to repeat the arguments. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Russell King - ARM Linux on 29 Jul 2010 18:20 On Thu, Jul 29, 2010 at 01:55:19PM -0700, Dave Hansen wrote: > Could you give some full examples of how the memory is laid out on these > systems? I'm having a bit of a hard time visualizing it. In the example I quote, there are four banks of memory, which start at 0x10000000, 0x14000000, 0x18000000 and 0x1c000000 physical, which can be populated or empty, each one in multiples of 512KB up to the maximum 64MB. There are other systems where memory starts at 0xc0000000 and 0xc8000000 physical, and the memory size is either 32MB or 64MB. We also have one class of systems where memory starts at 0xc0000000, 0xc1000000, 0xc2000000, etc - but I don't know what the minimum populated memory size in any one region is. Things that we've tried over the years: 1. flatmem, remapping memory into one contiguous chunk (which can cause problems when parts of the kernel assume that the underlying phys space is contiguous.) 2. flatmem with holes and a 1:1 v:p mapping (was told we shouldn't be doing this - and it becomes impossible with sparsely populated banks of memory split over a large range.) 3. discontigmem (was told this was too heavy, we're not NUMA, we shouldn't be using this, and it will be deprecated, use sparsemem instead) 4. sparsemem What we need is something which allows us to handle memory scattered in several regions of the physical memory map, each bank being a variable size. From what I've seen through this thread, there is no support for such a setup. (People seem to have their opinions on this, and will tell you what you should be using, only for someone else to tell you that you shouldn't be using that! - *) This isn't something new for ARM, we've had these kinds of issues for the last 10 or more years. What is new is that we're now seeing systems where the first bank of memory to be populated is at a higher physical address than the second bank, and therefore people are setting up v:p mappings which switch the ordering of these - but this I think is unrelated to the discussion at hand. * - this is why I'm exasperated with this latest discussion on it. While we're here, I'll repeat a point made earlier. We don't map lowmem in using 4K pages. That would be utter madness given the small TLB size ARM processors tend to have. Instead, we map lowmem using 1MB section mappings (which occupy one entry in the L1 page table.) Modifying these mappings requires all page tables in the system to be updated - which given that we're SMP etc. now is not practical. So the idea that we can remap a section of memory for the mem_map struct (as suggested several times in this thread) isn't possible without having it allocated in something like vmalloc space. Plus, of course, that if you did such a remapping in the lowmem mapping, the pages which were there become unusable as they lose their virtual mapping (thereby causing phys_to_virt/virt_to_phys on their addresses to break.) Therefore, you only gain even more problems by this method. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Lameter on 29 Jul 2010 18:30 On Thu, 29 Jul 2010, Russell King - ARM Linux wrote: > We don't map lowmem in using 4K pages. That would be utter madness > given the small TLB size ARM processors tend to have. Instead, we > map lowmem using 1MB section mappings (which occupy one entry in the > L1 page table.) Modifying these mappings requires all page tables > in the system to be updated - which given that we're SMP etc. now > is not practical. > > So the idea that we can remap a section of memory for the mem_map > struct (as suggested several times in this thread) isn't possible > without having it allocated in something like vmalloc space. > Plus, of course, that if you did such a remapping in the lowmem > mapping, the pages which were there become unusable as they lose > their virtual mapping (thereby causing phys_to_virt/virt_to_phys > on their addresses to break.) Therefore, you only gain even more > problems by this method. A 1M page dedicated to vmemmap would only be used for memmap and only be addressed using the virtual memory address. The pfn to page and vice versa mapping that is the basic mechamism for virt_to_page and friends is then straightforward. Nothing breaks. memory-model.h: #elif defined(CONFIG_SPARSEMEM_VMEMMAP) /* memmap is virtually contiguous. */ #define __pfn_to_page(pfn) (vmemmap + (pfn)) #define __page_to_pfn(page) (unsigned long)((page) - vmemmap) However, if you have such a sparse address space you would not want 1M blocks for memmap but rather 4k pages. So yes you would need to use vmalloc space (or reserve another virtual range for that purpose). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dave Hansen on 29 Jul 2010 20:40 On Thu, 2010-07-29 at 23:14 +0100, Russell King - ARM Linux wrote: > What we need is something which allows us to handle memory scattered > in several regions of the physical memory map, each bank being a > variable size. Russell, it does sound like you have a pretty pathological case here. :) It's not one that we've really attempted to address on any other architectures. Just to spell it out, if you have 4GB of physical address space, with 512k sections, you need 8192 sections, which means 8192*8 bytes, so it'd eat 64k of memory. That's the normal SPARSEMEM case. SPARSEMEM_EXTREME would be a bit different. It's a 2-level lookup. You'd have 16 "section roots", each representing 256MB of address space. Each time we put memory under one of those roots, we'd fill in a 512-section second-level table, which is designed to always fit into one page. If you start at 256MB, you won't waste all those entries. The disadvantage of SPARSEMEM_EXTREME is that it costs you the extra level in the lookup. The space loss in arm's case would only be 16 pointers, which would more than be made up for by the other gains. The other case where it really makes no sense is when you're populating a single (or small number) of sections, evenly across the address space. For instance, let's say you have 16 512k banks, evenly spaced at 256MB intervals: 512k(a)0x00000000 512k(a)0x10000000 512k(a)0x20000000 ... 512k(a)0xF0000000 If you use SPARSEMEM_EXTREME on that it will degenerate to having the same memory consumption as classic SPARSEMEM, along with the extra lookup of EXTREME. But, I haven't heard you say that you have this kind of configuration, yet. :) SPARSEMEM_EXTREME is really easy to test. You just have to set it in your .config. To get much use out of it, you'd also need to make the SECTION_SIZE, like the 512k we were talking about. -- Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Minchan Kim on 30 Jul 2010 05:40
On Fri, Jul 30, 2010 at 5:55 AM, Dave Hansen <dave(a)linux.vnet.ibm.com> wrote: > On Thu, 2010-07-29 at 19:33 +0100, Russell King - ARM Linux wrote: >> And no, setting the sparse section size to 512kB doesn't work - memory is >> offset by 256MB already, so you need a sparsemem section array of 1024 >> entries just to cover that - with the full 256MB populated, that's 512 >> unused entries followed by 512 used entries. �That too is going to waste >> memory like nobodies business. > > Sparsemem could use some work in the case where memory doesn't start at > 0x0. �But, it doesn't seem like it would be _too_ oppressive to add. > It's literally just adding an offset to all of the places where a > physical address is stuck into the system. �It'll make a few of the > calculations longer, of course, but it should be manageable. > > Could you give some full examples of how the memory is laid out on these > systems? �I'm having a bit of a hard time visualizing it. > > As Christoph mentioned, SPARSEMEM_EXTREME might be viable here, too. > > If you free up parts of the mem_map[] array, how does the buddy > allocator still work? �I thought we required at 'struct page's to be > contiguous and present for at least 2^MAX_ORDER-1 pages in one go. I think in that case, arch should define CONFIG_HOLES_IN_ZONE to prevent crash. But I am not sure hole architectures on ARM have been used it well. Kujkin's problem happens not buddy but walking whole pfn to echo min_free_kbytes. > > -- Dave > > -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |