Prev: [PATCH 0/3] HID: make raw output callback more flexible
Next: kprobes: Disable booster when CONFIG_PREEMPT=y
From: Catalin Marinas on 3 Mar 2010 05:30 On Tue, 2010-03-02 at 23:33 +0000, Benjamin Herrenschmidt wrote: > On Tue, 2010-03-02 at 17:47 +0000, Catalin Marinas wrote: > > > > Actually, option 2 still has an issue - does not easily work on SMP > > systems where cache maintenance operations aren't broadcast in hardware. > > In this case (ARM11MPCore), flush_dcache_page() is implemented > > non-lazily so that the flushing happens on the same processor that > > dirtied the cache. But since with some drivers there is no call to this > > function, it wouldn't make any difference. > > Also, option 1 would not solve the icache issue which has the same > problem related to IPIs. Correct. But that's true for both options. It would have been simpler if we had software TLBs. > You -really- need to spank some HW folks here :-) I think they got the message :). Cortex-A9 does it properly. > > A solution is to do something like read-for-ownership before flushing > > the D-cache in update_mmu_cache() (or set_pte_at()). > > You might also want to experiment with not clearing PG_arch_1 in > flush_dcache_page(). I'm not 100% convinced it is necessary and that may > reduce the amount of flushing needed. Could a file map page be swapped out (and the mapping removed), then the page cache page modified (i.e. NFS filesystem) and flush_dcache_page() called? > Another thing is, on powerpc, we only do the cleaning when we try to > execute from the pages. IE. We basically "filter out" exec permission > when pages are not clean. At least on processors that support per-page > exec permission. You may want to consider something like that as well. For non-aliasing VIPT, I think that's a fair optimisation. Thanks. -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Catalin Marinas on 3 Mar 2010 05:50 On Wed, 2010-03-03 at 03:47 +0000, FUJITA Tomonori wrote: > On Wed, 03 Mar 2010 10:29:54 +1100 > Benjamin Herrenschmidt <benh(a)kernel.crashing.org> wrote: > > > On Tue, 2010-03-02 at 17:05 +0000, Catalin Marinas wrote: > > > > > The viable solutions so far: > > > > > > 1. Implement a PIO mapping API similar to the DMA API which takes > > > care of the D-cache flushing. This means that PIO drivers would > > > need to be modified to use an API like pio_kmap()/pio_kunmap() > > > before writing to a page cache page. > > > 2. Invert the meaning of PG_arch_1 to denote a clean page. This > > > means that by default newly allocated page cache pages are > > > considered dirty and even if there isn't a call to > > > flush_dcache_page(), update_mmu_cache() would flush the D-cache. > > > This is the PowerPC approach. > > > > I don't see the point of a "PIO" API. I would thus vote for 2 :-) Note > > Yeah, as powerpc and ia64 do, arm can flush D cache and invalidate I > cache when inserting a executable page to pte, IIUC. No need for the > new API for I/D consistency. I can see that IA-64 uses the PG_arch_1 bit to mark a clean page rather than dirty (as we did for ARM). The Documentation/cachetlb.txt needs updating. Thanks. -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Catalin Marinas on 3 Mar 2010 05:50 On Tue, 2010-03-02 at 23:29 +0000, Benjamin Herrenschmidt wrote: > On Tue, 2010-03-02 at 17:05 +0000, Catalin Marinas wrote: > > > The viable solutions so far: > > > > 1. Implement a PIO mapping API similar to the DMA API which takes > > care of the D-cache flushing. This means that PIO drivers would > > need to be modified to use an API like pio_kmap()/pio_kunmap() > > before writing to a page cache page. > > 2. Invert the meaning of PG_arch_1 to denote a clean page. This > > means that by default newly allocated page cache pages are > > considered dirty and even if there isn't a call to > > flush_dcache_page(), update_mmu_cache() would flush the D-cache. > > This is the PowerPC approach. [...] > > Option 2 above looks pretty appealing to me since it can be done in the > > ARM code exclusively. I've done some tests and it indeed solves the > > cache coherency with a rootfs on a USB stick. As Russell suggested, it > > can be optimised to mark a page as clean when the DMA API is involved to > > avoid duplicate flushing. > > That wouldn't solve the need for invalidating the I-cache... Unless we > use another bit. Indeed. We currently always invalidate the I-cache when the page is mapped. With PG_arch_2, we could optimise this but I'm not sure it is worth since I think we only get an update_mmu_cache() call for a page (unless it is unmapped and re-mapped again). -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Russell King - ARM Linux on 3 Mar 2010 14:50 On Wed, Mar 03, 2010 at 03:54:37PM +0530, James Bottomley wrote: > On Wed, 2010-03-03 at 09:36 +0000, Russell King - ARM Linux wrote: > > James, that's a pipedream. If you have a processor which doesn't support > > NX, then the kernel marks all regions executable, even if the app only > > asks for RW protection. > > I'm not talking about what the processor supports ... I'm talking about > what the user sets on the VMA. My point is that the kernel only has > responsibility in specific situations ... it's those paths we do the I/D > coherency on. You may not be talking about what the processor supports, but it is directly relevant. > > You end up with the protection masks always having VM_EXEC set in them, > > so there's no way to distinguish from the kernel POV which pages are > > going to be executed and those which aren't. > > I think you're talking about the pte page flags, I'm talking about the > VMA ones above. No, I'm talking about the VMA ones. if ((prot & PROT_READ) && (current->personality & READ_IMPLIES_EXEC)) if (!(file && (file->f_path.mnt->mnt_flags & MNT_NOEXEC))) prot |= PROT_EXEC; .... /* Do simple checking here so the lower-level routines won't have * to. we assume access permissions have been handled by the open * of the memory object, so we don't do any here. */ vm_flags = calc_vm_prot_bits(prot) | calc_vm_flag_bits(flags) | mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; calc_vm_prot_bits(unsigned long prot) { return _calc_vm_trans(prot, PROT_READ, VM_READ ) | _calc_vm_trans(prot, PROT_WRITE, VM_WRITE) | _calc_vm_trans(prot, PROT_EXEC, VM_EXEC) | arch_calc_vm_prot_bits(prot); } So, if you have a CPU which does not support NX, then READ_IMPLIES_EXEC is set in the personality. That forces PROT_EXEC for anything with PROT_READ, which in turn forces VM_EXEC. > I'm not saying the common path (faulting in text sections) is the > responsibility of user space. I'm saying the uncommon path, write > modification of binaries, is. So the kernel only needs to worry about > the ordinary text fault path. What I'm saying is that you can't always tell the difference between what's an executable page and what isn't in the kernel. On NX-incapable CPUs, the kernel treats *all* readable pages as executable, and there's no way to tell from the VMA or page protection flags that this isn't the case. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Jamie Lokier on 3 Mar 2010 15:30
Catalin Marinas wrote: > > I wonder if it's time to get a PG_arch_2 :-) > > As an optimisation, I think this would help (rather than always > invalidating the I-cache in update_mmu_cache or set_pte_at). If PG_arch_{1,2} are used in the same way on all architectures, when they are used at all, perhaps they should be renamed :-) -- Jamie -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |