Prev: [PATCH 0/3] HID: make raw output callback more flexible
Next: kprobes: Disable booster when CONFIG_PREEMPT=y
From: Benjamin Herrenschmidt on 3 Mar 2010 00:20 On Wed, 2010-03-03 at 12:47 +0900, FUJITA Tomonori wrote: > The ways to improve the approach (introducing PG_arch_2 or marking a > page clean on dma_unmap_* with DMA_FROM_DEVICE like ia64 does) is up > to architectures. How does the above work ? IE, the dma unmap will flush the D side but not the I side ... or is the ia64 flush primitive magic enough to do both ? Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: James Bottomley on 3 Mar 2010 00:50 On Wed, 2010-03-03 at 16:10 +1100, Benjamin Herrenschmidt wrote: > On Wed, 2010-03-03 at 12:47 +0900, FUJITA Tomonori wrote: > > The ways to improve the approach (introducing PG_arch_2 or marking a > > page clean on dma_unmap_* with DMA_FROM_DEVICE like ia64 does) is up > > to architectures. > > How does the above work ? IE, the dma unmap will flush the D side but > not the I side ... or is the ia64 flush primitive magic enough to do > both ? The point is that in a well regulated system, the I cache shouldn't need extra flushing in the kernel. We should only be faulting in R-X pages. If we're operating on RWX pages (i.e. self modifying code), it's the job of userspace to keep I/D coherency. So the only case the kernel needs to worry about is the R-X fault case for executable text code. James -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on 3 Mar 2010 01:40 On Wed, 03 Mar 2010 16:10:32 +1100 Benjamin Herrenschmidt <benh(a)kernel.crashing.org> wrote: > On Wed, 2010-03-03 at 12:47 +0900, FUJITA Tomonori wrote: > > The ways to improve the approach (introducing PG_arch_2 or marking a > > page clean on dma_unmap_* with DMA_FROM_DEVICE like ia64 does) is up > > to architectures. > > How does the above work ? IE, the dma unmap will flush the D side but > not the I side ... or is the ia64 flush primitive magic enough to do > both ? On ia64 platform, I (and D) cache is coherent with the memory that you did DMA to, I think. But better to ask an ia64 guru. :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Russell King - ARM Linux on 3 Mar 2010 04:50 On Wed, Mar 03, 2010 at 11:10:09AM +0530, James Bottomley wrote: > On Wed, 2010-03-03 at 16:10 +1100, Benjamin Herrenschmidt wrote: > > On Wed, 2010-03-03 at 12:47 +0900, FUJITA Tomonori wrote: > > > The ways to improve the approach (introducing PG_arch_2 or marking a > > > page clean on dma_unmap_* with DMA_FROM_DEVICE like ia64 does) is up > > > to architectures. > > > > How does the above work ? IE, the dma unmap will flush the D side but > > not the I side ... or is the ia64 flush primitive magic enough to do > > both ? > > The point is that in a well regulated system, the I cache shouldn't need > extra flushing in the kernel. We should only be faulting in R-X pages. James, that's a pipedream. If you have a processor which doesn't support NX, then the kernel marks all regions executable, even if the app only asks for RW protection. You end up with the protection masks always having VM_EXEC set in them, so there's no way to distinguish from the kernel POV which pages are going to be executed and those which aren't. And if you can't do that, you have to _always_ flush the I cache for every page fault, because you don't know if the I cache is out of sync with the page that you've just read in from disk - and therefore you may end up executing bad code instead of the glibc text that was intended. So here's the question: in a system where the responsibility for I-cache flushing is in userspace, how do you ensure that you can execute code in userspace to do this I-cache flushing without first having flushed the (speculatively prefetching) I-cache? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: James Bottomley on 3 Mar 2010 05:30
On Wed, 2010-03-03 at 09:36 +0000, Russell King - ARM Linux wrote: > On Wed, Mar 03, 2010 at 11:10:09AM +0530, James Bottomley wrote: > > On Wed, 2010-03-03 at 16:10 +1100, Benjamin Herrenschmidt wrote: > > > On Wed, 2010-03-03 at 12:47 +0900, FUJITA Tomonori wrote: > > > > The ways to improve the approach (introducing PG_arch_2 or marking a > > > > page clean on dma_unmap_* with DMA_FROM_DEVICE like ia64 does) is up > > > > to architectures. > > > > > > How does the above work ? IE, the dma unmap will flush the D side but > > > not the I side ... or is the ia64 flush primitive magic enough to do > > > both ? > > > > The point is that in a well regulated system, the I cache shouldn't need > > extra flushing in the kernel. We should only be faulting in R-X pages. > > James, that's a pipedream. If you have a processor which doesn't support > NX, then the kernel marks all regions executable, even if the app only > asks for RW protection. I'm not talking about what the processor supports ... I'm talking about what the user sets on the VMA. My point is that the kernel only has responsibility in specific situations ... it's those paths we do the I/D coherency on. > You end up with the protection masks always having VM_EXEC set in them, > so there's no way to distinguish from the kernel POV which pages are > going to be executed and those which aren't. I think you're talking about the pte page flags, I'm talking about the VMA ones above. > And if you can't do that, you have to _always_ flush the I cache for > every page fault, because you don't know if the I cache is out of sync > with the page that you've just read in from disk - and therefore you > may end up executing bad code instead of the glibc text that was > intended. If you're doing a not present, fault in a VMA executable region, I agree ... since that's the start of the lifecycle where we have to begin with I/D coherent. > So here's the question: in a system where the responsibility for I-cache > flushing is in userspace, how do you ensure that you can execute code > in userspace to do this I-cache flushing without first having flushed > the (speculatively prefetching) I-cache? I'm not saying the common path (faulting in text sections) is the responsibility of user space. I'm saying the uncommon path, write modification of binaries, is. So the kernel only needs to worry about the ordinary text fault path. James -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |