Prev: [PATCH 0/3] HID: make raw output callback more flexible
Next: kprobes: Disable booster when CONFIG_PREEMPT=y
From: James Bottomley on 4 Mar 2010 09:30 On Thu, 2010-03-04 at 14:51 +0100, Pavel Machek wrote: > > On Wed, 2010-03-03 at 21:54 +0000, Pavel Machek wrote: > > > > With some drivers (those doing PIO) or subsystems (SCSI mass storage > > > > over USB HCD), there is no call to flush_dcache_page() for page cache > > > > pages, hence the ARM implementation of update_mmu_cache() doesn't flush > > > > the D-cache (and only invalidating the I-cache doesn't help). > > > > > > > > The viable solutions so far: > > > > > > > > 1. Implement a PIO mapping API similar to the DMA API which takes > > > > care of the D-cache flushing. This means that PIO drivers would > > > > need to be modified to use an API like pio_kmap()/pio_kunmap() > > > > before writing to a page cache page. > > > > 2. Invert the meaning of PG_arch_1 to denote a clean page. This > > > > means that by default newly allocated page cache pages are > > > > considered dirty and even if there isn't a call to > > > > flush_dcache_page(), update_mmu_cache() would flush the D-cache. > > > > This is the PowerPC approach. > > > > > > What about option > > > > > > 3. Forget about PG_arch_1 and always do the flush? > > > > > > How big is the performance impact? Note that current code does not > > > even *work* so working, 10% slower code will be an improvement. > > > > The driver fix is as simple as calling a flush_dcache_page() and I've > > been carrying such patches in my tree for some time now. The question is > > whether we need to do it in the driver or not (would need to update > > Documentation/cachetlb.txt as well). > > > > The reason I'm not in favour always doing the flush is that we penalise > > DMA drivers where there is no need for extra D-cache flushing (already > > handled by the DMA API; option 1 above is similar, just that it is meant > > for PIO usage). An ARM patch I proposed for inverting the meaning of > > PG_arch_1 also marks a page as clean in the dma_map_* functions. > > But you are not fixing driver bug, are you? Technically, he is. In the old days, most VI architectures were high end enough not to require PIO transfers. The only exception was an IDE driver used by sparc, which lead to the arch specific ide in/out string instructions, in which sparc actually did all the necessary flushing. So no other drivers than old IDE grew up with cache flushing in the PIO case (and almost no high end VI hardware had an IDE interface, so they rarely got implemented in the arch layer). However, recently, with the transition from old IDE to libata and the prevalence of ARM with more commodity hardware, the deficiency is becoming exposed. Even the PA8000 workstations now come with an IDE CD, which means we're starting to have problems with them as well. > Seems like ARM has requirement other architectures do not, that is > a) not documented anywhere > b) causes problems > > You could argue that performance improvement (how big is it, anyway?) > is worth it, but this should be agreed to by wider community... Performance is always worth it provided we don't sacrifice correctness. The thing which was discovered in this thread is basically that ARM is handling deferred flushing (for D/I coherency) in a slightly different way from everyone else ... once that's fixed, ARM will likely not have the D/I problem, but we'll still have the libata (and other PIO systems) D flushing issue. James -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Russell King - ARM Linux on 4 Mar 2010 09:40 On Thu, Mar 04, 2010 at 07:51:52PM +0530, James Bottomley wrote: > On Thu, 2010-03-04 at 14:51 +0100, Pavel Machek wrote: > > Seems like ARM has requirement other architectures do not, that is > > a) not documented anywhere > > b) causes problems > > > > You could argue that performance improvement (how big is it, anyway?) > > is worth it, but this should be agreed to by wider community... > > Performance is always worth it provided we don't sacrifice correctness. > The thing which was discovered in this thread is basically that ARM is > handling deferred flushing (for D/I coherency) in a slightly different > way from everyone else ... once that's fixed, ARM will likely not have > the D/I problem, but we'll still have the libata (and other PIO systems) > D flushing issue. I think you've got that backwards. Reversing the meaning of PG_arch_1 will probably fix the D aliasing issue - since we'll interpret '0' to mean "page is dirty, it needs flushing before hitting userspace", whereas '1' means "page has been cleaned; there are no aliases." This doesn not address the I/D coherency issue, where the Icache needs attention to get rid of speculatively loaded cache lines while old data was present in the cache. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Catalin Marinas on 4 Mar 2010 10:30 On Thu, 2010-03-04 at 14:27 +0000, Russell King - ARM Linux wrote: > On Thu, Mar 04, 2010 at 07:51:52PM +0530, James Bottomley wrote: > > On Thu, 2010-03-04 at 14:51 +0100, Pavel Machek wrote: > > > Seems like ARM has requirement other architectures do not, that is > > > a) not documented anywhere > > > b) causes problems > > > > > > You could argue that performance improvement (how big is it, anyway?) > > > is worth it, but this should be agreed to by wider community... > > > > Performance is always worth it provided we don't sacrifice correctness. > > The thing which was discovered in this thread is basically that ARM is > > handling deferred flushing (for D/I coherency) in a slightly different > > way from everyone else ... once that's fixed, ARM will likely not have > > the D/I problem, but we'll still have the libata (and other PIO systems) > > D flushing issue. > > I think you've got that backwards. > > Reversing the meaning of PG_arch_1 will probably fix the D aliasing issue - > since we'll interpret '0' to mean "page is dirty, it needs flushing before > hitting userspace", whereas '1' means "page has been cleaned; there are no > aliases." > > This doesn not address the I/D coherency issue, where the Icache needs > attention to get rid of speculatively loaded cache lines while old data > was present in the cache. The I-cache flushing is already handled in update_mmu_cache (or set_pte_at in a future patch; I'm not talking about other issues on ARM11MPCore here). We always invalidate the I-cache currently (since we may have DMA transfers and the page's D-cache is clean). As an optimisation, we could use PG_arch_2 for I-cache but I don't think there is much performance benefit compared to always invalidating the I-cache flushing. My understanding from this long discussion is that we cannot get the kernel modifying a page cache page which is already mapped in user space (well, ptrace does this but we flush the cache there already). -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Catalin Marinas on 4 Mar 2010 10:40 On Thu, 2010-03-04 at 13:51 +0000, Pavel Machek wrote: > > On Wed, 2010-03-03 at 21:54 +0000, Pavel Machek wrote: > > > > With some drivers (those doing PIO) or subsystems (SCSI mass storage > > > > over USB HCD), there is no call to flush_dcache_page() for page cache > > > > pages, hence the ARM implementation of update_mmu_cache() doesn't flush > > > > the D-cache (and only invalidating the I-cache doesn't help). > > > > > > > > The viable solutions so far: > > > > > > > > 1. Implement a PIO mapping API similar to the DMA API which takes > > > > care of the D-cache flushing. This means that PIO drivers would > > > > need to be modified to use an API like pio_kmap()/pio_kunmap() > > > > before writing to a page cache page. > > > > 2. Invert the meaning of PG_arch_1 to denote a clean page. This > > > > means that by default newly allocated page cache pages are > > > > considered dirty and even if there isn't a call to > > > > flush_dcache_page(), update_mmu_cache() would flush the D-cache. > > > > This is the PowerPC approach. > > > > > > What about option > > > > > > 3. Forget about PG_arch_1 and always do the flush? > > > > > > How big is the performance impact? Note that current code does not > > > even *work* so working, 10% slower code will be an improvement. > > > > The driver fix is as simple as calling a flush_dcache_page() and I've > > been carrying such patches in my tree for some time now. The question is > > whether we need to do it in the driver or not (would need to update > > Documentation/cachetlb.txt as well). > > > > The reason I'm not in favour always doing the flush is that we penalise > > DMA drivers where there is no need for extra D-cache flushing (already > > handled by the DMA API; option 1 above is similar, just that it is meant > > for PIO usage). An ARM patch I proposed for inverting the meaning of > > PG_arch_1 also marks a page as clean in the dma_map_* functions. > > But you are not fixing driver bug, are you? Some drivers I fixed already: db8516f61b481e8, 2d68b7fe55d9e19. > Seems like ARM has requirement other architectures do not, that is > a) not documented anywhere > b) causes problems Well, ARM is pretty similar to other architectures in this respect. And I'm sure other architectures have similar problems, only that they only become visible in some circumstances they may not have encountered (i.e. PIO drivers + filesystem that doesn't call flush_dcache_page like ext*). Some other architectures may do heavier flushing Of course, a Documentation/arm/cachetlb.txt file would make sense. -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Russell King - ARM Linux on 4 Mar 2010 10:40
On Thu, Mar 04, 2010 at 03:25:23PM +0000, Catalin Marinas wrote: > On Thu, 2010-03-04 at 14:27 +0000, Russell King - ARM Linux wrote: > > On Thu, Mar 04, 2010 at 07:51:52PM +0530, James Bottomley wrote: > > > On Thu, 2010-03-04 at 14:51 +0100, Pavel Machek wrote: > > > > Seems like ARM has requirement other architectures do not, that is > > > > a) not documented anywhere > > > > b) causes problems > > > > > > > > You could argue that performance improvement (how big is it, anyway?) > > > > is worth it, but this should be agreed to by wider community... > > > > > > Performance is always worth it provided we don't sacrifice correctness. > > > The thing which was discovered in this thread is basically that ARM is > > > handling deferred flushing (for D/I coherency) in a slightly different > > > way from everyone else ... once that's fixed, ARM will likely not have > > > the D/I problem, but we'll still have the libata (and other PIO systems) > > > D flushing issue. > > > > I think you've got that backwards. > > > > Reversing the meaning of PG_arch_1 will probably fix the D aliasing issue - > > since we'll interpret '0' to mean "page is dirty, it needs flushing before > > hitting userspace", whereas '1' means "page has been cleaned; there are no > > aliases." > > > > This doesn not address the I/D coherency issue, where the Icache needs > > attention to get rid of speculatively loaded cache lines while old data > > was present in the cache. > > The I-cache flushing is already handled in update_mmu_cache (or > set_pte_at in a future patch; I'm not talking about other issues on > ARM11MPCore here). You may not have been; my message was addressed to James to correct his message, which seems to have the issues confused. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |