Prev: [PATCH 0/3] HID: make raw output callback more flexible
Next: kprobes: Disable booster when CONFIG_PREEMPT=y
From: Catalin Marinas on 2 Mar 2010 12:50 On Tue, 2010-03-02 at 17:05 +0000, Catalin Marinas wrote: > On Tue, 2010-03-02 at 21:11 +0900, FUJITA Tomonori wrote: > > On Sun, 28 Feb 2010 10:31:03 +0530 > > James Bottomley <James.Bottomley(a)HansenPartnership.com> wrote: > > > But the point of all of this is that I cache invalidation doesn't appear > > > anywhere in the I/O path ... so if we're getting I/D incoherency, > > > there's some problem in the mm code (or there's a missing arch > > > assumption ... like I cache gets moved in more aggressively than we > > > expect). Parisc is very sensitive to I/D incoherency, so we'd notice if > > > there were a serious generic problem here. > > > > I'm not sure that there are some problems in the mm or common code. Is > > this ARM's implementation issue? (Of course, the usb stack and the > > driver's misuse of the DMA API needs to be fixed too). > > Just to summarise - on ARM (PIPT / non-aliasing VIPT) there is I-cache > invalidation for user pages in update_mmu_cache() (it could actually be > in set_pte_at on SMP to avoid a race but that's for another thread). The > D-cache is flushed by this function only if the PG_arch_1 bit is set. > This bit is set in the ARM case by flush_dcache_page(), following the > advice in Documentation/cachetlb.txt. > > With some drivers (those doing PIO) or subsystems (SCSI mass storage > over USB HCD), there is no call to flush_dcache_page() for page cache > pages, hence the ARM implementation of update_mmu_cache() doesn't flush > the D-cache (and only invalidating the I-cache doesn't help). > > The viable solutions so far: > > 1. Implement a PIO mapping API similar to the DMA API which takes > care of the D-cache flushing. This means that PIO drivers would > need to be modified to use an API like pio_kmap()/pio_kunmap() > before writing to a page cache page. > 2. Invert the meaning of PG_arch_1 to denote a clean page. This > means that by default newly allocated page cache pages are > considered dirty and even if there isn't a call to > flush_dcache_page(), update_mmu_cache() would flush the D-cache. > This is the PowerPC approach. > > Option 2 above looks pretty appealing to me since it can be done in the > ARM code exclusively. I've done some tests and it indeed solves the > cache coherency with a rootfs on a USB stick. As Russell suggested, it > can be optimised to mark a page as clean when the DMA API is involved to > avoid duplicate flushing. Actually, option 2 still has an issue - does not easily work on SMP systems where cache maintenance operations aren't broadcast in hardware. In this case (ARM11MPCore), flush_dcache_page() is implemented non-lazily so that the flushing happens on the same processor that dirtied the cache. But since with some drivers there is no call to this function, it wouldn't make any difference. A solution is to do something like read-for-ownership before flushing the D-cache in update_mmu_cache() (or set_pte_at()). -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Benjamin Herrenschmidt on 2 Mar 2010 18:40 On Tue, 2010-03-02 at 17:47 +0000, Catalin Marinas wrote: > > Actually, option 2 still has an issue - does not easily work on SMP > systems where cache maintenance operations aren't broadcast in hardware. > In this case (ARM11MPCore), flush_dcache_page() is implemented > non-lazily so that the flushing happens on the same processor that > dirtied the cache. But since with some drivers there is no call to this > function, it wouldn't make any difference. Also, option 1 would not solve the icache issue which has the same problem related to IPIs. You -really- need to spank some HW folks here :-) > A solution is to do something like read-for-ownership before flushing > the D-cache in update_mmu_cache() (or set_pte_at()). You might also want to experiment with not clearing PG_arch_1 in flush_dcache_page(). I'm not 100% convinced it is necessary and that may reduce the amount of flushing needed. Another thing is, on powerpc, we only do the cleaning when we try to execute from the pages. IE. We basically "filter out" exec permission when pages are not clean. At least on processors that support per-page exec permission. You may want to consider something like that as well. Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Benjamin Herrenschmidt on 2 Mar 2010 18:40 On Tue, 2010-03-02 at 17:05 +0000, Catalin Marinas wrote: > The viable solutions so far: > > 1. Implement a PIO mapping API similar to the DMA API which takes > care of the D-cache flushing. This means that PIO drivers would > need to be modified to use an API like pio_kmap()/pio_kunmap() > before writing to a page cache page. > 2. Invert the meaning of PG_arch_1 to denote a clean page. This > means that by default newly allocated page cache pages are > considered dirty and even if there isn't a call to > flush_dcache_page(), update_mmu_cache() would flush the D-cache. > This is the PowerPC approach. I don't see the point of a "PIO" API. I would thus vote for 2 :-) Note that flushing the D-cache isn't enough, you also need to invalidate the I-cache as we discussed earlier, though you mostly get away if you don't by luck. There's also a question as to whether clearing PG_arch_1 is flush_dcache_page() is really necessary or not. > Option 2 above looks pretty appealing to me since it can be done in the > ARM code exclusively. I've done some tests and it indeed solves the > cache coherency with a rootfs on a USB stick. As Russell suggested, it > can be optimised to mark a page as clean when the DMA API is involved to > avoid duplicate flushing. That wouldn't solve the need for invalidating the I-cache... Unless we use another bit. > It was also suggested to add a PG_arch_2 flag which would keep track of > the I-cache status as well. > > I can post a proposal to modify the cachetlb.txt document to reflect the > issues we currently have on ARM. Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Benjamin Herrenschmidt on 2 Mar 2010 19:00 On Tue, 2010-03-02 at 21:11 +0900, FUJITA Tomonori wrote: > > > Sorry to be a bit late to the party (on holiday), but I/D coherency > is > > supposed to be taken care of using flush_cache_page in the memory > > mapping routines. > > powerpc does that? To be exact, powerpc doesn't need > flush_cache_page() and handles I/D coherency in the pte modification > code. powerpc uses PG_arch_1 to avoid unnecessarily handling I/D > coherency. Seems that IA64 does the same trick with PG_arch_1. Right. We set PG_arch_1 to avoid doing it again of a given physical page. We assume that it's always cleared when a page is recycled by the page cache and we also clear it in flush_dcache_page() though the need for that later thing is dubious... Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on 2 Mar 2010 22:50
On Wed, 03 Mar 2010 10:29:54 +1100 Benjamin Herrenschmidt <benh(a)kernel.crashing.org> wrote: > On Tue, 2010-03-02 at 17:05 +0000, Catalin Marinas wrote: > > > The viable solutions so far: > > > > 1. Implement a PIO mapping API similar to the DMA API which takes > > care of the D-cache flushing. This means that PIO drivers would > > need to be modified to use an API like pio_kmap()/pio_kunmap() > > before writing to a page cache page. > > 2. Invert the meaning of PG_arch_1 to denote a clean page. This > > means that by default newly allocated page cache pages are > > considered dirty and even if there isn't a call to > > flush_dcache_page(), update_mmu_cache() would flush the D-cache. > > This is the PowerPC approach. > > I don't see the point of a "PIO" API. I would thus vote for 2 :-) Note Yeah, as powerpc and ia64 do, arm can flush D cache and invalidate I cache when inserting a executable page to pte, IIUC. No need for the new API for I/D consistency. The ways to improve the approach (introducing PG_arch_2 or marking a page clean on dma_unmap_* with DMA_FROM_DEVICE like ia64 does) is up to architectures. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |