Prev: [PATCH 0/3] HID: make raw output callback more flexible
Next: kprobes: Disable booster when CONFIG_PREEMPT=y
From: Pavel Machek on 28 Feb 2010 14:20 > There's two potential problems with the approach, and maybe more that I > have missed though. One is the case of a networked filesystem where the > executable pages are modified remotely. However, I would expect such a > program to invalidate the PTE mappings before making the change visible, > so we -do- get a chance to re-flush provided something clears PG_arch_1. > > Then, there's In the case of a multithread app, where one thread does > the cache flush and another thread then executes, the earlier ARMs > without broadcast ops have a potential problem there. In fact, some > variant of PowerPC 440 have the same problem and some people are > (ab)using those for SMP setups I'm being told. > > For that case, I see two options. One is a big hammer but would make > existing code work to "most" extent: Don't allow a page to be both > writable and executable. Ping-pong the page permission lazily and flush > when transitioning from write to exec. > > That means using a spare bit for Linux _PAGE_RW separate from your real > RW bit I suppose, since you have HW loaded PTEs (on 440 it's easier > since we SW load, we can do the fixup there, though it has a perf impact > obviously). > > Another option would be to make some syscall mandatory to "sync" caches > which could then do IPIs or whatever else is needed. But that would > require changing existing userspace code. Or you could do first option by default, and add mmap flag that says that application is responsible for cross-cpu cache flushes...? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Catalin Marinas on 28 Feb 2010 18:20 On Fri, 2010-02-26 at 21:49 +0000, Benjamin Herrenschmidt wrote: > > > > On ARM11MPCore we flush the caches in flush_dcache_page() because the > > > > cache maintenance operations weren't visible to the other CPUs. > > > > > > I'm not even sure that's going to be 100% correct. Don't you also need > > > to flush the remote icaches when you are dealing with instructions (such > > > as swap) anyways ? > > > > I don't think we tried swap but for pages that have been mapped for the > > first time, the I-cache would be clean. > > > > At mm switching, if a thread > > migrates to a new CPU we invalidate the cache at that point. > > That sounds fragile. What about a multithread app with one thread on > each core hitting the pages at the same time ? Sounds racy to me... Interestingly, until commit 826cbdaff29 (< 2 years ago), we didn't have any I-cache flushing in update_mmu_cache() and it was working fine. I added it for correctness reasons rather than to fix something. My theory is that it was working because a page cache page tends to keep the same physical address, especially if we don't swap pages, and a 16KB PIPT cache cannot hold enough lines to show any issues (lines are replaced frequently). I suspect that's one of the reasons why only invalidating the whole I-cache when switching the mm to a new CPU seems to suffice. Once we enable some form of swapping, it may show the problem. -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Catalin Marinas on 28 Feb 2010 18:30 On Fri, 2010-02-26 at 22:03 +0000, Russell King - ARM Linux wrote: > On Sat, Feb 27, 2010 at 08:49:40AM +1100, Benjamin Herrenschmidt wrote: > > It will deadlock if you use normal IRQs. I don't see a good way around > > that other than using a higher-level type of IRQs. I though ARM has > > something like that (FIQs ?). Can you use those guys for IPIs ? [...] > The other problem we'd encounter using FIQs for IPIs is that some IPIs > need to take locks - and in order to make that safe, we'd either need > another class of locks which disable IRQs and FIQs together, or we'd > need to disable FIQs everywhere we disable IRQs - at which point FIQs > become utterly pointless. You could use the FIQ only for the DMA cache maintenance operations and not as a generic IPI mechanism. But the hardware needs to be modified. -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Catalin Marinas on 1 Mar 2010 05:40 On Sun, 2010-02-28 at 05:01 +0000, James Bottomley wrote: > On Sun, 2010-02-28 at 11:14 +1100, Benjamin Herrenschmidt wrote: > > On Fri, 2010-02-26 at 21:00 +0000, Russell King - ARM Linux wrote: > > > On Fri, Feb 26, 2010 at 04:25:21PM +0000, Catalin Marinas wrote: > > > > For mmap'ed pages (and present in the page cache), is it guaranteed that > > > > the HCD driver won't write to it once it has been mapped into user > > > > space? If that's the case, it may solve the problem by just reversing > > > > the meaning of PG_arch_1 on ARM and assume that a newly allocated page > > > > has dirty D-cache by default. > > > > > > I guess we could also set PG_arch_1 in the DMA API as well, to avoid the > > > unnecessary D cache flushing when clean pages get mapped into userspace. > > > > That's an interesting thought for us too. When doing I$/D$ coherency, we > > have to fist flush the D$ and then invalidate the I$. If we could keep > > track of D$ and I$ separately, we could avoid the first step in many > > cases, including the DMA API trick you mentioned. > > > > I wonder if it's time to get a PG_arch_2 :-) > > Sorry to be a bit late to the party (on holiday), but I/D coherency is > supposed to be taken care of using flush_cache_page in the memory > mapping routines. On parisc, at least, we don't use any PG_arch flags > to help. The way it's supposed to work is that I is invalidated on > mapping or remapping, so the I/O code only needs to worry about flushing > D. The guarantee we pass to userland is that any page we do I/O to has > a clean D cache before it goes back to userspace. Thus if userspace > executes the page, the I cache gets its first movein there. There is an > underlying assumption to all of this: The CPU won't speculatively move > in I cache until the page is executed, so we can rely on the > flush_cache_page in the mapping to keep the I cache invalidated until > we're ready to execute. We cannot guarantee this assumption on ARM. As soon as the page is accessible and executable, the CPU can fetch into the I-cache speculatively. Even if the page hasn't been mapped into user-space yet, we still have the kernel linear mapping via which we can get the same I-cache lines fetched (PIPT cache). The only place we can safely invalidate the I-cache is after the D-cache was flushed (after flush_dcache_page). On ARM PIPT, flush_cache_page is a no-op. > The other fundamental assumption is that if > userspace needs to modify an executable region (say for dynamic linking) > it has to take care of reinvalidating the I cache itself ... although it > can do this by remapping the region to alter the flags (i.e W no X then > X no W). The ARM dynamic linker remaps the page with no-exec, writes the data and then remaps it back with exec. The COW code flushes the D-cache. Anyway, recent dynamic linker no longer touches a code page. > > But the point of all of this is that I cache invalidation doesn't appear > anywhere in the I/O path ... so if we're getting I/D incoherency, > there's some problem in the mm code (or there's a missing arch > assumption ... like I cache gets moved in more aggressively than we > expect). Parisc is very sensitive to I/D incoherency, so we'd notice if > there were a serious generic problem here. On ARM PIPT, it's probably because flush_cache_page isn't implemented. But as I said above, given the speculative fetches I don't think it would help much (well, it would work a bit better but not a complete fix). Thanks. -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Catalin Marinas on 1 Mar 2010 05:50
On Sun, 2010-02-28 at 00:14 +0000, Benjamin Herrenschmidt wrote: > On Fri, 2010-02-26 at 21:00 +0000, Russell King - ARM Linux wrote: > > On Fri, Feb 26, 2010 at 04:25:21PM +0000, Catalin Marinas wrote: > > > For mmap'ed pages (and present in the page cache), is it guaranteed that > > > the HCD driver won't write to it once it has been mapped into user > > > space? If that's the case, it may solve the problem by just reversing > > > the meaning of PG_arch_1 on ARM and assume that a newly allocated page > > > has dirty D-cache by default. > > > > I guess we could also set PG_arch_1 in the DMA API as well, to avoid the > > unnecessary D cache flushing when clean pages get mapped into userspace. That sounds good to me. > That's an interesting thought for us too. When doing I$/D$ coherency, we > have to fist flush the D$ and then invalidate the I$. If we could keep > track of D$ and I$ separately, we could avoid the first step in many > cases, including the DMA API trick you mentioned. > > I wonder if it's time to get a PG_arch_2 :-) As an optimisation, I think this would help (rather than always invalidating the I-cache in update_mmu_cache or set_pte_at). -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |