From: Martin Schwidefsky on 28 Jul 2010 03:20 On Tue, 27 Jul 2010 19:06:41 -0700 John Stultz <johnstul(a)us.ibm.com> wrote: > To me, there isn't a clear reason why we're using stop_machine > when changing clocksources instead of just taking the xtime_lock. > > Additionally, using stop_machine limits us from being able to > register clocksources from timers (as needed for a following > patch). > > This patch simply removes the stop_machine usage and instead > directly calls change_clocksource, which now takes the xtime_lock. > > I could be totally missing something here that necessitates > stop_machine, but in my testing it seems to function fine. > > Any clarifications or corrections would be appreciated! Installing a new clocksource updates quite a lot of internal variables, we need to make sure that no code ever uses these variables without holding the xtime_lock as writer. And then there is ktime_get which uses a read_seqbegin/ read_seqretry loop on the xtime_lock to get the time from the clocksource. Consider the case where a ktime_get call already did read_seqbegin but did not yet call the read function of the clocksource. Another cpu registers a better clocksource which will cause the timekeeper.clock variable to get updated while the ktime_get call is using it. If I look at timekeeping_get_ns I don't see anything that prevents the compiler from generating code that reads timekeeper.clock multiple times. Which would mix the read function from one clocksource with the cycle_last / mask values from the new clock. Now if we add code that prevents the compiler from reading from timekeeper.clock multiple times we might get away with it. The reasoning for stop_machine is that the change of a clocksource is a major change which has subtle side effects so we want to make sure that nothing breaks. It is a very rare event, we can afford to spent a little bit of time there. Ergo stop_machine. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: john stultz on 28 Jul 2010 12:20 On Wed, 2010-07-28 at 09:17 +0200, Martin Schwidefsky wrote: > On Tue, 27 Jul 2010 19:06:41 -0700 > John Stultz <johnstul(a)us.ibm.com> wrote: > > > To me, there isn't a clear reason why we're using stop_machine > > when changing clocksources instead of just taking the xtime_lock. > > > > Additionally, using stop_machine limits us from being able to > > register clocksources from timers (as needed for a following > > patch). > > > > This patch simply removes the stop_machine usage and instead > > directly calls change_clocksource, which now takes the xtime_lock. > > > > I could be totally missing something here that necessitates > > stop_machine, but in my testing it seems to function fine. > > > > Any clarifications or corrections would be appreciated! > > Installing a new clocksource updates quite a lot of internal > variables, we need to make sure that no code ever uses these > variables without holding the xtime_lock as writer. Agreed. > And then there is ktime_get which uses a read_seqbegin/ > read_seqretry loop on the xtime_lock to get the time from the > clocksource. Consider the case where a ktime_get call already > did read_seqbegin but did not yet call the read function of > the clocksource. Another cpu registers a better clocksource > which will cause the timekeeper.clock variable to get updated > while the ktime_get call is using it. Although ktime_get will be forced to loop and try again, as any writes require holding a write on the xtime_lock. While the xtime_lock writelock is held, the function could possibly mix the read/cycle_last/mask/cyc2ns values, but the results from those invalid calculations will not be returned. > If I look at > timekeeping_get_ns I don't see anything that prevents the > compiler from generating code that reads timekeeper.clock > multiple times. Which would mix the read function from one > clocksource with the cycle_last / mask values from the new > clock. Now if we add code that prevents the compiler from > reading from timekeeper.clock multiple times we might get > away with it. Right, but this should be ok. timekeeping_get_ns is a helper that requires the xtime_lock to be held (such a comment is probably needed, but there is no usage of it when the xtime_lock isn't held). While the function may actually mix values from two clocksources in a calculation, the results of those calculations will be thrown out and re-done via the xtime_lock seqlock. > The reasoning for stop_machine is that the change of a > clocksource is a major change which has subtle side effects > so we want to make sure that nothing breaks. It is a very rare > event, we can afford to spent a little bit of time there. > Ergo stop_machine. I do agree that there can be subtle side effects when dealing with clocksources (part of why I'm being so cautious introducing this change), and when the stop_machine code was added it seemed reasonable. But given the limitations of stop_machine, the more I look at the clocksource_change code, the more I suspect stop_machine is overkill and we can safely just take the write lock on xtime_lock. If I'm still missing something, do let me know. thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Martin Schwidefsky on 29 Jul 2010 03:20 On Wed, 28 Jul 2010 09:12:49 -0700 john stultz <johnstul(a)us.ibm.com> wrote: > On Wed, 2010-07-28 at 09:17 +0200, Martin Schwidefsky wrote: > > On Tue, 27 Jul 2010 19:06:41 -0700 > > John Stultz <johnstul(a)us.ibm.com> wrote: > > > > If I look at > > timekeeping_get_ns I don't see anything that prevents the > > compiler from generating code that reads timekeeper.clock > > multiple times. Which would mix the read function from one > > clocksource with the cycle_last / mask values from the new > > clock. Now if we add code that prevents the compiler from > > reading from timekeeper.clock multiple times we might get > > away with it. > > Right, but this should be ok. timekeeping_get_ns is a helper that > requires the xtime_lock to be held (such a comment is probably needed, > but there is no usage of it when the xtime_lock isn't held). While the > function may actually mix values from two clocksources in a calculation, > the results of those calculations will be thrown out and re-done via the > xtime_lock seqlock. Ok, all callers to timekeeping_get_ns use an xtime_lock loop to make sure that no inconsistent result gets returned. > > The reasoning for stop_machine is that the change of a > > clocksource is a major change which has subtle side effects > > so we want to make sure that nothing breaks. It is a very rare > > event, we can afford to spent a little bit of time there. > > Ergo stop_machine. > > I do agree that there can be subtle side effects when dealing with > clocksources (part of why I'm being so cautious introducing this > change), and when the stop_machine code was added it seemed reasonable. > But given the limitations of stop_machine, the more I look at the > clocksource_change code, the more I suspect stop_machine is overkill and > we can safely just take the write lock on xtime_lock. > > If I'm still missing something, do let me know. What about a clocksource_unregister while a cpu is in the middle of a read_seqbegin/timekeeping_get_ns/read_seqretry? The clocksource structure is "free" after the successful call to the unregister. At least in theory this could be a use after free. The race window is tiny but on virtual systems there can be an arbitrary delay in the ktime_get sequence. I agree that stop_machine is the big gun and restricts the code in the way how the clocksource functions may be call. But it is safe, no? -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: john stultz on 29 Jul 2010 16:50 On Thu, 2010-07-29 at 09:11 +0200, Martin Schwidefsky wrote: > On Wed, 28 Jul 2010 09:12:49 -0700 > john stultz <johnstul(a)us.ibm.com> wrote: > > > > I do agree that there can be subtle side effects when dealing with > > clocksources (part of why I'm being so cautious introducing this > > change), and when the stop_machine code was added it seemed reasonable. > > But given the limitations of stop_machine, the more I look at the > > clocksource_change code, the more I suspect stop_machine is overkill and > > we can safely just take the write lock on xtime_lock. > > > > If I'm still missing something, do let me know. > > What about a clocksource_unregister while a cpu is in the middle of a > read_seqbegin/timekeeping_get_ns/read_seqretry? The clocksource structure > is "free" after the successful call to the unregister. At least in theory > this could be a use after free. The race window is tiny but on virtual > systems there can be an arbitrary delay in the ktime_get sequence. Huh. At first I thought "but we don't yet implement clocksource_unregister!" but of course do now (I guess that TODO in the clocksource.c header can be yanked :). So yes, unregister has been contentious in the past for this very reason. Once registered, its really hard to find a safe point when it can be un-registered. Stop machine mostly solves this (although one should note: vsyscall enabled clocksources really can't be freed, as their vread() page needs to be statically mapped into userspace). So while stop_machine is a solution here, it would make more sense to me to use stop_machine (or maybe even a different method, as it sort of screams RCU to me) to make sure all the cpus are out of the xtime_lock critical section prior to returning from unregister_clocksource, rather then stopping everything for the clocksource change. > I agree that stop_machine is the big gun and restricts the code in the way > how the clocksource functions may be call. But it is safe, no? Actually, my reading of stop_machine makes me hesitate a bit, as I'm not sure if with kernel preemption, we're sure to avoid stopping a thread mid-syscall to gettimeofday. Anyone have a clue if that's avoided? Regardless, we need some other method then stop_machine to register clocksources, as stop_machine is just too limiting. thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: john stultz on 29 Jul 2010 19:30 On Thu, 2010-07-29 at 16:08 -0700, john stultz wrote: > On Thu, 2010-07-29 at 13:49 -0700, john stultz wrote: > > On Thu, 2010-07-29 at 09:11 +0200, Martin Schwidefsky wrote: > > > What about a clocksource_unregister while a cpu is in the middle of a > > > read_seqbegin/timekeeping_get_ns/read_seqretry? The clocksource structure > > > is "free" after the successful call to the unregister. At least in theory > > > this could be a use after free. The race window is tiny but on virtual > > > systems there can be an arbitrary delay in the ktime_get sequence. > > > > So yes, unregister has been contentious in the past for this very > > reason. Once registered, its really hard to find a safe point when it > > can be un-registered. Stop machine mostly solves this (although one > > should note: vsyscall enabled clocksources really can't be freed, as > > their vread() page needs to be statically mapped into userspace). > > > > So while stop_machine is a solution here, it would make more sense to me > > to use stop_machine (or maybe even a different method, as it sort of > > screams RCU to me) to make sure all the cpus are out of the xtime_lock > > critical section prior to returning from unregister_clocksource, rather > > then stopping everything for the clocksource change. > > > Below is a rough patch to use stop_machine to get the same level of race > protection for clocksource_unregister as we have currently in Linus's > tree (which may possibly have holes in it?). > > Comments or suggestions for other ideas would be appreciated. > > I'm thinking RCU might be really close to what we actually want here, > but I'd like to be able to avoid any extra work on the read-side (ie: > even the preempt_disable()), and would even be more prone to disallowing > clocksource unregistration then impacting the xtime_lock read side. > > > Any other thoughts? Actually, the more I think about it.. The more I really just think we should kill clocksource_unregister and simply not allow it. Part of the reason is that we have other issues lurking under here, such as: "what do we do if someone unregisters the only HRT capable clocksource? As there's currently no way to fall back from HRT mode to non HRT mode." It just adds a ton of complexity and issues for really zero gain. The only reasonable use-case I can come up with is having a clocksource loaded via a module, and then wanting to unload it. So while loading clocksources as a module is a nice feature that could save folks in a pinch (think old distro kernels needing a clock fix on new hardware), unregister and removal really doesn't have much functional use. Its just only nice an symmetrical. So unless anyone else objects, I'm prone to kill off unregister (and change the single user's error-handling path to delay registration until the hardware is known to be good). Any counter points? thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Pages: 1 Prev: generic file status update Next: [PATCH 2/2] drivers:staging:ti-st: update ABI and TODO |