Prev: [PATCH 01/10] KVM: SVM: Notify nested hypervisor of lost event injections
Next: [PATCH 1/8] firewire: sbp2: provide fallback if mgt_ORB_timeout is missing
From: john stultz on 5 Nov 2009 06:40 On Wed, 2009-11-04 at 13:28 -0800, Dan Magenheimer wrote: > > From: john stultz [mailto:johnstul(a)us.ibm.com] > > On Thu, Oct 29, 2009 at 7:07 AM, Avi Kivity <avi(a)redhat.com> wrote: > > > > > > Out of interest, do you know (and can you relate) why those > > apps need > > > 100k/sec monotonically increasing timestamps? > > > > This is sort of tangential, but depending on the need, this might be > > of interest: Recently I've added a new clock_id, > > CLOCK_MONOTONIC_COARSE (as well as CLOCK_REALTIME_COARSE), which > > return a HZ granular timestamp (same granularity as filesystem > > timestamps). Its very fast to access, since there's no hardware to > > touch, and is accessible via vsyscall. > > > > The idea being, if your hitting clock_gettime 100k/sec but you really > > don't have the need for nsec granular timestamps, it might provide a > > really nice performance boost. > > > > Here's the commit: > > Hi John -- > > Yes, possibly of interest. But does it work with CONFIG_NO_HZ? > (I'm expecting that over time NO_HZ will become widespread > for VM OS's, though interested in if you agree.) It should work, with CONFIG_NO_HZ, as soon as we come out of a long idle (likely due to a timer tick), the timekeeping code will accumulate all the skipped ticks. If we ever get to non-idle NOHZ, we'll need some extra work here (probably lazy accumulation done conditionally in the read path), but that's also true for filesystem timestamps. > Also very interested in your thoughts about a variation > that returns something similar to a TSC_AUX to notify > caller that the underlying reference clock has/may have > changed. I haven't been following that closely. Personally, experience makes me skeptical of workarounds for unsynced TSCs. But I'm sure there's sharper folks out there that might make it work. The kernel just requires that it *really really* works, and not "mostly" works. :) thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dan Magenheimer on 5 Nov 2009 06:41 > > Yes, possibly of interest. But does it work with CONFIG_NO_HZ? > > (I'm expecting that over time NO_HZ will become widespread > > for VM OS's, though interested in if you agree.) > > It should work, with CONFIG_NO_HZ, as soon as we come out of > a long idle > (likely due to a timer tick), the timekeeping code will accumulate all > the skipped ticks. > > If we ever get to non-idle NOHZ, we'll need some extra work here > (probably lazy accumulation done conditionally in the read path), but > that's also true for filesystem timestamps. OK, sounds good. > > Also very interested in your thoughts about a variation > > that returns something similar to a TSC_AUX to notify > > caller that the underlying reference clock has/may have > > changed. > > I haven't been following that closely. Personally, experience makes me > skeptical of workarounds for unsynced TSCs. But I'm sure > there's sharper > folks out there that might make it work. The kernel just requires that > it *really really* works, and not "mostly" works. :) This is less a workaround for unsynced TSCs than it is for VM migration (and maybe also time where a VM is out-of-context or moved to a different pcpu) though it could probably be made to work on unsynced TSC boxes also. Basically an application needing hi-res profiling info would do: nsec1 = clock_gettime2(MONOTONIC,&aux1); (time passes) nsec2 = clock_gettime2(MONOTONIC,&aux2); if (aux1 != aux2) discard_measurement(); else use_measurement(nsec2-nsec1); and system software (hypervisor or kernel or both) is responsible for ensuring aux value monotonically increases whenever a different crystal is used. Without something like this as a vsyscall, apps will just use rdtscp (which must be emulated to work properly across a migration). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Keir Fraser on 5 Nov 2009 12:40
On 05/11/2009 14:52, "Dan Magenheimer" <dan.magenheimer(a)oracle.com> wrote: > Well, all this discussion has convince me that > my original proposals do make sense You surprise me, Dan. ;-) -- Keir -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |