From: john stultz on 24 May 2010 18:50 On Mon, 2010-05-24 at 15:30 -0700, H. Peter Anvin wrote: > On 05/24/2010 03:04 PM, Dan Magenheimer wrote: > >>> Is that still the case? I thought newer versions of NTP could deal > >> with > >>> large values. Inaccuracies of way more than 500 ppm are everyday. > >> > >> That's scary. > >> > >> Yea, in the kernel the ntp freq correction tops out at 500ppm. Almost > >> all the systems I see tend to fall in the +/- 200ppm range (if there's > >> not something terribly wrong with the hardware). > >> > >> So maybe things aren't so bad out there? Or is that wishful thinking? > > > > Since Brian's concern is at boot-time at which point there is no > > network or ntp, and assuming that it would be unwise to vary tsc_khz > > dynamically on a clocksource==tsc machine (is it?), would optionally > > lengthening the TSC<->PIT calibration beyond 25ms result in a more > > consistent tsc_khz between boots? Or is the relative instability > > an unavoidable result of skew between the PIT and the fixed constant > > PIT_TICK_RATE combined with algorithmic/arithmetic error? Or is > > the jitter of the (spread-spectrum) TSC too extreme? Or ??? > > > > If better more consistent calibration is possible, offering > > that as an optional kernel parameter seems better than specifying > > a fixed tsc_khz (stamped or user-specified) which may or may > > not be ignored due to "too different from measured tsc_khz". > > Even an (*optional*) extra second or two of boot time might > > be perfectly OK if it resulted in an additional five or six > > bits of tsc_khz precision. > > > > Thoughts, Brian? > > Making the calibration time longer should give a more precise result, > but of course at the expense of longer boot time. > > A longer sample would make sense if the goal is to freeze it into a > kernel command line variable, but the real question is how many people > would actually do that (and how many people would then suffer problems > because they upgraded their CPU/mobo and got massive failures on post-boot.) I'll admit its a feature for a minority of users. Probably why its not included. And the upgraded system issue was something I tried to address by using the calibrated value if it was off by some unreasonable amount, however folks protested that, figuring since if its explicitly stated kernel should not override it (ie: for the use case of where the calibration is broken and folks want to force the value). Also, you don't really need extra accuracy, you just need it to be the same from boot to boot. NTP keeps the correction factor persistent from boot to boot via the drift file. The boot argument is just trying to save the time (possibly hours depending on ntp config) after a reboot for NTP to correct for the new error introduced by calibration. thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dan Magenheimer on 24 May 2010 19:20 > From: john stultz [mailto:johnstul(a)us.ibm.com] > Also, you don't really need extra accuracy, you just need it to be the > same from boot to boot. NTP keeps the correction factor persistent from > boot to boot via the drift file. The boot argument is just trying to > save the time (possibly hours depending on ntp config) after a reboot > for NTP to correct for the new error introduced by calibration. I was assuming that extra accuracy would decrease the ntp convergence time by about the same factor (5-6 bits of extra accuracy would decrease ntp convergence time by 32-64x). Is that an incorrect assumption? > From: H. Peter Anvin [mailto:hpa(a)zytor.com] > A longer sample would make sense if the goal is to freeze it into a > kernel command line variable, but the real question is how many people > would actually do that (and how many people would then suffer problems > because they upgraded their CPU/mobo and got massive failures on post- > boot.) Not sure why upgraded mobo's would fail due to a longer sample? As more and more systems become dependent on clocksource==tsc and more and more people assume nanosecond-class measurements are relatively accurate, I'd expect the accuracy of tsc_khz to become more important. While desktop users might bristle at an extra second of boot delay, I'll bet many server farm administrators would gladly pay that upfront cost if they know an option exists. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: H. Peter Anvin on 24 May 2010 19:30 On 05/24/2010 04:16 PM, Dan Magenheimer wrote: >> From: john stultz [mailto:johnstul(a)us.ibm.com] >> Also, you don't really need extra accuracy, you just need it to be the >> same from boot to boot. NTP keeps the correction factor persistent from >> boot to boot via the drift file. The boot argument is just trying to >> save the time (possibly hours depending on ntp config) after a reboot >> for NTP to correct for the new error introduced by calibration. > > I was assuming that extra accuracy would decrease the ntp > convergence time by about the same factor (5-6 bits of extra > accuracy would decrease ntp convergence time by 32-64x). > Is that an incorrect assumption? > Yes. >> From: H. Peter Anvin [mailto:hpa(a)zytor.com] >> A longer sample would make sense if the goal is to freeze it into a >> kernel command line variable, but the real question is how many people >> would actually do that (and how many people would then suffer problems >> because they upgraded their CPU/mobo and got massive failures on post- >> boot.) > > Not sure why upgraded mobo's would fail due to a longer sample? Not due to a longer sample, but a frozen sample. > As more and more systems become dependent on clocksource==tsc > and more and more people assume nanosecond-class measurements > are relatively accurate, I'd expect the accuracy of tsc_khz > to become more important. While desktop users might bristle > at an extra second of boot delay, I'll bet many server > farm administrators would gladly pay that upfront cost > if they know an option exists. Not really. The delta measurements aren't the issue here, but rather walltime convergence. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: john stultz on 24 May 2010 19:40 On Mon, 2010-05-24 at 16:16 -0700, Dan Magenheimer wrote: > > From: john stultz [mailto:johnstul(a)us.ibm.com] > > Also, you don't really need extra accuracy, you just need it to be the > > same from boot to boot. NTP keeps the correction factor persistent from > > boot to boot via the drift file. The boot argument is just trying to > > save the time (possibly hours depending on ntp config) after a reboot > > for NTP to correct for the new error introduced by calibration. > > I was assuming that extra accuracy would decrease the ntp > convergence time by about the same factor (5-6 bits of extra > accuracy would decrease ntp convergence time by 32-64x). > Is that an incorrect assumption? Sorry, this is sort of mixing points. I was saying you don't need more accuracy (as opposed to what H. Peter mentioned below) when setting the tsc_khz= option I proposed. Since it will be constant from boot to boot, and thus will reduce the ntp convergence time. However, without such a boot option, more accuracy from an increased calibration time would help. However, the tradeoff of a longer boot time is one not many will probably want. > > From: H. Peter Anvin [mailto:hpa(a)zytor.com] > > A longer sample would make sense if the goal is to freeze it into a > > kernel command line variable, but the real question is how many people > > would actually do that (and how many people would then suffer problems > > because they upgraded their CPU/mobo and got massive failures on post- > > boot.) > > Not sure why upgraded mobo's would fail due to a longer sample? Again, this is mixing the discussion. The concern was users of a tsc_khz= boot option might have problems when they upgrade, as the actual TSC freq might not match what was specified at boot. > As more and more systems become dependent on clocksource==tsc > and more and more people assume nanosecond-class measurements > are relatively accurate, I'd expect the accuracy of tsc_khz > to become more important. While desktop users might bristle > at an extra second of boot delay, I'll bet many server > farm administrators would gladly pay that upfront cost > if they know an option exists. Maybe something like a tsc_long_calibration=1 option would allow for this? However, I really do like the idea of pulling the stamped value from the MSR and if its close to what we quickly calibrated, use that. thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andi Kleen on 24 May 2010 19:50
> > As more and more systems become dependent on clocksource==tsc > > and more and more people assume nanosecond-class measurements > > are relatively accurate, I'd expect the accuracy of tsc_khz > > to become more important. While desktop users might bristle > > at an extra second of boot delay, I'll bet many server > > farm administrators would gladly pay that upfront cost > > if they know an option exists. > > Maybe something like a tsc_long_calibration=1 option would allow for > this? On a system with synchronized TSC and multiple cores you could also simply do a longer calibration on another core in the background after a quick "fast calibration" -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |