From: Andi Kleen on 15 May 2010 06:00 Venkatesh Pallipadi <venki(a)google.com> writes: > From: Dan Magenheimer <dan.magenheimer(a)oracle.com> > > Kernel information about calibrated value of tsc_khz and > tsc_stability (result of tsc warp test) are useful bits of information > for any app that wants to use TSC directly. Export this read_only > information in sysfs. Is this really a good idea? It will encourage the applications to use RDTSC directly, but there are all kinds of constraints on that. Even the kernel has a hard time with them, how likely is it that applications will get all that right? It would be better to fix them to use the vsyscalls instead. Or if they can't use the vsyscalls for some reason today fix them. This way if anything changes again in TSC the kernel could shield the applications. -Andi -- ak(a)linux.intel.com -- Speaking for myself only. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Jaswinder Singh Rajput on 15 May 2010 08:40 Hello, On Sat, May 15, 2010 at 7:10 AM, Venkatesh Pallipadi <venki(a)google.com> wrote: > From: Dan Magenheimer <dan.magenheimer(a)oracle.com> > > Kernel information about calibrated value of tsc_khz and > tsc_stability (result of tsc warp test) are useful bits of information > for any app that wants to use TSC directly. Export this read_only > information in sysfs. > > Signed-off-by: Venkatesh Pallipadi <venki(a)google.com> > Signed-off-by: Dan Magenheimer <dan.magenheimer(a)oracle.com> > --- > �arch/x86/kernel/tsc.c | � 76 +++++++++++++++++++++++++++++++++++++++++++++++++ > �1 files changed, 76 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c > index 9faf91a..24dd484 100644 > --- a/arch/x86/kernel/tsc.c > +++ b/arch/x86/kernel/tsc.c > @@ -10,6 +10,7 @@ > �#include <linux/clocksource.h> > �#include <linux/percpu.h> > �#include <linux/timex.h> > +#include <linux/sysdev.h> > > �#include <asm/hpet.h> > �#include <asm/timer.h> > @@ -857,6 +858,81 @@ static void __init init_tsc_clocksource(void) > � � � �clocksource_register(&clocksource_tsc); > �} > > +#ifdef CONFIG_SYSFS > +/* > + * Export TSC related info to user land. This reflects kernel usage of TSC > + * as hints to userspace users of TSC. The read_only info provided here: > + * - tsc_stable: 1 implies system has TSC that always counts at a constant > + * � rate, sync across CPUs and has passed the kernel warp test. > + * - tsc_khz: TSC frequency in khz. > + * - tsc_mult and tsc_shift: multiplier and shift to optimally convert > + * � TSC delta to ns; ns = ((u64) delta * mult) >> shift > + */ > + > +#define define_show_var_function(_name, _var) \ > +static ssize_t show_##_name( \ > + � � � struct sys_device *dev, struct sysdev_attribute *attr, char *buf) \ > +{ \ > + � � � return sprintf(buf, "%u\n", (unsigned int) _var);\ > +} > + > +define_show_var_function(tsc_stable, !tsc_unstable); > +define_show_var_function(tsc_khz, tsc_khz); > +define_show_var_function(tsc_mult, clocksource_tsc.mult); > +define_show_var_function(tsc_shift, clocksource_tsc.shift); > + > +static SYSDEV_ATTR(tsc_stable, 0444, show_tsc_stable, NULL); > +static SYSDEV_ATTR(tsc_khz, 0444, show_tsc_khz, NULL); > +static SYSDEV_ATTR(tsc_mult, 0444, show_tsc_mult, NULL); > +static SYSDEV_ATTR(tsc_shift, 0444, show_tsc_shift, NULL); > + > +static struct sysdev_attribute *tsc_attrs[] = { > + � � � &attr_tsc_stable, > + � � � &attr_tsc_khz, > + � � � &attr_tsc_mult, > + � � � &attr_tsc_shift, > +}; > + > +static struct sysdev_class tsc_sysclass = { > + � � � .name = "tsc", > +}; > + > +static struct sys_device device_tsc = { > + � � � .id = 0, > + � � � .cls = &tsc_sysclass, > +}; > + > +static int __init init_tsc_sysfs(void) > +{ > + � � � int err, i = 0; > + > + � � � err = sysdev_class_register(&tsc_sysclass); > + � � � if (err) > + � � � � � � � return err; > + > + � � � err = sysdev_register(&device_tsc); > + � � � if (err) > + � � � � � � � goto fail; fail will call sysdev_unregister(&device_tsc) which is not appropriate, please fix this goto. Thanks, -- Jaswinder Singh. > + > + � � � for (i = 0; i < ARRAY_SIZE(tsc_attrs); i++) { > + � � � � � � � err = sysdev_create_file(&device_tsc, tsc_attrs[i]); > + � � � � � � � if (err) > + � � � � � � � � � � � goto fail; > + � � � } > + > + � � � return 0; > + > +fail: > + � � � while (--i >= 0) > + � � � � � � � sysdev_remove_file(&device_tsc, tsc_attrs[i]); > + > + � � � sysdev_unregister(&device_tsc); > + � � � sysdev_class_unregister(&tsc_sysclass); > + � � � return err; > +} > +device_initcall(init_tsc_sysfs); > +#endif > + > �#ifdef CONFIG_X86_64 > �/* > �* calibrate_cpu is used on systems with fixed rate TSCs to determine > -- > 1.7.0.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo(a)vger.kernel.org > More majordomo info at �http://vger.kernel.org/majordomo-info.html > Please read the FAQ at �http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dan Magenheimer on 15 May 2010 09:40 > From: Andi Kleen [mailto:andi(a)firstfloor.org] > > > Kernel information about calibrated value of tsc_khz and > > tsc_stability (result of tsc warp test) are useful bits of > information > > for any app that wants to use TSC directly. Export this read_only > > information in sysfs. > > Is this really a good idea? It will encourage the applications > to use RDTSC directly, but there are all kinds of constraints on Indeed, that is what it is intended to do. > that. Even the kernel has a hard time with them, how likely > is it that applications will get all that right? That's the point of exposing the tsc_reliable kernel data. If the processor has Invariant TSC and the system has successfully passed Ingo's warp test and, as a result the kernel is using TSC as a clocksource, why not enable userland apps that need to obtain timestamp data tens or hundreds of thousands of times per second to also use the TSC directly? > It would be better to fix them to use the vsyscalls instead. > Or if they can't use the vsyscalls for some reason today fix them. The problem is from an app point-of-view there is no vsyscall. There are two syscalls: gettimeofday and clock_gettime. Sometimes, if it gets lucky, they turn out to be very fast and sometimes it doesn't get lucky and they are VERY slow (resulting in a performance hit of 10% or more), depending on a number of factors completely out of the control of the app and even undetectable to the app. Note also that even vsyscall with TSC as the clocksource will still be significantly slower than rdtsc, especially in the common case where a timestamp is directly stored and the delta between two timestamps is later evaluated; in the vsyscall case, each timestamp is a function call and a convert to nsec but in the TSC case, each timestamp is a single instruction. > This way if anything changes again in TSC the kernel could > shield the applications. If tsc_reliable is 1, the system and the kernel are guaranteeing to the app that nothing will change in the TSC. In an Invariant TSC system that has passed Ingo's warp test (to eliminate the possibility of a fixed interprocessor TSC gap due to a broken BIOS in a multi-node NUMA system), if anything changes in the clock signal that drives the TSC, the system is badly broken and far worse things -- like inter-processor cache incoherency -- may happen. Is it finally possible to get past the horrible SMP TSC problems of the past and allow apps, under the right conditions, to be able to use rdtsc again? This patch argues "yes". -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Venkatesh Pallipadi on 15 May 2010 10:40 On Sat, May 15, 2010 at 5:35 AM, Jaswinder Singh Rajput <jaswinderlinux(a)gmail.com> wrote: > Hello, >> + >> +static int __init init_tsc_sysfs(void) >> +{ >> + � � � int err, i = 0; >> + >> + � � � err = sysdev_class_register(&tsc_sysclass); >> + � � � if (err) >> + � � � � � � � return err; >> + >> + � � � err = sysdev_register(&device_tsc); >> + � � � if (err) >> + � � � � � � � goto fail; > > fail will call sysdev_unregister(&device_tsc) which is not > appropriate, please fix this goto. > Thanks for catching this. Will fix this in the patch refresh. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Venkatesh Pallipadi on 15 May 2010 12:50
On Sat, May 15, 2010 at 6:29 AM, Dan Magenheimer <dan.magenheimer(a)oracle.com> wrote: >> From: Andi Kleen [mailto:andi(a)firstfloor.org] >> >> > Kernel information about calibrated value of tsc_khz and >> > tsc_stability (result of tsc warp test) are useful bits of >> information >> > for any app that wants to use TSC directly. Export this read_only >> > information in sysfs. >> >> Is this really a good idea? �It will encourage the applications >> to use RDTSC directly, but there are all kinds of constraints on > > Indeed, that is what it is intended to do. > >> that. Even the kernel has a hard time with them, how likely >> is it that applications will get all that right? > > That's the point of exposing the tsc_reliable kernel data. > If the processor has Invariant TSC and the system has > successfully passed Ingo's warp test and, as a result > the kernel is using TSC as a clocksource, why not enable > userland apps that need to obtain timestamp data > tens or hundreds of thousands of times per second to > also use the TSC directly? > I am a little concerned about applications getting this right, with rdtsc and related barriers etc. May be that calls for a userspace library. Despite of that, it is useful information to human user to know whether tsc is stable and what TSC freq is. It will be very hard for userspace to do the TSC calibration. And to know whether TSC warp test passed or whether TSC is marked unstable by the kernel, there is no one place to get these answers today. Users have to look at clocksource, dmesg, etc. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |