From: Dan Magenheimer on
> From: Peter Zijlstra [mailto:peterz(a)infradead.org]
> Subject: Re: [PATCH] x86: Export tsc related information in sysfs
>
> On Tue, 2010-05-18 at 13:25 +0200, Andi Kleen wrote:
> > On Tue, May 18, 2010 at 11:58:18AM +0200, Peter Zijlstra wrote:
> > > On Sun, 2010-05-16 at 22:06 -0700, Arjan van de Ven wrote:
> > > > look we're not disabling ring 3 tsc. We could, but we don't.
> > >
> > > Maybe we should.
> >
> > That would kill the vsyscall too. Remember it's running in ring 3.
> >
> > That is in theory you could disable it on systems where the vsyscall
> > doesn't use it, but then you would likely break huge amounts of
> software,
> > unless you emulate it.
>
> Well, software shouldn't use it, so breaking it sounds like a fine
> idea ;-)

Last fall, I discovered that EVERY program on RHEL5 (U2?) uses rdtsc
because RHEL5 ld.so uses rdtsc. The uses are few and harmless but
are there nonetheless. Clearly that can be fixed, but you might
be surprised how big "huge amounts of software" is.

> Also, a slow emulation is an incentive to actually do the right thing.

Emulation is not particularly slow, especially compared to accessing
the HPET. If the kernel deems TSC is unsafe, the ring 3 vsyscall
shouldn't be using rdtsc either so the additional trap overhead
might be in the noise.

As long as there is a sysfs file that can override the setting
and there is a counter (accessible via sysfs) that can count
the number of emulated rdtsc/rdtscp instructions (possibly
optionally by pid so the "offending" userland threads can
be tracked down), setting CR4.TSD whenever the kernel deems
TSC is unsafe and emulating rdtsc might be a reasonable solution.
Infrequent rdtsc users won't know or care, and frequent users
will at least be able to learn the frequency of their "sin".

And to help Thomas/Arjan/Ingo/Andi educate users, every read
or write to any of these sysfs files could also result in a
printk of "Use of rdtsc is deprecated... use vsyscalls instead.
See Documentation/friends_dont_let_friends_use_rdtsc."
(Half ;-)

And the sysfs file could have a "strict" setting which kills any
thread that uses rdtsc, so Thomas can tell future problem
reporters: "Set the rdtsc setting to strict and if you still
have problems, call me back." (Other half of ;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: H. Peter Anvin on
On 05/18/2010 04:58 AM, Peter Zijlstra wrote:
> On Tue, 2010-05-18 at 13:25 +0200, Andi Kleen wrote:
>> On Tue, May 18, 2010 at 11:58:18AM +0200, Peter Zijlstra wrote:
>>> On Sun, 2010-05-16 at 22:06 -0700, Arjan van de Ven wrote:
>>>> look we're not disabling ring 3 tsc. We could, but we don't.
>>>
>>> Maybe we should.
>>
>> That would kill the vsyscall too. Remember it's running in ring 3.
>>
>> That is in theory you could disable it on systems where the vsyscall
>> doesn't use it, but then you would likely break huge amounts of software,
>> unless you emulate it.
>
> Well, software shouldn't use it, so breaking it sounds like a fine
> idea ;-)
>
> Also, a slow emulation is an incentive to actually do the right thing.

The problem is that you throw the baby (vsyscall) out with the bathwater
(user rdtsc).

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on
On Tue, 2010-05-18 at 09:40 -0700, H. Peter Anvin wrote:
>
> The problem is that you throw the baby (vsyscall) out with the bathwater
> (user rdtsc).

Well, we could only flip the CR4 bit when we mark the TSC unsuitable for
gtod. That should be plenty good to tag all userspace trying to use it,
since more than half my machines don't use TSC for clocksource.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: H. Peter Anvin on
On 05/18/2010 09:52 AM, Peter Zijlstra wrote:
> On Tue, 2010-05-18 at 09:40 -0700, H. Peter Anvin wrote:
>>
>> The problem is that you throw the baby (vsyscall) out with the bathwater
>> (user rdtsc).
>
> Well, we could only flip the CR4 bit when we mark the TSC unsuitable for
> gtod. That should be plenty good to tag all userspace trying to use it,
> since more than half my machines don't use TSC for clocksource.

This might be an option, although it would have to be an *option*.
There are restricted uses of the TSC in userspace which are still useful
(mainly involving performance analysis and/or CPU-locked processes).

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Dan Magenheimer on
> From: H. Peter Anvin [mailto:hpa(a)zytor.com]
> Sent: Tuesday, May 18, 2010 11:04 AM
> To: Peter Zijlstra
> Cc: Andi Kleen; Arjan van de Ven; Dan Magenheimer; Thomas Gleixner;
> Venkatesh Pallipadi; Ingo Molnar; Chris Mason; linux-
> kernel(a)vger.kernel.org
> Subject: Re: [PATCH] x86: Export tsc related information in sysfs
>
> On 05/18/2010 09:52 AM, Peter Zijlstra wrote:
> > On Tue, 2010-05-18 at 09:40 -0700, H. Peter Anvin wrote:
> >>
> >> The problem is that you throw the baby (vsyscall) out with the
> bathwater
> >> (user rdtsc).
> >
> > Well, we could only flip the CR4 bit when we mark the TSC unsuitable
> for
> > gtod. That should be plenty good to tag all userspace trying to use
> it,
> > since more than half my machines don't use TSC for clocksource.
>
> This might be an option, although it would have to be an *option*.
> There are restricted uses of the TSC in userspace which are still
> useful
> (mainly involving performance analysis and/or CPU-locked processes).

(Though I expect tglx/arjan/andi/mingo to disagree with this proposal
for similar reasons as the original one that started this thread...)

Proposal:

/sys/devices/system/tsc/native (writable by root):

0 = (default) Kernel dynamically controls TSC emulation.
When the kernel deems TSC usable as a clocksource, rdtsc
will be executed directly by the CPU. When the kernel deems
TSC unsafe to use, rdtsc will be trapped and emulated.
1 = TSC emulation is never enabled. Programs using rdtsc
directly are subject to the many known and sometimes rare
and subtle vagaries of TSC.
2 = TSC emulation is always enabled (for debug only)
3 = Processes using TSC will be treated as if they executed
an illegal instruction. [? Can the kernel recognize
use of rdtsc in a vsyscall and emulate so that,
even though vsyscall is slower, all other rdtsc
in userspace are illegal?] [? Can/should this be
enforced only on non-root processes?]

/sys/devices/system/tsc/system_count (writable by root):

Contains a count of all TSC emulations, system-wide.
Writable to allow reset to zero.

/sys/devices/system/tsc/pid_counters (writable by root):

0 = (default) TSC counts are system-wide only
1 = TSC counted per pid (at performance penalty)
counters in /proc/PID/tsc_count

/proc/PID/tsc_count (readonly):

If /sys/devices/system/tsc_pid_counters is 1,
contains the count of rdtsc instructions emulated
for this PID.

(Note: except for the actual instruction emulation
which will be faithful, rdtscp will be treated and
counted as a rdtsc.)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/