Prev: Sound not transferred to docking station after resume from StR
Next: Process-shared futexes on hugepages puts the kernel in an infinite loop in 2.6.32.11; is this fixed now?
From: Peter Zijlstra on 19 Apr 2010 10:50 On Mon, 2010-04-19 at 17:33 +0300, Avi Kivity wrote: > On 04/19/2010 05:21 PM, Glauber Costa wrote: > > > >> Oh yes, just trying to avoid a patch with both atomic64_read() and > >> ACCESS_ONCE(). > >> > > you're mixing the private version of the patch you saw with this one. > > there isn't any atomic reads in here. I'll use a barrier then > > > > This patch writes last_value atomically, but reads it non-atomically. A > barrier is insufficient. What avi says! :-) On a 32bit machine a 64bit read are two 32bit reads, so last = last_value; becomes: last.high = last_value.high; last.low = last_vlue.low; (or the reverse of course) Now imagine a write getting interleaved with that ;-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Avi Kivity on 20 Apr 2010 11:20 On 04/20/2010 03:59 PM, Glauber Costa wrote: > >> Might be due to NMIs or SMIs interrupting the rdtsc(); ktime_get() >> operation which establishes the timeline. We could limit it by >> having a loop doing rdtsc(); ktime_get(); rdtsc(); and checking for >> some bound, but it isn't worthwhile (and will break nested >> virtualization for sure). Better to have the option to calibrate >> kvmclock just once on machines with >> X86_FEATURE_NONSTOP_TRULY_RELIABLE _TSC_HONESTLY. >> > For the record, we can only even do that in those machines. If we try to update > time structures only once in machines with the > X86_FEATURE_TSC_SAYS_IT_IS_OKAY_BUT_IN_REALITY_IS_NOT_OKAY feature flag, guests > won't even boot. > > We can detect that, and besides doing calculation only once, also export some > bit indicating that to the guest. Humm... I'm thinking now, that because of > migration, we should check this bit every time, because we might have changed host. > So instead of using an expensive cpuid check, we should probably use some bit in > the vcpu_time_info structure, and use a cpuid bit just to say it is enabled. > Right, we need a bit in the structure itself that says you can trust the time not to go backwards, and a bit in cpuid that says you can trust the bit in the structure (unless we can be sure no implementations leak garbage into the padding field). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Jeremy Fitzhardinge on 19 Apr 2010 12:20 On 04/19/2010 07:26 AM, Glauber Costa wrote: >> Is the problem that the tscs are starting out of sync, or that they're >> drifting relative to each other over time? Do the problems become worse >> the longer the uptime? How large are the offsets we're talking about here? >> > The offsets usually seem pretty small, under a microsecond. So I don't think > it has anything to do with tscs starting out of sync. Specially because the > delta-based calculation has the exact purpose of handling that case. > So you think they're drifting out of sync from an initially synced state? If so, what would bound the drift? J -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Jeremy Fitzhardinge on 19 Apr 2010 12:20 On 04/19/2010 07:33 AM, Avi Kivity wrote: > On 04/19/2010 05:21 PM, Glauber Costa wrote: >> >>> Oh yes, just trying to avoid a patch with both atomic64_read() and >>> ACCESS_ONCE(). >>> >> you're mixing the private version of the patch you saw with this one. >> there isn't any atomic reads in here. I'll use a barrier then >> > > This patch writes last_value atomically, but reads it non-atomically. > A barrier is insufficient. Well, on a 32b system, you can explicitly order the updates of low and high, then do a high-low-checkhigh read. That would be much more efficient than atomic64. If we really care about 32b. J -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Glauber Costa on 19 Apr 2010 14:30
On Mon, Apr 19, 2010 at 09:19:38AM -0700, Jeremy Fitzhardinge wrote: > On 04/19/2010 07:26 AM, Glauber Costa wrote: > >> Is the problem that the tscs are starting out of sync, or that they're > >> drifting relative to each other over time? Do the problems become worse > >> the longer the uptime? How large are the offsets we're talking about here? > >> > > The offsets usually seem pretty small, under a microsecond. So I don't think > > it has anything to do with tscs starting out of sync. Specially because the > > delta-based calculation has the exact purpose of handling that case. > > > > So you think they're drifting out of sync from an initially synced > state? If so, what would bound the drift? I think delta calculation introduces errors. Marcelo can probably confirm it, but he has a nehalem with an appearently very good tsc source. Even this machine warps. It stops warping if we only write pvclock data structure once and forget it, (which only updated tsc_timestamp once), according to him. Obviously, we can't do that everywhere. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |