Prev: nvidia controller failed command, possibly related to SMART selftest (2.6.32)
Next: powernow-k8: Core Performance Boost and effective frequency support
From: Pekka Enberg on 7 Apr 2010 13:00 Pekka Enberg wrote: > Christoph Lameter wrote: >> I wonder if this is not related to the kmem_cache_cpu structure >> straggling >> cache line boundaries under some conditions. On 2.6.33 the kmem_cache_cpu >> structure was larger and therefore tight packing resulted in different >> alignment. >> >> Could you see how the following patch affects the results. It attempts to >> increase the size of kmem_cache_cpu to a power of 2 bytes. There is also >> the potential that other per cpu fetches to neighboring objects affect >> the >> situation. We could cacheline align the whole thing. >> >> --- >> include/linux/slub_def.h | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> Index: linux-2.6/include/linux/slub_def.h >> =================================================================== >> --- linux-2.6.orig/include/linux/slub_def.h 2010-04-07 >> 11:33:50.000000000 -0500 >> +++ linux-2.6/include/linux/slub_def.h 2010-04-07 >> 11:35:18.000000000 -0500 >> @@ -38,6 +38,11 @@ struct kmem_cache_cpu { >> void **freelist; /* Pointer to first free per cpu object */ >> struct page *page; /* The slab from which we are allocating */ >> int node; /* The node of the page (or -1 for debug) */ >> +#ifndef CONFIG_64BIT >> + int dummy1; >> +#endif >> + unsigned long dummy2; >> + >> #ifdef CONFIG_SLUB_STATS >> unsigned stat[NR_SLUB_STAT_ITEMS]; >> #endif > > Would __cacheline_aligned_in_smp do the trick here? Oh, sorry, I think it's actually '____cacheline_aligned_in_smp' (with four underscores) for per-cpu data. Confusing... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Pekka Enberg on 7 Apr 2010 13:00 Christoph Lameter wrote: > I wonder if this is not related to the kmem_cache_cpu structure straggling > cache line boundaries under some conditions. On 2.6.33 the kmem_cache_cpu > structure was larger and therefore tight packing resulted in different > alignment. > > Could you see how the following patch affects the results. It attempts to > increase the size of kmem_cache_cpu to a power of 2 bytes. There is also > the potential that other per cpu fetches to neighboring objects affect the > situation. We could cacheline align the whole thing. > > --- > include/linux/slub_def.h | 5 +++++ > 1 file changed, 5 insertions(+) > > Index: linux-2.6/include/linux/slub_def.h > =================================================================== > --- linux-2.6.orig/include/linux/slub_def.h 2010-04-07 11:33:50.000000000 -0500 > +++ linux-2.6/include/linux/slub_def.h 2010-04-07 11:35:18.000000000 -0500 > @@ -38,6 +38,11 @@ struct kmem_cache_cpu { > void **freelist; /* Pointer to first free per cpu object */ > struct page *page; /* The slab from which we are allocating */ > int node; /* The node of the page (or -1 for debug) */ > +#ifndef CONFIG_64BIT > + int dummy1; > +#endif > + unsigned long dummy2; > + > #ifdef CONFIG_SLUB_STATS > unsigned stat[NR_SLUB_STAT_ITEMS]; > #endif Would __cacheline_aligned_in_smp do the trick here? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Lameter on 7 Apr 2010 14:20 On Wed, 7 Apr 2010, Pekka Enberg wrote: > Christoph Lameter wrote: > > I wonder if this is not related to the kmem_cache_cpu structure straggling > > cache line boundaries under some conditions. On 2.6.33 the kmem_cache_cpu > > structure was larger and therefore tight packing resulted in different > > alignment. > > > > Could you see how the following patch affects the results. It attempts to > > increase the size of kmem_cache_cpu to a power of 2 bytes. There is also > > the potential that other per cpu fetches to neighboring objects affect the > > situation. We could cacheline align the whole thing. > > > > --- > > include/linux/slub_def.h | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > Index: linux-2.6/include/linux/slub_def.h > > =================================================================== > > --- linux-2.6.orig/include/linux/slub_def.h 2010-04-07 11:33:50.000000000 > > -0500 > > +++ linux-2.6/include/linux/slub_def.h 2010-04-07 11:35:18.000000000 > > -0500 > > @@ -38,6 +38,11 @@ struct kmem_cache_cpu { > > void **freelist; /* Pointer to first free per cpu object */ > > struct page *page; /* The slab from which we are allocating */ > > int node; /* The node of the page (or -1 for debug) */ > > +#ifndef CONFIG_64BIT > > + int dummy1; > > +#endif > > + unsigned long dummy2; > > + > > #ifdef CONFIG_SLUB_STATS > > unsigned stat[NR_SLUB_STAT_ITEMS]; > > #endif > > Would __cacheline_aligned_in_smp do the trick here? This is allocated via the percpu allocator. We could specify cacheline alignment there but that would reduce the density. You basically need 4 words for a kmem_cache_cpu structure. A number of those fit into one 64 byte cacheline. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Pekka Enberg on 7 Apr 2010 14:30 Christoph Lameter wrote: > On Wed, 7 Apr 2010, Pekka Enberg wrote: > >> Oh, sorry, I think it's actually '____cacheline_aligned_in_smp' (with four >> underscores) for per-cpu data. Confusing... > > This does not particulary help to clarify the situation since we are > dealing with data that can either be allocated via the percpu allocator or > be statically present (kmalloc bootstrap situation). Yes, I am an idiot. :-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Lameter on 7 Apr 2010 14:30
On Wed, 7 Apr 2010, Pekka Enberg wrote: > Oh, sorry, I think it's actually '____cacheline_aligned_in_smp' (with four > underscores) for per-cpu data. Confusing... This does not particulary help to clarify the situation since we are dealing with data that can either be allocated via the percpu allocator or be statically present (kmalloc bootstrap situation). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |