Prev: padata: separate serial and parallel cpumasks
Next: hugetlb: add allocate function for hugepage migration
From: Steffen Klassert on 2 Jul 2010 05:10 On Thu, Jul 01, 2010 at 06:28:34PM +0400, Dan Kruchinin wrote: > > > > These statistic counters add a lot of atomic operations to the fast-path. > > Would'nt it be better to have these statistics in a percpu manner? > > This would avoid the atomic operations and we would get some additional > > information on the distribution of the queued objects. > > > > If I understood you correctly the resulting sysfs hierarchy would look like > this one: > pcrypt/ > |- serial_cpumask > |- parallel_cpumask > |- w0/ > +--- parallel_objects > +--- serial_objects > +--- reorder_objects > |- w1/ > ... > |- wN/ > > right? If so I think it won't be very convenient to monitor summary number > of parallel, serial and reorder objects. Yes, I thought about something like this. You can still take the sum over the percpu objects when you output the statistics. > Anyway I think these atomic operations take very small time in comparison > with other operations in padata. So small that it can be ignored. I have a patch in queue that simplifies the serialization mechanism and reduces the accesses of foreign and global memory as much as possible in the parallel codepath. Adding atomic operations to global memory (just to collect statistics) to the parallel codepath would go in the opposite direction. Steffen -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dan Kruchinin on 2 Jul 2010 06:30 On Fri, Jul 2, 2010 at 1:08 PM, Steffen Klassert <steffen.klassert(a)secunet.com> wrote: > On Thu, Jul 01, 2010 at 06:28:34PM +0400, Dan Kruchinin wrote: >> > >> > These statistic counters add a lot of atomic operations to the fast-path. >> > Would'nt it be better to have these statistics in a percpu manner? >> > This would avoid the atomic operations and we would get some additional >> > information on the distribution of the queued objects. >> > >> >> If I understood you correctly the resulting sysfs hierarchy would look like >> this one: >> pcrypt/ >> |- serial_cpumask >> |- parallel_cpumask >> |- w0/ >> +--- parallel_objects >> +--- serial_objects >> +--- reorder_objects >> |- w1/ >> ... >> |- wN/ >> >> right? If so I think it won't be very convenient to monitor summary number >> of parallel, serial and reorder objects. > > Yes, I thought about something like this. You can still take the sum > over the percpu objects when you output the statistics. But summation can not be clear without some kind of lock because while we're summing another CPU can increase or decrease its percpu statistic counters. Then each statistic percpu counter must be modified under lock, right? > > >> Anyway I think these atomic operations take very small time in comparison >> with other operations in padata. So small that it can be ignored. > > I have a patch in queue that simplifies the serialization mechanism and > reduces the accesses of foreign and global memory as much as possible > in the parallel codepath. Adding atomic operations to global memory > (just to collect statistics) to the parallel codepath would go in the > opposite direction. > > Steffen > -- W.B.R. Dan Kruchinin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Steffen Klassert on 2 Jul 2010 07:30 On Fri, Jul 02, 2010 at 02:20:15PM +0400, Dan Kruchinin wrote: > > > > Yes, I thought about something like this. You can still take the sum > > over the percpu objects when you output the statistics. > > But summation can not be clear without some kind of lock because > while we're summing another CPU can increase or decrease its percpu statistic > counters. Then each statistic percpu counter must be modified under lock, right? > Yes, the counters must accessed under lock. In the fastpath functions you hold the appropriate lock anyway. Modifying a local percpu value should not be too painfull there. The expensive thing is to access the percpu statistics, but this happens on demand and is probaply a rare event. Steffen -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Steffen Klassert on 5 Jul 2010 07:20
On Fri, Jul 02, 2010 at 02:20:15PM +0400, Dan Kruchinin wrote: > On Fri, Jul 2, 2010 at 1:08 PM, Steffen Klassert > <steffen.klassert(a)secunet.com> wrote: > > On Thu, Jul 01, 2010 at 06:28:34PM +0400, Dan Kruchinin wrote: > >> > > >> > These statistic counters add a lot of atomic operations to the fast-path. > >> > Would'nt it be better to have these statistics in a percpu manner? > >> > This would avoid the atomic operations and we would get some additional > >> > information on the distribution of the queued objects. > >> > > >> > >> If I understood you correctly the resulting sysfs hierarchy would look like > >> this one: > >> pcrypt/ > >> |- serial_cpumask > >> |- parallel_cpumask > >> |- w0/ > >> +--- parallel_objects > >> +--- serial_objects > >> +--- reorder_objects > >> |- w1/ > >> ... > >> |- wN/ > >> > >> right? If so I think it won't be very convenient to monitor summary number > >> of parallel, serial and reorder objects. > > > > Yes, I thought about something like this. You can still take the sum > > over the percpu objects when you output the statistics. > > But summation can not be clear without some kind of lock because > while we're summing another CPU can increase or decrease its percpu statistic > counters. Then each statistic percpu counter must be modified under lock, right? > Thinking a bit longer about this statistics, this statistics work should be an extra patch. We should focus on the cpumask separation now and think about this statistics later. Steffen -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |