Prev: [PATCH] time/fs - file's time race with vgettimeofday
Next: [PATCH v2] KVM: IOAPIC: only access APIC registers one dword at a time
From: Peter Zijlstra on 2 Jul 2010 04:10 On Thu, 2010-07-01 at 16:12 -0700, Venkatesh Pallipadi wrote: > commit 871e35b moved update_group_power() call in update_sg_lb_stats(), > resulting in it being called for each group, even though it only updates > the power of local group. As a result we have frequent redundant > update_group_power() calls. > > Move it back under "if (local_group)" condition. > > This reduces the number of calls to update_group_power by a factor of 4 > on my 4 core in 4 NUMA nodes test system. Hrm,.. so Gautham removed that because for things like the NO_HZ balancer the initial balance_cpu == this_cpu constraint doesn't hold. Not I don't think the local_group constraint holds for that either, so the below would again break that.. Should we perhaps have a conditional on this_rq->nohz_balance_kick or so? > --- a/kernel/sched_fair.c > +++ b/kernel/sched_fair.c > @@ -2359,8 +2359,11 @@ static inline void update_sg_lb_stats(struct sched_domain *sd, > unsigned int balance_cpu = -1, first_idle_cpu = 0; > unsigned long avg_load_per_task = 0; > > - if (local_group) > + if (local_group) { > balance_cpu = group_first_cpu(group); > + update_group_power(sd, this_cpu); > + } > + > > /* Tally up the load of all CPUs in the group */ > max_cpu_load = 0; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Venkatesh Pallipadi on 2 Jul 2010 12:30 On Fri, Jul 2, 2010 at 1:05 AM, Peter Zijlstra <peterz(a)infradead.org> wrote: > On Thu, 2010-07-01 at 16:12 -0700, Venkatesh Pallipadi wrote: >> commit 871e35b moved update_group_power() call in update_sg_lb_stats(), >> resulting in it being called for each group, even though it only updates >> the power of local group. As a result we have frequent redundant >> update_group_power() calls. >> >> Move it back under "if (local_group)" condition. >> >> This reduces the number of calls to update_group_power by a factor of 4 >> on my 4 core in 4 NUMA nodes test system. > > Hrm,.. so Gautham removed that because for things like the NO_HZ > balancer the initial balance_cpu == this_cpu constraint doesn't hold. > > Not I don't think the local_group constraint holds for that either, so > the below would again break that.. > > Should we perhaps have a conditional on this_rq->nohz_balance_kick or > so? The thing is that update_group_power is only updating the power of local group (sd->groups). It is getting called multiple times however for each group as update_sd_lb_stats loops through groups->next calling update_sg_lb_stats. If we really want to update the power of non-local groups, update_cpu_power has to change to take a groups parameter and non this_cpu as arguments and may have to access non-local rq etc. Thanks, Venki > >> --- a/kernel/sched_fair.c >> +++ b/kernel/sched_fair.c >> @@ -2359,8 +2359,11 @@ static inline void update_sg_lb_stats(struct sched_domain *sd, >> � � � unsigned int balance_cpu = -1, first_idle_cpu = 0; >> � � � unsigned long avg_load_per_task = 0; >> >> - � � if (local_group) >> + � � if (local_group) { >> � � � � � � � balance_cpu = group_first_cpu(group); >> + � � � � � � update_group_power(sd, this_cpu); >> + � � } >> + >> >> � � � /* Tally up the load of all CPUs in the group */ >> � � � max_cpu_load = 0; > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on 2 Jul 2010 12:50 On Fri, 2010-07-02 at 09:20 -0700, Venkatesh Pallipadi wrote: > > Hrm,.. so Gautham removed that because for things like the NO_HZ > > balancer the initial balance_cpu == this_cpu constraint doesn't hold. > > > > Not I don't think the local_group constraint holds for that either, so > > the below would again break that.. > > > > Should we perhaps have a conditional on this_rq->nohz_balance_kick or > > so? > > The thing is that update_group_power is only updating the power of > local group (sd->groups). Not quite, see nohz_idle_balance(), that iterates idle_cpus_mask, and calls rebalance_domains(balance_cpu, CPU_IDLE), which then does for_each_domain(balance_cpu, sd) So sd need not be local at all, and sd->group will be the group of which balance_cpu is part. > It is getting called multiple times however for each group as > update_sd_lb_stats loops > through groups->next calling update_sg_lb_stats. Sure I see how that's happening and why you would want to avoid that, no argument there. > If we really want to update the power of non-local groups, > update_cpu_power has to change > to take a groups parameter and non this_cpu as arguments and may have > to access non-local > rq etc. No, see above. All we need is to somehow allow nohz_idle_balance() to update cpu_power as well. So I think we want something like: if (local_group) { balance_cpu = group_first_cpu(group); if (balance_cpu == this_cpu || nohz_balance) update_group_power(sd, this_cpu); } Or am I totally missing something here? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Venkatesh Pallipadi on 2 Jul 2010 13:00 My fault. I completely missed that code path, assuming this_cpu in load_balance means this_cpu :( How about the change below instead? Signed-off-by: Venkatesh Pallipadi <venki(a)google.com> --- kernel/sched_fair.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index a878b53..97f4534 100644 --- a/kernel/sched_fair.c +++ b/kernel/sched_fair.c @@ -2406,8 +2406,6 @@ static inline void update_sg_lb_stats(struct sched_domain *sd, return; } - update_group_power(sd, this_cpu); - /* Adjust by relative CPU power of the group */ sgs->avg_load = (sgs->group_load * SCHED_LOAD_SCALE) / group->cpu_power; @@ -2456,6 +2454,8 @@ static inline void update_sd_lb_stats(struct sched_domain *sd, int this_cpu, init_sd_power_savings_stats(sd, sds, idle); load_idx = get_sd_load_idx(sd, idle); + update_group_power(sd, this_cpu); + do { int local_group; -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on 2 Jul 2010 13:40
On Fri, 2010-07-02 at 09:56 -0700, Venkatesh Pallipadi wrote: > How about the change below instead? > I think I've now managed to confuse myself too.. will try and reset my brain and have a second look in a bit.. ;-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |