Prev: PCI quirks: RS780/RS880: work around missing MSI initialization
Next: cx231xx: card->driver "Conexant cx231xx Audio" too long
From: Vaidyanathan Srinivasan on 22 Mar 2010 12:40 Hi Peter, This is repost of the same patch http://lkml.org/lkml/2010/3/2/216 After applying Suresh's fixes from discussion thread http://lkml.org/lkml/2010/2/12/352, we still need the attached patch to restore sched_smt_powersavings=1 functionality where tasks prefer sibling threads and keep more cores idle. Please apply to sched-tip, the patch is rebased and tested on today's sched-tip master. The attached patch will run 4 while(1) loops in two cores when sched_smt_power_savings=1. Tested on two socket, quad core, hyper threaded system. Additional testing was done on POWER platform where sched_smt_powersavings was able to consolidate tasks on sibling threads leaving more idle cores. Thanks, Vaidy --- sched: Fix group_capacity for sched_smt_powersavings=1 sched_smt_powersavings for threaded systems need this fix for consolidation to sibling threads to work. Since threads have fractional capacity, group_capacity will turn out to be one always and not accommodate another task in the sibling thread. This fix makes group_capacity a function of cpumask_weight that will enable the power saving load balancer to pack tasks among sibling threads and keep more cores idle. Signed-off-by: Vaidyanathan Srinivasan <svaidy(a)linux.vnet.ibm.com> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index 5a5ea2c..7c0a29a 100644 --- a/kernel/sched_fair.c +++ b/kernel/sched_fair.c @@ -2538,6 +2538,21 @@ static inline void update_sd_lb_stats(struct sched_domain *sd, int this_cpu, */ if (prefer_sibling) sgs.group_capacity = min(sgs.group_capacity, 1UL); + /* + * If power savings balance is set at this domain, then + * make capacity equal to number of hardware threads to + * accommodate more tasks until capacity is reached. + */ + else if (sd->flags & SD_POWERSAVINGS_BALANCE) + sgs.group_capacity = + cpumask_weight(sched_group_cpus(group)); + + /* + * The default group_capacity is rounded from sum of + * fractional cpu_powers of sibling hardware threads + * in order to enable fair use of available hardware + * resources. + */ if (local_group) { sds->this_load = sgs.avg_load; @@ -2863,7 +2878,8 @@ static int need_active_balance(struct sched_domain *sd, int sd_idle, int idle) !test_sd_parent(sd, SD_POWERSAVINGS_BALANCE)) return 0; - if (sched_mc_power_savings < POWERSAVINGS_BALANCE_WAKEUP) + if (sched_mc_power_savings < POWERSAVINGS_BALANCE_WAKEUP && + sched_smt_power_savings < POWERSAVINGS_BALANCE_WAKEUP) return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |