Prev: [RFC PATCH] net: NETDEV WATCHDOG should print something every time
Next: Dynamic power-save and pm-qos support on 802.11
From: Andrew Morton on 22 Jan 2010 17:30 On Tue, 12 Jan 2010 17:15:50 -0800 Mike Chan <mike(a)android.com> wrote: > Setting ignore_idle to 1 ignores idle time from time_in_state accounting. > > Currently cpufreq stats accounts for idle time time_in_state for each > cpu speed. For cpu's that have a low power idle state this improperly > accounts for time spent at each speed. > > The most relevant case is when the system is idle yet cpu time is still > accounted for at the lowest speed. This results in heavily skewed statistics > (towards the lowest speed) which makes these statistics useless when tuning > cpufreq scaling with cpuidle. > Is there somewhere where this new userspace interface should have been documented? > +static ssize_t store_ignore_idle(struct cpufreq_policy *policy, char *buf) > +{ > + int input; > + if (sscanf(buf, "%d", &input) != 1) > + return -EINVAL; > + > + ignore_idle = input; > + return 1; > +} No bounds checking is needed? This function will accept input of the form "42foo", which is sloppy. Use of strict_strtoul() will fix that. > +static ssize_t show_ignore_idle(struct cpufreq_policy *policy, char *buf) > +{ > + return sprintf(buf, "%d\n", ignore_idle); > +} > + > +CPUFREQ_STATDEVICE_ATTR(total_trans, 0444, show_total_trans, NULL); > +CPUFREQ_STATDEVICE_ATTR(time_in_state, 0444, show_time_in_state, NULL); > +CPUFREQ_STATDEVICE_ATTR(ignore_idle, 0664, show_ignore_idle, store_ignore_idle); > > static struct attribute *default_attrs[] = { > &_attr_total_trans.attr, > @@ -304,6 +323,21 @@ static int cpufreq_stat_notifier_trans(struct notifier_block *nb, > return 0; > } > > +void cpufreq_exit_idle(int cpu, unsigned long ticks) > +{ > + struct cpufreq_stats *stat; > + stat = per_cpu(cpufreq_stats_table, cpu); > + > + /* Wait until cpu stats is initalized */ > + if (!ignore_idle || !stat || !stat->time_in_state) > + return; > + > + spin_lock(&cpufreq_stats_lock); > + stat->time_in_state[stat->last_index] = > + cputime_sub(stat->time_in_state[stat->last_index], ticks); > + spin_unlock(&cpufreq_stats_lock); > +} cpufreq_stats_lock is not an irq-safe lock, so if cpufreq_exit_idle() gets called from an interrupt then this function is deadlockable. It doesn't _look_ like cpufreq_exit_idle() is presently called from a clock interrupt, but I didn't look too closely. > static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb, > unsigned long action, > void *hcpu) > diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h > index 4de02b1..e3f1363 100644 > --- a/include/linux/cpufreq.h > +++ b/include/linux/cpufreq.h > @@ -256,6 +256,11 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver_data); > > void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state); > > +#ifdef CONFIG_CPU_FREQ_STAT > +extern void cpufreq_exit_idle(int cpu, unsigned long ticks); > +#else > +#define cpufreq_exit_idle(int cpu, unsigned long ticks) do {} while (0) This didn't need to be a macro, I think. A static inline provides typechecking and is hence preferred. > +#endif > > static inline void cpufreq_verify_within_limits(struct cpufreq_policy *policy, unsigned int min, unsigned int max) > { > diff --git a/kernel/sched.c b/kernel/sched.c > index c535cc4..feef94c 100644 > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -54,6 +54,7 @@ > #include <linux/rcupdate.h> > #include <linux/cpu.h> > #include <linux/cpuset.h> > +#include <linux/cpufreq.h> > #include <linux/percpu.h> > #include <linux/kthread.h> > #include <linux/proc_fs.h> > @@ -5187,6 +5188,7 @@ void account_steal_ticks(unsigned long ticks) > */ > void account_idle_ticks(unsigned long ticks) > { > + cpufreq_exit_idle(smp_processor_id(), ticks); > account_idle_time(jiffies_to_cputime(ticks)); > } This only gets called if CONFIG_VIRT_CPU_ACCOUNTING=y, which is mostly a Xen thing. But your changelog didn't describe this being a xen-specific fix so I wonder whether that was intended. Please cc the sched developers on patches of this nature. And the Xen developers if appropriate. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |