Prev: [PATCH 1/2] ftrace: record command lines at more appropriate moment
Next: [PATCH 2/2] s6e63m0: fix section mismatch
From: Li Zefan on 27 Jul 2010 23:00 Ian Munsie wrote: > From: Ian Munsie <imunsie(a)au1.ibm.com> > > Previously, when tracing was activated through debugfs, regardless of > which tracing plugin (if any) were activated, the probe_sched_switch and > probe_sched_wakeup probes from the sched_switch plugin would be > activated. This appears to have been a hack to use them to record the > command lines of active processes as they were scheduled. > > That approach would suffer if many processes were being scheduled that > were not generating events as they would consume entries in the > saved_cmdlines buffer that could otherwise have been used by other > processes that were actually generating events. > > It also had the problem that events could be mis-attributed - in the > common situation of a process forking then execing a new process, the > change of the process command would not be noticed for some time after > the exec until the process was next scheduled. > > If the trace was read after the fact this would generally go unnoticed > because at some point the process would be scheduled and the entry in > the saved_cmdlines buffer would be updated so that the new command would > be reported when the trace was eventually read. However, if the events > were being read live (e.g. through trace_pipe), the events just after > the exec and before the process was next scheduled would show the > incorrect command (though the PID would be correct). > > This patch removes the sched_switch hack altogether and instead records > the commands at a more appropriate moment - at the same time the PID of > the process is recorded (i.e. when an entry on the ring buffer is > reserved). This means that the recorded command line is much more likely > to be correct when the trace is read, either live or after the fact, so > long as the command line still resides in the saved_cmdlines buffer. > > It is still not guaranteed to be correct in all situations. For instance > if the trace is read after the fact rather than live (consider events > generated by a process before an exec - in the below example they would > be attributed to sleep rather than stealpid since the entry in > saved_cmdlines would have changed before the event was read), but this > is no different to the current situation and the alternative would be to > store the command line with each and every event. > .... > > Signed-off-by: Ian Munsie <imunsie(a)au1.ibm.com> > --- > kernel/trace/trace.c | 3 +-- > kernel/trace/trace_events.c | 11 ----------- > kernel/trace/trace_functions.c | 2 -- > kernel/trace/trace_functions_graph.c | 2 -- > kernel/trace/trace_sched_switch.c | 10 ---------- > 5 files changed, 1 insertions(+), 27 deletions(-) > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index 4b1122d..f8458c3 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -1023,8 +1023,6 @@ void tracing_stop(void) > spin_unlock_irqrestore(&tracing_start_lock, flags); > } > > -void trace_stop_cmdline_recording(void); > - > static void trace_save_cmdline(struct task_struct *tsk) > { > unsigned pid, idx; > @@ -1112,6 +1110,7 @@ tracing_generic_entry_update(struct trace_entry *entry, unsigned long flags, > { > struct task_struct *tsk = current; > > + tracing_record_cmdline(tsk); Now this function is called everytime a tracepoint is triggered, so did you run some benchmarks to see if the performance is improved or even worse? Another problem in this patch is, tracing_generic_entry_update() is also called by perf, but cmdline recoding is not needed in perf. > entry->preempt_count = pc & 0xff; > entry->pid = (tsk) ? tsk->pid : 0; > entry->lock_depth = (tsk) ? tsk->lock_depth : 0; .... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |