From: Koki Sanagi on
(2010/07/21 20:14), Neil Horman wrote:
> On Wed, Jul 21, 2010 at 03:57:05PM +0900, Koki Sanagi wrote:
>> (2010/07/20 20:04), Neil Horman wrote:
>>> On Tue, Jul 20, 2010 at 09:45:31AM +0900, Koki Sanagi wrote:
>>>> From: Lai Jiangshan <laijs(a)cn.fujitsu.com>
>>>>
>>>> Add a tracepoint for tracing when softirq action is raised.
>>>>
>>>> It and the existed tracepoints complete softirq's tracepoints:
>>>> softirq_raise, softirq_entry and softirq_exit.
>>>>
>>>> And when this tracepoint is used in combination with
>>>> the softirq_entry tracepoint we can determine
>>>> the softirq raise latency.
>>>>
>>>> Signed-off-by: Lai Jiangshan <laijs(a)cn.fujitsu.com>
>>>> Acked-by: Mathieu Desnoyers <mathieu.desnoyers(a)polymtl.ca>
>>>> Acked-by: Frederic Weisbecker <fweisbec(a)gmail.com>
>>>>
>>>> [ factorize softirq events with DECLARE_EVENT_CLASS ]
>>>> Signed-off-by: Koki Sanagi <sanagi.koki(a)jp.fujitsu.com>
>>>> ---
>>>> include/linux/interrupt.h | 8 +++++-
>>>> include/trace/events/irq.h | 57 ++++++++++++++++++++++++++-----------------
>>>> kernel/softirq.c | 4 +-
>>>> 3 files changed, 43 insertions(+), 26 deletions(-)
>>>>
>>>> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
>>>> index c233113..1cb5726 100644
>>>> --- a/include/linux/interrupt.h
>>>> +++ b/include/linux/interrupt.h
>>>> @@ -18,6 +18,7 @@
>>>> #include <asm/atomic.h>
>>>> #include <asm/ptrace.h>
>>>> #include <asm/system.h>
>>>> +#include <trace/events/irq.h>
>>>>
>>>> /*
>>>> * These correspond to the IORESOURCE_IRQ_* defines in
>>>> @@ -402,7 +403,12 @@ asmlinkage void do_softirq(void);
>>>> asmlinkage void __do_softirq(void);
>>>> extern void open_softirq(int nr, void (*action)(struct softirq_action *));
>>>> extern void softirq_init(void);
>>>> -#define __raise_softirq_irqoff(nr) do { or_softirq_pending(1UL << (nr)); } while (0)
>>>> +static inline void __raise_softirq_irqoff(unsigned int nr)
>>>> +{
>>>> + trace_softirq_raise(nr);
>>>> + or_softirq_pending(1UL << nr);
>>>> +}
>>>> +
>>> We already have tracepoints in irq_enter and irq_exit. If the goal here is to
>>> detect latency during packet processing, cant the delta in time between those
>>> two points be used to determine interrupt handling latency?
>>
>> Certainly, the time between irq_entry and irq_exit is not directly related to
>> latency during packet processing. But it's indirectly related it.
>> Because softirq_entry isn't passed until irq exits and softirq_entry time is
>> related to packet processing latency. So I show it as a reference.
>>
> Its not directly related no, but look at it, the amount of processing between
> irq_exit and softirq_entry is minimal. The information you are trying to
> extract by computing the delta from irq_entry to softirq_entry is almost exactly
> the same as that from irq_entry to irq_exit. For that matter, since you're
> trying to guage lantency for packet processing, I expect you could get the same
> delta by measuring irq_entry to napi_poll tracepoint time, and save the hassle
> of needing to filter on softirq processing that doesn't relate to packet
> processing.

Yeah, to determine interrput latency, we need either one irq_exit or
softirq_entry, not both.
And I think softirq_entry should be left because there is a possibility that
softirq isn't triggered immidiately after irq_exit.
softirq_exit isn't needed because it is not related to packet processing.
softirq_raise is needed because it connects irq_entry and softirq_entry but
there is no need to show it. Currently, my idea is like the following.

irq_entry(+0.000000msec,irq=77:eth3)
|
softirq_entry(+0.003562msec)
|
|---netif_receive_skb(+0.006279msec,len=100)
| |
| skb_copy_datagram_iovec(+0.038778msec, 2285:sshd)
|
napi_poll_exit(+0.017160msec, eth3)

>
>>>
>>>
>>>> extern void raise_softirq_irqoff(unsigned int nr);
>>>> extern void raise_softirq(unsigned int nr);
>>>> extern void wakeup_softirqd(void);
>>>> diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
>>>> index 0e4cfb6..717744c 100644
>>>> --- a/include/trace/events/irq.h
>>>> +++ b/include/trace/events/irq.h
>>>> @@ -5,7 +5,9 @@
>>>> #define _TRACE_IRQ_H
>>>>
>>>> #include <linux/tracepoint.h>
>>>> -#include <linux/interrupt.h>
>>>> +
>>>> +struct irqaction;
>>>> +struct softirq_action;
>>>>
>>>> #define softirq_name(sirq) { sirq##_SOFTIRQ, #sirq }
>>>> #define show_softirq_name(val) \
>>>> @@ -84,56 +86,65 @@ TRACE_EVENT(irq_handler_exit,
>>>>
>>>> DECLARE_EVENT_CLASS(softirq,
>>>>
>>>> - TP_PROTO(struct softirq_action *h, struct softirq_action *vec),
>>>> + TP_PROTO(unsigned int nr),
>>>>
>>>> - TP_ARGS(h, vec),
>>>> + TP_ARGS(nr),
>>>>
>>>> TP_STRUCT__entry(
>>>> - __field( int, vec )
>>>> + __field( unsigned int, vec )
>>>> ),
>>>>
>>>> TP_fast_assign(
>>>> - __entry->vec = (int)(h - vec);
>>>> + __entry->vec = nr;
>>>> ),
>>>>
>>>> TP_printk("vec=%d [action=%s]", __entry->vec,
>>>> - show_softirq_name(__entry->vec))
>>>> + show_softirq_name(__entry->vec))
>>>> +);
>>>> +
>>>> +/**
>>>> + * softirq_raise - called immediately when a softirq is raised
>>>> + * @nr: softirq vector number
>>>> + *
>>>> + * Tracepoint for tracing when softirq action is raised.
>>>> + * Also, when used in combination with the softirq_entry tracepoint
>>>> + * we can determine the softirq raise latency.
>>>> + */
>>>> +DEFINE_EVENT(softirq, softirq_raise,
>>>> +
>>>> + TP_PROTO(unsigned int nr),
>>>> +
>>>> + TP_ARGS(nr)
>>>> );
>>>>
>>>> /**
>>>> * softirq_entry - called immediately before the softirq handler
>>>> - * @h: pointer to struct softirq_action
>>>> - * @vec: pointer to first struct softirq_action in softirq_vec array
>>>> + * @nr: softirq vector number
>>>> *
>>>> - * The @h parameter, contains a pointer to the struct softirq_action
>>>> - * which has a pointer to the action handler that is called. By subtracting
>>>> - * the @vec pointer from the @h pointer, we can determine the softirq
>>>> - * number. Also, when used in combination with the softirq_exit tracepoint
>>>> + * Tracepoint for tracing when softirq action starts.
>>>> + * Also, when used in combination with the softirq_exit tracepoint
>>>> * we can determine the softirq latency.
>>>> */
>>>> DEFINE_EVENT(softirq, softirq_entry,
>>>>
>>>> - TP_PROTO(struct softirq_action *h, struct softirq_action *vec),
>>>> + TP_PROTO(unsigned int nr),
>>>>
>>>> - TP_ARGS(h, vec)
>>>> + TP_ARGS(nr)
>>>> );
>>>>
>>>> /**
>>>> * softirq_exit - called immediately after the softirq handler returns
>>>> - * @h: pointer to struct softirq_action
>>>> - * @vec: pointer to first struct softirq_action in softirq_vec array
>>>> + * @nr: softirq vector number
>>>> *
>>>> - * The @h parameter contains a pointer to the struct softirq_action
>>>> - * that has handled the softirq. By subtracting the @vec pointer from
>>>> - * the @h pointer, we can determine the softirq number. Also, when used in
>>>> - * combination with the softirq_entry tracepoint we can determine the softirq
>>>> - * latency.
>>>> + * Tracepoint for tracing when softirq action ends.
>>>> + * Also, when used in combination with the softirq_entry tracepoint
>>>> + * we can determine the softirq latency.
>>>> */
>>>> DEFINE_EVENT(softirq, softirq_exit,
>>>>
>>>> - TP_PROTO(struct softirq_action *h, struct softirq_action *vec),
>>>> + TP_PROTO(unsigned int nr),
>>>>
>>>> - TP_ARGS(h, vec)
>>>> + TP_ARGS(nr)
>>>> );
>>>>
>>>> #endif /* _TRACE_IRQ_H */
>>>> diff --git a/kernel/softirq.c b/kernel/softirq.c
>>>> index 825e112..6790599 100644
>>>> --- a/kernel/softirq.c
>>>> +++ b/kernel/softirq.c
>>>> @@ -215,9 +215,9 @@ restart:
>>>> int prev_count = preempt_count();
>>>> kstat_incr_softirqs_this_cpu(h - softirq_vec);
>>>>
>>>> - trace_softirq_entry(h, softirq_vec);
>>>> + trace_softirq_entry(h - softirq_vec);
>>>> h->action(h);
>>>> - trace_softirq_exit(h, softirq_vec);
>>>> + trace_softirq_exit(h - softirq_vec);
>>>
>>> You're loosing information here by reducing the numbers of parameters in this
>>> tracepoint. How many other tracepoint scripts rely on having both pointers
>>> handy? Why not just do the pointer math inside your tracehook instead?
>>
>> In __raise_softirq_irqoff macro there is no method to refer softirq_vec, so it
>> can't use softirq DECLARE_EVENT_CLASS as is.
>> Currently, there is no script using softirq_entry or softirq_exit.
>>
> That shouldn't matter, just pass in NULL for softirq_vec in
> __raise_softirq_irqoff as the second argument to the trace function. You may
> need to fix up the class definition so that the assignment or printk doesn't try
> to dereference that pointer when its NULL, but thats easy enough, and it avoids
> breaking any other perf scripts floating out there.
> Neil
>
>> Thanks,
>> Koki Sanagi.
>>
>>>
>>>> if (unlikely(prev_count != preempt_count())) {
>>>> printk(KERN_ERR "huh, entered softirq %td %s %p"
>>>> "with preempt_count %08x,"
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>> the body of a message to majordomo(a)vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>>
>>
>>
>>
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KOSAKI Motohiro on
> On Wed, Jul 21, 2010 at 10:01:34PM +0900, KOSAKI Motohiro wrote:
> > > > >> #endif /* _TRACE_IRQ_H */
> > > > >> diff --git a/kernel/softirq.c b/kernel/softirq.c
> > > > >> index 825e112..6790599 100644
> > > > >> --- a/kernel/softirq.c
> > > > >> +++ b/kernel/softirq.c
> > > > >> @@ -215,9 +215,9 @@ restart:
> > > > >> int prev_count = preempt_count();
> > > > >> kstat_incr_softirqs_this_cpu(h - softirq_vec);
> > > > >>
> > > > >> - trace_softirq_entry(h, softirq_vec);
> > > > >> + trace_softirq_entry(h - softirq_vec);
> > > > >> h->action(h);
> > > > >> - trace_softirq_exit(h, softirq_vec);
> > > > >> + trace_softirq_exit(h - softirq_vec);
> > > > >
> > > > > You're loosing information here by reducing the numbers of parameters in this
> > > > > tracepoint. How many other tracepoint scripts rely on having both pointers
> > > > > handy? Why not just do the pointer math inside your tracehook instead?
> > > >
> > > > In __raise_softirq_irqoff macro there is no method to refer softirq_vec, so it
> > > > can't use softirq DECLARE_EVENT_CLASS as is.
> > > > Currently, there is no script using softirq_entry or softirq_exit.
> > > >
> > > That shouldn't matter, just pass in NULL for softirq_vec in
> > > __raise_softirq_irqoff as the second argument to the trace function. You may
> > > need to fix up the class definition so that the assignment or printk doesn't try
> > > to dereference that pointer when its NULL, but thats easy enough, and it avoids
> > > breaking any other perf scripts floating out there.
> >
> > please see 5 lines above. we already have 'h - softirq_vec' calculation in
> > this function. so, Sanagi-san's change don't makes any overhead.
> >
> > So, if the overhead is zero, I'd prefer simplest tracepoint definition :)
> >
> I never complained about performance here, I complained about information loss.
> You have a tracepoint that provides two arguments here, and you're eliminating
> one of them. That will potentially break other users of this tracepoint. I
> understand we don't normally care about that with tracepoints as much, but if we
> can avoid it, why don't we?

I see. I have no objection.

Thanks.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/