Prev: net: Micrel KSZ8841/2 PCI Ethernet driver
Next: [PATCH] perf: Fix implicit declaration of getline in util.c
From: Hitoshi Mitake on 16 Jan 2010 08:10 On 2010年01月13日 18:52, Peter Zijlstra wrote: > On Thu, 2010-01-07 at 19:39 +0900, Hitoshi Mitake wrote: >> There are a lot of lock instances with same names (e.g. port_lock). >> This patch series add __FILE__ and __LINE__ to lockdep_map, >> and these will be used for trace lock events. >> >> Example use from perf lock map: >> >> | 0xffffea0004c992b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952) >> | 0xffffea0004b112b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952) >> | 0xffffea0004a3f2b8: __pte_lockptr(page) (src: include/linux/mm.h, line: 952) >> | 0xffffea0004cd5228: __pte_lockptr(page) (src: include/linux/mm.h, line: 952) >> | 0xffff8800b91e2b28:&sb->s_type->i_lock_key (src: fs/inode.c, line: 166) >> | 0xffff8800bb9d7ae0: key (src: kernel/wait.c, line: 16) >> | 0xffff8800aa07dae0:&dentry->d_lock (src: fs/dcache.c, line: 944) >> | 0xffff8800b07fbae0:&dentry->d_lock (src: fs/dcache.c, line: 944) >> | 0xffff8800b07f3ae0:&dentry->d_lock (src: fs/dcache.c, line: 944) >> | 0xffff8800bf15fae0:&sighand->siglock (src: kernel/fork.c, line: 1490) >> | 0xffff8800b90f7ae0:&dentry->d_lock (src: fs/dcache.c, line: 944) >> | ... >> >> (This output of perf lock map is produced by my local version, >> I'll send this later.) >> >> And sadly, as Peter Zijlstra predicted, this produces certain overhead. >> >> Before appling this series: >> | % sudo ./perf lock rec perf bench sched messaging >> | # Running sched/messaging benchmark... >> | # 20 sender and receiver processes per group >> | # 10 groups == 400 processes run >> | >> | Total time: 3.834 [sec] >> After: >> sudo ./perf lock rec perf bench sched messaging >> | # Running sched/messaging benchmark... >> | # 20 sender and receiver processes per group >> | # 10 groups == 400 processes run >> | >> | Total time: 5.415 [sec] >> | [ perf record: Woken up 0 times to write data ] >> | [ perf record: Captured and wrote 53.512 MB perf.data (~2337993 samples) ] >> >> But raw exec of perf bench sched messaging is this: >> | % perf bench sched messaging >> | # Running sched/messaging benchmark... >> | # 20 sender and receiver processes per group >> | # 10 groups == 400 processes run >> | >> | Total time: 0.498 [sec] >> >> Tracing lock events already produces amount of overhead. >> I think the overhead produced by this series is not a fatal problem, >> radically optimization is required... > > Right, these patches look OK, for the tracing overhead, you could > possibly hash the file:line into a u64 and reduce the tracepoint size, > that should improve the situation I tihnk, because I seem to remember > the only thing that really matters for speed is the size of things. > > Thanks for your opinion, Peter. I'll work on reducing size of events later. Hashing is a good idea. I think indexing is also way to reduce size. And I want lockdep_map to have another thing, type of lock. For example, mutex and spinlock has completely different acquired time and attributes, so I want to separate these things. If lockdep_map has member to express type, things will be easy. How do you think? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |