From: Stephane Eranian on 17 May 2010 10:30 On Tue, May 11, 2010 at 4:48 PM, Peter Zijlstra <peterz(a)infradead.org> wrote: > On Tue, 2010-05-11 at 16:04 +0200, Stephane Eranian wrote: >> Hi, >> >> >> I am confused by the inheritance cmd line option of perf record: >> >> $ perf record -h >> usage: perf record [<options>] [<command>] >> or: perf record [<options>] -- <command> [<options>] >> >> -e, --event <event> event selector. use 'perf list' to list >> available events >> --filter <filter> >> event filter >> -p, --pid <n> record events on existing process id >> -t, --tid <n> record events on existing thread id >> -r, --realtime <n> collect data with this RT SCHED_FIFO priority >> -R, --raw-samples collect raw sample records from all opened counters >> -a, --all-cpus system-wide collection from all CPUs >> -A, --append append to the output file to do incremental profiling >> -C, --profile_cpu <n> >> CPU to profile on >> -f, --force overwrite existing data file (deprecated) >> -c, --count event period to sample >> -o, --output <file> output file name >> -i, --inherit child tasks inherit counters >> >> This leads to believe that by default inheritance in children is off. >> >> However, builtin-record.c says: >> >> static bool inherit = true; >> >> If that's the case, what's the point of the -i option? > > Right, I think we should invert that, does --no-inherit work? > >> Another side effect of inheritance is that in per-thread mode, >> perf creates as many "sessions" as you have CPUs. So >> on a 16-way processor, sampling on cycles, perf creates >> 16 events and 16 x 2-page sampling buffers. That's a lot of >> resources consumed if I am just interested in monitoring >> a single-threaded workload. > > Right, but I think the default of inherit is right, and once you do that > you basically have to do the per-task-per-cpu thing, otherwise your > fancy 16-way will start spending most of its time in cacheline bounces. > In that case, don't you think you should also ensure that the buffer is allocated on the NUMA node of the designated per-thread-per-cpu? I don't think it is the case today. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on 17 May 2010 12:50 On Mon, 2010-05-17 at 16:25 +0200, Stephane Eranian wrote: > > Right, but I think the default of inherit is right, and once you do that > > you basically have to do the per-task-per-cpu thing, otherwise your > > fancy 16-way will start spending most of its time in cacheline bounces. > > > In that case, don't you think you should also ensure that the buffer is > allocated on the NUMA node of the designated per-thread-per-cpu? > I don't think it is the case today. Yeah, something like the below ought to do I guess.. Almost-Signed-off-by: Peter Zijlstra <a.p.zijlstra(a)chello.nl> --- kernel/perf_event.c | 17 +++++++++++++++-- 1 files changed, 15 insertions(+), 2 deletions(-) diff --git a/kernel/perf_event.c b/kernel/perf_event.c index 9dbe8cd..85e2d32 100644 --- a/kernel/perf_event.c +++ b/kernel/perf_event.c @@ -2288,6 +2288,19 @@ perf_mmap_to_page(struct perf_mmap_data *data, unsigned long pgoff) return virt_to_page(data->data_pages[pgoff - 1]); } +static void *perf_mmap_alloc_page(int cpu) +{ + struct page *page; + int node; + + node = (cpu == -1) ? cpu : cpu_to_node(cpu); + page = alloc_pages_node(node, GFP_KERNEL | __GFP_ZERO, 0); + if (!page) + return NULL; + + return page_address(page); +} + static struct perf_mmap_data * perf_mmap_data_alloc(struct perf_event *event, int nr_pages) { @@ -2304,12 +2317,12 @@ perf_mmap_data_alloc(struct perf_event *event, int nr_pages) if (!data) goto fail; - data->user_page = (void *)get_zeroed_page(GFP_KERNEL); + data->user_page = perf_mmap_alloc_page(event->cpu); if (!data->user_page) goto fail_user_page; for (i = 0; i < nr_pages; i++) { - data->data_pages[i] = (void *)get_zeroed_page(GFP_KERNEL); + data->data_pages[i] = perf_mmap_alloc_page(event->cpu); if (!data->data_pages[i]) goto fail_data_pages; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
First
|
Prev
|
Pages: 1 2 3 Prev: xfs: add a shrinker to background inode reclaim Next: ACPI Errors on Acer Aspire One |