Prev: perf_events: ctx_flexible_sched_in()
Next: Hello
From: Stephane Eranian on 1 Feb 2010 07:30 Hi, I believe there is something wrong with ctx_flexible_sched_in(). The function does not allow maximizing PMU usage because of the way can_add_hw is managed. Basically, as soon as a group fail to be scheduled in, then no other group can. I believe this is not optimum. You need to skip the group that fails and keep scanning the list. There may be other groups which can be scheduled. Here is an example to illustrate the issue: $ task -ebaclears,div,instructions_retired,fp_assist noploop 5 noploop for 5 seconds 908 baclears (scaled from 74.97% of time) 0 div (scaled from 50.01% of time) 11328128990 instructions_retired (scaled from 74.99% of time) 0 fp_assist (scaled from 50.00% of time) Here div, fp_assist can only go on counter 1. There is no explicit grouping. On Intel Core, you have 2 generic, 3 fixed counters. Instruction_retired can go on a fixed counter. Thus, I was expecting baclears and instructions_retired to always be scheduled. The other two would alternate at 50% each. While you get the latter behavior, you are not getting full utilization for the other two. Once I modify ctx_flexible_sched_in(): $ ./task -ebaclears,div,instructions_retired,fp_assist noploop 5 noploop for 5 seconds 658 baclears 0 div (scaled from 50.01% of time) 11726844342 instructions_retired 0 fp_assist (scaled from 50.00% of time) I get the right result. Thus, I think, we need to drop can_add_hw from ctx_flexible_sched_in(). Am I missing something in the role of can_add_hw? If not, then I I will provide a patch to get the optimum behavior. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Pages: 1 Prev: perf_events: ctx_flexible_sched_in() Next: Hello |