Prev: BUG: using smp_processor_id() in preemptible [00000000] code: icedove-bin/5449
Next: x86: Add IRQ_TIME_ACCOUNTING, finer accounting of irq time to task
From: Venkatesh Pallipadi on 25 May 2010 17:50 On Tue, May 25, 2010 at 2:13 AM, Balbir Singh <balbir(a)linux.vnet.ibm.com> wrote: > * Venkatesh Pallipadi <venki(a)google.com> [2010-05-24 17:11:19]: > >> Currently, kernel does not have accounting mechanism for softirq and hardirq >> times at the task level. There is irq time info in kstat_cpu which is >> accumulated at the cpu level. >> >> Without the task level information, the non irq run time of task(s) would >> have to be guessed based on their exec time and CPU on which they were >> running recently and assuming that the CPU irq time reported are spread >> across all the tasks running there. And this guess can be widely off the mark. >> >> Sample case, considering just the softirq: >> >> If there are varied workloads running on a CPU, say a CPU bound task (loop) >> and a network IO bound task (nc) along with the network softirq load, >> there is no way for the administrator/user to know the non-irq runtime of each >> of these tasks. Only information available is the total runtime for each of the >> tasks and kstat_cpu softirq time for the CPU. >> >> In this example, considering a 10 second sample, both loop and nc would have >> total run time of ~5s. And kstat_cpu softirq on this cpu increase was >> 355 (~3.5s). >> >> So, all the information the user gets is that both the tasks are running for >> roughly the same amount of time and softirq is around 35%. As a result user >> may conclude that irq overhead for both tasks are equal (1.75s) and the >> non-irq runtime of both the tasks are around ~3.25s. Yes. There is another >> factor of system and user time reported for these tasks that I am ignoring >> as that is tough to correlate with irq time, in cases where the tasks have >> significant non-irq system time. >> >> This change adds tracking of softirq time on each task and task group. >> This information is exported in /proc/<pid>/stat. >> >> So, the user can get info like below, looking at exec_time and si_time in >> appropriate /proc/<pid>/stat. >> (Taken for a 10s interval) >> task exec_time softirqtime (in USER_HZ) >> (loop) �(nc) >> 505 0 � 500 359 >> 502 1 � 501 363 >> 503 0 � 502 354 >> 504 0 � 499 359 >> 503 3 � 500 360 >> >> with this, user can get the non-irq run time as 5s and ~1.45s for >> loop and nc, respectively. > > Have you noticed any overheads after these changes? Otherwise, the > changes look correct to me. > Haven't noticed any significant overhead yet. But, I am yet to run this on wide array of systems. Will post the data once I have them. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |