Prev: Correct way to write a wrapper for C functions that accept/return ?strings
Next: Correct way to write a wrapper for C functions that accept/return ??strings
From: Arjan on 9 Mar 2010 14:13 Until now I monitor the performance of my application by measuring the real time spent by my program and subtract the value from the former iteration from the latest estimate. This gives me the number of seconds per iteration of my process. I have only 1 CPU, so the available time is distributed over all processes. My current application uses a lot of CPU and produces only a tiny bit of output, so I/O-time is not restrictive. How can I measure the net cpu-time spent by my program per iteration of my calculation, i.e. corrected for the fraction of CPU assigned to the process? A.
From: Gordon Sande on 9 Mar 2010 14:32 On 2010-03-09 15:13:48 -0400, Arjan <arjan.van.dijk(a)rivm.nl> said: > Until now I monitor the performance of my application by measuring the > real time spent by my program and subtract the value from the former > iteration from the latest estimate. This gives me the number of > seconds per iteration of my process. I have only 1 CPU, so the > available time is distributed over all processes. My current > application uses a lot of CPU and produces only a tiny bit of output, > so I/O-time is not restrictive. How can I measure the net cpu-time > spent by my program per iteration of my calculation, i.e. corrected > for the fraction of CPU assigned to the process? > > A. Isn't "cpu_time" intended to give your the cpu time for your process? This assumes that the system is capable of keeping track of the time used by each process. Noted as new so must be F95. Real time is usually understood to be wall clock as given by "date_and_time". There is also "system_clock" which gives the processor clock in processor units but is still a real time clock. Both part of F90.
From: glen herrmannsfeldt on 9 Mar 2010 14:51 Arjan <arjan.van.dijk(a)rivm.nl> wrote: > Until now I monitor the performance of my application by measuring the > real time spent by my program and subtract the value from the former > iteration from the latest estimate. This gives me the number of > seconds per iteration of my process. I have only 1 CPU, so the > available time is distributed over all processes. My current > application uses a lot of CPU and produces only a tiny bit of output, > so I/O-time is not restrictive. How can I measure the net cpu-time > spent by my program per iteration of my calculation, i.e. corrected > for the fraction of CPU assigned to the process? For IA32, I usually use a routine that returns the value of the time stamp counter, as given by the RDTSC instruction. While one should expect to see time given to other tasks, in the times and places that I have used it I pretty much never do. Read the RDTSC value before and after a computational task, and subract the two values. That gives the number of CPU clock cycles used for that operation, at least as well as the CPU knows how to count such cycles. Something like the following, as rdtsc.s, given to gcc: .file "rdtsc.s" rdtsc: rdtsc ret and declared in Fortran as a function of type INTEGER*8 rdtsc Rumors are that this one works for X86-64, but I haven't tried it recently: ..globl _fun_ .def _fun_; .scl 2; .type 32; .endef _fun_: rdtsc shlq $32, %rdx orq %rdx, %rax ret Some other processors have a similar register. -- glen
From: JB on 9 Mar 2010 15:29 On 2010-03-09, Gordon Sande <g.sande(a)worldnet.att.net> wrote: > On 2010-03-09 15:13:48 -0400, Arjan <arjan.van.dijk(a)rivm.nl> said: > >> Until now I monitor the performance of my application by measuring the >> real time spent by my program and subtract the value from the former >> iteration from the latest estimate. This gives me the number of >> seconds per iteration of my process. I have only 1 CPU, so the >> available time is distributed over all processes. My current >> application uses a lot of CPU and produces only a tiny bit of output, >> so I/O-time is not restrictive. How can I measure the net cpu-time >> spent by my program per iteration of my calculation, i.e. corrected >> for the fraction of CPU assigned to the process? >> >> A. > > Isn't "cpu_time" intended to give your the cpu time for your process? > This assumes that the system is capable of keeping track of the time used > by each process. Noted as new so must be F95. > > Real time is usually understood to be wall clock as given by "date_and_time". > There is also "system_clock" which gives the processor clock in processor units > but is still a real time clock. Both part of F90. Nowadays many systems have a concept of a "monotonic" clock in addition to the real time clock. Meaning that this is a clock that increases monotonically, that is, it's not affected by the system administrator, NTP, or somesuch setting the real time clock. On such systems one could argue that system_clock, which is typically used for measuring elapsed time rather than getting the current best estimate for the real world time, should use the monotonic clock. -- JB
From: JB on 9 Mar 2010 16:10
On 2010-03-09, glen herrmannsfeldt <gah(a)ugcs.caltech.edu> wrote: > Arjan <arjan.van.dijk(a)rivm.nl> wrote: > >> Until now I monitor the performance of my application by measuring the >> real time spent by my program and subtract the value from the former >> iteration from the latest estimate. This gives me the number of >> seconds per iteration of my process. I have only 1 CPU, so the >> available time is distributed over all processes. My current >> application uses a lot of CPU and produces only a tiny bit of output, >> so I/O-time is not restrictive. How can I measure the net cpu-time >> spent by my program per iteration of my calculation, i.e. corrected >> for the fraction of CPU assigned to the process? > > For IA32, I usually use a routine that returns the value of > the time stamp counter, as given by the RDTSC instruction. Here, let me formulate a corollary to Godwin's law: "As an online programming discussion about timing grows longer, the probability of someone suggesting use of RDTSC approaches 1". The wikipedia page contains reasons why it should not be used except in very specific circumstances: http://en.wikipedia.org/wiki/Rdtsc Here in Fortran-happy-happy-land, the solution in the vast majority of cases is to use the standard timing intrinsics DATE_AND_TIME, SYSTEM_CLOCK, and CPU_TIME. -- JB |