From: Arjan on
Until now I monitor the performance of my application by measuring the
real time spent by my program and subtract the value from the former
iteration from the latest estimate. This gives me the number of
seconds per iteration of my process. I have only 1 CPU, so the
available time is distributed over all processes. My current
application uses a lot of CPU and produces only a tiny bit of output,
so I/O-time is not restrictive. How can I measure the net cpu-time
spent by my program per iteration of my calculation, i.e. corrected
for the fraction of CPU assigned to the process?

A.
From: Gordon Sande on
On 2010-03-09 15:13:48 -0400, Arjan <arjan.van.dijk(a)rivm.nl> said:

> Until now I monitor the performance of my application by measuring the
> real time spent by my program and subtract the value from the former
> iteration from the latest estimate. This gives me the number of
> seconds per iteration of my process. I have only 1 CPU, so the
> available time is distributed over all processes. My current
> application uses a lot of CPU and produces only a tiny bit of output,
> so I/O-time is not restrictive. How can I measure the net cpu-time
> spent by my program per iteration of my calculation, i.e. corrected
> for the fraction of CPU assigned to the process?
>
> A.

Isn't "cpu_time" intended to give your the cpu time for your process?
This assumes that the system is capable of keeping track of the time used
by each process. Noted as new so must be F95.

Real time is usually understood to be wall clock as given by "date_and_time".
There is also "system_clock" which gives the processor clock in processor units
but is still a real time clock. Both part of F90.


From: glen herrmannsfeldt on
Arjan <arjan.van.dijk(a)rivm.nl> wrote:

> Until now I monitor the performance of my application by measuring the
> real time spent by my program and subtract the value from the former
> iteration from the latest estimate. This gives me the number of
> seconds per iteration of my process. I have only 1 CPU, so the
> available time is distributed over all processes. My current
> application uses a lot of CPU and produces only a tiny bit of output,
> so I/O-time is not restrictive. How can I measure the net cpu-time
> spent by my program per iteration of my calculation, i.e. corrected
> for the fraction of CPU assigned to the process?

For IA32, I usually use a routine that returns the value of
the time stamp counter, as given by the RDTSC instruction.

While one should expect to see time given to other tasks,
in the times and places that I have used it I pretty much never do.
Read the RDTSC value before and after a computational task,
and subract the two values. That gives the number of CPU clock
cycles used for that operation, at least as well as the CPU knows
how to count such cycles.


Something like the following, as rdtsc.s, given to gcc:

.file "rdtsc.s"
rdtsc:
rdtsc
ret

and declared in Fortran as a function of type

INTEGER*8 rdtsc

Rumors are that this one works for X86-64, but I haven't
tried it recently:

..globl _fun_
.def _fun_; .scl 2; .type 32; .endef
_fun_:
rdtsc
shlq $32, %rdx
orq %rdx, %rax
ret

Some other processors have a similar register.

-- glen
From: JB on
On 2010-03-09, Gordon Sande <g.sande(a)worldnet.att.net> wrote:
> On 2010-03-09 15:13:48 -0400, Arjan <arjan.van.dijk(a)rivm.nl> said:
>
>> Until now I monitor the performance of my application by measuring the
>> real time spent by my program and subtract the value from the former
>> iteration from the latest estimate. This gives me the number of
>> seconds per iteration of my process. I have only 1 CPU, so the
>> available time is distributed over all processes. My current
>> application uses a lot of CPU and produces only a tiny bit of output,
>> so I/O-time is not restrictive. How can I measure the net cpu-time
>> spent by my program per iteration of my calculation, i.e. corrected
>> for the fraction of CPU assigned to the process?
>>
>> A.
>
> Isn't "cpu_time" intended to give your the cpu time for your process?
> This assumes that the system is capable of keeping track of the time used
> by each process. Noted as new so must be F95.
>
> Real time is usually understood to be wall clock as given by "date_and_time".
> There is also "system_clock" which gives the processor clock in processor units
> but is still a real time clock. Both part of F90.

Nowadays many systems have a concept of a "monotonic" clock in
addition to the real time clock. Meaning that this is a clock that
increases monotonically, that is, it's not affected by the system
administrator, NTP, or somesuch setting the real time clock. On such
systems one could argue that system_clock, which is typically used for
measuring elapsed time rather than getting the current best estimate
for the real world time, should use the monotonic clock.

--
JB
From: JB on
On 2010-03-09, glen herrmannsfeldt <gah(a)ugcs.caltech.edu> wrote:
> Arjan <arjan.van.dijk(a)rivm.nl> wrote:
>
>> Until now I monitor the performance of my application by measuring the
>> real time spent by my program and subtract the value from the former
>> iteration from the latest estimate. This gives me the number of
>> seconds per iteration of my process. I have only 1 CPU, so the
>> available time is distributed over all processes. My current
>> application uses a lot of CPU and produces only a tiny bit of output,
>> so I/O-time is not restrictive. How can I measure the net cpu-time
>> spent by my program per iteration of my calculation, i.e. corrected
>> for the fraction of CPU assigned to the process?
>
> For IA32, I usually use a routine that returns the value of
> the time stamp counter, as given by the RDTSC instruction.

Here, let me formulate a corollary to Godwin's law: "As an online
programming discussion about timing grows longer, the probability of
someone suggesting use of RDTSC approaches 1".

The wikipedia page contains reasons why it should not be used except
in very specific circumstances:

http://en.wikipedia.org/wiki/Rdtsc

Here in Fortran-happy-happy-land, the solution in the vast majority of
cases is to use the standard timing intrinsics DATE_AND_TIME,
SYSTEM_CLOCK, and CPU_TIME.


--
JB