From: James Tursa on
Rune Allnor <allnor(a)tele.ntnu.no> wrote in message <f0482013-df24-406d-af1b-0d5b24fef77e(a)y11g2000yqm.googlegroups.com>...
>
> You need to understand the way C(++) works, and the options
> available to you, to obtain speed. To be specific, the
> dereference call
>
> double array[1000];
> double x;
> int i;
>
> for (i = 0; i < 1000; i++)
> x = array[i];
>
> is a bit more complex than you might think:
>
> - Fetch the value of i
> - Fetch the base adress of array
> - Add i to the base address
> - Fetch the number from the location
> - Increment i
>
> In contrast, the alternative formulation
>
> double array[1000];
> double x;
> double *p;
> int i;
>
> p = array;
> for (i = 0; i < 1000; ++i)
> x = *p++;
>
> In this case, the chain of events is something like
> Init:
> - Fetch the base address of the array, and store in p
>
> Inside the loop:
> - Fetch the number from the location p points to
> - Increment p
> - Fetch i
> - Increment i
>
> When the loop is this tiny, this altered, and shorter,
> chain of events might easily mean a factor 30% or more
> on run-time. The more work is done inside the loop,
> the less the relative saving, but in that case you will
> need to work in a similar manner through the operations
> that take place inside the loop.

If the compiler produces machine code at a very basic level pretty much as the source code is written, then yes you might see that kind of difference in the executable. But optimizing compilers these days are very smart, particularly when it comes to simple indexing in a loop as you show above. It might be that both versions produce pretty much the same executable code, or even that the "indexed" loop produces faster code than the "pointer" loop. Maybe one version just manages to use registers better than the other version. And the answer may vary from compiler to compiler and from machine to machine. Etc. etc.

Sometimes you just have to try it both ways to see what your particular compiler does with it and not try to assume too much about which way you think will be faster. I know that compiler writers are a lot smarter than I am about how to optimize code for a particular machine architecture, so I would advise doing something reasonable and readable with your source code first, then tweak it later if you really need more speed.

James Tursa