code speed moving from fortran 77 compiler to f2003 compiler [Fortran]

Prev: small example, using complex variables in Ada
Next: reordering expressions with NaN

From: Tobias Burnus on 14 Jun 2010 16:05

Lynn McGuire wrote:
> Does anyone have experience with moving a large code set from
> a f77 compiler (open watcom f77) to a modern day fortran 2003
> compiler such as IVF ? I am interested in speeding up the code
> execution. This is double precision floating point bound code
> (from our testing in the past).

> I have tried to port the code to IVF in the past but have run
> into trouble with the zero initialization code in IVF.

Doesn't "-zero" work? I do not know the code, but some code also needs
-save.

(In gfortran, the options are called -finit-local-zero and [only if SAVE
is required!] -fno-automatic.)

* * *

Regarding automatic parallelization: My feeling is that it often does
not work and can even make your program slower. However, what usually
helps with speeding up is vectorization - which is also a kind of
parallelization (though on processor/SSE level) - but that's still using
only one core.

Tobias

From: Ron Shepard on 15 Jun 2010 11:11

In article <4C168BA3.3060508(a)net-b.de>,
Tobias Burnus <burnus(a)net-b.de> wrote:

> Lynn McGuire wrote:
> > Does anyone have experience with moving a large code set from
> > a f77 compiler (open watcom f77) to a modern day fortran 2003
> > compiler such as IVF ? I am interested in speeding up the code
> > execution. This is double precision floating point bound code
> > (from our testing in the past).
>
> > I have tried to port the code to IVF in the past but have run
> > into trouble with the zero initialization code in IVF.
>
> Doesn't "-zero" work? I do not know the code, but some code also needs
> -save.
>
> (In gfortran, the options are called -finit-local-zero and [only if SAVE
> is required!] -fno-automatic.)

I would suggest to the OP to correct this known bug in the program
as quickly as possible. Of course, a known bug is easier to work
with than unknown bugs, but sill, a bug is a bug, and there is no
guarantee that future compiles will work as wanted on code that has
these kinds of bugs.

If you have a compiler that produces correct results with one set of
standard-violating compiler options but incorrect results with
standard-conforming options, then this is actually a good situation.
You can compile various parts of your code with and without the
options and zero in on where the error in the code is located. And
once located, it can be fixed (in this case, by saving the necessary
variables and setting variables to correct initial values).

> Regarding automatic parallelization: My feeling is that it often does
> not work and can even make your program slower. However, what usually
> helps with speeding up is vectorization - which is also a kind of
> parallelization (though on processor/SSE level) - but that's still using
> only one core.

When I read the original post, the first thing I though was that a
modern compiler is more likely to use SSE instructions than an older
compiler, especially the newer SSE3 and SSE4 instructions. If you
have tight well-written do loops, then this is likely to result in
significant performance improvements, even without changing any of
the code, but perhaps some tweaking of the code might help this
process even more. There are also memory-related optimizations that
can be done on modern compilers to account for cache effects on
newer hardware.

$.02 -Ron Shepard

From: Lynn McGuire on 15 Jun 2010 17:53

>> I have tried to port the code to IVF in the past but have run
>> into trouble with the zero initialization code in IVF.
>
> Doesn't "-zero" work? I do not know the code, but some code also needs
> -save.

It does but only mostly. I cannot remember at this time but
some variables (maybe arrays ?) were not initialized to zero
and that killed my app.

> Regarding automatic parallelization: My feeling is that it often does
> not work and can even make your program slower. However, what usually
> helps with speeding up is vectorization - which is also a kind of
> parallelization (though on processor/SSE level) - but that's still using
> only one core.

I suspected as much. Yes, I think that automatic vectorization
would help, the question is just how much.

Thanks,
Lynn

From: Lynn McGuire on 15 Jun 2010 17:57

> Dependence on default SAVE syntax has been an obstacle to optimization for over 2 decades. This is why it was never part of the
> Fortran standard, and (from f77 on) better alternatives are offered. Default SAVE was a response to the widespread (but not
> universal) use of such an extension under f66. ifort does a reasonable job of emulating the default SAVE behavior of its predecessor,
> when the associated options are set, but that will make further optimizations difficult, as those options are incompatible with
> parallelization. At least on the C side, improved options are available for catching such dependencies.

I suspect that we need to remove the zero init and default save
requirements from the code in order to move forward on this
problem.

Thanks,
Lynn

From: Lynn McGuire on 15 Jun 2010 18:04

> When I read the original post, the first thing I though was that a
> modern compiler is more likely to use SSE instructions than an older
> compiler, especially the newer SSE3 and SSE4 instructions. If you
> have tight well-written do loops, then this is likely to result in
> significant performance improvements, even without changing any of
> the code, but perhaps some tweaking of the code might help this
> process even more. There are also memory-related optimizations that
> can be done on modern compilers to account for cache effects on
> newer hardware.

We have sloppy, thousands of lines of code with multitudes of
subroutine calls, do loops all over the place. All of our
floating point is in double precision. I would like to see the
effects of the SSE instructions but I need get a fully working
version of the code in IVF first. It goes without saying that
Open Watcom F77 does not have a clue about the SSE instructions.

Thanks,
Lynn

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12
Prev: small example, using complex variables in Ada
Next: reordering expressions with NaN