From: Tobias Burnus on 14 Jun 2010 16:05 Lynn McGuire wrote: > Does anyone have experience with moving a large code set from > a f77 compiler (open watcom f77) to a modern day fortran 2003 > compiler such as IVF ? I am interested in speeding up the code > execution. This is double precision floating point bound code > (from our testing in the past). > I have tried to port the code to IVF in the past but have run > into trouble with the zero initialization code in IVF. Doesn't "-zero" work? I do not know the code, but some code also needs -save. (In gfortran, the options are called -finit-local-zero and [only if SAVE is required!] -fno-automatic.) * * * Regarding automatic parallelization: My feeling is that it often does not work and can even make your program slower. However, what usually helps with speeding up is vectorization - which is also a kind of parallelization (though on processor/SSE level) - but that's still using only one core. Tobias
From: Ron Shepard on 15 Jun 2010 11:11 In article <4C168BA3.3060508(a)net-b.de>, Tobias Burnus <burnus(a)net-b.de> wrote: > Lynn McGuire wrote: > > Does anyone have experience with moving a large code set from > > a f77 compiler (open watcom f77) to a modern day fortran 2003 > > compiler such as IVF ? I am interested in speeding up the code > > execution. This is double precision floating point bound code > > (from our testing in the past). > > > I have tried to port the code to IVF in the past but have run > > into trouble with the zero initialization code in IVF. > > Doesn't "-zero" work? I do not know the code, but some code also needs > -save. > > (In gfortran, the options are called -finit-local-zero and [only if SAVE > is required!] -fno-automatic.) I would suggest to the OP to correct this known bug in the program as quickly as possible. Of course, a known bug is easier to work with than unknown bugs, but sill, a bug is a bug, and there is no guarantee that future compiles will work as wanted on code that has these kinds of bugs. If you have a compiler that produces correct results with one set of standard-violating compiler options but incorrect results with standard-conforming options, then this is actually a good situation. You can compile various parts of your code with and without the options and zero in on where the error in the code is located. And once located, it can be fixed (in this case, by saving the necessary variables and setting variables to correct initial values). > Regarding automatic parallelization: My feeling is that it often does > not work and can even make your program slower. However, what usually > helps with speeding up is vectorization - which is also a kind of > parallelization (though on processor/SSE level) - but that's still using > only one core. When I read the original post, the first thing I though was that a modern compiler is more likely to use SSE instructions than an older compiler, especially the newer SSE3 and SSE4 instructions. If you have tight well-written do loops, then this is likely to result in significant performance improvements, even without changing any of the code, but perhaps some tweaking of the code might help this process even more. There are also memory-related optimizations that can be done on modern compilers to account for cache effects on newer hardware. $.02 -Ron Shepard
From: Lynn McGuire on 15 Jun 2010 17:53 >> I have tried to port the code to IVF in the past but have run >> into trouble with the zero initialization code in IVF. > > Doesn't "-zero" work? I do not know the code, but some code also needs > -save. It does but only mostly. I cannot remember at this time but some variables (maybe arrays ?) were not initialized to zero and that killed my app. > Regarding automatic parallelization: My feeling is that it often does > not work and can even make your program slower. However, what usually > helps with speeding up is vectorization - which is also a kind of > parallelization (though on processor/SSE level) - but that's still using > only one core. I suspected as much. Yes, I think that automatic vectorization would help, the question is just how much. Thanks, Lynn
From: Lynn McGuire on 15 Jun 2010 17:57 > Dependence on default SAVE syntax has been an obstacle to optimization for over 2 decades. This is why it was never part of the > Fortran standard, and (from f77 on) better alternatives are offered. Default SAVE was a response to the widespread (but not > universal) use of such an extension under f66. ifort does a reasonable job of emulating the default SAVE behavior of its predecessor, > when the associated options are set, but that will make further optimizations difficult, as those options are incompatible with > parallelization. At least on the C side, improved options are available for catching such dependencies. I suspect that we need to remove the zero init and default save requirements from the code in order to move forward on this problem. Thanks, Lynn
From: Lynn McGuire on 15 Jun 2010 18:04
> When I read the original post, the first thing I though was that a > modern compiler is more likely to use SSE instructions than an older > compiler, especially the newer SSE3 and SSE4 instructions. If you > have tight well-written do loops, then this is likely to result in > significant performance improvements, even without changing any of > the code, but perhaps some tweaking of the code might help this > process even more. There are also memory-related optimizations that > can be done on modern compilers to account for cache effects on > newer hardware. We have sloppy, thousands of lines of code with multitudes of subroutine calls, do loops all over the place. All of our floating point is in double precision. I would like to see the effects of the SSE instructions but I need get a fully working version of the code in IVF first. It goes without saying that Open Watcom F77 does not have a clue about the SSE instructions. Thanks, Lynn |