lost of efficiency of operator overloading [Fortran]

Prev: passing a type of different KIND to a C function from Fortran ?binding
Next: ALLOCATABLE array as MPI buffer

From: Arjen Markus on 28 May 2010 07:21

On 28 mei, 12:06, Kay Diederichs <kay.diederi...(a)uni-konstanz.de>
wrote:
> mecej4 schrieb:
>
>
>
> > Hifi-Comp wrote:
>
> >> I am writing a code using operator overloading feature of F90/95. The
> >> basic math is to replace all real(8) with two real(8) contained in a
> >> type named DUAL_NUM, and overloading the corresponding calculations
> >> according to newly defined data. Based on the coding, the computing
> >> should not more than three times more than computing for real(8).
> >> However, my simple test shows that computing for DUAL_NUM is almost
> >> nine times more expensive. I hope some of your knowledgable Fortran
> >> experts can help me figure out the loss of efficiency and how can I
> >> make the code more efficient. Thanks alot!
>
> >> TYPE,PUBLIC:: DUAL_NUM
> >> REAL(8)::x_ad_
> >> REAL(8)::xp_ad_
> >> END TYPE DUAL_NUM
>
> >> PUBLIC OPERATOR (+)
> >> INTERFACE OPERATOR (+)
> >> MODULE PROCEDURE ADD_DD ! dual+ dual, ELEMENTAL
> >> END INTERFACE
>
> >> PUBLIC OPERATOR (*)
> >> INTERFACE OPERATOR (*)
> >> MODULE PROCEDURE MULT_DD ! dual*dual, ELEMENTAL
> >> END INTERFACE
>
> >> ELEMENTAL FUNCTION ADD_DD(u,v) RESULT(res)
> >> TYPE (DUAL_NUM), INTENT(IN)::u,v
> >> TYPE (DUAL_NUM)::res
> >> res%x_ad_ = u%x_ad_+v%x_ad_
> >> res%xp_ad_ = u%xp_ad_+v%xp_ad_
> >> END FUNCTION ADD_DD
>
> >> ELEMENTAL FUNCTION MULT_DD(u,v) RESULT(res)
> >> TYPE (DUAL_NUM), INTENT(IN)::u,v
> >> TYPE (DUAL_NUM)::res
> >> res%x_ad_ = u%x_ad_*v%x_ad_
> >> res%xp_ad_= u%xp_ad_*v%x_ad_ + u%x_ad_*v%xp_ad_
> >> END FUNCTION MULT_DD
>
> >> The segment of the original code:
> >> REAL(8):: x, y, z,f
> >> x=1.0d0;y=2.0d0;z=0.3d0
>
> >> !**********************************
> >> DO i=1,50000000
> >> f=x-y*z
> >> ENDDO
> >> !**********************************
>
> >> The do loop runs for 0.516 seconds.
>
> >> The corresponding overloaded code:
> >> TYPE(DUAL_NUM):: x,y,z,f
>
> >> x=DUAL_NUM(1.0d0,1.0D0);
> >> y=DUAL_NUM(2.0d0,1.0D0);
> >> z=DUAL_NUM(0.3d0,0.0D0)
>
> >> !**********************************
> >> DO i=1,50000000
> >> f=X-y*z
> >> ENDDO
> >> !*********************************
> >> The do loop runs for 4.513 seconds.
>
> >> Supposedly, for DUAL_NUM, the operations needed for minus are twice as
> >> those needed for REAL, and the operations needed for times are thrice
> >> as those needed for REAL. That is the time needed for computation
> >> should not be more than three times of computation for real. However,
> >> the overall time is almost nine times more. What else takes more time?
>
> > You have no provision for carries and overflows in your multiplication. And,
> > you have not yet reached the fun part: division. Once you implement
> > division, you will appreciate why doing multiple-precision floating point
> > arithmetic in software is undertaken only if unavoidable.
>
> > -- mecej4
>
> I understand your comment as meaning that you have identified the code
> as doing a part of interval arithmetics (at least that's what I think
> it's headed to), and that furthermore you have looked into that more deeply.
> I am quite interested in learning about existing software (e.g. Fortran
> MODULE) that allows to (as simply as possible) convert an existing
> program from normal arithmetics to interval arithmetics, e.g. to
> pinpoint parts of code that benefit from higher precision calculations.
>
> Another "fun part" of that, once one has the + - * / is, I guess, to
> provide overloaded functions of min max abs sqrt exp log sin cos tan and
> so on. But it would be extremely useful, I'd say.
>
> Do you have any pointers?
>
> thanks,
> Kay

There is a simple module for interval arithmetic in my Flibs project
(http://flibs.sf.net). It includes support for most elementary
functions
(not for min and max, though).

There is no documentation yet, but it should not be too hard to
figure
out how to use it.

Regards,

Arjen

From: Hifi-Comp on 28 May 2010 07:24

In fact I have already had all the operations overloaded including all
intrinsic functions, math operations (+-*/ **) and relational
operators. One thing I am not satisfied is the speed. With /, the time
for analysis remains the same, yet for the overloaded code it runs for
6.25 secs (CVF 6.6) and 7.25 secs (gfortran 4.3), For those who are
interested to test the speed, I am pasting the source codes here:

PROGRAM Test
USE DNAD
USE CPUTime

IMPLICIT NONE
REAL(8):: x_,y_,z_,f_,ftot_
TYPE(DUAL_NUM):: x,y,z,f,ftot
INTEGER:: I

x_=1.0d0;y_=2.0d0;z_=0.3d0
ftot_=0.0d0

CALL TIC

DO i=1,50000000
f_=x_-y_*z_/x_
ftot_ = ftot_ - f_
ENDDO
WRITE(*,*)'Analysis Runs for ', TOC(),' Seconds.'

write(*,*)ftot_

x=DUAL_NUM(1.0d0,0.1D0);y=DUAL_NUM(2.0d0,0.2D0);z=DUAL_NUM(0.3d0,0.3D0)
ftot=DUAL_NUM(0.0d0,0.0D0)

CALL TIC

DO i=1,50000000
f=X-y*z/x
ftot = ftot - f
ENDDO
WRITE(*,*)'DNAD Runs for ', TOC(),' Seconds.'

write(*,*)ftot_

END PROGRAM Test

MODULE CPUTime
IMPLICIT NONE
PRIVATE
PUBLIC TIC, TOC
INTEGER::start, rate, finish
CONTAINS

SUBROUTINE TIC
CALL SYSTEM_CLOCK(start,rate)
END SUBROUTINE TIC

FUNCTION TOC() RESULT(sec)

REAL::sec

CALL SYSTEM_CLOCK(finish)
IF(finish>start) THEN
sec=REAL(finish-start)/REAL(rate)
ELSE
sec=0.0
ENDIF

END FUNCTION TOC
END MODULE CPUTime

MODULE DNAD
IMPLICIT NONE
PRIVATE
INTEGER, PARAMETER:: DBL_AD=SELECTED_REAL_KIND(15)
REAL(DBL_AD) ::negative_one=-1.0d0

TYPE,PUBLIC:: DUAL_NUM
REAL(DBL_AD)::x_ad_
REAL(DBL_AD)::xp_ad_
END TYPE DUAL_NUM

PUBLIC OPERATOR (-)
INTERFACE OPERATOR (-)
MODULE PROCEDURE MINUS_DD
END INTERFACE

PUBLIC OPERATOR (*)
INTERFACE OPERATOR (*)
MODULE PROCEDURE MULT_DD
END INTERFACE

PUBLIC OPERATOR (/)
INTERFACE OPERATOR (/)
MODULE PROCEDURE DIV_DD
END INTERFACE

CONTAINS

ELEMENTAL FUNCTION MINUS_DD(u,v) RESULT(res)
TYPE (DUAL_NUM), INTENT(IN)::u,v
TYPE (DUAL_NUM)::res

res%x_ad_ = u%x_ad_-v%x_ad_
res%xp_ad_= u%xp_ad_-v%xp_ad_

END FUNCTION MINUS_DD

ELEMENTAL FUNCTION MULT_DD(u,v) RESULT(res)
TYPE (DUAL_NUM), INTENT(IN)::u,v
TYPE (DUAL_NUM)::res

res%x_ad_ = u%x_ad_*v%x_ad_
res%xp_ad_= u%xp_ad_*v%x_ad_ + u%x_ad_*v%xp_ad_

END FUNCTION MULT_DD

ELEMENTAL FUNCTION DIV_DD(u,v) RESULT(res)
TYPE (DUAL_NUM), INTENT(IN)::u,v
REAL(DBL_AD)::tmp
TYPE (DUAL_NUM)::res
INTEGER:: i

tmp=1.D0/v%x_ad_
res%x_ad_ = u%x_ad_*tmp
res%xp_ad_ =(u%xp_ad_- res%x_ad_*v%xp_ad_)*tmp

END FUNCTION DIV_DD

END MODULE DNAD

From: Hifi-Comp on 28 May 2010 07:49

On May 28, 7:21 am, Arjen Markus <arjen.markus...(a)gmail.com> wrote:

> There is a simple module for interval arithmetic in my Flibs project
> (http://flibs.sf.net). It includes support for most elementary
> functions
> (not for min and max, though).
>
> There is no documentation yet, but it should not be too hard to
> figure
> out how to use it.
>
> Regards,
>
> Arjen- Hide quoted text -
>
> - Show quoted text -

I just wend to the site and find out "automatic differentiation" is
exactly what I have done with the extension to differentiate real-
world general-purpose engineering analysis codes written in Fortran
for gradient-based optimization and design. There are some other
packages available which can be found from
http://www.autodiff.org/?module=Tools&language=Fortran95.

From: m_b_metcalf on 28 May 2010 08:24

On May 28, 1:24 pm, Hifi-Comp <wenbinyu.hea...(a)gmail.com> wrote:
> In fact I have already had all the operations overloaded including all
> intrinsic functions, math operations (+-*/ **) and relational
> operators. One thing I am not satisfied is the speed. With /, the time
> for analysis remains the same, yet for the overloaded code it runs for
> 6.25 secs (CVF 6.6) and 7.25 secs (gfortran 4.3), For those who are
> interested to test the speed, I am pasting the source codes here:
>
> PROGRAM Test
> USE DNAD
> USE CPUTime
>
> IMPLICIT NONE
> REAL(8):: x_,y_,z_,f_,ftot_
> TYPE(DUAL_NUM):: x,y,z,f,ftot
> INTEGER:: I
>
> x_=1.0d0;y_=2.0d0;z_=0.3d0
> ftot_=0.0d0
>
> CALL TIC
>
> DO i=1,50000000
> f_=x_-y_*z_/x_
> ftot_ = ftot_ - f_
> ENDDO
> WRITE(*,*)'Analysis Runs for ', TOC(),' Seconds.'
>
> write(*,*)ftot_
>
> x=DUAL_NUM(1.0d0,0.1D0);y=DUAL_NUM(2.0d0,0.2D0);z=DUAL_NUM(0.3d0,0.3D0)
> ftot=DUAL_NUM(0.0d0,0.0D0)
>
> CALL TIC
>
> DO i=1,50000000
> f=X-y*z/x
> ftot = ftot - f
> ENDDO
> WRITE(*,*)'DNAD Runs for ', TOC(),' Seconds.'
>
> write(*,*)ftot_
>
> END PROGRAM Test
>
> MODULE CPUTime
> IMPLICIT NONE
> PRIVATE
> PUBLIC TIC, TOC
> INTEGER::start, rate, finish
> CONTAINS
>
> SUBROUTINE TIC
> CALL SYSTEM_CLOCK(start,rate)
> END SUBROUTINE TIC
>
> FUNCTION TOC() RESULT(sec)
>
> REAL::sec
>
> CALL SYSTEM_CLOCK(finish)
> IF(finish>start) THEN
> sec=REAL(finish-start)/REAL(rate)
> ELSE
> sec=0.0
> ENDIF
>
> END FUNCTION TOC
> END MODULE CPUTime
>
> MODULE DNAD
> IMPLICIT NONE
> PRIVATE
> INTEGER, PARAMETER:: DBL_AD=SELECTED_REAL_KIND(15)
> REAL(DBL_AD) ::negative_one=-1.0d0
>
> TYPE,PUBLIC:: DUAL_NUM
> REAL(DBL_AD)::x_ad_
> REAL(DBL_AD)::xp_ad_
> END TYPE DUAL_NUM
>
> PUBLIC OPERATOR (-)
> INTERFACE OPERATOR (-)
> MODULE PROCEDURE MINUS_DD
> END INTERFACE
>
> PUBLIC OPERATOR (*)
> INTERFACE OPERATOR (*)
> MODULE PROCEDURE MULT_DD
> END INTERFACE
>
> PUBLIC OPERATOR (/)
> INTERFACE OPERATOR (/)
> MODULE PROCEDURE DIV_DD
> END INTERFACE
>
> CONTAINS
>
> ELEMENTAL FUNCTION MINUS_DD(u,v) RESULT(res)
> TYPE (DUAL_NUM), INTENT(IN)::u,v
> TYPE (DUAL_NUM)::res
>
> res%x_ad_ = u%x_ad_-v%x_ad_
> res%xp_ad_= u%xp_ad_-v%xp_ad_
>
> END FUNCTION MINUS_DD
>
> ELEMENTAL FUNCTION MULT_DD(u,v) RESULT(res)
> TYPE (DUAL_NUM), INTENT(IN)::u,v
> TYPE (DUAL_NUM)::res
>
> res%x_ad_ = u%x_ad_*v%x_ad_
> res%xp_ad_= u%xp_ad_*v%x_ad_ + u%x_ad_*v%xp_ad_
>
> END FUNCTION MULT_DD
>
> ELEMENTAL FUNCTION DIV_DD(u,v) RESULT(res)
> TYPE (DUAL_NUM), INTENT(IN)::u,v
> REAL(DBL_AD)::tmp
> TYPE (DUAL_NUM)::res
> INTEGER:: i
>
> tmp=1.D0/v%x_ad_
> res%x_ad_ = u%x_ad_*tmp
> res%xp_ad_ =(u%xp_ad_- res%x_ad_*v%xp_ad_)*tmp
>
> END FUNCTION DIV_DD
>
> END MODULE DNAD
This is what I get with Intel's compiler on a 2GHz PC. Msybe you need
to upgrade?

Analysis Runs for 0.1050000 Seconds.
-19999999.9990236
DNAD Runs for 0.1040000 Seconds.
-19999999.9990236
Press any key to continue . . .

Regards,

Mike Metcalf

From: m_b_metcalf on 28 May 2010 08:32

On May 28, 2:24 pm, m_b_metcalf <michaelmetc...(a)compuserve.com> wrote:
> On May 28, 1:24 pm, Hifi-Comp <wenbinyu.hea...(a)gmail.com> wrote:
>
>
>
> > In fact I have already had all the operations overloaded including all
> > intrinsic functions, math operations (+-*/ **) and relational
> > operators. One thing I am not satisfied is the speed. With /, the time
> > for analysis remains the same, yet for the overloaded code it runs for
> > 6.25 secs (CVF 6.6) and 7.25 secs (gfortran 4.3), For those who are
> > interested to test the speed, I am pasting the source codes here:
>
> > PROGRAM Test
> > USE DNAD
> > USE CPUTime
>
> > IMPLICIT NONE
> > REAL(8):: x_,y_,z_,f_,ftot_
> > TYPE(DUAL_NUM):: x,y,z,f,ftot
> > INTEGER:: I
>
> > x_=1.0d0;y_=2.0d0;z_=0.3d0
> > ftot_=0.0d0
>
> > CALL TIC
>
> > DO i=1,50000000
> > f_=x_-y_*z_/x_
> > ftot_ = ftot_ - f_
> > ENDDO
> > WRITE(*,*)'Analysis Runs for ', TOC(),' Seconds.'
>
> > write(*,*)ftot_
>
> > x=DUAL_NUM(1.0d0,0.1D0);y=DUAL_NUM(2.0d0,0.2D0);z=DUAL_NUM(0.3d0,0.3D0)
> > ftot=DUAL_NUM(0.0d0,0.0D0)
>
> > CALL TIC
>
> > DO i=1,50000000
> > f=X-y*z/x
> > ftot = ftot - f
> > ENDDO
> > WRITE(*,*)'DNAD Runs for ', TOC(),' Seconds.'
>
> > write(*,*)ftot_
>
> > END PROGRAM Test
>
> > MODULE CPUTime
> > IMPLICIT NONE
> > PRIVATE
> > PUBLIC TIC, TOC
> > INTEGER::start, rate, finish
> > CONTAINS
>
> > SUBROUTINE TIC
> > CALL SYSTEM_CLOCK(start,rate)
> > END SUBROUTINE TIC
>
> > FUNCTION TOC() RESULT(sec)
>
> > REAL::sec
>
> > CALL SYSTEM_CLOCK(finish)
> > IF(finish>start) THEN
> > sec=REAL(finish-start)/REAL(rate)
> > ELSE
> > sec=0.0
> > ENDIF
>
> > END FUNCTION TOC
> > END MODULE CPUTime
>
> > MODULE DNAD
> > IMPLICIT NONE
> > PRIVATE
> > INTEGER, PARAMETER:: DBL_AD=SELECTED_REAL_KIND(15)
> > REAL(DBL_AD) ::negative_one=-1.0d0
>
> > TYPE,PUBLIC:: DUAL_NUM
> > REAL(DBL_AD)::x_ad_
> > REAL(DBL_AD)::xp_ad_
> > END TYPE DUAL_NUM
>
> > PUBLIC OPERATOR (-)
> > INTERFACE OPERATOR (-)
> > MODULE PROCEDURE MINUS_DD
> > END INTERFACE
>
> > PUBLIC OPERATOR (*)
> > INTERFACE OPERATOR (*)
> > MODULE PROCEDURE MULT_DD
> > END INTERFACE
>
> > PUBLIC OPERATOR (/)
> > INTERFACE OPERATOR (/)
> > MODULE PROCEDURE DIV_DD
> > END INTERFACE
>
> > CONTAINS
>
> > ELEMENTAL FUNCTION MINUS_DD(u,v) RESULT(res)
> > TYPE (DUAL_NUM), INTENT(IN)::u,v
> > TYPE (DUAL_NUM)::res
>
> > res%x_ad_ = u%x_ad_-v%x_ad_
> > res%xp_ad_= u%xp_ad_-v%xp_ad_
>
> > END FUNCTION MINUS_DD
>
> > ELEMENTAL FUNCTION MULT_DD(u,v) RESULT(res)
> > TYPE (DUAL_NUM), INTENT(IN)::u,v
> > TYPE (DUAL_NUM)::res
>
> > res%x_ad_ = u%x_ad_*v%x_ad_
> > res%xp_ad_= u%xp_ad_*v%x_ad_ + u%x_ad_*v%xp_ad_
>
> > END FUNCTION MULT_DD
>
> > ELEMENTAL FUNCTION DIV_DD(u,v) RESULT(res)
> > TYPE (DUAL_NUM), INTENT(IN)::u,v
> > REAL(DBL_AD)::tmp
> > TYPE (DUAL_NUM)::res
> > INTEGER:: i
>
> > tmp=1.D0/v%x_ad_
> > res%x_ad_ = u%x_ad_*tmp
> > res%xp_ad_ =(u%xp_ad_- res%x_ad_*v%xp_ad_)*tmp
>
> > END FUNCTION DIV_DD
>
> > END MODULE DNAD
>
> This is what I get with Intel's compiler on a 2GHz PC. Msybe you need
> to upgrade?
>
> Analysis Runs for 0.1050000 Seconds.
> -19999999.9990236
> DNAD Runs for 0.1040000 Seconds.
> -19999999.9990236
> Press any key to continue . . .
>
> Regards,
>
> Mike Metcalf- Hide quoted text -
>
> - Show quoted text -

Correcting the code (superfluous _ on ftot) gives:
Analysis Runs for 0.1060000 Seconds.
-19999999.9990236
DNAD Runs for 0.1050000 Seconds.
-19999999.9990236 25000000.0000000
Press any key to continue . . .

First | Prev | Next | Last
Pages: 1 2 3 4 5 6
Prev: passing a type of different KIND to a C function from Fortran ?binding
Next: ALLOCATABLE array as MPI buffer