From: Stéphane on
Hello,
I am currently working on my own implementation in C++ of Strassen's fast matrix multiplication algorithm. It works actually pretty well.
What I do now is unit testing (cppunit), where I use Matlab as a golden standard. Everything works fine, except for big matrices full of big float numbers. Neither my new algorithm nor the former (O(n^3)) can stand the comparison with the matlab's result, the delta is just way too big for every and each element. I made some research, and it occured to me that I probably pull a rounding error all over my computation (rounding error amplification), and I made the assumption that somehow, Matlab's algorithm compensates this rounding error amplification.
I would be very happy if somebody here could provide some information about that.

Thank you,

Stéphane
From: Bruno Luong on
"Stéphane " <jalisastef(a)yahoo.ca> wrote in message <hpnmqc$gu7$1(a)fred.mathworks.com>...
> Hello,
> Matlab's algorithm compensates this rounding error amplification.

Not that I'm aware of (in the strict sense). However Matlab uses BLAS for full matrix product, and there is one thing you should pay attention is the so called "floating point control word" (FPCW) that control the precision on the internal computation inside the (Intel) processor. There is an compilation option on MSVC to allow to switch the FPCW between two states, otherwise you can control finer within the program. I believe Matlab by default uses 53-bit internal calculation. You should at least use as much.

Bruno
From: James Tursa on
"Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hpnorh$idd$1(a)fred.mathworks.com>...
> "Stéphane " <jalisastef(a)yahoo.ca> wrote in message <hpnmqc$gu7$1(a)fred.mathworks.com>...
> > Hello,
> > Matlab's algorithm compensates this rounding error amplification.
>
> Not that I'm aware of (in the strict sense). However Matlab uses BLAS for full matrix product, and there is one thing you should pay attention is the so called "floating point control word" (FPCW) that control the precision on the internal computation inside the (Intel) processor. There is an compilation option on MSVC to allow to switch the FPCW between two states, otherwise you can control finer within the program. I believe Matlab by default uses 53-bit internal calculation. You should at least use as much.
>
> Bruno

One additional note for OP. The BLAS that MATLAB uses was not written by The Mathworks but is actually a 3rd party library. You *might* be able to find out some info about it on the web somewhere but I am guessing that most of the details are proprietary.

James Tursa
From: Stéphane on
"Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hpnorh$idd$1(a)fred.mathworks.com>...
> "Stéphane " <jalisastef(a)yahoo.ca> wrote in message <hpnmqc$gu7$1(a)fred.mathworks.com>...
> > Hello,
> > Matlab's algorithm compensates this rounding error amplification.
>
> Not that I'm aware of (in the strict sense). However Matlab uses BLAS for full matrix product, and there is one thing you should pay attention is the so called "floating point control word" (FPCW) that control the precision on the internal computation inside the (Intel) processor. There is an compilation option on MSVC to allow to switch the FPCW between two states, otherwise you can control finer within the program. I believe Matlab by default uses 53-bit internal calculation. You should at least use as much.
>
> Bruno

Hello Bruno,

you're right about the 53 bits... it is the same for MSVC however, the double format has a 53 bits mantissa.
Thank for the clue about MSVC, I changed the precision from precise to strict in the floating point optimization option. I will let you know about it, tests are under process right now.

Stéphane
From: James Tursa on
"Stéphane " <jalisastef(a)yahoo.ca> wrote in message <hpnmqc$gu7$1(a)fred.mathworks.com>...
>
> I am currently working on my own implementation in C++ of Strassen's fast matrix multiplication algorithm.
> Everything works fine, except for big matrices full of big float numbers. Neither my new algorithm nor the former (O(n^3)) can stand the comparison with the matlab's result, the delta is just way too big for every and each element.

Can you provide a short code sample that generates the matrices that are giving you problems?

James Tursa