Porting Mathlab code to C++ - accuracy problem [Matlab]

Prev: exemple for mlp code
Next: Learning Fortran

From: Bruno Luong on 24 May 2010 18:21

"Jan Simon" <matlab.THIS_YEAR(a)nMINUSsimon.de> wrote in message <hte3oq$t98$1(a)fred.mathworks.com>...
>
>
> I disagree. Sorting normally distributed values decreases the accuracy of a sum: The first hals of the sum accumulates the nagative values. The temporary result is a large negative number. Then some small positive numbers are added - without influencing the sum due to the round off!
>
> Sorting normally distributed values according to their absolute value is more accurate:

Ah I see many people who takes care of summing accurately the noise (e.g., normally distributed). LOL

Bruno

From: Walter Roberson on 24 May 2010 19:12

Jan Simon wrote:

> Sorting normally distributed values according to their absolute value is
> more accurate:

That's the approach I normally think of, but then I get into thinking,
"Ah, but suppose I have a negative number and a positive number whose
magnitudes are nearly equal; would I not increase precision by
"canceling" those two numbers pairwise rather than just sorting by
magnitude? At about this point I get lost in figuring out what the best
approach is.

From: Bruno Luong on 25 May 2010 02:11

Walter Roberson <roberson(a)hushmail.com> wrote in message <lJDKn.12285$7d5.1065(a)newsfe17.iad>...
> Jan Simon wrote:
>
> > Sorting normally distributed values according to their absolute value is
> > more accurate:
>
> That's the approach I normally think of, but then I get into thinking,
> "Ah, but suppose I have a negative number and a positive number whose
> magnitudes are nearly equal; would I not increase precision by
> "canceling" those two numbers pairwise rather than just sorting by
> magnitude? At about this point I get lost in figuring out what the best
> approach is.

Yes Walter, sorting is the first reflex people often think of. But no one use it, it has limitation when applied on *real* signal (not noise), not even mention the costly sorting step. Some of the serious approaches are Knuths, Rump's, Li's and al (Xblas implementation).

Note that Matlab more resent version does necessary not sum in the order of the input array. The only way to control the order is write the Mex file, or use an existing library (like Xsum by Jan).

Bruno

From: James Tursa on 25 May 2010 11:25

"Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hteu4g$kn4$1(a)fred.mathworks.com>...
> "Jan Simon" <matlab.THIS_YEAR(a)nMINUSsimon.de> wrote in message <hte3oq$t98$1(a)fred.mathworks.com>...
>
> Ah I see many people who takes care of summing accurately the noise (e.g., normally distributed). LOL
>
> Bruno

Yes, I agree. This discussion so far seems to be missing one important thing, the accuracy of the original data. Take this simple list:

-1 1e-20 1

The obvious "best" sum is 1e-20, right? You can see that with your own eyes. You can get it by judicious sorting of the data first to put it in this order and then doing the sum in that order:

-1 1 1e20

But that leaves out the entire discussion of the accuracy of the original data, which you can easily get at with eps of the largest magnitude number:

eps(1) = 2.2204e-016

So one can go through a lot of effort to use judicious reordering/sorting schemes to get the "best" answer for the sum based on the inputs, but it may not be worth it. e.g., the above "best" answer for the sum based on the inputs is 1e-20, but it is only accurate to about 4e-16, so have we really gained anything as far as the accuracy of the result by the reordering this particular list? Is the answer 1e-20 really better than an answer of 0 one would get by summing the list in the original order? Well, this depends. If the original data was generated by some process that produced exact results in the floating point data, then maybe the sum of 1e-20 really *is* better and has more meaning. But if the original data was generated by some process that did not do this (which is usually the case), then you would have to say that the sum answer of 1e-20 really isn't any better than a sum
answer of 0, the result one would get with a simple sum of the data in the original order.

Bottom line is one needs to understand the accuracy of the original data. Go ahead and use the "best" scheme if your original data warrants it, but first understand the trades you are making with extra code complexity, execution time, and what you are really buying as far as the "accuracy" of the result is concerned.

James Tursa

From: Rune Allnor on 25 May 2010 12:56

On 25 Mai, 17:25, "James Tursa"
<aclassyguy_with_a_k_not_...(a)hotmail.com> wrote:

> Bottom line is one needs to understand the accuracy of the original data. Go ahead and use the "best" scheme if your original data warrants it, but first understand the trades you are making with extra code complexity, execution time, and what you are really buying as far as the "accuracy" of the result is concerned.

You are right, assuming that the data are 'measurements' of
some kind. However, these accuracy issues have a habit of turning
up in the midst of numerical computations.

Assume your analytic expression ends up somehwere along the
lines of

y = exp(x + 1) - exp(x - 1)

where x is 'large'. The end result of this computation needs
not be very large, but the intermediate results are such that
several significant digits are lost, which in turn tend to
seriously mess up subsequent computations that rely on y
above.

This kind of thing happened in what is known as Tompson and
Haskell's method for modelling acoustic media, which was
developed some time in the 1950s. Because of the large
magnitudes of the intermediate results, what looked as a
simple and convenient modelling method gained the reputation
as a totally unpredictable numerical beast.

Several attempts have been made to come up with numerically
stable methods to do the computations, and all of the
successful ones (notably by Henrik Schmidt in the '80s and
Sven Ivansson in the '90s) relied on carefully and cautiously
re-formulating and re-ordering the individual terms that needed
to be computed.

Rune

First | Prev | Next | Last
Pages: 1 2 3 4
Prev: exemple for mlp code
Next: Learning Fortran