floating point, how many significant figures? [C++]

Prev: Accessor Functions (getter) for C String (Character Array) Members
Next: Why is the return type of count_if() "signed" rather than "unsigned"?

From: Andrew on 24 Jun 2010 09:49

I always thought the answer was 15, assuming IEEE (seems fairly safe
to assume that these days for all but some esoteric embedded
environments). Then I came across a C++ program that requires 16. And
with Visual Studio it seems to work. Then I recompiled it with GCC and
spotted several differences. All the program did was to read in a file
with several numbers and output those numbers again in a different
way. But the program held the numbers as double rather than string.
Here are some examples of the differences I found:

VS GCC
-937566.2364699869 -937566.2364699868
-939498.8118815000 -939498.8118814999
928148.9855319375 928148.9855319374
543195.7159558449 543195.7159558448

Checking aginst the original input file, VS is the one that gets it
right. Can anyone comment on why the difference with GCC please?

Regards,

Andrew Marlow

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: George Neuner on 24 Jun 2010 17:41

On Thu, 24 Jun 2010 18:49:36 CST, Andrew
<marlow.andrew(a)googlemail.com> wrote:

>I always thought the answer was 15, assuming IEEE (seems fairly safe
>to assume that these days for all but some esoteric embedded
>environments). Then I came across a C++ program that requires 16. And
>with Visual Studio it seems to work. Then I recompiled it with GCC and
>spotted several differences. All the program did was to read in a file
>with several numbers and output those numbers again in a different
>way. But the program held the numbers as double rather than string.
>Here are some examples of the differences I found:
>
>VS GCC
>-937566.2364699869 -937566.2364699868
>-939498.8118815000 -939498.8118814999
>928148.9855319375 928148.9855319374
>543195.7159558449 543195.7159558448
>
>Checking aginst the original input file, VS is the one that gets it
>right. Can anyone comment on why the difference with GCC please?

IEEE double precision isn't 16 significant figures ... it's actually
about 15.9 on average - some code that requires 16 works and some
doesn't so it's best never to test the limits. Also keep in mind that
floating point is an *approximation* of a real number and some base 10
numbers don't have a finite representation in base 2.

You say VC++ is doing the right thing, but are you certain?

The VC++ stream library has a known precision issue when writing and
reading back floating point values as text using different
representations. The fallback answer in VC++ is to read FP data
using scanf if you don't know how it was written (scanf always works).

The G++ stream library isn't known to have such problems. If you're
changing data representations, it's more likely that G++'s results are
correct.

Most likely the difference is in the stream parsers in the respective
compiler libraries, but it would help if you post both the code and
the test data.
George

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Helge Kruse on 24 Jun 2010 17:54

"Andrew" <marlow.andrew(a)googlemail.com> wrote in message news:bc1605e1-5c21-4c40-8e72-434fd22746d3(a)j8g2000yqd.googlegroups.com...
> I always thought the answer was 15,

I heard the answer would 42. But I may be wrong. ;-)

Could you elaborate the question a bit more?

> Here are some examples of the differences I found:
>
> VS GCC
> -937566.2364699869 -937566.2364699868
> -939498.8118815000 -939498.8118814999
> 928148.9855319375 928148.9855319374
> 543195.7159558449 543195.7159558448

What did you compare?

Sorry for my lack of wisdom,
Helge

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Andrew on 24 Jun 2010 22:11

On 25 June, 08:41, George Neuner <gneun...(a)comcast.net> wrote:
> On Thu, 24 Jun 2010 18:49:36 CST, Andrew
>
> >VS GCC
> >-937566.2364699869 -937566.2364699868
> >-939498.8118815000 -939498.8118814999
> >928148.9855319375 928148.9855319374
> >543195.7159558449 543195.7159558448
>
> >Checking aginst the original input file, VS is the one that gets it
> >right. Can anyone comment on why the difference with GCC please?

> You say VC++ is doing the right thing, but are you certain?

Yes. I looked up the values in the original input file. They tally
with the output file.

> The VC++ stream library has a known precision issue when writing and
> reading back floating point values as text using different
> representations.

What issues are these? Can you give a reference please?

> The fallback answer in VC++ is to read FP data
> using scanf if you don't know how it was written (scanf always works).

Actually, I think the thing to do to ensure that the numbers have
their exact string representation preserved is to keep them as strings
when the file is read in. This was not done for memory space reasons.
Holding each number as a double rather than a string takes only 8
bytes but it takes around twice that for a string. The files being
processed are hundreds of megs and it is close to blowing up due to
lack of memory.

> The G++ stream library isn't known to have such problems. If you're
> changing data representations, it's more likely that G++'s results are
> correct.

I understand and normally I trust GCC and don't trust VS but this time
VS has the correct behaviour. I diff'd the output built via VS against
the output built with GCC and selected a few lines that were
different. I looked up the VS strings in the original input file and
found them. I didn't find different strings produced by the GCC run.

> Most likely the difference is in the stream parsers in the respective
> compiler libraries, but it would help if you post both the code and
> the test data.
> George

I can prove there is a problem. Here is a little program:

#include <iostream>
#include <iomanip>
#include <sstream>
#include <cmath>

double convertStringToValue(const std::string input)
{
double value;
std::stringstream str(input);
str >> value;
return value;
}

std::string formatValue(double value)
{
std::stringstream str;

int prec = std::min<int>(15 - (int)log10(fabs(value)), 15);

str << std::fixed << std::setprecision(prec) << value;

return str.str();
}

int main()
{
std::string input = "-937566.2364699869";
double value = convertStringToValue(input);
std::string converted = formatValue(value);

if (input == converted)
std::cout << input << " converted ok." << std::endl;
else
{
std::cout << "Conversion failed:" << std::endl
<< "Input: " << input << std::endl
<< "Converted: " << converted << std::endl;

}

return 0;
}

Here's what I get when I run it (built using GCC 4.4.2):-

Conversion failed:
Input: -937566.2364699869
Converted: -937566.2364699868

-Andrew Marlow

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: SG on 24 Jun 2010 22:11

On 25 Jun., 02:49, Andrew wrote:
> [Subject: floating point, how many significant figures?]
> I always thought the answer was 15, assuming IEEE (seems fairly safe
> to assume that these days for all but some esoteric embedded
> environments).

It may be safe to assume. Just for the record: C and C++ guarantee
floats to have a precision of about 6 significant decimal digits and
double and long double to have about 10.

> Then I came across a C++ program that requires 16. And
> with Visual Studio it seems to work. Then I recompiled it with GCC and
> spotted several differences. All the program did was to read in a file
> with several numbers and output those numbers again in a different
> way. But the program held the numbers as double rather than string.
> Here are some examples of the differences I found:
>
> VS GCC
> -937566.2364699869 -937566.2364699868
> -939498.8118815000 -939498.8118814999
> 928148.9855319375 928148.9855319374
> 543195.7159558449 543195.7159558448
>
> Checking aginst the original input file, VS is the one that gets it
> right. Can anyone comment on why the difference with GCC please?

937566.2364699869 =
11100100111001011110.001111001000100101001100000011000 01110...

The closest representable number with an IEEE-754 64bit float is

11100100111001011110.001111001000100101001100000011000 =
937566.2364699868 485...

The closest representable 16-digit decimal number is

937566.2364699868

So, your program you compiled with GCC did a good job. The four
closest double values to your decimal string are

937566.2364699867 321...
937566.2364699868 485...
937566.2364699869 649...
937566.2364699870 813...

It seems, the program you used to create your input file did a bad
job, because neither of these numbers should be approximated with the
string "937566.2364699869".

If you're interested in a lossless double->string->double roundtrip
you should use 17 decimal digits and high quality conversions.

Cheers!
SG

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

| Next | Last
Pages: 1 2 3 4 5
Prev: Accessor Functions (getter) for C String (Character Array) Members
Next: Why is the return type of count_if() "signed" rather than "unsigned"?