Portable way to write binary data [C++]

Prev: SHA1
Next: generate double persision random value

From: Holger Sebert on 24 Nov 2005 03:05

Hi,

thank you all for your answers.

The data I have to write are huge blocks of floating point data (both of type
float or double) coming out of numerical applications. So writing them in text
format is out of the question.

These data blocks are generated on some big computer (whose architecture may be
mysterious) and then are post processed on ordinary PCs for, e.g.,
visualisation. Up to now everything worked fine only concering endianess, but I
see the danger that things could get messed up in the future.

A general purpose serialization library might be overkill, or not specialized
enough (furthermore I am obliged to keep the library dependencies as small as
possible).

Does anyone know what in total I have to consider when dealing portably with
binary floating point data and could give a link or something?

Or perhaps it's sufficient just using some typedefs and hope there won't be 80
bit floats that have to be read on a 64 bit float machine ... ?

Regards,
Holger

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: kanze on 24 Nov 2005 12:19

Holger Sebert wrote:

> The data I have to write are huge blocks of floating point
> data (both of type float or double) coming out of numerical
> applications. So writing them in text format is out of the
> question.

Be careful here. A good binary representation of floating point
is a lot trickier than the other types. I'd still consider
text, although it's true that conversion of floating point to
text (and vice versa) is also a lot more costly than for other
formats.

> These data blocks are generated on some big computer (whose
> architecture may be mysterious) and then are post processed on
> ordinary PCs for, e.g., visualisation. Up to now everything
> worked fine only concering endianess, but I see the danger
> that things could get messed up in the future.

And how. You don't give any information concerning the "some
big computer", but be aware that some big computers use a
floating point format that is completely incompatible with that
on a PC.

> A general purpose serialization library might be overkill, or
> not specialized enough (furthermore I am obliged to keep the
> library dependencies as small as possible).

> Does anyone know what in total I have to consider when dealing
> portably with binary floating point data and could give a link
> or something?

I'm familiar with BER format, but it might be overkill; it can
also be very expensive to decode. About the only other portable
floating point format I know is text.

I would give careful consideration to the set of machines over
which the code must work. Most new machines will support IEEE;
your only real risk is legacy architectures, like IBM 390, and
even these are moving toward IEEE. (Java requires it.) If you
can assume that all of the machines use IEEE, and that NaN's are
never transmitted, then I think you can make do with viewing the
double as if it were an unsigned long long, and transmitting
that.

> Or perhaps it's sufficient just using some typedefs and hope
> there won't be 80 bit floats that have to be read on a 64 bit
> float machine ... ?

The only 80 bit floats that I know are IEEE extended precision,
which would be mapped to long double, if they are supported at
all. In the past, one of the most important big machines for
numeric work was the CDC's, which used a 60 bit float (and a 120
bit double), but if you don't currently have to support these,
you almost certainly won't in the future.

But the number of bits isn't the only problem. IBM 390's use
natively a base 16 format, rather than a base 2. Some years
back, IBM introduced IEEE support on this hardware as an option;
at least on the early machines supporting it, IEEE was
significantly slower than the native format, and even if that is
no longer a problem, there is still the issue of files written
in the native format which could force use of it rather than
IEEE.

Trying to be portable to every legal format is probably a waste
of time. The IEEE format (used on PC's, Sparcs, HP's PA and
IBM's Power PC architectures) has become more or less a
standard; unless you have concrete reasons to assume that you
will have to support something else, I'd limit my support to
that until more became a concrete necessity. (Of course, I
would encapsulate the "conversion" routines, so that if more
became necessary, I know where the changes have to be made, and
they won't repercute through the entire program.)

--
James Kanze GABI Software
Conseils en informatique orient?e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Christopher Yeleighton on 25 Nov 2005 13:22

"Holger Sebert" <holger.sebert(a)ruhr-uni-bochum.de> wrote in message
news:3ukituF124586U1(a)news.dfncis.de...
> Hi,
>
> thank you all for your answers.
>
> The data I have to write are huge blocks of floating point data (both of
> type
> float or double) coming out of numerical applications. So writing them in
> text
> format is out of the question.
>
> These data blocks are generated on some big computer (whose architecture
> may be
> mysterious) and then are post processed on ordinary PCs for, e.g.,
> visualisation. Up to now everything worked fine only concering endianess,
> but I
> see the danger that things could get messed up in the future.
>
> A general purpose serialization library might be overkill, or not
> specialized
> enough (furthermore I am obliged to keep the library dependencies as small
> as
> possible).
>
> Does anyone know what in total I have to consider when dealing portably
> with
> binary floating point data and could give a link or something?
>

http://webstore.ansi.org/ansidocstore/product.asp?sku=INCITS%2FISO%2FIEC+8825%2D1%2D1998

Chris

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Carl Barron on 26 Nov 2005 09:32

kanze <kanze(a)gabi-soft.fr> wrote:

> basic_ios also does error handling. What there is of it,
> anyway. Use the streambuf if you don't need that.
>
> The streambuf does character code translation. Don't use
> streambuf if you don't want that.

Stream buffer classes can do char code translation. But I
don't see any requirement that they always do so. In fact
stringbuf ussually does not:)

this is a legal stream buffer class

struct membuf:public std::streambuf
{
// a simple sequential read of memory block [a,a+n)
membuf(char *a,int n) {setg(a,a,a+n);}
};

definite do not use filebuf for binary data without a specific do
nothing codecvt...

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: James Kanze on 26 Nov 2005 23:08

Carl Barron wrote:
> kanze <kanze(a)gabi-soft.fr> wrote:

>>basic_ios also does error handling. What there is of it,
>>anyway. Use the streambuf if you don't need that.

>>The streambuf does character code translation. Don't use
>>streambuf if you don't want that.

> Stream buffer classes can do char code translation. But I
> don't see any requirement that they always do so. In fact
> stringbuf ussually does not:)

There is a requirement that filebuf do code translation. The
potential is there.

> definite do not use filebuf for binary data without a specific
> do nothing codecvt...

That's the work-around. It's a fragile solution, but it is the
only one available to us.

It would be nicer if there were a class with an abstraction
which didn't include code translation.

--
James Kanze mailto: james.kanze(a)free.fr
Conseils en informatique orient?e objet/
Beratung in objektorientierter Datenverarbeitung
9 pl. Pierre S?mard, 78210 St.-Cyr-l'?cole, France +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

First | Prev | Next | Last
Pages: 1 2 3
Prev: SHA1
Next: generate double persision random value