Saving C structures to file -- best practices [Embedded]

Prev: 50 ohm PCB antenna track impedance
Next: BT earpieces

From: Hans-Bernhard Bröker on 18 Apr 2010 08:31

ChrisQ wrote:

> And yes, where do you get the count for your memcopy ?.

The same way your approach does: by the sizeof operator which exists for
exactly that purpose.

> The usual hack is to sizeof (struct),

So you basically agree that your own offering is a hack. Interesting
you should get so wound up about defending a hack.

> If this is the best you can do, it's no surprise that software has so
> many bugs :-)...

How exactly did you reach the conclusion that I considered this "the
best" in any way, from me saying this about it:

> It's about the worst possible way to express worst practice.

In case you didn't get it: that "worst practice" I was talking about
(and the OP, too) is direct binary copying of entire C data structures
to the outside world. And the "worst possible to express" it is the
union hack.

The reason the Union hack is even worse than direct memcpy or fwrite is
that it directly invokes undefined behaviour. That's about the worst
kind of mistake a C program can contain while still being compilable.

From: ChrisQ on 9 May 2010 14:11

John Devereux wrote:
> ChrisQ <meru(a)devnull.com> writes:
>
>> Hans-Bernhard Bröker wrote:
>>> ChrisQ wrote:
>>>> May be missing something here, but how about this ?:
>>>>
>>>> typedef struct {
>>>> ...
>>>> ...
>>>> ...
>>>> } DATA;
>>>>
>>>> typedef union {
>>>> struct DATA sData;
>>>> U8 u8Data [sizeof (struct DATA)];
>>>> } uData;
>>> This achieves nothing useful at all. There's nothing to be gained
>>> by having that union, compared to a plain and simple equivalent of a
>>> memcpy() or fwrite()/fread() function, operating directly on a
>>> pointer to the struct (cast to unsigned char*).
>>>
>>> In a nutshell this is the opposite of best practice. It's about the
>>> worst possible way to express worst practice.
>> That's very good. Two paragraphs having nothing constructive to say at
>> all :-).
>
> But true.
>
>> We can agree to differ, but aliasing a structure to an array is a good
>> technique to ensure that the correct number of bytes get written to
>> the file, irrespective of the cpu architecture, compiler and future
>> changes to the structure.
>
> So what happens when the compiler changes the struct layout so the
> length changes? Exactly the same thing as if you had simply used the
> plain struct in the first place!
>

Late again :-), but not much time for usenet recently. Because the
association between the array and struct is defined, you can add or
remove elements, change compiler and it still works. Discuss ?.

>> If you don't understand the importance of data structures in
>> programming, then I guess there's no point in continuing...
>
> It's not that simply writing a bit copy of the struct to a file is a
> great idea, But your wrapper does not appear to improve on it at all. It
> just obfuscates what is going on and introduces a gratuitous extra layer
> when accessing it.
>

I find it usefull, as it formalises the association between the struct
and the data to save. The compiler guarantees that the array is at least
as large as the structure. If you do something like:

fwrite(&struct, sizeof(struct), 1, fp);

what you are actually saying is that a struct is an array, which is
sloppy and makes the association loose at best. Formalising the
association into a union makes the intent clear to the compiler and to
anyone you has to maintain the code. It's a small point, but I use such
techniques all the time because I know that it will always work and yes,
a bit more overhead, but trivial in practice...

Regards,

Chris

From: John Devereux on 9 May 2010 14:59

ChrisQ <meru(a)devnull.com> writes:

> John Devereux wrote:
>> ChrisQ <meru(a)devnull.com> writes:
>>
>>> Hans-Bernhard Bröker wrote:
>>>> ChrisQ wrote:
>>>>> May be missing something here, but how about this ?:
>>>>>
>>>>> typedef struct {
>>>>> ...
>>>>> ...
>>>>> ...
>>>>> } DATA;
>>>>>
>>>>> typedef union {
>>>>> struct DATA sData;
>>>>> U8 u8Data [sizeof (struct DATA)];
>>>>> } uData;
>>>> This achieves nothing useful at all. There's nothing to be gained
>>>> by having that union, compared to a plain and simple equivalent of a
>>>> memcpy() or fwrite()/fread() function, operating directly on a
>>>> pointer to the struct (cast to unsigned char*).
>>>>
>>>> In a nutshell this is the opposite of best practice. It's about the
>>>> worst possible way to express worst practice.
>>> That's very good. Two paragraphs having nothing constructive to say at
>>> all :-).
>>
>> But true.
>>
>>> We can agree to differ, but aliasing a structure to an array is a good
>>> technique to ensure that the correct number of bytes get written to
>>> the file, irrespective of the cpu architecture, compiler and future
>>> changes to the structure.
>>
>> So what happens when the compiler changes the struct layout so the
>> length changes? Exactly the same thing as if you had simply used the
>> plain struct in the first place!
>>
>
> Late again :-), but not much time for usenet recently. Because the
> association between the array and struct is defined, you can add or
> remove elements, change compiler and it still works. Discuss ?.

No, both techniques (your one and the "plain struct" approach) fail in
an identical way when changing the number of elements or when a new
compiler decides to re-arrange them. The elements will be in different
locations. If you read an old structure with the new program (after a
firmware update say) it will be corrupted. Wrapping the struct in a
union does absolutely nothing to correct this!

>>> If you don't understand the importance of data structures in
>>> programming, then I guess there's no point in continuing...
>>
>> It's not that simply writing a bit copy of the struct to a file is a
>> great idea, But your wrapper does not appear to improve on it at all. It
>> just obfuscates what is going on and introduces a gratuitous extra layer
>> when accessing it.
>>
>
> I find it usefull, as it formalises the association between the struct
> and the data to save. The compiler guarantees that the array is at
> least as large as the structure. If you do something like:
>
> fwrite(&struct, sizeof(struct), 1, fp);
>
> what you are actually saying is that a struct is an array, which is
> sloppy and makes the association loose at best.

Not sure what you mean here. How would your approach be any different?
The fwrite function appears to assume an array of elements (optionally
with just one member as above). If you mean that the call above says
that each element is *itself* an array, I disagree. The function
signature is

size_t fwrite(const void *ptr, size_t size, size_t nmemb,FILE *stream)

So the "elements" are generic objects of length <size>, not specifically
arrays. Of course everything is an "array of bytes" in the end but note
that the first parameter is *not* "unsigned char*" (nor "U8*" either).

> Formalising the association into a union makes the intent clear to the
> compiler and to anyone you has to maintain the code. It's a small
> point, but I use such techniques all the time because I know that it
> will always work and yes, a bit more overhead, but trivial in
> practice...

I agree with the principle of making the intent clear - but I disagree
that you are doing that here! :)

--

John Devereux

From: Hans-Bernhard Bröker on 9 May 2010 15:25

ChrisQ wrote:

> Late again :-), but not much time for usenet recently. Because the
> association between the array and struct is defined, you can add or
> remove elements, change compiler and it still works. Discuss ?.

It still works not a bit better than without the union. In particular,
you still have to spell out the sizeof(variable) or sizeof(type)
whenever you hand this data to a generic "save this data" function.

And no, it does most decidedly _not_ just "still work" when you change
compilers. The data format created by this is every bit as
platform-dependent as that of a plain fwrite(&struct,...). Because it's
the same.

> I find it usefull, as it formalises the association between the struct
> and the data to save.

It formalizes an association between two things one of which serves no
actual purpose at all. I doesn't gain you anything on the one
association that actually needs to be established in this context: that
between the struct itself and sizeof(struct). You'll end up writing

fwrite(&u.array, 1, sizeof(u.array), fp);

when you could have written

fwrite(&s, 1, sizeof(s), fp);

instead. And you really think _that_ is an improvement?

From: ChrisQ on 10 May 2010 09:58

John Devereux wrote:
>
> No, both techniques (your one and the "plain struct" approach) fail
> in an identical way when changing the number of elements or when a
> new compiler decides to re-arrange them. The elements will be in
> different locations. If you read an old structure with the new
> program (after a firmware update say) it will be corrupted. Wrapping
> the struct in a union does absolutely nothing to correct this!
>

If you need to exchange data between systems, or if file format must be
constant, then it is a problem. The obvious solution is to
encapsulate all format conversion ops into a separate module and limit
access to the data structure(s) through function calls in that module.
That simplifies the management of risk for future maintenance. That's how
it would typically be done here and would also do it the long way round.
ie: Do the conversion from struct element(s) to an array element by element,
line by line. Then add plenty of commentary in the module and docs for
future maintainers.

A good system design should always separate internal and external data
formats via conversion modules, but how often is that done in practice
?. Typically, the internal struct is declared global, then peeked /
poked from everywhere and we wonder why the resulting system is a
nightmare to maintain and full of sleeping bugs.

Many embedded systems don't have the luxury of a clib style file system
(ie: only flash or cmos ram) and the internal "file system" is designed
/ optimised to suit the app. Typically, the format needs no correlation
between versions, so changes to the structure have no effect at all.
Using a union to alias types involves some overhead, but
it does make the implied conversion clear. I would use such techniques
for any slightly dodgy conversion as it makes the intent clear to the
reader and compiler. After good design, perhaps clarity should be the
next most important thing. Good code should flow down the page, with no
pauses to do mental gymnastics re the programmers intent :-).

> So the "elements" are generic objects of length <size>, not
> specifically arrays. Of course everything is an "array of bytes" in
> the end but note that the first parameter is *not* "unsigned char*"
> (nor "U8*" either).
>

One of the reasons to avoid the standard clib is that all the
types are wrong and casts are nearly always required. Far better
to develop your own libs over projects / years. I know half the world
runs embedded linux now, but can't shake off the perhaps neanderthal
view that unix is a better fit and unsurpassed on the desktop. Going
against the grain I know, but...

Regards,

Chris

First | Prev | Next | Last
Pages: 1 2 3 4 5 6
Prev: 50 ohm PCB antenna track impedance
Next: BT earpieces