From: Ray Mitchell on 25 Feb 2010 16:16 "Igor Tandetnik" wrote: > Ray Mitchell <RayMitchell(a)discussions.microsoft.com> wrote: > > "Igor Tandetnik" wrote: > >> 6.7.2.1p14 The size of a union is sufficient to contain the largest > >> of its members. The value of at most one of the members can be > >> stored in a union object at any time. > > > > I don't agree that this makes reading a member that was not most > > recently written undefined as long as the member being read shares > > all of its bytes with the recently written member. Concerning the > > "older" members of a union, > > 6.1.6.2p7 of the standard says, "When a value is stored in a member > > of an object of union type, the bytes of the object representation > > that do not correspond to that member but do correspond to other > > members take unspecified values, but the value of the union object > > shall not thereby become a trap representation." > > This just says that the union shouldn't turn into something that the CPU would throw a hardware exception on (some architectures have bit patterns that cause the CPU to do so - known as "trap representations"). > > > When the compiler > > generates code to access the various union members, that code merely > > accesses the appropriate bytes in the common object and interprets > > them in the way appropriate to that member's data type. The code to > > do this is "permanent" and does not change just because another > > member was recently written. > > Of course not. But the program that necessitates running this code exhibits undefined behavior. Consider: > > int* p = malloc(sizeof(int)); > *p = 1; > if (rand() % 2) { > free(p); > } > *p = 42; > > Code that assigns 42 to *p doesn't change just because memory is freed. Nevertheless, if it was indeed freed, that line exhibits undefined behavior - it accesses an object whose lifetime has ended. > > > Instead, the access is made without any > > memory of what might have happened to the object previously. > > That doesn't make such access any more valid. > > > As a > > result the values of the bytes being read are exactly the values that > > were written. > > Not necessarily. The compiler can legally optimize away the assignment to one member of the union, seeing that the member is never read afterwards. See also > > http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Optimize-Options.html#index-fstrict_002daliasing-542 > > (note that GCC doesn't perform this optimization in the simple case - only because there's too much invalid code in existence that would be broken by it). If the compiler does that, then no value is written, and the value read is random garbage. > > >> 6.2.4p2 The lifetime of an object is the portion of program > >> execution during which storage is guaranteed to be reserved for it. > >> An object exists, has a constant address, and retains its > >> last-stored value throughout its lifetime. If an object is referred > >> to outside of its lifetime, the behavior is undefined. > > > > I agree totally, and the object in this case is the underlying memory > > common to all members. > > Not quite. The union as a whole is an object, and each union member is itself an object: > > 6.2.5p20 A union type describes an overlapping nonempty set of member objects, each of which has an optionally specified name and possibly distinct type. > > Remember also 6.7.2.1p14: "The value of at most one of the members can be stored in a union object at any time." Thus, one member object cannot possibly "retain its last-stored value" when another member is assigned to - the union can only hold one value at a time. > > C++ standard states this more explicitly: > > 3.8p1 ...The lifetime of an object of type T ends when: ... the storage which the object occupies is reused... > > > But this is unrelated to the issue we're > > discussing since the lifetime of the object does not end between the > > write and the read. > > Lifetime of the union doesn't, but lifetime of the member object whose storage has been hijacked does. > > >> However, this doesn't give you much, in view of aforementioned > >> 6.2.6.1p1 - in general, you have no idea what to expect when looking > >> at individual bytes of an object. > > > > But my original example was not the general case. It merely set the > > value > > of an integral type to a value of 1, and I believe that guarantees > > that only the least significant bit will be a 1. > > What is the basis for this belief? It is my turn now to demand chapter and verse. > -- > With best wishes, > Igor Tandetnik > > With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925 > > . > Thanks Igor, I think I'm finally beginning to see the light on some of the things you have been saying. I did not consider the various optimizations and intermediate operations that might be performed that would render "old" members invalid. > > of an integral type to a value of 1, and I believe that guarantees > > that only the least significant bit will be a 1. > > What is the basis for this belief? It is my turn now to demand chapter and verse. I was basing my assertion on the fact that positive integral values must use a pure binary representation for their values (6.2.6.2p1). Then, by definition, doesn't the least significant bit have to be a 1 to represent a value of 1? And if there happens to be padding bits, then I suppose that such a bit could occupy the "farthest right" bit position. But that bit would then not be called the least significant bit would it? Or I suppose that in some screwed up implementation the value order of the value bits would not necessarily be the physical order of the bits in the object, but isn't the least significant bit still going to be a 1 no matter what physical position it occupies? Is this what you are questioning? The concept of padding bits does bring up another question that I thought I understood, however: If an unsigned integral object is set to a value of 1, then the value of the object is repeatedly shifted left by 1 until its value becomes 0, I always assumed that this was a portable way to determine the number of value bits in the data type of that object. Now I'm beginning to wonder if the padding bits might also get included in the count. If this is the case, however, it seems to do away with the ability to do efficient multiplications/divisions by powers of 2 by merely shifting instead. Thanks for your detailed explanations, Ray
From: Igor Tandetnik on 25 Feb 2010 17:36 Ray Mitchell <RayMitchell(a)discussions.microsoft.com> wrote: > "Igor Tandetnik" wrote: > >> Ray Mitchell <RayMitchell(a)discussions.microsoft.com> wrote: >>> of an integral type to a value of 1, and I believe that guarantees >>> that only the least significant bit will be a 1. >> >> What is the basis for this belief? It is my turn now to demand >> chapter and verse. > > I was basing my assertion on the fact that positive integral values > must use a pure binary representation for their values (6.2.6.2p1). Only value bits participate in pure binary representation. There may be padding bits sprinkled around arbitrarily. > Then, by definition, doesn't the least significant bit have to be a 1 > to represent a value of 1? And if there happens to be padding bits, > then I suppose that such a bit could occupy the "farthest right" bit > position. But that bit would then not be called the least > significant bit would it? Ok, so you define "least significant bit" as "the value bit that would be set to 1 in the representation of the integer whose value is 1". Then you state that in an integer whose value is 1, the least significant bit is necessarily set to 1. Yes, this statement is trivially true, but I don't quite see how this circular definition helps you in your goal of inferring properties of the architecture by inspecting binary representation of select integers. The bit you appear to be interested in is (char*)(&number)[0] & 1 // [1] whatever you want to call it. > Or I suppose that in some screwed up > implementation the value order of the value bits would not > necessarily be the physical order of the bits in the object I don't think that is allowed. 6.2.6.1p3 Footnote 40: A positional representation for integers that uses the binary digits 0 and 1, in which the values represented by successive bits are additive, begin with 1, and are multiplied by successive integral powers of 2, except perhaps the bit with the highest position. Though I guess it's arguable what "successive bits" means, as it is never formally defined. > but > isn't the least significant bit still going to be a 1 no matter what > physical position it occupies? Under your circular definition, yes. > Is this what you are questioning? I was assuming that by "least significant bit" you mean the bit in physical position zero (as defined by [1] above), because that's the definition that is actually relevant to your original question. > The concept of padding bits does bring up another question that I > thought I understood, however: If an unsigned integral object is set > to a value of 1, then the value of the object is repeatedly shifted > left by 1 until its value becomes 0, I always assumed that this was a > portable way to determine the number of value bits in the data type > of that object. That's correct. > Now I'm beginning to wonder if the padding bits > might also get included in the count. No. Shifts are defined arithmetically, not in terms of physical representation: 6.5.7p4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1*2^E2, reduced modulo one more than the maximum value representable in the result type. Bitwise operations (&, |, ^) are another matter. -- With best wishes, Igor Tandetnik With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925
From: Barry Schwarz on 25 Feb 2010 21:39 On Wed, 24 Feb 2010 22:57:01 -0800, Ray <Ray(a)discussions.microsoft.com> wrote: >"Igor Tandetnik" wrote: snip >> Assigning to one member of the union and then reading another exhibits undefined behavior. > >Where did you get this information? Could you please refer me to the >appropriate section of the C standard that states this is the case? I >searched through the C99 standard and could find nothing the either directly >stated nor implied this undefined behavior. Logically, to me at least, since C89 called it implementation defined behavior. C99 "fixed" it with footnote 82 to paragraph 6.5.2.3 (in n1256): "If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation." However, it apparently went through several iterations. Paragraph 6.5.2.2 of n869 says "With one exception, if the value of a member of a union object is used when the most recent store to the object was to a different member, the behavior is implementation-defined." The exception is for structure members of the union. This is almost identical to the C89 wording. n1124 doesn't seem to address it at all. -- Remove del for email
From: Ulrich Eckhardt on 26 Feb 2010 07:45 David Lowndes wrote: >>Let's assume a fictional long double type that is 12 bytes large. However, >>it doesn't need a 12-byte alignment but actually a 16-byte alignment >>because the FPU says so. > > I could argue that in essence that makes the type really 16 bytes > though. True. 12 significant bytes plus 4 bytes padding. > I'm still not convinced by fictional things. Does anyone have a real > example that would truly illustrate this? struct X { int i; char c; }; Typical layout would be four byte for the int, one for the character and then one or three byte padding. Uli -- C++ FAQ: http://parashift.com/c++-faq-lite Sator Laser GmbH Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932
First
|
Prev
|
Pages: 1 2 3 4 5 6 Prev: why the different results? Next: Endianness of padded scalar objects - Correction |