Prev: C++ Concepts
Next: Portable way to write binary data
From: kanze on 23 Nov 2005 08:45 Le Chaud Lapin wrote: > kanze wrote: > > And how does the endian-ness make a difference. I > > implemented MD5 on a Sparc (big-endian) -- I've since > > compiled the code on a Linux based PC (little-endian), and > > it works perfectly well. Is there something fundamentally > > different between MD5 and SHA-1 which means that SHA-1 is > > endian dependant? They look very similar in priciples to > > me. > It definitely makes a difference, but first, let me rant: You're wasting your time. The arguments are ancient -- when I started programming, the most widespread machines were the PDP-11 (little endian) and the IBM 360 (big endian). When 16 bit microprocessors started appearing, the leaders were Intel (little endian) and Motorola (big endian). Today, the only consensus that I know is for transmission protocols, and it is contradictary: -- Bit order in a byte is *always* little endian. I have never heard of an exception. -- Byte order in a larger integral value is normally big endian, at least for fixed length integers. (BER uses little endian for variable length integers -- 7 bits per byte, with the top bit 0 if there are still bytes to come, and 1 if this is the last byte.) (Like you, I prefer little endian. But my preferences aren't going to change anything, and there are a lot more important issues to be concerned with.) [...] > Getting of my rant box, if the input data is 0x01234567, > that's 4 bytes. It depends. Is it four bytes (characters or whatever), or is it a 32 bit value. If it's four bytes, it's four separate bytes (and should probably be written 01 23 45 67). If it's a 32 bit value, it's a 32 bit value. > In a buffer, begining at offset 0, either the 0x01 will be > there or the 0x67 will be there. In a buffer, it's in the format I decide that it's in. If it's off the line, it's in the format of the protocol, automatically. If it's in memory, to be written, its in the format I wrote it in. > When the programmer goes to form a word from this buffer to be > used in the plethora of Boolean operations of the hash > algorithm, s/he will have to choose how to extract the bytes > from this stream to form the word. If it is a character buffer, and I need words for the SHA or MD5 algorithm, I will, naturally, extract the necessary number of bytes from the buffer, and construct a word from them according to the specifications of the protocol. Something like (from my implementation of MD5) : GB_uint32_t getWord( GB_uint8_t const* source ) { return source[ 0 ] | (source[ 1 ] << 8) | (source[ 2 ] << 16) | (source[ 3 ] << 24) ; } I have to do it this way anyway, of course. There's no guarantee that my clients character buffer is correctly aligned for me to view it as a word. > Note that implicit in all of this is that when the stream of > bytes move from one machine to another, the bit order is > preserved for individual bytes, and the byte order is > preserved for a stream. So if 0x01 is the byte at offset 0 > (big-endian), it will be the byte at offset 0 when it is sent > as a sequence of bytes. But if a little-endian receiving > machine gets these 4-bytes, and puts in them in increasing > memory locations starting at 0, and then extracts the 4 bytes > as a word, the word will become 0x76543210. If any machine accesses a stream of bytes as if it were in fact a stream of words, it's looking for trouble. Data has a format. I never suppose that the format from an external source just happens to coincide with the internal format of the machine. I've encountered at least three different byte orders for 32 bit values, two of them with different compilers on the same machine. And on the machines I use today (Sun Sparc), trying to access a word at a misaligned address will cause a core dump. > So there are two subtleties here. The first is that bit-order > is guranteed no matter what from machine to machine. There is no such thing as bit order within the machine, unless your machine supports some form of bit addressing. (The PDP-11, and I think the VAX, did.) There is a consensus that bits will be sent little endian on the transmission line. The hardware takes care of assembling them into bytes. There is also a very real sense that there is no such thing as byte order in a machine. You can't see it without doing all sorts of dirty casts. (I know that there are cases where you cannot avoid such things -- I've implemented a standard C library, and I did do some dirty casting in functions like modf, in order to piddle with the bits of the exponant and the mantissa directly.) If you want the low order byte of a 32 bit word, you know how to get it: word & 0xFF (or just assigning to a uint8_t). If you want the high order byte, it's word >> 24 first. (Note that this works even on a machine with 9 bit char's. Providing, of course, that the hardware sends the 8 low order bits when serializing, which is what I would expect.) > We take that for granted, because it is. The second is that > no one ever said that we were allowed to treat the input > stream as a sequence of words. The most that we are allowed > is bytes. So during both extraction and implantation of > words, we have to be careful about byte-sequences. We have to deal with byte sequences, and we have to convert them to and from words. Both have a specific format, and can be accessed within C++ without worrying about that format. There's never any reason here to write platform dependant code. -- James Kanze GABI Software Conseils en informatique orient?e objet/ Beratung in objektorientierter Datenverarbeitung 9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34 [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: kanze on 23 Nov 2005 09:00 gkoreman(a)gmail.com wrote: > Is it working? > No, not working. But it is running. I understood what you meant. That was just a little attempt at humor. Still, it's important to keep in mind that the goal isn't just to get the code through the compiler, or that it runs without core dumping. The goal is a correct answer for all legal input, and an error indication for all illegal input. (In the case of SHA, I think all input can be considered legal.) > The link at the top looks very useful. Thank you. > The 64bit size issue: An early implementation decision to > limit complexity. This limits the message size, but this was > not a bug. That wasn't the point. The protocol says that the size is sent on 64 bits. That means that you *must* put 64 bits of length in the buffer. The protocol is big endian, so your last 32 bits of padding are being interpreted as the upper part of the length. If they are all zero (which will usually be the case), the error won't be apparent. If the length of the user message is such that the 0x80 trailer falls in those bits, the digest will be wrong. It's perfectly acceptable to restrict the input, and say that you cannot handle more than 2^32 bytes, or even 2^27 (so that the multiplication by 32 cannot overflow. But in that case, you still must insert 64 bits into the stream, even if the top 32 are guaranteed zero. Generally, it's fairly trivial to accept 2^32 byte. Just assign the length to a uint64_t (part of the C standard, but often present -- #include <stdint.h>) before doing anything with it. And insert "length >> 32' (high word) and 'length & 0xFFFFFFFF' (low word) into your buffer. I personally wouldn't bother with more, although in my implementation of MD5, doing so would be trivial, since I stream the input, processing one block at a time, rather than requiring the user to pass me the entire message in one go. -- James Kanze GABI Software Conseils en informatique orient?e objet/ Beratung in objektorientierter Datenverarbeitung 9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34 [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Greg Herlihy on 24 Nov 2005 03:05 kanze wrote: > Le Chaud Lapin wrote: > > kanze wrote: > > > And how does the endian-ness make a difference. I > > > implemented MD5 on a Sparc (big-endian) -- I've since > > > compiled the code on a Linux based PC (little-endian), and > > > it works perfectly well. Is there something fundamentally > > > different between MD5 and SHA-1 which means that SHA-1 is > > > endian dependant? They look very similar in priciples to > > > me. > > > It definitely makes a difference, but first, let me rant: > > You're wasting your time. The arguments are ancient -- when I > started programming, the most widespread machines were the > PDP-11 (little endian) and the IBM 360 (big endian). When 16 > bit microprocessors started appearing, the leaders were Intel > (little endian) and Motorola (big endian). > > Today, the only consensus that I know is for transmission > protocols, and it is contradictary: > > -- Bit order in a byte is *always* little endian. I have never > heard of an exception. If big endian has the most significant byte of a multiple byte value ordered first, wouldn't a big endian byte have its most significant bit ordered first? If so, wouldn't bit order in a byte always be big endian, and never little endian? > [...] > > Getting of my rant box, if the input data is 0x01234567, > > that's 4 bytes. > > It depends. Is it four bytes (characters or whatever), or is it > a 32 bit value. If it's four bytes, it's four separate bytes > (and should probably be written 01 23 45 67). If it's a 32 bit > value, it's a 32 bit value. MD5 treats the input as a 32-bit values in 4-byte little endian-format. SHA1 treats the input as 32-bit values in 4-byte big-endian format (the actual order of the bytes from the point of view of its author is not relevant for the purpose of creating a digest for them). If a machine calculating a digest does not interpret the four input bytes with the endianness that the algorithm requires, then it must rearrange those bytes in order that it does. Otherwise, the subsequent additions and rotations applied to the input value on a big endian machine would produce a completely different result from the one calculated on a little-endian machine - because the starting value used in those calculations was not the same. > > In a buffer, begining at offset 0, either the 0x01 will be > > there or the 0x67 will be there. > > In a buffer, it's in the format I decide that it's in. If it's > off the line, it's in the format of the protocol, automatically. > If it's in memory, to be written, its in the format I wrote it > in. > > > When the programmer goes to form a word from this buffer to be > > used in the plethora of Boolean operations of the hash > > algorithm, s/he will have to choose how to extract the bytes > > from this stream to form the word. > > If it is a character buffer, and I need words for the SHA or MD5 > algorithm, I will, naturally, extract the necessary number of > bytes from the buffer, and construct a word from them according > to the specifications of the protocol. Something like (from my > implementation of MD5) : > > GB_uint32_t > getWord( > GB_uint8_t const* source ) > { > return source[ 0 ] > | (source[ 1 ] << 8) > | (source[ 2 ] << 16) > | (source[ 3 ] << 24) ; > } This above routine is in fact the byte-swapping routine necessary for big-endian machines calculating an MD5 digest to interpret the four bytes of input as a 32-bit little-endian word. Since this function is "endian-neutral" it can also be used on little endian-machines without ill effect, but to do so is inefficient. A little-endian machine calculating an MD5 digest would not waste cyles on these operations - it would simply load the entire four bytes into a register at once. Considering that big-endian machines can spend as much 1/3 of their time computing an MD5 digest in swapping bytes, the effect of this optimization for little-endian machines is likely to be significant. By the same token, the performance advantage that a big-endian machine would have over a little-endian machine when calculating SHA-1 (due to its big-endian biased input), would likely be equally as significant. Greg [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: kanze on 24 Nov 2005 12:12
Greg Herlihy wrote: > kanze wrote: > > Le Chaud Lapin wrote: > > > kanze wrote: > > > > And how does the endian-ness make a difference. I > > > > implemented MD5 on a Sparc (big-endian) -- I've since > > > > compiled the code on a Linux based PC (little-endian), > > > > and it works perfectly well. Is there something > > > > fundamentally different between MD5 and SHA-1 which > > > > means that SHA-1 is endian dependant? They look very > > > > similar in priciples to me. > > > It definitely makes a difference, but first, let me rant: > > You're wasting your time. The arguments are ancient -- when > > I started programming, the most widespread machines were the > > PDP-11 (little endian) and the IBM 360 (big endian). When > > 16 bit microprocessors started appearing, the leaders were > > Intel (little endian) and Motorola (big endian). > > Today, the only consensus that I know is for transmission > > protocols, and it is contradictary: > > -- Bit order in a byte is *always* little endian. I have > > never heard of an exception. > If big endian has the most significant byte of a multiple byte > value ordered first, wouldn't a big endian byte have its most > significant bit ordered first? That would be logical. > If so, wouldn't bit order in a byte always be big endian, and > never little endian? No. Within a machine, bytes are accessed all eight bits in parallel; there isn't any endianness. (The reason why endianness doesn't make a difference in my own code is because I always access words as words, i.e. all bits in parallel, where there isn't any endianness.) On a serial medium, like a transmission line or a track on a disk, of course, it's one bit after the other, and you *can* send them in any order you want. In practice, there is a consensus with regards to transmission line protocols, which is to send the bits in order, with the LSB first, and with any parity trailing, if byte parity is being used. (I once ran into hardware, some 25 years ago, which inserted a parity bit in the middle of the byte, between the bits 3 and 4. And PCM requires block parity, and different protocols insert the bit at various places within the block, although none that I know of put it in the middle of a byte, at least not when the entire byte is in a single block -- the D canal of the lowest level ISDN connection is sent 2 bits per block, so obviously, there are block parity bits, as well as bytes for other canals, in the middle of it.) In practice, this is almost always fully handled by the hardware. You don't see the bit order; the link between the serial controler and your CPU is parallel. > > [...] > > > Getting of my rant box, if the input data is 0x01234567, > > > that's 4 bytes. > > It depends. Is it four bytes (characters or whatever), or > > is it a 32 bit value. If it's four bytes, it's four > > separate bytes (and should probably be written 01 23 45 67). > > If it's a 32 bit value, it's a 32 bit value. > MD5 treats the input as a 32-bit values in 4-byte little > endian-format. Formally, MD5 treates the input as a bit stream. That bit stream is then processed in 512 bit blocks; within each block, the bits are processed as 16 32 bit words. In practice, of course, the input will be a byte stream, with four bytes per 32 bit word. The only place where bit order in the bytes would be relevant is if you had to assemble a byte from several different blocks of bits. This might occur, for example, if your interface allowed the user to specify an arbitrary number of bits, and you used only the lower n bits of the last byte. I don't support this, and I don't know, off hand, what the RFC says about it; my impression is that it should be big endian (based on the fact that to insert a single 1 bit, followed by zeros, I use 0x80). The only place byte order is relevant is when you assemble bytes into words. Here, the reference implementation (which is part of the RFC) does use little endian format. Which, to be frank, seems a little strange in context -- the obvious intent in processing the bits in the byte is to consider that the high order bit is inserted into the bit stream first. But it doesn't really matter; in all cases I know of, the input stream has in fact been a stream of bytes, and the choice between little endian and big endian is purely arbitrary here. > SHA1 treats the input as 32-bit values in 4-byte big-endian > format (the actual order of the bytes from the point of view > of its author is not relevant for the purpose of creating a > digest for them). It's important that both sides agree. Again, SHA1 is formally defined over a stream of bits. For the rest, my comments concerning MD5 above apply, with the exception that bytes are streamed into 32 bit words in big endian order: the first byte to arrive is the high order byte of the 32 bit word, where as in MD5, the first byte to arrive is the low order byte. > If a machine calculating a digest does not interpret the four > input bytes with the endianness that the algorithm requires, > then it must rearrange those bytes in order that it does. > Otherwise, the subsequent additions and rotations applied to > the input value on a big endian machine would produce a > completely different result from the one calculated on a > little-endian machine - because the starting value used in > those calculations was not the same. Formally, as I said, the algorithm processes a stream of bits, which are broken up into blocks, and within each block, into 32 bit words. By convention, when given a byte, the algorithm inserts the high order bit first, with the rest of the bits in order. If I read SHA1 correctly, when it breaks the stream up into words, it does so by simply taking 32 successive bits out of the stream, with the first bit as the high order bit. MD5 inserts the first bit in bit 7 of the word, the second in bit 6, then goes on to put the ninth in bit 15... In a practical implementation, of course, both will receive a stream of bytes, with MD5 using the first byte as the low order byte of its words, and SHA1 using it as the high order byte. > > > In a buffer, begining at offset 0, either the 0x01 will be > > > there or the 0x67 will be there. > > In a buffer, it's in the format I decide that it's in. If > > it's off the line, it's in the format of the protocol, > > automatically. If it's in memory, to be written, its in the > > format I wrote it in. > > > When the programmer goes to form a word from this buffer > > > to be used in the plethora of Boolean operations of the > > > hash algorithm, s/he will have to choose how to extract > > > the bytes from this stream to form the word. > > If it is a character buffer, and I need words for the SHA or > > MD5 algorithm, I will, naturally, extract the necessary > > number of bytes from the buffer, and construct a word from > > them according to the specifications of the protocol. > > Something like (from my implementation of MD5) : > > GB_uint32_t > > getWord( > > GB_uint8_t const* source ) > > { > > return source[ 0 ] > > | (source[ 1 ] << 8) > > | (source[ 2 ] << 16) > > | (source[ 3 ] << 24) ; > > } > This above routine is in fact the byte-swapping routine > necessary for big-endian machines calculating an MD5 digest to > interpret the four bytes of input as a 32-bit little-endian > word. The above routine works on the value of a 32 bit word. It assembles the bytes from the stream into a 32 bit value, which it writes as a single word, in parallel. There is NO byte order in the word, because the word is parallel. It works with big endian byte addressed machines. It works with little endian byte addressed machines. It works with mixed endian (3412 -- really seen) byte addressed machines. And it works with word addressed machines. Because it ignores byte order, and works directly on the 32 bit value. The fact that the results may look the same as byte swapping on certain machines is a coincidence. The code doesn't depend on it in any way. > Since this function is "endian-neutral" it can also be used on > little endian-machines without ill effect, but to do so is > inefficient. Is it? Depending on the hardware, it may be more efficient than writing four bytes. At any rate, any differences in efficiency will hardly be measurable compared to the time actually needed to calculate the SHA1 or the MD5 digest. > A little-endian machine calculating an MD5 digest would not > waste cyles on these operations - it would simply load the > entire four bytes into a register at once. And get a segment violation, because they might be mis-aligned. Not in any code I'd write. When processing a byte stream into 32 bit words, you have to do something like the above at some point anyway. You cannot simply convert a char* into a uint32_t*, and expect to be able to dereference it. (Doing so actually works on an Intel architecture. But at a considerable run-time cost.) > Considering that big-endian machines can spend as much 1/3 of > their time computing an MD5 digest in swapping bytes, the > effect of this optimization for little-endian machines is > likely to be significant. Have you actually profiled this? My MD5 implementation was orignallly developed on a Sparc, a big endian machine. The application was time critical, and was extensively prototyped. And the time spent in the getWord routine, above, was never measurable. > By the same token, the performance advantage that a big-endian > machine would have over a little-endian machine when > calculating SHA-1 (due to its big-endian biased input), would > likely be equally as significant. Probably. The time spent in the getWord function, above, is certainly insignificant for MD5 on at least one big-endian machine (a Sun Sparc -- I'm no longer sure which exact model). I expect that it would also be insignificant for SHA1 on a little-endian machine. -- James Kanze GABI Software Conseils en informatique orient?e objet/ Beratung in objektorientierter Datenverarbeitung 9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34 [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |