Prev: Apple II Debugger
Next: TMA Assembler?
From: jukka on 24 Aug 2006 16:54 Skybuck wrote: > Do you always read code from bottom-up ? > I-DON'T-THINK-SO ! I said that "maybe it helps" -- YOU! > The code however is not. If you say the code isn't, then I believe that in your opinion it isn't. Whether it is in my opinion or not, I'm not interested in discussing. Maybe on a nice lunch. > You actually had to read it bottom-up. Actually, I didn't. I merely pointed out that maybe you might find it easier to wrap your head around the idea starting with 16 bits, 8 bits, 4 bits and finally 2 bits to realize that it is a simple divide-and-conquer method. It was originally written in reverse order and I didn't see it a point of interest to swap things into different order because you took my completely by surprise of not "getting it" right away. Yikes! > Structured programming languages are read from top to bottom. This isn't really top-to-bottom issue at all anyway. The guy who wrote the code thought lowest bits are more logical to handle before the highest bits or something. I can't say because I'm not him, but that would be good enough for me. It really doesn't make any diddly-dooey difference in which order you do it. I just pointed it out to YOU: "Look, if you do first this.. then... this happens.. SEE?" > Switching the lines would have made it a bit more understandable. I don't think it makes any difference to me but feel free to refactor. > Each single line is still unclear. Thank you for bringing this to my attention. :) > How can the variable 'v' be modified two times before assigning it to > itself ? What makes you think that it is as you describe? How about this: x = x + 1; > In theory this is impossible. There is only one variable V in the code. In theory AND in practise this is not only possible but also fully conforming and quite standard pattern in C and C++ programs. > 1. Temporarely variables are used. Aren't they always? What is a machine specific register but temporary storage, even though "pretty fast". > 2. The variable is modified in the first part of the line and then > modified again in the second part of the line. Nope. > Or when looking at the assembler a third possibility could exist: > 3. The code is invalid and something else is generated to solve it. Can you be more specific what brings you to this conclusion? > Lol, do you hang upside down from the ceiling or something lol ? Not really. I had this intuitive feeling that switching the problem other way around might help you, am I wrong to assume that it helped you atleast? > I shall now try to re-implement this idea to see if it really works and > to see how crystal clear code could look like, and to see if it's still > as fast ! (my first guess would be probably not ;)) I'm interested.
From: Skybuck on 24 Aug 2006 16:57 > > v = (v >> 16) | (v << 16); > > > > It swaps the lowest and highest 16 bits. Then what does the statement > > BEFORE that do? Yes, it swaps the 8 bit quantities within each 16 bit > > quantity. And so on, until single bits are changing position inside 2 > > bit groups. > > Ok, this explanation is clear. <snip> > I shall now try to re-implement this idea to see if it really works and > to see how crystal clear code could look like, and to see if it's still > as fast ! (my first guess would be probably not ;)) Ok, I have implemented the method in my own code. Something strange with the Delphi compiler is going on though, read my comments at the end of this post for more info ;) Sections in this post: // Delphi code section // Assembler code section // Analysis ;) function JukkaAndSteveAndSkybuckSwapBits( const ParaByte : byte ) : byte; begin result := ParaByte; // optimized away by compiler // reverse bit order // original implementation by Steve Baker // modified by Jukka (?) and Skybuck (!) // the code below will be optimized by the compiler to other kind of instructions ! ;) result := ((result shr 1) and 85) or ((result shl 1) and 170); result := ((result shr 2) and 51) or ((result shl 2) and 204); result := ((result shr 4) and 15) or ((result shl 4) and 240); end; function SkybuckSwapBits( const ParaByte : byte ) : byte; var vLeftBits : byte; vRightBits : byte; begin // step 1, swap left four bits with right four bits. // before: 87654321 // after: 43218765 // and-in the left bits. // 240 11110000 // mask to and-in the left four bits. vLeftBits := ParaByte and 240; // and-in the right bits. // 15 00001111 // mask to and-in the right four bits. vRightBits := ParaByte and 15; // shift the bits. // shift left bits 4 positions to the right. vLeftBits := vLeftBits shr 4; // shift right bits 4 positions to the left. vRightBits := vRightBits shl 4; // combine both boths into one need byte, the output byte hehe. result := vLeftBits or vRightBits; // step 2, swap first left 2 bits with first right 2 bits, // swap last left 2 bits with last right 2 bits. // before: 43218765 // after: 21436587 // and-in the first 2 left and last 2 left bits. // 204 11001100 // mask to and-in first left two bits. vLeftBits := result and 204; // and-in the first 2 right and last 2 right bits. // 51 00110011 // mask to and-in first right two bits. vRightBits := result and 51; // shift the bits. // shift left bits 2 positions to the right. vLeftBits := vLeftBits shr 2; // shift right bits 2 positions to the left. vRightBits := vRightBits shl 2; // combine both boths into one need byte, the output byte hehe. result := vLeftBits or vRightBits; // step 3, swap the left (even) and right (uneven (odd)) bit positions // before: 21436587 // after: 12345678 // and-in the left (even) bit positions // 170 10101010 // mask to and-in the left even bit positions vLeftBits := result and 170; // and-in the right (uneven (odd)) bit positions // 85 01010101 // mask to and-in the right uneven bit positions vRightBits := result and 85; // shift the bits. // shift left bits 1 position to the right. vLeftBits := vLeftBits shr 1; // shift right bits 1 position to the left. vRightBits := vRightBits shl 1; // combine both boths into one need byte, the output byte hehe. result := vLeftBits or vRightBits; // now we are done... and voila... bits swapped =D end; // let's re-order to try and make it faster, maybe the instructions will get paired ? function SkybuckSwapBits2( const ParaByte : byte ) : byte; var vLeftBits : byte; vRightBits : byte; begin // step 1, swap left four bits with right four bits. // before: 87654321 // after: 43218765 // and-in the left bits. // 240 11110000 // mask to and-in the left four bits. vLeftBits := ParaByte and 240; // shift left bits 4 positions to the right. vLeftBits := vLeftBits shr 4; // and-in the right bits. // 15 00001111 // mask to and-in the right four bits. vRightBits := ParaByte and 15; // shift right bits 4 positions to the left. vRightBits := vRightBits shl 4; // combine both boths into one need byte, the output byte hehe. result := vLeftBits or vRightBits; // step 2, swap first left 2 bits with first right 2 bits, // swap last left 2 bits with last right 2 bits. // before: 43218765 // after: 21436587 // and-in the first 2 left and last 2 left bits. // 204 11001100 // mask to and-in first left two bits. vLeftBits := result and 204; // shift left bits 2 positions to the right. vLeftBits := vLeftBits shr 2; // and-in the first 2 right and last 2 right bits. // 51 00110011 // mask to and-in first right two bits. vRightBits := result and 51; // shift right bits 2 positions to the left. vRightBits := vRightBits shl 2; // combine both boths into one need byte, the output byte hehe. result := vLeftBits or vRightBits; // step 3, swap the left (even) and right (uneven (odd)) bit positions // before: 21436587 // after: 12345678 // and-in the left (even) bit positions // 170 10101010 // mask to and-in the left even bit positions vLeftBits := result and 170; // shift left bits 1 position to the right. vLeftBits := vLeftBits shr 1; // and-in the right (uneven (odd)) bit positions // 85 01010101 // mask to and-in the right uneven bit positions vRightBits := result and 85; // shift right bits 1 position to the left. vRightBits := vRightBits shl 1; // combine both boths into one need byte, the output byte hehe. result := vLeftBits or vRightBits; // now we are done... and voila... bits swapped =D end; Delphi's generated assembler for Jukka's code (Delphi to C translation) // JukkaAndSteveAndSkybuckSwapBits TestReversingBitOrder.dpr.726: result := ((result shr 1) and 85) or ((result shl 1) and 170); 00408EDC 0FB6D0 movzx edx,al 00408EDF D1EA shr edx,1 00408EE1 80E255 and dl,$55 00408EE4 03C0 add eax,eax 00408EE6 24AA and al,$aa 00408EE8 0AD0 or dl,al 00408EEA 8BC2 mov eax,edx TestReversingBitOrder.dpr.727: result := ((result shr 2) and 51) or ((result shl 2) and 204); 00408E
From: Skybuck on 24 Aug 2006 18:03 Well, Here is one of my assembler versions which I like to share with you. It's so far the fastest method I have benchmarked, except for the table look-up method. It's getting very close though: // optimize it function SkybuckSwapBits4AsmOptimized( const ParaByte : byte ) : byte; asm { mov edx,eax and dl,$f0 movzx edx,dl shr edx,$04 and al,$0f shl al,$04 or dl,al } // replaced with skybuck's first step :) // swap left 4 bits with right 4 bits. // bbbbaaa // bbbbbbbb aaaaaaaa // shifted right 4: // 0000bbbb bbbbaaaa :) mov edx, eax mov dh, dl shr edx, 4 // swaps left 2 bits with right 2 bits, in left 4 bits, and again in the right 4 bits. bbaabbaa mov eax,edx // mov edx,eax and dl,$cc movzx edx,dl shr edx,$02 and al,$33 shl al,$02 or dl,al // swaps left bit with right bit. babababa mov eax,edx // mov edx,eax and dl,$aa movzx edx,dl shr edx,1 and al,$55 add eax,eax or dl,al mov eax,edx end; HerbertKleebauerTheIdiotLol's implementation, calls : 116.541.402 SkybuckSwapBits4's (re-ordered asm) implementation, calls: 101.736.412 SkybuckSwapBits3's (asm) implementation, calls: : 99.127.525 JukkaAndSteveAndSkybuck's implementation, calls : 96.738.509 SkybuckSwapBits2's (re-ordered) implementation, calls : 94.486.438 SkybuckSwapBits's implementation, calls : 94.474.619 Well this is afterall cross-posted to asm :) so I have the right to make my own asm implementation as well lol. I simply took delphi's generated assembler and optimized it somewhat. I am a little bit cheating by using a 16 bit register (actually it's 32 bit)... but that's ok ;) Maybe the code would be faster by using just 16 bit registers instead of 32 bit registers ? Bye, Skybuck.
From: Skybuck on 24 Aug 2006 18:32 I have optimized it further to: function SkybuckSwapBits5AsmOptimized( const ParaByte : byte ) : byte; asm { mov edx,eax and dl,$f0 movzx edx,dl shr edx,$04 and al,$0f shl al,$04 or dl,al } // replaced with skybuck's first step :) // swap left 4 bits with right 4 bits. // bbbbaaa // bbbbbbbb aaaaaaaa // shifted right 4: // 0000bbbb bbbbaaaa :) // for some instructions... using an 32 bit register is somehow faster ;) // number of instructions further reduced. mov ah, al shr eax, 4 mov edx, eax // swaps left 2 bits with right 2 bits, in left 4 bits, and again in the right 4 bits. bbaabbaa and edx,$cc and eax,$33 shr edx,$02 shl eax,$02 or eax,edx // swaps left bit with right bit. babababa mov edx, eax and dl,$aa and al,$55 shr edx,1 // shl al, 1 add eax, eax or al,dl end; Ok, I have reduced the instructions even further. Here are the new results: SkybuckSwapBits5's implementation verified ok. HerbertKleebauerTheIdiotLol's implementation, calls : 113002862 JukkaAndSteveAndSkybuck's implementation, calls : 92246130 SkybuckSwapBits's implementation, calls : 94367612 SkybuckSwapBits2's (re-ordered) implementation, calls : 90080914 SkybuckSwapBits3's (asm) implementation, calls: : 96697324 SkybuckSwapBits4's (re-ordered asm) implementation, calls: 96709743 SkybuckSwapBits5's (re-ordered asm) implementation, calls: 101170133 Bye, Skybuck.
From: Skybuck on 24 Aug 2006 19:26
Hello, Here is another method to flip bits ;) I have finally realised a small little dream of mine... I always wanted to test bits and set bits with if statements and bit positions/indexes/offsets. I have now finally fulfilled this dream by using the BT and BTS (and optionally the BTR) and JNC instruction ;) Pretty cool stuff. This also shows how important the bit order is ! :P* =D I knew it was possible to code in assembler like this, though I never tried. But now I have yeaaaaaaaaahhhhhh :) Like the show brainaic says: "I can do assembler me ! =D" Jippppeee :) // and another way to set and swap bits :) // I always wanted to implement a method like this and now I finally have LOL // I can't wait to find out how it performs LOL ;) :) me ccccuuuuurriiioouuuuss =D function SkybuckSwapBits6AsmBitIfs( const ParaByte : byte ) : byte; asm // let's assume the end result contains zero's already // this is true since edx is xor-ed to zero // this means we only have to set a bit if it needs to be set // otherwise we skip. xor edx, edx // edx is used as result variable BT eax, 7 // test bit 7 jnc @@skip7 // if bit 7 is not set then jump to skip7, otherwise set bit 0 BTS EDX, 0 // set bit 0 @@skip7: BT eax, 6 // test bit 6 jnc @@skip6 // if bit 6 is not set then jump to skip7, otherwise set bit 1 BTS EDX, 1 // set bit 1 @@skip6: BT eax, 5 // test bit 5 jnc @@skip5 // if bit 5 is not set then jump to skip7, otherwise set bit 2 BTS EDX, 2 // set bit 2 @@skip5: BT eax, 4 // test bit 4 jnc @@skip4 // if bit 4 is not set then jump to skip7, otherwise set bit 3 BTS EDX, 3 // set bit 3 @@skip4: BT eax, 3 // test bit 3 jnc @@skip3 // if bit 3 is not set then jump to skip7, otherwise set bit 4 BTS EDX, 4 // set bit 4 @@skip3: BT eax, 2 // test bit 2 jnc @@skip2 // if bit 2 is not set then jump to skip7, otherwise set bit 5 BTS EDX, 5 // set bit 5 @@skip2: BT eax, 1 // test bit 1 jnc @@skip1 // if bit 1 is not set then jump to skip7, otherwise set bit 6 BTS EDX, 6 // set bit 6 @@skip1: BT eax, 0 // test bit 0 jnc @@skip0 // if bit 0 is not set then jump to skip7, otherwise set bit 7 BTS EDX, 7 // set bit 7 @@skip0: mov eax, edx // output edx to eax the result end; It's slow though... but I don't mind... it's the dream that counts lol. You know something, it still beats the pants out of the first few delphi implementations =D Kkkkweeellll ;) I knew it would be faster... delphi just sucks with manipulating bits like that :( I call this asm technique "Bit Ifs" for bit if statements ;) Skybuck Flying's first implementation, calls : 29276104 Alan Lloyd's pure delphi implementation, calls : 17681421 Alan Lloyd's second (same) implementation, calls : 18233802 Alan Lloyd's basm (wrapped) implementation, calls : 21761263 HerbertKleebauerTheIdiotLol's implementation, calls : 116673698 FrankKotlerSwapBits's implementation, calls : 38965639 Skybuck Flying's second implementation, calls : 88217836 Skybuck Flying's third implementation, calls : 88156093 JukkaAndSteve's implementation, calls : 79356505 JukkaAndSteveAndSkybuck's implementation, calls : 96783645 SkybuckSwapBitsUsingXLAT's implementation, calls : 89787820 SkybuckSwapBits's implementation, calls : 94457529 SkybuckSwapBits2's (re-ordered) implementation, calls : 94455797 SkybuckSwapBits3's (asm) implementation, calls: : 99275058 SkybuckSwapBits4's (re-ordered asm) implementation, calls: 101723234 SkybuckSwapBits5's (re-ordered asm) implementation, calls: 104381539 SkybuckSwapBits6's (Bit Ifs) implementation, calls : 33955624 Bye, Skybuck. |