Prev: Apple II Debugger
Next: TMA Assembler?
From: alanglloyd on 19 Aug 2006 09:45 Skybuck wrote: <snip> > > Reversing bit order in delphi ? > <snip> > So what would be the fastest way in pure delphi code to reverse the > bits ? > <snip> Two solution - Delphi (checked) & Delphi assembler (un-tested) ... type T32BitInt = 0..31; T32BitIntSet = set of T32BitInt; function SwapBits(InInteger : integer) : integer; var I : T32BitInt; InBits, OutBits : T32BitIntSet; const RevArray : array[T32BitInt] of T32BitInt = (31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0); begin InBits := T32BitIntSet(InInteger); OutBits := []; for I := 0 to 31 do if I in InBits then Include(OutBits, RevArray[I]); Result := integer(OutBits); end; function Reversed(source: longint): longint; {it left-shifts each bit from EDX into the carry flag and then right-shifts the carry flag into EAX a total of 32 times (controlled by ECX)} asm mov EDX, EAX mov ECX, 32 @@AGAIN: rcl EDX,1 rcr EAX,1 loop @@AGAIN end; It would be interesting to know the relative speeds. This is not the same as changing endians. Alan Lloyd
From: [jongware] on 19 Aug 2006 10:00 "Skybuck" <skybuck2000(a)hotmail.com> wrote in message news:1155977986.135633.244450(a)h48g2000cwc.googlegroups.com... > I think intel uses most significant bit first ? No (but that's not your question). > Reversing the bits in delphi code might be too slow, if that's the case > is there a special (intel/amd) assembler instruction which might do the > trick ? xlat. At the cost of a 256-byte table, it is easiest and fastest to just look the byte up. It even used to be a well-known trick (you can mirror bitmap images this way). As a side note I might add that though 'xlat' *is* a special instruction for table lookup, it's usually faster (!) to use the regular memory read functions. (That's from hear-say!) [Jongware]
From: Herbert Kleebauer on 19 Aug 2006 10:05 Frank Kotler wrote: > ; assumes byte to reverse in al > ; trashes ah, ecx, flags > > mov ah, al > mov ecx, 8 > top: > rcr ah, 1 > rcl al, 1 > sub ecx, 1 > jnz top > > That wouldn't be particularly fast, but I think it'll do what you want. If it real is just a byte, why not use a 256 byte lookup table (xlat)? And if there are more bytes, swap the bytes (if necessary) and use for each byte the lookup table.
From: Skybuck on 19 Aug 2006 12:54 Herbert Kleebauer schreef: > Frank Kotler wrote: > > ; assumes byte to reverse in al > > ; trashes ah, ecx, flags > > > > mov ah, al > > mov ecx, 8 > > top: > > rcr ah, 1 > > rcl al, 1 > > sub ecx, 1 > > jnz top > > > > That wouldn't be particularly fast, but I think it'll do what you want. > > If it real is just a byte, why not use a 256 byte lookup table (xlat)? > And if there are more bytes, swap the bytes (if necessary) and use > for each byte the lookup table. Wow such a simple and great idea ! Good thinking dude ! Easy to implement too ! However I must ask some question in the light of speed. A lookup table requires a memory access and some pointer/memory calculations.. that's about it probably. Suppose 2000 "data" bytes needs to be reversed... what will happen with the lookup table and what will happen with the "data" bytes. I am not sure how CPU's work... but it would be good if the data bytes and lookup table end up in the cpu's cache... otherwise the memory access might slow things down. Alternative is too not use a lookup table and only instructions and data... Gje I wonder what would be faster, I have no idea. I shall benchmark both ideas. Great ! Bye, Skybuck.
From: Skybuck on 19 Aug 2006 15:28
Hello Folks, I wrote a nice little test, verification and benchmark program to test all your implementations, here are the results so far: Conclusion so far: 1st place, 101.985.874 calls, Herbert's implementation/idea wins (but disqualified for cheating =D) 2st place, 79.738.396 Skybuck's second and third implementation. (shared place) 3st place, 39.073.907 Frank Kotler's implementation. 4st place, 27.833.642 Skybuck's first implementation. 5st place, 21.180.826 Alan Lloyd's basm (wrapped) implementation. (It could do better had to use a wrapper to fix it) 6st place, 18.314.808 Alan Lloyd's pure delphi implementation. (Bummer dude, you end up last, good try though ;) ) I like to thank all contestants for entering their ideas and submissions. I also like to thank Herbert for his winning idea. I hope you don't mind me calling you an idiot LOL (for cheating) =D ;) I also look forward to any other implementations / ideas, don't hestitate to supply more workable/implementable idea's/algo's ! ;) =D Competition is fun ! =D *** Begin of code *** program TestReversingBitOrder; {$APPTYPE CONSOLE} { Reversing Bit Order Test program version 0.01 created on 19 august 2006 by Skybuck Flying. First I will create my own version, let's see how would I have done it... probably with some and's and shifts or so ;) Or maybe a bitset or some of that code ;) First I should build in some checking code to verify that the method is correct/sound ;) Writing this code was a lot of fun. Benchmark method is implemented Verification method is implemented. Different implementations are implemented. The fastest algorithm/idea is from Herbert. But he cheating lol. He did not supply a method/algorithm to generate the look up table, nor did he supply a method to calculate the reversed bit order value etc. The most amazing thing is that my third implementation is the fastest... well not that many contenders anyway. My second implementation is the runner up. Only Alan Lloyd and Frank Kotler dared to enter this speed competition, maybe we'll see more contenders ? :) Here are the result/output of the program: 0 00000000 00000000 0 1 00000001 10000000 128 2 00000010 01000000 64 3 00000011 11000000 192 4 00000100 00100000 32 5 00000101 10100000 160 6 00000110 01100000 96 7 00000111 11100000 224 8 00001000 00010000 16 9 00001001 10010000 144 10 00001010 01010000 80 11 00001011 11010000 208 12 00001100 00110000 48 13 00001101 10110000 176 14 00001110 01110000 112 15 00001111 11110000 240 16 00010000 00001000 8 17 00010001 10001000 136 18 00010010 01001000 72 19 00010011 11001000 200 20 00010100 00101000 40 21 00010101 10101000 168 22 00010110 01101000 104 23 00010111 11101000 232 24 00011000 00011000 24 25 00011001 10011000 152 26 00011010 01011000 88 27 00011011 11011000 216 28 00011100 00111000 56 29 00011101 10111000 184 30 00011110 01111000 120 31 00011111 11111000 248 32 00100000 00000100 4 33 00100001 10000100 132 34 00100010 01000100 68 35 00100011 11000100 196 36 00100100 00100100 36 37 00100101 10100100 164 38 00100110 01100100 100 39 00100111 11100100 228 40 00101000 00010100 20 41 00101001 10010100 148 42 00101010 01010100 84 43 00101011 11010100 212 44 00101100 00110100 52 45 00101101 10110100 180 46 00101110 01110100 116 47 00101111 11110100 244 48 00110000 00001100 12 49 00110001 10001100 140 50 00110010 01001100 76 51 00110011 11001100 204 52 00110100 00101100 44 53 00110101 10101100 172 54 00110110 01101100 108 55 00110111 11101100 236 56 00111000 00011100 28 57 00111001 10011100 156 58 00111010 01011100 92 59 00111011 11011100 220 60 00111100 00111100 60 61 00111101 10111100 188 62 00111110 01111100 124 63 00111111 11111100 252 64 01000000 00000010 2 65 01000001 10000010 130 66 01000010 01000010 66 67 01000011 11000010 194 68 01000100 00100010 34 69 01000101 10100010 162 70 01000110 01100010 98 71 01000111 11100010 226 72 01001000 00010010 18 73 01001001 10010010 146 74 01001010 01010010 82 75 01001011 11010010 210 76 01001100 00110010 50 77 01001101 10110010 178 78 01001110 01110010 114 79 01001111 11110010 242 80 01010000 00001010 10 81 01010001 10001010 138 82 01010010 01001010 74 83 01010011 11001010 202 84 01010100 00101010 42 85 01010101 10101010 170 86 01010110 01101010 106 87 01010111 11101010 234 88 01011000 00011010 26 89 01011001 10011010 154 90 01011010 01011010 90 91 01011011 11011010 218 92 01011100 00111010 58 93 01011101 10111010 186 94 01011110 01111010 122 95 01011111 11111010 250 96 01100000 00000110 6 97 01100001 10000110 134 98 01100010 01000110 70 99 01100011 11000110 198 100 01100100 00100110 38 101 01100101 10100110 166 102 01100110 01100110 102 103 01100111 11100110 230 104 01101000 00010110 22 105 01101001 10010110 150 106 01101010 01010110 86 107 01101011 11010110 214 108 01101100 00110110 54 109 01101101 10110110 182 110 01101110 01110110 118 111 01101111 11110110 246 112 01110000 00001110 14 113 01110001 10001110 142 114 01110010 01001110 78 115 01110011 11001110 206 116 01110100 00101110 46 117 01110101 10101110 174 118 01110110 01101110 110 119 01110111 11101110 238 120 01111000 00011110 30 121 01111001 10011110 158 122 01111010 01011110 94 123 01111011 11011110 222 124 01111100 00111110 62 125 01111101 10111110 190 126 01111110 01111110 126 127 01111111 11111110 254 128 10000000 00000001 1 129 10000001 10000001 129 130 10000010 01000001 65 131 10000011 11000001 193 132 10000100 00100001 33 133 10000101 10100001 161 134 10000110 01100001 97 135 10000111 11100001 225 136 10001000 00010001 17 137 10001001 10010001 145 138 10001010 01010001 81 139 10001011 11010001 209 140 10001100 00110001 49 141 10001101 10110001 177 142 10001110 01110001 113 143 10001111 11110001 241 144 10010000 00001001 9 145 10010001 10001001 137 146 10010010 01001001 73 147 10010011 11001001 201 148 10010100 00101001 41 149 10010101 10101001 169 150 10010110 01101001 105 151 10010111 11101001 233 152 10011000 00011001 25 153 10011001 10011001 153 154 10011010 01011001 89 155 10011011 11011001 217 156 10011100 00111001 57 157 10011101 10111001 185 158 10011110 01111001 121 159 10011111 11111001 249 160 10100000 00000101 5 161 10100001 |