From: alanglloyd on

Skybuck wrote:
<snip>
>
> Reversing bit order in delphi ?
>
<snip>
> So what would be the fastest way in pure delphi code to reverse the
> bits ?
>
<snip>

Two solution - Delphi (checked) & Delphi assembler (un-tested) ...

type
T32BitInt = 0..31;
T32BitIntSet = set of T32BitInt;

function SwapBits(InInteger : integer) : integer;
var
I : T32BitInt;
InBits, OutBits : T32BitIntSet;
const
RevArray : array[T32BitInt] of T32BitInt
= (31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16,
15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0);
begin
InBits := T32BitIntSet(InInteger);
OutBits := [];
for I := 0 to 31 do
if I in InBits then
Include(OutBits, RevArray[I]);
Result := integer(OutBits);
end;

function Reversed(source: longint): longint;
{it left-shifts each bit from EDX into the carry flag and then
right-shifts
the carry flag into EAX a total of 32 times (controlled by ECX)}
asm
mov EDX, EAX
mov ECX, 32
@@AGAIN:
rcl EDX,1
rcr EAX,1
loop @@AGAIN
end;

It would be interesting to know the relative speeds.

This is not the same as changing endians.

Alan Lloyd

From: [jongware] on
"Skybuck" <skybuck2000(a)hotmail.com> wrote in message
news:1155977986.135633.244450(a)h48g2000cwc.googlegroups.com...
> I think intel uses most significant bit first ?

No (but that's not your question).

> Reversing the bits in delphi code might be too slow, if that's the case
> is there a special (intel/amd) assembler instruction which might do the
> trick ?

xlat.
At the cost of a 256-byte table, it is easiest and fastest to just look the
byte up. It even used to be a well-known trick (you can mirror bitmap images
this way).
As a side note I might add that though 'xlat' *is* a special instruction for
table lookup, it's usually faster (!) to use the regular memory read
functions. (That's from hear-say!)

[Jongware]


From: Herbert Kleebauer on
Frank Kotler wrote:
> ; assumes byte to reverse in al
> ; trashes ah, ecx, flags
>
> mov ah, al
> mov ecx, 8
> top:
> rcr ah, 1
> rcl al, 1
> sub ecx, 1
> jnz top
>
> That wouldn't be particularly fast, but I think it'll do what you want.

If it real is just a byte, why not use a 256 byte lookup table (xlat)?
And if there are more bytes, swap the bytes (if necessary) and use
for each byte the lookup table.
From: Skybuck on

Herbert Kleebauer schreef:

> Frank Kotler wrote:
> > ; assumes byte to reverse in al
> > ; trashes ah, ecx, flags
> >
> > mov ah, al
> > mov ecx, 8
> > top:
> > rcr ah, 1
> > rcl al, 1
> > sub ecx, 1
> > jnz top
> >
> > That wouldn't be particularly fast, but I think it'll do what you want.
>
> If it real is just a byte, why not use a 256 byte lookup table (xlat)?
> And if there are more bytes, swap the bytes (if necessary) and use
> for each byte the lookup table.

Wow such a simple and great idea ! Good thinking dude ! Easy to
implement too !

However I must ask some question in the light of speed.

A lookup table requires a memory access and some pointer/memory
calculations.. that's about it probably.

Suppose 2000 "data" bytes needs to be reversed... what will happen with
the lookup table and what will happen with the "data" bytes.

I am not sure how CPU's work... but it would be good if the data bytes
and lookup table end up in the cpu's cache... otherwise the memory
access might slow things down.

Alternative is too not use a lookup table and only instructions and
data...

Gje I wonder what would be faster, I have no idea.

I shall benchmark both ideas.

Great !

Bye,
Skybuck.

From: Skybuck on
Hello Folks,

I wrote a nice little test, verification and benchmark program to test
all your implementations, here are the results so far:

Conclusion so far:

1st place, 101.985.874 calls, Herbert's implementation/idea wins (but
disqualified for cheating =D)
2st place, 79.738.396 Skybuck's second and third implementation.
(shared place)
3st place, 39.073.907 Frank Kotler's implementation.
4st place, 27.833.642 Skybuck's first implementation.
5st place, 21.180.826 Alan Lloyd's basm (wrapped) implementation. (It
could do better had to use a wrapper to fix it)
6st place, 18.314.808 Alan Lloyd's pure delphi implementation. (Bummer
dude, you end up last, good try though ;) )

I like to thank all contestants for entering their ideas and
submissions.

I also like to thank Herbert for his winning idea. I hope you don't
mind me calling you an idiot LOL (for cheating) =D ;)

I also look forward to any other implementations / ideas, don't
hestitate to supply more workable/implementable idea's/algo's ! ;) =D

Competition is fun ! =D

*** Begin of code ***

program TestReversingBitOrder;

{$APPTYPE CONSOLE}

{

Reversing Bit Order

Test program

version 0.01 created on 19 august 2006 by Skybuck Flying.

First I will create my own version, let's see how would I have done
it...

probably with some and's and shifts or so ;) Or maybe a bitset or some
of that code ;)

First I should build in some checking code to verify that the method
is correct/sound ;)

Writing this code was a lot of fun.

Benchmark method is implemented
Verification method is implemented.

Different implementations are implemented.

The fastest algorithm/idea is from Herbert.

But he cheating lol. He did not supply a method/algorithm to generate
the look up table,
nor did he supply a method to calculate the reversed bit order value
etc.

The most amazing thing is that my third implementation is the
fastest... well not that many contenders anyway.
My second implementation is the runner up.

Only Alan Lloyd and Frank Kotler dared to enter this speed competition,
maybe we'll see more contenders ? :)

Here are the result/output of the program:

0 00000000 00000000 0
1 00000001 10000000 128
2 00000010 01000000 64
3 00000011 11000000 192
4 00000100 00100000 32
5 00000101 10100000 160
6 00000110 01100000 96
7 00000111 11100000 224
8 00001000 00010000 16
9 00001001 10010000 144
10 00001010 01010000 80
11 00001011 11010000 208
12 00001100 00110000 48
13 00001101 10110000 176
14 00001110 01110000 112
15 00001111 11110000 240
16 00010000 00001000 8
17 00010001 10001000 136
18 00010010 01001000 72
19 00010011 11001000 200
20 00010100 00101000 40
21 00010101 10101000 168
22 00010110 01101000 104
23 00010111 11101000 232
24 00011000 00011000 24
25 00011001 10011000 152
26 00011010 01011000 88
27 00011011 11011000 216
28 00011100 00111000 56
29 00011101 10111000 184
30 00011110 01111000 120
31 00011111 11111000 248
32 00100000 00000100 4
33 00100001 10000100 132
34 00100010 01000100 68
35 00100011 11000100 196
36 00100100 00100100 36
37 00100101 10100100 164
38 00100110 01100100 100
39 00100111 11100100 228
40 00101000 00010100 20
41 00101001 10010100 148
42 00101010 01010100 84
43 00101011 11010100 212
44 00101100 00110100 52
45 00101101 10110100 180
46 00101110 01110100 116
47 00101111 11110100 244
48 00110000 00001100 12
49 00110001 10001100 140
50 00110010 01001100 76
51 00110011 11001100 204
52 00110100 00101100 44
53 00110101 10101100 172
54 00110110 01101100 108
55 00110111 11101100 236
56 00111000 00011100 28
57 00111001 10011100 156
58 00111010 01011100 92
59 00111011 11011100 220
60 00111100 00111100 60
61 00111101 10111100 188
62 00111110 01111100 124
63 00111111 11111100 252
64 01000000 00000010 2
65 01000001 10000010 130
66 01000010 01000010 66
67 01000011 11000010 194
68 01000100 00100010 34
69 01000101 10100010 162
70 01000110 01100010 98
71 01000111 11100010 226
72 01001000 00010010 18
73 01001001 10010010 146
74 01001010 01010010 82
75 01001011 11010010 210
76 01001100 00110010 50
77 01001101 10110010 178
78 01001110 01110010 114
79 01001111 11110010 242
80 01010000 00001010 10
81 01010001 10001010 138
82 01010010 01001010 74
83 01010011 11001010 202
84 01010100 00101010 42
85 01010101 10101010 170
86 01010110 01101010 106
87 01010111 11101010 234
88 01011000 00011010 26
89 01011001 10011010 154
90 01011010 01011010 90
91 01011011 11011010 218
92 01011100 00111010 58
93 01011101 10111010 186
94 01011110 01111010 122
95 01011111 11111010 250
96 01100000 00000110 6
97 01100001 10000110 134
98 01100010 01000110 70
99 01100011 11000110 198
100 01100100 00100110 38
101 01100101 10100110 166
102 01100110 01100110 102
103 01100111 11100110 230
104 01101000 00010110 22
105 01101001 10010110 150
106 01101010 01010110 86
107 01101011 11010110 214
108 01101100 00110110 54
109 01101101 10110110 182
110 01101110 01110110 118
111 01101111 11110110 246
112 01110000 00001110 14
113 01110001 10001110 142
114 01110010 01001110 78
115 01110011 11001110 206
116 01110100 00101110 46
117 01110101 10101110 174
118 01110110 01101110 110
119 01110111 11101110 238
120 01111000 00011110 30
121 01111001 10011110 158
122 01111010 01011110 94
123 01111011 11011110 222
124 01111100 00111110 62
125 01111101 10111110 190
126 01111110 01111110 126
127 01111111 11111110 254
128 10000000 00000001 1
129 10000001 10000001 129
130 10000010 01000001 65
131 10000011 11000001 193
132 10000100 00100001 33
133 10000101 10100001 161
134 10000110 01100001 97
135 10000111 11100001 225
136 10001000 00010001 17
137 10001001 10010001 145
138 10001010 01010001 81
139 10001011 11010001 209
140 10001100 00110001 49
141 10001101 10110001 177
142 10001110 01110001 113
143 10001111 11110001 241
144 10010000 00001001 9
145 10010001 10001001 137
146 10010010 01001001 73
147 10010011 11001001 201
148 10010100 00101001 41
149 10010101 10101001 169
150 10010110 01101001 105
151 10010111 11101001 233
152 10011000 00011001 25
153 10011001 10011001 153
154 10011010 01011001 89
155 10011011 11011001 217
156 10011100 00111001 57
157 10011101 10111001 185
158 10011110 01111001 121
159 10011111 11111001 249
160 10100000 00000101 5
161 10100001
First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12
Prev: Apple II Debugger
Next: TMA Assembler?