Fast integer distance formula [DSP]

Prev: Timing Recovery for Blind Demodulation
Next: Using FFTW 3 DLL w/ CLR

From: Steve Underwood on 5 Oct 2006 03:10

Phil Frisbie, Jr. wrote:
> Robert Scott wrote:
>> I have an approximation for the Pythagorean distance formula
>> (magnitude of
>> vector [x,y]) using integer arithmetic that I would like to improve.
>> Currently
>> I am using this:
> <snip>
>>
>> This is for an ARM processor in an application where I cannot afford
>> the time
>> for any floating point operations. The integer values of x and y come
>> from an
>> array of signed 16-bit numbers, but x and y themselves are 32-bit
>> numbers in the
>> above formula. The final result (z) will be shifted right one bit
>> before being
>> stored back into an array of 16-bit integers, since the distance
>> formula can
>> potentially extend the range of x and y by one bit. In case you are
>> wondering,
>> this is part of calculating the power spectrum at the end of an FFT.
>>
>> The improvement I am looking for is in accuracy. I would like to try
>> for a
>> 4-fold improvement in accuracy (.5%) without substantially increasing the
>> running time of what I have now. Does anyone know of a better
>> approximation
>> that is almost as fast?
>
> I like this algorithm. It is snipped from my OpenLPC fixed point codec.
> It uses only simple operations. It is likely more accurate than you
> need, but I think you can simply truncate the algorithm sooner for less
> precision. At the end PRECISION is the fractional bits you are using in
> your fixed point code.
>
> static fixed32 fixsqrt32(fixed32 x)
> {
>
> unsigned long r = 0, s, v = (unsigned long)x;
>
> #define STEP(k) s = r + (1 << k * 2); r >>= 1; \
> if (s <= v) { v -= s; r |= (1 << k * 2); }
>
> STEP(15);
> STEP(14);
> STEP(13);
> STEP(12);
> STEP(11);
> STEP(10);
> STEP(9);
> STEP(8);
> STEP(7);
> STEP(6);
> STEP(5);
> STEP(4);
> STEP(3);
> STEP(2);
> STEP(1);
> STEP(0);
>
> return (fixed32)(r << (PRECISION / 2));
> }
>

That only seems to calculate a part of the answer, leaving lots of zeros
at the end. Why not use something that calculates as many bits as it
can, like:

int32_t isqrt32(int32_t h)
{
int32_t x;
int32_t y;
int i;

/* The answer is calculated as a 32 bit value, where the last
16 bits are fractional. */
/* Calling with negative numbers is not a good idea :-) */
x =
y = 0;
for (i = 0; i < 32; i++)
{
x = (x << 1) | 1;
if (y < x)
x -= 2;
else
y -= x;
x++;
y <<= 1;
if ((h & 0x80000000))
y |= 1;
h <<= 1;
y <<= 1;
if ((h & 0x80000000))
y |= 1;
h <<= 1;
}
return x;
}

Of course, both these routines seem like overkill for what the original
poster needs. :-)

Steve

From: Mark Borgerding on 5 Oct 2006 08:07

Robert Scott wrote:
> I have an approximation for the Pythagorean distance formula (magnitude of
> vector [x,y]) using integer arithmetic that I would like to improve. Currently

Look for a paper entitled

Filip, A. E. , "A baker's dozen magnitude approximations and their
detection statistics" IEEE Trans on Aerospace and Elcectronic Systems,
Jan 1976

It is a classic work with many tricks similar to the one you are using.

With a little trickery, several of them could be implemented with little
or no multiplication.

--
Mark Borgerding
3dB Labs, Inc
Innovate. Develop. Deliver.

From: Everett M. Greene on 4 Oct 2006 22:39

"Phil Frisbie, Jr." <phil(a)hawksoft.com> writes:
[snip]
> I like this algorithm. It is snipped from my OpenLPC fixed point codec.
> It uses only simple operations. It is likely more accurate than you
> need, but I think you can simply truncate the algorithm sooner for less
> precision. At the end PRECISION is the fractional bits you are using in
> your fixed point code.
[snip]

Isn't computing the exact square root of an integer
faster than the algorithm you show?

From: Phil Frisbie, Jr. on 5 Oct 2006 21:26

Everett M. Greene wrote:

> "Phil Frisbie, Jr." <phil(a)hawksoft.com> writes:
> [snip]
>
>>I like this algorithm. It is snipped from my OpenLPC fixed point codec.
>>It uses only simple operations. It is likely more accurate than you
>>need, but I think you can simply truncate the algorithm sooner for less
>>precision. At the end PRECISION is the fractional bits you are using in
>>your fixed point code.
>
> [snip]
>
> Isn't computing the exact square root of an integer
> faster than the algorithm you show?

That depends on your target CPU/DSP. I am now targeting ARM most of the
time so I will make a note to profile that code and see.

--
Phil Frisbie, Jr.
Hawk Software
http://www.hawksoft.com

From: Everett M. Greene on 6 Oct 2006 00:34

"Phil Frisbie, Jr." <phil(a)hawksoft.com> writes:
> Everett M. Greene wrote:
>
> > "Phil Frisbie, Jr." <phil(a)hawksoft.com> writes:
> > [snip]
> >
> >>I like this algorithm. It is snipped from my OpenLPC fixed point codec.
> >>It uses only simple operations. It is likely more accurate than you
> >>need, but I think you can simply truncate the algorithm sooner for less
> >>precision. At the end PRECISION is the fractional bits you are using in
> >>your fixed point code.
> >
> > [snip]
> >
> > Isn't computing the exact square root of an integer
> > faster than the algorithm you show?
>
> That depends on your target CPU/DSP. I am now targeting ARM most
> of the time so I will make a note to profile that code and see.

It would be interesting to know the results of the comparison.

First | Prev | Next | Last
Pages: 1 2 3
Prev: Timing Recovery for Blind Demodulation
Next: Using FFTW 3 DLL w/ CLR