From: Datesfat Chicks on
Hi,

I have the need in a microcontroller application to calculate floor(sqrt(x))
where x is approximately in the range of 2^32 - 2^64.

What are the algorithms I should look at?

These small microcontrollers are characterized by having a 8-bit times 8-bit
to give 16-bit result integer unsigned multiply instruction, and a similar
16/8 division instruction to give a quotient and a remainder.
Mulitplications of larger integers have to be done by multiplying the bytes
and adding.

I'm aware of these algorithms:

a)Adding consecutive odd integers and counting how many you have to add,
i.e. 1 + 3 = 4, 4 + 5 = 9, 9 + 7 = 16, 16 + 9 = 25, etc.

b)Trial squaring, testing setting each bit to "1", i.e.

http://www.uculib.com/vvcuculib01/viewvc.cgi/uculib01/src/stm8/cosmic/modxx/atu8sqrtfrxx/src/atu8sqrtfrxx.c?revision=1.3&view=markup

c)Another method that seems to work (don't know the name of it), here is
source code:

http://www.uculib.com/vvcuculib01/viewvc.cgi/uculib01/src/stm8/cosmic/modxx/atu16sqrtx10frxx/src/atu16sqrtx10frxx.c?revision=1.3&view=markup

What other options should I investigate?

Thanks, Datesfat

From: Dann Corbit on
In article <yY6dnfRKeZATg_bWnZ2dnUVZ_sydnZ2d(a)giganews.com>,
datesfat.chicks(a)gmail.com says...
>
> Hi,
>
> I have the need in a microcontroller application to calculate floor(sqrt(x))
> where x is approximately in the range of 2^32 - 2^64.
>
> What are the algorithms I should look at?
>
> These small microcontrollers are characterized by having a 8-bit times 8-bit
> to give 16-bit result integer unsigned multiply instruction, and a similar
> 16/8 division instruction to give a quotient and a remainder.
> Mulitplications of larger integers have to be done by multiplying the bytes
> and adding.
>
> I'm aware of these algorithms:
>
> a)Adding consecutive odd integers and counting how many you have to add,
> i.e. 1 + 3 = 4, 4 + 5 = 9, 9 + 7 = 16, 16 + 9 = 25, etc.
>
> b)Trial squaring, testing setting each bit to "1", i.e.
>
> http://www.uculib.com/vvcuculib01/viewvc.cgi/uculib01/src/stm8/cosmic/modxx/atu8sqrtfrxx/src/atu8sqrtfrxx.c?revision=1.3&view=markup
>
> c)Another method that seems to work (don't know the name of it), here is
> source code:
>
> http://www.uculib.com/vvcuculib01/viewvc.cgi/uculib01/src/stm8/cosmic/modxx/atu16sqrtx10frxx/src/atu16sqrtx10frxx.c?revision=1.3&view=markup
>
> What other options should I investigate?

Here is some C code for integer square roots using many different
algorithms:
http://cap.connx.com/chess-engines/new-approach/jsqrt0.ZIP

Worth a look:
http://www.azillionmonkeys.com/qed/sqroot.html

Consider also:
http://www.google.com/#hl=en&source=hp&q=fast+integer+square+root&rlz=
1W1GGLD_en&aq=0&aqi=g2&oq=fast+integer+sq&fp=a048890d3c90c6fc

From: Didi on
On Feb 4, 9:22 pm, "Datesfat Chicks" <datesfat.chi...(a)gmail.com>
wrote:
> Hi,
>
> I have the need in a microcontroller application to calculate floor(sqrt(x))
> where x is approximately in the range of 2^32 - 2^64.
>
> What are the algorithms I should look at?
>
> These small microcontrollers are characterized by having a 8-bit times 8-bit
> to give 16-bit result integer unsigned multiply instruction, and a similar
> 16/8 division instruction to give a quotient and a remainder.
> Mulitplications of larger integers have to be done by multiplying the bytes
> and adding.
>
> I'm aware of these algorithms:
>
> a)Adding consecutive odd integers and counting how many you have to add,
> i.e. 1 + 3 = 4, 4 + 5 = 9, 9 + 7 = 16, 16 + 9 = 25, etc.
>
> b)Trial squaring, testing setting each bit to "1", i.e.
>
> http://www.uculib.com/vvcuculib01/viewvc.cgi/uculib01/src/stm8/cosmic...
>
> c)Another method that seems to work (don't know the name of it), here is
> source code:
>
> http://www.uculib.com/vvcuculib01/viewvc.cgi/uculib01/src/stm8/cosmic...
>
> What other options should I investigate?
>
> Thanks, Datesfat

The one I have been using last 15-20 years is probably your B, at
least sounds
like what I made up back then - successive approximation for each bit
of the result.
If multiplication is fast on the core you will be using it it is hard
to beat.

Dimiter
From: Datesfat Chicks on
"Didi" <dp(a)tgi-sci.com> wrote in message
news:8b5a6ce8-5e18-49e8-815a-5733e4fe5fe3(a)g1g2000yqi.googlegroups.com...
On Feb 4, 9:22 pm, "Datesfat Chicks" <datesfat.chi...(a)gmail.com>
wrote:
> Hi,
>
> I have the need in a microcontroller application to calculate
> floor(sqrt(x))
> where x is approximately in the range of 2^32 - 2^64.
>
> What are the algorithms I should look at?
>
> These small microcontrollers are characterized by having a 8-bit times
> 8-bit
> to give 16-bit result integer unsigned multiply instruction, and a similar
> 16/8 division instruction to give a quotient and a remainder.
> Mulitplications of larger integers have to be done by multiplying the
> bytes
> and adding.
>
> I'm aware of these algorithms:
>
> a)Adding consecutive odd integers and counting how many you have to add,
> i.e. 1 + 3 = 4, 4 + 5 = 9, 9 + 7 = 16, 16 + 9 = 25, etc.
>
> b)Trial squaring, testing setting each bit to "1", i.e.
>
> http://www.uculib.com/vvcuculib01/viewvc.cgi/uculib01/src/stm8/cosmic...
>
> c)Another method that seems to work (don't know the name of it), here is
> source code:
>
> http://www.uculib.com/vvcuculib01/viewvc.cgi/uculib01/src/stm8/cosmic...
>
> What other options should I investigate?
>
> Thanks, Datesfat
>
>The one I have been using last 15-20 years is probably your B, at
>least sounds
>like what I made up back then - successive approximation for each bit
>of the result.
>If multiplication is fast on the core you will be using it it is hard
>to beat.

Thanks for the insight.

For small operands, I agree with your analysis.

However, for larger operands, the Babylonian method:

http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Babylonian_method

looks very promising.

The issue is that for the cost of one division (the /2 and addition is
virtually free) you get far more than one bit of additional information
about the square root.

For larger operands, I don't believe that (b) will be the right answer any
longer.

But I'm not sure yet ...

Thanks for all, Datesfat

From: Nobody on
On Thu, 04 Feb 2010 14:22:54 -0500, Datesfat Chicks wrote:

> I have the need in a microcontroller application to calculate floor(sqrt(x))
> where x is approximately in the range of 2^32 - 2^64.
>
> What are the algorithms I should look at?

Newton-Raphson or bisection, depending upon how slow division is. N-R is
asymptotically faster (the number of correct digits doubles at each step),
but each step requires a division.

Bisection is essentially your option b), but it can be computed much more
efficiently than the algorithm given. You don't need arbitrary multiplies,
only shifts, as:

(a+2^k)^2 = a^2 + 2.a.2^k + (2^k)^2
= a^2 + a.2^(k+1) + 2^2k

IOW:

uint32_t isqrt(uint64_t x)
{
uint64_t a = 0;
uint64_t a2 = 0;
int k;

for (k = 31; k >= 0; k--) {

uint64_t t = a2 + (a << (k+1)) + ((uint64_t)1 << (k + k));
if (x > t) {
a += 1 << k;
a2 = t;
}
}

return a;
}

Depending upon the CPU, it may be faster to calculate the a.2^(k+1) and
2^2k terms incrementally.