Fast big integer division [ASM]

Prev: Computer nostalgia song
Next: If you don't immediately see your CLAX post...

From: Terje Mathisen "terje.mathisen at on 13 Apr 2010 09:34

Robert Redelmeier wrote:
> In alt.lang.asm H. Peter Anvin<hpa(a)zytor.com> wrote in part:
>>> There is always the method of long division -- shift, compare
>>> & subtract. About n^1.7
>>>
>>
>> Where do you get 1.7 from? It really seems O(n^2) to me,
>> or perhaps more accurately O(nm) with n and m being the
>> lengths of the dividend and divisor, respectively.
>
> On ~one-half the compares, you bail early (~half-way).
> Maybe that should be n^1.5.

If you want to implement this, you should at least try something like
SRT, i.e. an algorithm that gets you more than a single bit/iteration!

The logical starting point is simply to get ~32 bits or ~53
bits/iteration by doing a reciprocal multiplication of the top part of
the remainder, getting an estimate, then back-multiply and subtract.

OTOH, any kind of long division will still be O(n^2), just with slightly
different constant terms.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

From: Wolfgang Ehrhardt on 13 Apr 2010 12:26

On Tue, 13 Apr 2010 15:34:17 +0200, Terje Mathisen <"terje.mathisen at
tmsw.no"@giganews.com> wrote:

>Robert Redelmeier wrote:
>> In alt.lang.asm H. Peter Anvin<hpa(a)zytor.com> wrote in part:
>>>> There is always the method of long division -- shift, compare
>>>> & subtract. About n^1.7
>>>>
>>>
>>> Where do you get 1.7 from? It really seems O(n^2) to me,
>>> or perhaps more accurately O(nm) with n and m being the
>>> lengths of the dividend and divisor, respectively.
>>
>> On ~one-half the compares, you bail early (~half-way).
>> Maybe that should be n^1.5.
>
>If you want to implement this, you should at least try something like
>SRT, i.e. an algorithm that gets you more than a single bit/iteration!
>
>The logical starting point is simply to get ~32 bits or ~53
>bits/iteration by doing a reciprocal multiplication of the top part of
>the remainder, getting an estimate, then back-multiply and subtract.
>
>OTOH, any kind of long division will still be O(n^2), just with slightly
>different constant terms.

This is not quite correct, if you mean with "long division" unbalanced
2m by m division; See e.g. C. Burnikel, J. Ziegler: Fast Recursive
Division. MPI Informatik, Forschungsbericht MPI-I-98-1-022 (1998) or
<http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.565>

But this is normally not done with assembler, at least not the high
level recursive algorithm.

Wolfgang

From: Terje Mathisen "terje.mathisen at on 13 Apr 2010 15:14

Wolfgang Ehrhardt wrote:
> On Tue, 13 Apr 2010 15:34:17 +0200, Terje Mathisen<"terje.mathisen at
>> OTOH, any kind of long division will still be O(n^2), just with slightly
>> different constant terms.
>
> This is not quite correct, if you mean with "long division" unbalanced
> 2m by m division; See e.g. C. Burnikel, J. Ziegler: Fast Recursive

Of course not!

When I wrote "long div" I meant any algorithm which is similar to what
we all learned to use with pen & paper.

Most recursive (Karatsuba?) algorithms get close to O(n*log(n)), while
the fastest for really big numbers will all use FFT-based
multiplications. (Also O(n*log(n))

> Division. MPI Informatik, Forschungsbericht MPI-I-98-1-022 (1998) or
> <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.565>
>
> But this is normally not done with assembler, at least not the high
> level recursive algorithm.

:-)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

From: io_x on 14 Apr 2010 14:48

"Branimir Maksimovic" <bmaxa(a)nospicedham.hotmail.com> ha scritto nel messaggio
news:20100412220303.02038de7(a)maxa...
>
> Starting point is:
> http://www.x86asm.net/articles/working-with-big-numbers-using-x86-instructions/
>
> So case when one of numbers fits in register is trivial.
> Multiplications (karatzuba variant)
> y1 = x1*2^32 + n1
> y2 = x2*2^32 + n2
>
> y1*y2 = (x1*2^32 + n1) * (x2*2^32+n2)
> = x1*2^32*x2*2^32 + x1*n2*2^32 + x2*n1*2^32 + n1*n2
>
> = x1*x2 * 2^32 * 2^32 + (x1*n2+x2*n1)*2^32 + n1*n2
>
> x1*n2 (trivial) x2*n1 (trivial) n1*n2 (trivial)
>
> So this is easy. x1, x2 are big n1,n2 < 2^32
> recurse until x1 or x2 <2^32
> exponentiation is easy as it is shifting.

pheraps it is better i shut up because my rutines are not so fast

i use the paper mult mod 0FFFFFFFFh
[so the base of the number is 0FFFFFFFFh]
with the "paper mult" means the way someone multiply 2 decimal
numbers in the paper with the pencil.

not understand karasuba above

> Problem is how to do integer division.
> I have found newton-rafson method of interpolation,
> so you can aproximate division of x1/x2 to x1*1/x2
> but that requires floating / fixed point math.

> Do someone knows method for fast big integer division
> which does not requires floating / fixed point math?

yes i use paper div mod 0FFFFFFFFh
[so the base of the number is 0FFFFFFFFh]
how did i write these? i don't know, it seems all going well
for miracle, but i know can be errors
in these (mult and div) i wrote

From: Branimir Maksimovic on 14 Apr 2010 15:52

On Wed, 14 Apr 2010 20:48:46 +0200
"io_x" <a(a)b.c.invalid> wrote:

> "Branimir Maksimovic" <bmaxa(a)nospicedham.hotmail.com> ha scritto nel
> messaggio news:20100412220303.02038de7(a)maxa...
> >
> > Starting point is:
> > http://www.x86asm.net/articles/working-with-big-numbers-using-x86-instructions/
> >
> > So case when one of numbers fits in register is trivial.
> > Multiplications (karatzuba variant)
> > y1 = x1*2^32 + n1
> > y2 = x2*2^32 + n2
> >
> > y1*y2 = (x1*2^32 + n1) * (x2*2^32+n2)
> > = x1*2^32*x2*2^32 + x1*n2*2^32 + x2*n1*2^32 + n1*n2
> >
> > = x1*x2 * 2^32 * 2^32 + (x1*n2+x2*n1)*2^32 + n1*n2
> >
> > x1*n2 (trivial) x2*n1 (trivial) n1*n2 (trivial)
> >
> > So this is easy. x1, x2 are big n1,n2 < 2^32
> > recurse until x1 or x2 <2^32
> > exponentiation is easy as it is shifting.
>
> pheraps it is better i shut up because my rutines are not so fast
>
> i use the paper mult mod 0FFFFFFFFh
> [so the base of the number is 0FFFFFFFFh]
> with the "paper mult" means the way someone multiply 2 decimal
> numbers in the paper with the pencil.

Same thing is karatsuba metod.
y is represented as x time some base exponentiated to some degree
in this case 2^32.
This is only needed if you multiply two large numbers.
If one fits in register no problem.

>
> not understand karasuba above

Simple just decompose number with dividend * 2^32 + remainder
(you divide by 2^32 in order to find number which is just shifting)
Recursion is there to simplify number until is small enough
as it is x1*x2 * 2^32 * 2^32 part. So you decompose multiplication
by recursion as x1 or x2 will eventually fit in single register.

This is for multiplication. There is enough posts here so I know
now how to do division.

Thanks to all which replied. Things are clear now.

Thank you.

--
http://maxa.homedns.org/

Sometimes online sometimes not

First | Prev |
Pages: 1 2 3
Prev: Computer nostalgia song
Next: If you don't immediately see your CLAX post...