Prev: What is the shape that defines a semi-circle as the as it rollsalong a straight line?
Next: Cauchy product
From: Chip Eastham on 15 Apr 2010 14:58 On Apr 15, 12:14 pm, Pubkeybreaker <pubkeybrea...(a)aol.com> wrote: > On Apr 15, 11:43 am, Michael Hennebry <henne...(a)web.cs.ndsu.nodak.edu> > wrote: > > > > > On Apr 15, 10:20 am, recoder <kurtulmeh...(a)gmail.com> wrote: > > > > On Apr 15, 5:36 pm, Pubkeybreaker <pubkeybrea...(a)aol.com> wrote: > > > > > On Apr 15, 10:19 am, recoder <kurtulmeh...(a)gmail.com> wrote: > > > > > > Dear All, > > > > > What is the state of the art method to detect if a binary number is a > > > > > perfect square? > > > > > Thanks in advance.... > > > > > Please answer the following: > > > > > (1) Why do you believe that its representation (i.e. binary) matters? > > > > On a computer, all numbers are stored in binary. > > > > (2) How big is the number? [this matters!] > > > > (3) When you say "state of the art" do you mean "best known > > > > algorithm", > > > > or do you mean "best known code on a particular computer"?? > > > > If the latter, you need to specify the computer. > > > > (4) Must the answer be deterministic, or will you settle for knowing > > > > the > > > > answer with high probability? > > > > > Your statement of the problem is too vague to have a definitive answer. > > > > the binary number is in the range of 100 k - 100 Million digits > > > > for me state of the art means speed > > > > Deterministic might take long time, high probabibility is speedier I > > > guess > > > For probabilistic, evaluate a lot of Jacobi symbols. > > Yep. > > > No means no. Yes means maybe. > > > For deterministic, I think you will have to bite the bullet and find > > the square root. > > This can be done in a time proportional to the time required for > > multiplication.- Hide quoted text - > > Use Newton-Raphson along with FFT based multiplication. If the distribution of inputs contains a significant fraction of non-squares, your average time benefits from doing some "probabilistic" tests (giving a chance to quickly obtain the negative result) before resorting to the actual square root computation. - A nonzero square in binary representation will have least significant bits "...001" followed by an even number of zero bits (possibly none), because odd squares are congruent to 1 mod 8. [Depending on the machine architecture it may be practical to strip off the trailing zero bit pairs (factor out powers of 4) so we know we are dealing with an odd number, but unless the distribution of inputs contains a high frequency of such powers, it may not be a useful reduction.] - 2^32 - 1 = 3*5*17*257*65537 has 5 distinct prime factors, so if I understood the "100 million digits" aright, your input may consist of 10 million or so 32-bit words. Summing these into a 64-bit accumulator in one pass, and then summing those two 32-bit words into a single 32-bit word, would allow you to test the input for quadratic residues mod those five prime values. This will knock out about 95% of the non-squares, assuming the input distribution isn't malicious. regards, chip
From: Michael Hennebry on 15 Apr 2010 19:06 On Apr 15, 1:58 pm, Chip Eastham <hardm...(a)gmail.com> wrote: > - 2^32 - 1 = 3*5*17*257*65537 has 5 distinct prime > factors, so if I understood the "100 million digits" > aright, your input may consist of 10 million or so > 32-bit words. Summing these into a 64-bit accumulator > in one pass, and then summing those two 32-bit words > into a single 32-bit word, would allow you to test > the input for quadratic residues mod those five > prime values. This will knock out about 95% of the > non-squares, assuming the input distribution isn't > malicious. With Horner's rule and up to two 32x32->64-bit multiplications per step, one can also do any other modulus less than 2**32.
From: Chip Eastham on 15 Apr 2010 21:21 On Apr 15, 7:06 pm, Michael Hennebry <henne...(a)web.cs.ndsu.nodak.edu> wrote: > On Apr 15, 1:58 pm, Chip Eastham <hardm...(a)gmail.com> wrote: > > > - 2^32 - 1 = 3*5*17*257*65537 has 5 distinct prime > > factors, so if I understood the "100 million digits" > > aright, your input may consist of 10 million or so > > 32-bit words. Summing these into a 64-bit accumulator > > in one pass, and then summing those two 32-bit words > > into a single 32-bit word, would allow you to test > > the input for quadratic residues mod those five > > prime values. This will knock out about 95% of the > > non-squares, assuming the input distribution isn't > > malicious. > > With Horner's rule and up to two 32x32->64-bit multiplications per > step, > one can also do any other modulus less than 2**32. It might be worthwhile. Additional tests to exclude non-squares should benefit average case timing. After all there are relatively few squares when you get out to 100 million digits, so proportionally more effort to eliminate any non-squares will improve average speed up to a point. Perhaps the composite modulus: 2^48 - 1 = (3^2)*5*7*13*17*97*257*673 will do better if, as on Intel-like platforms, we have some flexibility in operand size. The ideal would be numerous small coprime moduli, small enough that quadratic residues can be found by lookup. regards, chip
From: Phil Carmody on 16 Apr 2010 03:04 Chip Eastham <hardmath(a)gmail.com> writes: > On Apr 15, 7:06 pm, Michael Hennebry <henne...(a)web.cs.ndsu.nodak.edu> > wrote: >> On Apr 15, 1:58 pm, Chip Eastham <hardm...(a)gmail.com> wrote: >> >> > - 2^32 - 1 = 3*5*17*257*65537 has 5 distinct prime >> > factors, so if I understood the "100 million digits" >> > aright, your input may consist of 10 million or so >> > 32-bit words. Summing these into a 64-bit accumulator >> > in one pass, and then summing those two 32-bit words >> > into a single 32-bit word, would allow you to test >> > the input for quadratic residues mod those five >> > prime values. This will knock out about 95% of the >> > non-squares, assuming the input distribution isn't >> > malicious. >> >> With Horner's rule and up to two 32x32->64-bit multiplications per >> step, >> one can also do any other modulus less than 2**32. > > It might be worthwhile. Additional tests to > exclude non-squares should benefit average case > timing. After all there are relatively few > squares when you get out to 100 million digits, > so proportionally more effort to eliminate any > non-squares will improve average speed up to a > point. Perhaps the composite modulus: > > 2^48 - 1 = (3^2)*5*7*13*17*97*257*673 > > will do better if, as on Intel-like platforms, > we have some flexibility in operand size. The > ideal would be numerous small coprime moduli, > small enough that quadratic residues can be > found by lookup. If you're doing 2^48-1, then it's probably quicker to just do 2^96-1 in 3 32-bit registers. Or even 2^192-1 in 3 64-bit registers. That can be done as quickly as an add. And don't forget even numbers. I have a bit twiddle which rejects all numbers that have an odd number of trailing 0 bits in their lowest word, and those for which the lowest set bit is preceded by a 1, in only 5 elementary operations. It's basically like the old mod 8 trick, (0 or 1 only), but also addresses n>>2 if n%4==0, n>>4 if n%16==0, etc. Phil -- I find the easiest thing to do is to k/f myself and just troll away -- David Melville on r.a.s.f1
From: Nicolas Bonneel on 16 Apr 2010 19:23 recoder wrote: > the binary number is in the range of 100 k - 100 Million digits > > for me state of the art means speed > > Deterministic might take long time, high probabibility is speedier I > guess With such a range and if probabilistic is enough, I guess I can do better than chance and with quite high probability at zero cost returning always false...
First
|
Prev
|
Pages: 1 2 Prev: What is the shape that defines a semi-circle as the as it rollsalong a straight line? Next: Cauchy product |