Assembly Language - Mathematics WITHOUT maths coprocessor [ASM]

Prev: HLA wont compile!!
Next: Structs in Assembly

From: Richard Cooper on 11 Oct 2005 06:48

On Tue, 11 Oct 2005 03:04:28 -0400, ?a\/b <al(a)f.g> wrote:

> this should be a/b =(a.sn.sn<<(32*precision))/b.sn; ?

Yes.

It's because when you store a number as a fixed point number, all that you
are really doing is storing the number multiplied by a constant value. We
usually make that constant 256 or 65536 or something of that sort, but it
could just as well be 7, 69, 111, anything.

Now, that fourth grade math I was talking about...

When I was in fourth grade, we were taught how to do long division with
decimal numbers. We were doing problems like this:
_____
1.2 ) 37

The way we were taught to do this was to convert 1.2 to an integer, by
multiplying it by 10 enough times to do so. So for this particular
problem, we would multiply it by 10 once to get 12. So that we came up
with the correct answer, we also multiplied 37 by 10, so that the problem
looked like this:

______
12 ) 370

Then it was just an ordinary integer division problem.

This is similar to what we're doing with fixed point numbers. We're
multiplying them so that they become integers, and then preforming
operations on integers instead.

Now in 4th grade, we multiplied both numbers in the problem by the same
amount, because we wanted to get the correct answer when we were done.
Doing the division would remove the constant that we multiplied each
number by, so that the answer we got was the correct one.

For a division problem, "n / d", we basically turned it into "(10 * n) /
(10 * d)" and of course the "10 / 10" part of that doesn't affect the
answer since 10 / 10 is 1, and so the answer came out multiplied by 1.

Now in fixed point math we actually want the answer to also be multiplied
by 10 (or whatever). So we've got to do the equivelant of "(10 * 10 * n)
/ (10 * d)" so that the 10 in the denominator eliminates one of the 10s in
the numerator, leaving us with the other 10 that we want the answer to be
multiplied by.

Now before you had written:
a/b =(a.sn.sn<<(2*32*precision))/b.sn;

Which is basically:
a/b =(a.sn.sn<<(32*precision)<<(32*precision))/b.sn;

A single "<<(32*precision)" is just like a "* 10"

The problem with this is that a.sn.sn (what's with the two 'sn' there
anyway?) already has one "<<(32*precision)" applied to it when it was
converted to the fixed point format. So even though we want the numerator
to be "n *10 * 10", it's representation in the floating point format is
already "* 10" so it only needs one more "* 10" to make it "* 10 * 10".

The same is true with the denominator, which also needs to be "* 10" for
it to all work out, but it's already "* 10" because it's in fixed point
format, so we don't do anything to it.

>> i think it is more easy than what you say
>> for input something like this:
>> if Lstring is the string of numbers and "a" is the number
>> result
>> fnum a, b;
>> unum c;
>> char *p1, *p2;
>> b.sn=read(Lstring, p1);
>
> a=b<<(32*precision);
>
>> if(*p1==".") ++p1;
>> else goto label;
>> c=read(p1, p2);
>
>> if(p1==p2) {label:;
>
>> return a;
>> }
>
>
>> if(precision<c.len)
>> {k=c.len-precision;
>> c>>= 32*k;
>> }
>> a+=c;
>> return a;
>>
>> for output, something like above, should be possible

That certainly doesn't look any easier. It won't work either. (not that
I claim to understand it, but what of it that I do understand, it won't
work)

The problem is that it's just not that simple of a conversion. For
example, 0.69 in binary is 0.1011000010100011, however 69 in binary is
1000101, and as you see, there's just no similarity in the bit patterns.

Now if you took the binary form of 69 and divided it by 100 (decimal),
then you'd have the correct bit pattern. However, that's basically just
doing what I said, except that it breaks the number into two halves and
does each seperately. I fail to see how going through all of that trouble
is any easier than the method I outlined.

>> if(precision<c.len)
>> {k=c.len-precision;
>> c>>= 32*k;
>> }

You particularly want to account for the cases with lots of extra decimal
places. For example, if I were doing 32-bit numbers with 8 fractional
bits, I'd likely be reading the number into a 64-bit integer. The integer
part is 24 bits, so that means I have 40 bits left over that I can toss
extra decimal places into.

So say I get this number as input:
3.1415926535897932384626433832795028841971693993751

Those 40 bits can hold log_10(2^40) or 12.04119983 decimal digits. (Or if
you wanted to do it the easy way, you'd just divide 40 by 3.321928095 and
get the same number.) So what I'd do in my number reading routine is make
it just always read 12 digits after the decimal point.

So if I gave it the above number, it would treat it as 3.141592653589, and
if I gave it 3, it would treat it as 3.000000000000. So it works like
this:

1. let x = 0 and let c = 12
2. Read next character.
3. If character is a number, multiply x by 10 and then add that number.
4. If character is a decimal point, go to step 10.
5. If character is anything else, go to step 20.
6. Go to step 2.

10. Read next character
11. If character is a number, multiply x by 10 and then add that number.
12. If character is anything else, go to step 20.
13. Subtract 1 from c.
14. If c == 0, go to step 30.
15. Go to step 10.

20. Multiply x by 10.
21. Subtract 1 from c.
22. If c != 0, go to step 20.

30. Shift x left by 8 bits (or whatever your "32*precision" is).
31. Add 500000000000 (half of 10^12) to x.
32. Divide x by 1000000000000 (which is 10^12)
33. Rejoice that you now have the number in fixed point format.

So steps 1 through 22 convert 3.141592653589 to 0x2DB75839F15.

You may notice that the next digit in pi is a 7, but that I didn't provide
for rounding up the 9. This is because so many of those digits aren't
even going to make it into the final number that the error caused by not
doing so is very very very small. Like 0.00000001% small. So it's just
not worth the trouble.

Step 30 adds in the normal "<< 8" format of the numbers, which turns it
into 0x2DB75839F1500. Then step 31 takes care of rounding. Then step 32
moves the decimal point 12 places left, turning the number into 0x324.
That makes it 11.00100100 in binary, which in decimal is 3.140625, which
is as close to pi as you're going to get with only 8 fractional bits. So
for the most part, most of those 12 digits we looked at were unnecessary,
however on extremely rare cases they actually do make a difference, for
example 1.005859374 would become 0x101, but 1.005859376 would become
0x102. Not really worth making a fuss over, but as long as you've got
those extra 40 bits and can input 12 digits, you might as well.

From: ?a/b on 13 Oct 2005 12:45

On Tue, 11 Oct 2005 10:48:35 GMT, "Richard Cooper"
<spamandviruses(a)rr.com> wrote:
>On Tue, 11 Oct 2005 03:04:28 -0400, ?a\/b <al(a)f.g> wrote:
>>> i think it is more easy than what you say
>>> for input something like this:
>>> if Lstring is the string of numbers and "a" is the number
>>> result
>>> fnum a, b;
>>> unum c;
>>> char *p1, *p2;
>>> b.sn=read(Lstring, p1);
>>
>> a=b<<(32*precision);
>>
>>> if(*p1==".") ++p1;
>>> else goto label;
>>> c=read(p1, p2);
>>
>>> if(p1==p2) {label:;
>>
>>> return a;
>>> }
>>
>>
>>> if(precision<c.len)
>>> {k=c.len-precision;
>>> c>>= 32*k;
>>> }
>>> a+=c;
>>> return a;
>>>
>>> for output, something like above, should be possible
>
>That certainly doesn't look any easier. It won't work either. (not that
>I claim to understand it, but what of it that I do understand, it won't
>work)

yes you are right

>The problem is that it's just not that simple of a conversion. For
>example, 0.69 in binary is 0.1011000010100011, however 69 in binary is
>1000101, and as you see, there's just no similarity in the bit patterns.
>
>Now if you took the binary form of 69 and divided it by 100 (decimal),
>then you'd have the correct bit pattern. However, that's basically just
>doing what I said, except that it breaks the number into two halves and
>does each seperately. I fail to see how going through all of that trouble
>is any easier than the method I outlined.
>
>>> if(precision<c.len)
>>> {k=c.len-precision;
>>> c>>= 32*k;
>>> }
>
>You particularly want to account for the cases with lots of extra decimal
>places. For example, if I were doing 32-bit numbers with 8 fractional
>bits, I'd likely be reading the number into a 64-bit integer. The integer
>part is 24 bits, so that means I have 40 bits left over that I can toss
>extra decimal places into.
>
>So say I get this number as input:
>3.1415926535897932384626433832795028841971693993751
>
>Those 40 bits can hold log_10(2^40) or 12.04119983 decimal digits. (Or if

this is false 40 bits hold [log_10(2^40)+1]=13 decimal digits
(log_10(128) = log_10(2^7)= 2.1 != 3)

>you wanted to do it the easy way, you'd just divide 40 by 3.321928095 and
>get the same number.) So what I'd do in my number reading routine is make
>it just always read 12 digits after the decimal point.

the problem is not so complex and i have found a easy "formula"
that find soon the array of a big fixed point integer
For inverse problems i use the same formula.

but there is a problem in output of numbers that has many '0's like
5.00000000012 -> 5.00000000000

and this is not a good thing for a 64 bits of fraction data

From: ?a/b on 14 Oct 2005 01:42

On Thu, 13 Oct 2005 18:45:43 +0200, "?a\\/b" <al(a)f.g> wrote:
>On Tue, 11 Oct 2005 10:48:35 GMT, "Richard Cooper"
><spamandviruses(a)rr.com> wrote:
....
>but there is a problem in output of numbers that has many '0's like
>5.00000000012 -> 5.00000000000
>
>and this is not a good thing for a 64 bits of fraction data

found the origin of the problem and found the cure for that.
So there are +-*/ < <= > >= == != >>(stream, fnum) <<(stream, fnum)
for fixed point number. Then there is the need of something like
double to fnum and fnum to double ... and do operatore +, -, etc
double fnum seems "una passeggiata di salute"

From: ?a/b on 14 Oct 2005 01:43

From: Richard Cooper on 15 Oct 2005 04:57

On Sat, 15 Oct 2005 02:19:23 -0400, ?a\/b <al(a)f.g> wrote:

>>> this is false 40 bits hold [log_10(2^40)+1]=13 decimal digits
>>
>> Nope, I'm right.
>
> no you are wrong [log_10(x)+1] where []="integer part" is how many
> digits is the number x. For x=2^40 the digits are 13 =[log_10(2^40)+1]

But "how many digits is the number 2^x" isn't the same as "how many digits
will x bits hold"...

Look again at what I said:

>> 2^40 = 1099511627776
>> 10^12 = 1000000000000
>> 10^13 = 10000000000000
>>
>> So, 13 decimal digits are capable of holding more possible values than
>> 40 bits can hold, however, 40 bits can hold slightly more than what 12
>> decimal places can hold. So I think my answer of 12.04119983 is a
>> little more correct, as 40 bits can hold as many values as 12 decimal
>> digits, plus just a little bit more.

Maybe it's time for another example:

0.0000000000001 (1e-13) in binary is
0.00000000000000000000000000000000000000000001110000100110...
00000000011111111112222222222333333333344444444445555555
12345678901234567890123456789012345678901234567890123456

0.0000000000002 (2e-13) in binary is
0.00000000000000000000000000000000000000000011100001001100...
00000000011111111112222222222333333333344444444445555555
12345678901234567890123456789012345678901234567890123456

Less than 43 bits isn't going to differentiate those two numbers. That's
to be expected since log_2(10^13) is 43.18506523, indicating that 44 bits
is required for 13-digit decimal numbers.

(Of course, just as 43 bits is enough for these two numbers, 43 bits will
differentiate some other 13 digit numbers too, 12% of them actually, the
other 88% will share the same binary representation as a neighboring
number, just like 1e-13 and 2e-13. You need 44 bits for 100% of the 13
digit numbers to each have a distinct binary representation.)

> if i have the decimal number is string format
>
> "-1234568788887877.0000839393948450154454454"
>
> then integer_number=-1234568788887877
>
> i have the fractional part of the number e.g.
> s ="0.0000839393948450154454454"
> if r = 839393948450154454454; like big integer
> the "formula" *seems* to be this:
>
> precision= precision_in_bits_of_fractional_part
>
> (precision)
> 2 * r
> Fraction_number = [-------------------------------------]
> (strlen(s)-2)
> 10

That's what I said to do, except that you've made twice as much work out
of it.

What you're doing is, by ignoring the "0." at the beginning of s, you're
multiplying the fractional part by 10 ^ (strlen(s)-2). Then, when you
multiply by 2 ^ precision, you shift it left by precision. Then you
divide it by 10 ^ (strlen(s)-2), which is the same number that you
effectively multiplied it by earlier.

So we've got what you're doing:

1. Seperate number into integer and fraction parts.
2. Convert integer part to binary.
3. Shift integer part left by precision.
4. Convert fraction part to binary as if it were an integer.
5. Shift fraction part left by precision.
6. Divide fraction part by what step 4 effectively multiplied it by.
7. Add or subtract together the integer and fraction parts.

Then there's what I said to do:

1. Convert entire number to binary as if it were an integer.
2. Shift entire number left by precision.
3. Divide number by what step 1 effectively multiplied it by.

Steps 2 and 3 of what you're doing are the same process as steps 4 and 5,
so if you just skip step 1 then you get to do them both at the same time,
and as an added bonus, you no longer need step 7 to combine the two
numbers back together again. So you effectively cut it down to just steps
4, 5, and 6, which make it just like my steps 1, 2 and 3.

You may, however, just want to stick with what you've got, as it may have
optimizational benefits depending on the number of bits in the numbers.
I'm _still_ not sure what you're doing, so I can't say which way is faster
in your particular case, but it depends on wether the divide in step 6 of
your method is signifigantly easier than the divide in step 3 of my method.

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7
Prev: HLA wont compile!!
Next: Structs in Assembly