Prev: HLA wont compile!!
Next: Structs in Assembly
From: Richard Cooper on 11 Oct 2005 06:48 On Tue, 11 Oct 2005 03:04:28 -0400, ?a\/b <al(a)f.g> wrote: > this should be a/b =(a.sn.sn<<(32*precision))/b.sn; ? Yes. It's because when you store a number as a fixed point number, all that you are really doing is storing the number multiplied by a constant value. We usually make that constant 256 or 65536 or something of that sort, but it could just as well be 7, 69, 111, anything. Now, that fourth grade math I was talking about... When I was in fourth grade, we were taught how to do long division with decimal numbers. We were doing problems like this: _____ 1.2 ) 37 The way we were taught to do this was to convert 1.2 to an integer, by multiplying it by 10 enough times to do so. So for this particular problem, we would multiply it by 10 once to get 12. So that we came up with the correct answer, we also multiplied 37 by 10, so that the problem looked like this: ______ 12 ) 370 Then it was just an ordinary integer division problem. This is similar to what we're doing with fixed point numbers. We're multiplying them so that they become integers, and then preforming operations on integers instead. Now in 4th grade, we multiplied both numbers in the problem by the same amount, because we wanted to get the correct answer when we were done. Doing the division would remove the constant that we multiplied each number by, so that the answer we got was the correct one. For a division problem, "n / d", we basically turned it into "(10 * n) / (10 * d)" and of course the "10 / 10" part of that doesn't affect the answer since 10 / 10 is 1, and so the answer came out multiplied by 1. Now in fixed point math we actually want the answer to also be multiplied by 10 (or whatever). So we've got to do the equivelant of "(10 * 10 * n) / (10 * d)" so that the 10 in the denominator eliminates one of the 10s in the numerator, leaving us with the other 10 that we want the answer to be multiplied by. Now before you had written: a/b =(a.sn.sn<<(2*32*precision))/b.sn; Which is basically: a/b =(a.sn.sn<<(32*precision)<<(32*precision))/b.sn; A single "<<(32*precision)" is just like a "* 10" The problem with this is that a.sn.sn (what's with the two 'sn' there anyway?) already has one "<<(32*precision)" applied to it when it was converted to the fixed point format. So even though we want the numerator to be "n *10 * 10", it's representation in the floating point format is already "* 10" so it only needs one more "* 10" to make it "* 10 * 10". The same is true with the denominator, which also needs to be "* 10" for it to all work out, but it's already "* 10" because it's in fixed point format, so we don't do anything to it. >> i think it is more easy than what you say >> for input something like this: >> if Lstring is the string of numbers and "a" is the number >> result >> fnum a, b; >> unum c; >> char *p1, *p2; >> b.sn=read(Lstring, p1); > > a=b<<(32*precision); > >> if(*p1==".") ++p1; >> else goto label; >> c=read(p1, p2); > >> if(p1==p2) {label:; > >> return a; >> } > > >> if(precision<c.len) >> {k=c.len-precision; >> c>>= 32*k; >> } >> a+=c; >> return a; >> >> for output, something like above, should be possible That certainly doesn't look any easier. It won't work either. (not that I claim to understand it, but what of it that I do understand, it won't work) The problem is that it's just not that simple of a conversion. For example, 0.69 in binary is 0.1011000010100011, however 69 in binary is 1000101, and as you see, there's just no similarity in the bit patterns. Now if you took the binary form of 69 and divided it by 100 (decimal), then you'd have the correct bit pattern. However, that's basically just doing what I said, except that it breaks the number into two halves and does each seperately. I fail to see how going through all of that trouble is any easier than the method I outlined. >> if(precision<c.len) >> {k=c.len-precision; >> c>>= 32*k; >> } You particularly want to account for the cases with lots of extra decimal places. For example, if I were doing 32-bit numbers with 8 fractional bits, I'd likely be reading the number into a 64-bit integer. The integer part is 24 bits, so that means I have 40 bits left over that I can toss extra decimal places into. So say I get this number as input: 3.1415926535897932384626433832795028841971693993751 Those 40 bits can hold log_10(2^40) or 12.04119983 decimal digits. (Or if you wanted to do it the easy way, you'd just divide 40 by 3.321928095 and get the same number.) So what I'd do in my number reading routine is make it just always read 12 digits after the decimal point. So if I gave it the above number, it would treat it as 3.141592653589, and if I gave it 3, it would treat it as 3.000000000000. So it works like this: 1. let x = 0 and let c = 12 2. Read next character. 3. If character is a number, multiply x by 10 and then add that number. 4. If character is a decimal point, go to step 10. 5. If character is anything else, go to step 20. 6. Go to step 2. 10. Read next character 11. If character is a number, multiply x by 10 and then add that number. 12. If character is anything else, go to step 20. 13. Subtract 1 from c. 14. If c == 0, go to step 30. 15. Go to step 10. 20. Multiply x by 10. 21. Subtract 1 from c. 22. If c != 0, go to step 20. 30. Shift x left by 8 bits (or whatever your "32*precision" is). 31. Add 500000000000 (half of 10^12) to x. 32. Divide x by 1000000000000 (which is 10^12) 33. Rejoice that you now have the number in fixed point format. So steps 1 through 22 convert 3.141592653589 to 0x2DB75839F15. You may notice that the next digit in pi is a 7, but that I didn't provide for rounding up the 9. This is because so many of those digits aren't even going to make it into the final number that the error caused by not doing so is very very very small. Like 0.00000001% small. So it's just not worth the trouble. Step 30 adds in the normal "<< 8" format of the numbers, which turns it into 0x2DB75839F1500. Then step 31 takes care of rounding. Then step 32 moves the decimal point 12 places left, turning the number into 0x324. That makes it 11.00100100 in binary, which in decimal is 3.140625, which is as close to pi as you're going to get with only 8 fractional bits. So for the most part, most of those 12 digits we looked at were unnecessary, however on extremely rare cases they actually do make a difference, for example 1.005859374 would become 0x101, but 1.005859376 would become 0x102. Not really worth making a fuss over, but as long as you've got those extra 40 bits and can input 12 digits, you might as well.
From: ?a/b on 13 Oct 2005 12:45 On Tue, 11 Oct 2005 10:48:35 GMT, "Richard Cooper" <spamandviruses(a)rr.com> wrote: >On Tue, 11 Oct 2005 03:04:28 -0400, ?a\/b <al(a)f.g> wrote: >>> i think it is more easy than what you say >>> for input something like this: >>> if Lstring is the string of numbers and "a" is the number >>> result >>> fnum a, b; >>> unum c; >>> char *p1, *p2; >>> b.sn=read(Lstring, p1); >> >> a=b<<(32*precision); >> >>> if(*p1==".") ++p1; >>> else goto label; >>> c=read(p1, p2); >> >>> if(p1==p2) {label:; >> >>> return a; >>> } >> >> >>> if(precision<c.len) >>> {k=c.len-precision; >>> c>>= 32*k; >>> } >>> a+=c; >>> return a; >>> >>> for output, something like above, should be possible > >That certainly doesn't look any easier. It won't work either. (not that >I claim to understand it, but what of it that I do understand, it won't >work) yes you are right >The problem is that it's just not that simple of a conversion. For >example, 0.69 in binary is 0.1011000010100011, however 69 in binary is >1000101, and as you see, there's just no similarity in the bit patterns. > >Now if you took the binary form of 69 and divided it by 100 (decimal), >then you'd have the correct bit pattern. However, that's basically just >doing what I said, except that it breaks the number into two halves and >does each seperately. I fail to see how going through all of that trouble >is any easier than the method I outlined. > >>> if(precision<c.len) >>> {k=c.len-precision; >>> c>>= 32*k; >>> } > >You particularly want to account for the cases with lots of extra decimal >places. For example, if I were doing 32-bit numbers with 8 fractional >bits, I'd likely be reading the number into a 64-bit integer. The integer >part is 24 bits, so that means I have 40 bits left over that I can toss >extra decimal places into. > >So say I get this number as input: >3.1415926535897932384626433832795028841971693993751 > >Those 40 bits can hold log_10(2^40) or 12.04119983 decimal digits. (Or if this is false 40 bits hold [log_10(2^40)+1]=13 decimal digits (log_10(128) = log_10(2^7)= 2.1 != 3) >you wanted to do it the easy way, you'd just divide 40 by 3.321928095 and >get the same number.) So what I'd do in my number reading routine is make >it just always read 12 digits after the decimal point. the problem is not so complex and i have found a easy "formula" that find soon the array of a big fixed point integer For inverse problems i use the same formula. but there is a problem in output of numbers that has many '0's like 5.00000000012 -> 5.00000000000 and this is not a good thing for a 64 bits of fraction data
From: ?a/b on 14 Oct 2005 01:42 On Thu, 13 Oct 2005 18:45:43 +0200, "?a\\/b" <al(a)f.g> wrote: >On Tue, 11 Oct 2005 10:48:35 GMT, "Richard Cooper" ><spamandviruses(a)rr.com> wrote: .... >but there is a problem in output of numbers that has many '0's like >5.00000000012 -> 5.00000000000 > >and this is not a good thing for a 64 bits of fraction data found the origin of the problem and found the cure for that. So there are +-*/ < <= > >= == != >>(stream, fnum) <<(stream, fnum) for fixed point number. Then there is the need of something like double to fnum and fnum to double ... and do operatore +, -, etc double fnum seems "una passeggiata di salute"
From: ?a/b on 14 Oct 2005 01:43 On Thu, 13 Oct 2005 18:45:43 +0200, "?a\\/b" <al(a)f.g> wrote: >On Tue, 11 Oct 2005 10:48:35 GMT, "Richard Cooper" ><spamandviruses(a)rr.com> wrote: .... >but there is a problem in output of numbers that has many '0's like >5.00000000012 -> 5.00000000000 > >and this is not a good thing for a 64 bits of fraction data found the origin of the problem and found the cure for that. So there are +-*/ < <= > >= == != >>(stream, fnum) <<(stream, fnum) for fixed point number. Then there is the need of something like double to fnum and fnum to double ... and do operatore +, -, etc double fnum seems "una passeggiata di salute"
From: Richard Cooper on 15 Oct 2005 04:57
On Sat, 15 Oct 2005 02:19:23 -0400, ?a\/b <al(a)f.g> wrote: >>> this is false 40 bits hold [log_10(2^40)+1]=13 decimal digits >> >> Nope, I'm right. > > no you are wrong [log_10(x)+1] where []="integer part" is how many > digits is the number x. For x=2^40 the digits are 13 =[log_10(2^40)+1] But "how many digits is the number 2^x" isn't the same as "how many digits will x bits hold"... Look again at what I said: >> 2^40 = 1099511627776 >> 10^12 = 1000000000000 >> 10^13 = 10000000000000 >> >> So, 13 decimal digits are capable of holding more possible values than >> 40 bits can hold, however, 40 bits can hold slightly more than what 12 >> decimal places can hold. So I think my answer of 12.04119983 is a >> little more correct, as 40 bits can hold as many values as 12 decimal >> digits, plus just a little bit more. Maybe it's time for another example: 0.0000000000001 (1e-13) in binary is 0.00000000000000000000000000000000000000000001110000100110... 00000000011111111112222222222333333333344444444445555555 12345678901234567890123456789012345678901234567890123456 0.0000000000002 (2e-13) in binary is 0.00000000000000000000000000000000000000000011100001001100... 00000000011111111112222222222333333333344444444445555555 12345678901234567890123456789012345678901234567890123456 Less than 43 bits isn't going to differentiate those two numbers. That's to be expected since log_2(10^13) is 43.18506523, indicating that 44 bits is required for 13-digit decimal numbers. (Of course, just as 43 bits is enough for these two numbers, 43 bits will differentiate some other 13 digit numbers too, 12% of them actually, the other 88% will share the same binary representation as a neighboring number, just like 1e-13 and 2e-13. You need 44 bits for 100% of the 13 digit numbers to each have a distinct binary representation.) > if i have the decimal number is string format > > "-1234568788887877.0000839393948450154454454" > > then integer_number=-1234568788887877 > > i have the fractional part of the number e.g. > s ="0.0000839393948450154454454" > if r = 839393948450154454454; like big integer > the "formula" *seems* to be this: > > precision= precision_in_bits_of_fractional_part > > (precision) > 2 * r > Fraction_number = [-------------------------------------] > (strlen(s)-2) > 10 That's what I said to do, except that you've made twice as much work out of it. What you're doing is, by ignoring the "0." at the beginning of s, you're multiplying the fractional part by 10 ^ (strlen(s)-2). Then, when you multiply by 2 ^ precision, you shift it left by precision. Then you divide it by 10 ^ (strlen(s)-2), which is the same number that you effectively multiplied it by earlier. So we've got what you're doing: 1. Seperate number into integer and fraction parts. 2. Convert integer part to binary. 3. Shift integer part left by precision. 4. Convert fraction part to binary as if it were an integer. 5. Shift fraction part left by precision. 6. Divide fraction part by what step 4 effectively multiplied it by. 7. Add or subtract together the integer and fraction parts. Then there's what I said to do: 1. Convert entire number to binary as if it were an integer. 2. Shift entire number left by precision. 3. Divide number by what step 1 effectively multiplied it by. Steps 2 and 3 of what you're doing are the same process as steps 4 and 5, so if you just skip step 1 then you get to do them both at the same time, and as an added bonus, you no longer need step 7 to combine the two numbers back together again. So you effectively cut it down to just steps 4, 5, and 6, which make it just like my steps 1, 2 and 3. You may, however, just want to stick with what you've got, as it may have optimizational benefits depending on the number of bits in the numbers. I'm _still_ not sure what you're doing, so I can't say which way is faster in your particular case, but it depends on wether the divide in step 6 of your method is signifigantly easier than the divide in step 3 of my method. |