How many "Fast Math" routines for XL/XE are there?

thorfdbg · June 24, 2013

Not necessarily, because it does help testing most of your assumptions or implementation observations (above), from one system to another... Cases like sin( ) are particularly important or difficult when hovering a "zero" argument either from "above" (slight higher than zero) or from "below" (slightly lower than Pi).

Well, I would rather say that any *sane* implementation of sin() should do exactly as I described. That Atari BASIC does something insane probably proves the point.

I intuivitely think that something is wrong with Atari's ROM FP rounding code, besides other embedded numerical code / approximations. AHL's benchmark is yielding in abysmal performance for a 8-10 BCD digits system. Edit: Even a cyclic 1/(1/7) or 1/(1/9) will come back to "7" or "9" exact, which seems really odd in a limited-precision BCD system.

That 1/(1/7) returns 7 precisely is just a side-result of how the division works, and of improper (or lucky) rounding. 1/9 does not return a precise result (try 9*(1/9)), but 1/(1/9) returns something a bit larger than 9, and the extra digit is just chopped off.

thorfdbg · June 24, 2013

So here is a really simple (yet revealing) basic program that basically performs C=1/(1/n) for n=1...100, and calculates the Error/Delta as (C-n). C should, in reality, be a VERY close number to n, for any n (for the most part, as some cases will yield in exact inverse).

I will later port the above code to the 48GX and run the test using LongFloat library set to an arbitrary precision of nine (9) digits which should be pretty close to Atari's FP math, and come back with such results.

All in all, I am seeing PRETTY LARGE errors on the above list, though, assuming ~9-10 digits BCD math.

There is nothing wrong with the results above. A ten-digit precise computation (as for example performed by the Linux 'bc' program) returns the same result:

38-1/(1/38)

-.0000001064

so that's actually ok. Division and multiplication in floating point are not the problematic operations. Subtraction is, because it can result in cancelation, and may return a completely useless result (i.e. no relevant digits left). Without any guard digits, which the original FP ROM did not have, one can get into instable regions quite quickly.

Faicuai · June 24, 2013

There is nothing wrong with the results above. A ten-digit precise computation (as for example performed by the Linux 'bc' program) returns the same result:

38-1/(1/38)

-.0000001064

so that's actually ok.

Well, just to give you an idea of how far off the Atari list (and your Linux "bc" output) may be:

1. On the 48GX, I set LongFloat library's global precision to 10 digits.

2. Ran the full integer-inverse test, from 1 to 100, with a bottom-of-the-barrel 1-100 For-Next and invoking LongFloat FINV, FSUB commands.

3. Here's what I got:

The LARGEST delta I was able to extract from any given (Integer-Inv(Inv(Integer)) calculation was 3x10^-8 (0.00000003). In contrast, the largest delta of such computation found on the Atari list goes up to 8x10^-7, which is 26.67x larger error than the above (worse).
38-Inv(Inv(38)) equals to 0.00000001, which is compared to your Linux' 10-digit BCD computation is 10.64x smaller (better).
The TOTAL cumulative error sum (all 100 numbers tested) adds up to -0.000000178 which, compared to Atari's 0.000011471, represents a 64.44x SMALLER (better) cumulative error (again,we are talking about what is supposed to be a 10-digit BCD system).

So after seeing how LARGE the above errors can be, I can only infer a few things:

1. My reference (48GX) is actually using more digits than those set in LongFloat's global "Digits" variable (which I set to 10).

2. Or, the Atari's BCD implementation is definitely NOT a 10-digit BCD (but something closer to 8-9 digits, max).

3. Or, the Atari's BCD implementation is a 10-digit BCD with relatively poor management of rounding or basic Add/Sub/Mul/Div operations.

4. Or a combination of some of the above.

At this point, I would like to see actual evidence or mathematical reasoning of why the Atari's FP package is outputting such large differences, in such basic operations.

thorfdbg · June 25, 2013

Well, just to give you an idea of how far off the Atari list (and your Linux "bc" output) may be:

1. On the 48GX, I set LongFloat library's global precision to 10 digits.

2. Ran the full integer-inverse test, from 1 to 100, with a bottom-of-the-barrel 1-100 For-Next and invoking LongFloat FINV, FSUB commands.

3. Here's what I got:

The LARGEST delta I was able to extract from any given (Integer-Inv(Inv(Integer)) calculation was 3x10^-8 (0.00000003). In contrast, the largest delta of such computation found on the Atari list goes up to 8x10^-7, which is 26.67x larger error than the above (worse).

38-Inv(Inv(38)) equals to 0.00000001, which is compared to your Linux' 10-digit BCD computation is 10.64x smaller (better).

'bc' is actually an arbitrary precision calculator, I just gave it 10 significant digits to work with.

38-1/(1/38)

-.0000001064

should actually be correct. The question is *when* you round. I'm not quite clear how the bc algorithm works precisely, though. That is, whether the division is carried out in ten digits, or whether the result is rounded to ten digits. If you round to ten digits, then your result is correct, as maple also says:

> evalf(1/evalf(1/38,10),10);

38.00000001

The TOTAL cumulative error sum (all 100 numbers tested) adds up to -0.000000178 which, compared to Atari's 0.000011471, represents a 64.44x SMALLER (better) cumulative error (again,we are talking about what is supposed to be a 10-digit BCD system).

So after seeing how LARGE the above errors can be, I can only infer a few things:

1. My reference (48GX) is actually using more digits than those set in LongFloat's global "Digits" variable (which I set to 10).

It's using a different division algorithm which ensures you an *output* precision of ten digits. Which is something different what the BCD math in the math pack does. It only has an intermediate precision of at most ten digits.

2. Or, the Atari's BCD implementation is definitely NOT a 10-digit BCD (but something closer to 8-9 digits, max).

It's neither. It is a 5-digit math to the base of 100. That's not the same as a 10-digit base 10 math. Round-off in Mathpack BCD destroys two digits, not one.

3. Or, the Atari's BCD implementation is a 10-digit BCD with relatively poor management of rounding or basic Add/Sub/Mul/Div operations.

4. Or a combination of some of the above.

That, too. (-: It uses (unless you use the Os++ implementation) a chunk-off rounding policy, which is close to the stupidest thing one can do.

At this point, I would like to see actual evidence or mathematical reasoning of why the Atari's FP package is outputting such large differences, in such basic operations.

Would the source be evidence enough? Division is actually the pen-and-paper algorithm, no guard digits, nothing.

Faicuai · June 27, 2013

It's neither. It is a 5-digit math to the base of 100. That's not the same as a 10-digit base 10 math. Round-off in Mathpack BCD destroys two digits, not one.

That, too. (-: It uses (unless you use the Os++ implementation) a chunk-off rounding policy, which is close to the stupidest thing one can do.

Would the source be evidence enough? Division is actually the pen-and-paper algorithm, no guard digits, nothing.

Thanks, APPRECIATE the feedback.

So 5-digits 100-base BCD system... WTF (!?) Is this because of trying to directly leverage how the 6502 works around math, or... because of someone went "high" on weeds when implementing this section of ROM?

All-in-all, this is an area of the system that (in my humble opinion) requires TRUE re-work (consistency and precision, before speed, though). Not sure if having things crammed in 2Kbytes of code is the actual driver / culprit of the final product (I did not see better results from Newell code, although I need to test a bit better).

Time to re-write history, again, if OS++ has not done so, yet.

thorfdbg · June 27, 2013

Thanks, APPRECIATE the feedback.

So 5-digits 100-base BCD system... WTF (!?) Is this because of trying to directly leverage how the 6502 works around math, or... because of someone went "high" on weeds when implementing this section of ROM?

Well, the point is that a base 100 system is so simple when using BCD math. Shift out a byte, adjust the exponent by one. A base-ten system would have required mantissa adjustments that shift by nibbles, which would have made the whole system more complicated (and slower).

All-in-all, this is an area of the system that (in my humble opinion) requires TRUE re-work (consistency and precision, before speed, though). Not sure if having things crammed in 2Kbytes of code is the actual driver / culprit of the final product (I did not see better results from Newell code, although I need to test a bit better).

The trouble is backwards compatibility. If it would be done properly, then a proper base-2 system would not only potentially use more ROM space (I haven't tried, though). It would also mean that any BASIC program with floating point constants in its body (so, every single one, essentially) would no longer work when LOADed from disk because the number format changed. And converting from and to BCD is not a realistic option either. It would also impact precision, and speed a lot.

Yes, you can write better mathpacks, even on a 6502, but the numerically most feasible options to not satisfy the constraints of portability.

Time to re-write history, again, if OS++ has not done so, yet.

Well, I don't think it's the last word in this area. It also had to work with constraints, namely I did not use the luxery of using other parts of the ROM for the mathpack. If you can loosen that, you could make it much better (and faster). That said, it probably avoided the most desasterous mistake of the original ROM, as in using an outright stupid rounding mode, and inadequate polynomial approximations for LOG and EXP. But given that the number format itself is just a very lousy choice, that's very little what one could do in principle.

luckybuck · September 9, 2013

I never knew about this either - one more great piece of software by Carol Shaw. Is it safe to assume then, that this is not using the FP in ROM? I suppose I should just download it and set some breakpoints in the emu.

Yes, you are right, Calculator does not use the FPs from the Atari OS. I am in direct contact to Carol.

Faicuai · September 10, 2013

Yes, you are right, Calculator does not use the FPs from the Atari OS. I am in direct contact to Carol.

Exactly what I thought from the get-go.

Time for some vis-à-vis ROM-vs-Calculator testing (hopefully by weekend). I will post here.

Kyle22 · November 26, 2013

Does anyone know where I can find OSNXL? I want to install it into my Incognito.

Thanks

-K

BillC · November 26, 2013

@Kyle: Try the ZIP file linked in this post(it's called Omnimon XL instead of OSNXL):

http://atariage.com/forums/topic/168769-omnimonxl-and-omniview-xe-roms/?do=findComment&comment=2237495

Kyle22 · November 26, 2013

Thanks for the file, however, I didn't see one named OSNXL. I suspect it is the same as the newest version of OmniMon XL, but I'm not sure. It's been too long...

I am trying to re-learn everything I have forgotten over the years, and it is difficult

-K

BillC · November 26, 2013

Thanks for the file, however, I didn't see one named OSNXL. I suspect it is the same as the newest version of OmniMon XL, but I'm not sure. It's been too long...

I am trying to re-learn everything I have forgotten over the years, and it is difficult

-K

I believe most users/sites just called it OmnimonXL, I know that's what I thought it was called until recently. OmnimonXL and FastchipXL are components of the OSNXL OS.

Kyle22 · November 26, 2013

so, all the varities of OmniMon XL contain Fastchip? That's what I am looking for. I guess I could type in a simple test program to find out...

EDIT: I did, and it seems a little faster. a 1 to 100 loop with multiply and sqr ran 3 seconds faster.

10 for x=1 to 100:a=x*x:b=sqr(a):?x,a,b:next x

Edited November 26, 2013 by Kyle22

How many "Fast Math" routines for XL/XE are there?

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members