Jump to content
IGNORED

How many "Fast Math" routines for XL/XE are there?


Larry

Recommended Posts

Not necessarily, because it does help testing most of your assumptions or implementation observations (above), from one system to another... Cases like sin( ) are particularly important or difficult when hovering a "zero" argument either from "above" (slight higher than zero) or from "below" (slightly lower than Pi).

Well, I would rather say that any *sane* implementation of sin() should do exactly as I described. That Atari BASIC does something insane probably proves the point.

 

I intuivitely think that something is wrong with Atari's ROM FP rounding code, besides other embedded numerical code / approximations. AHL's benchmark is yielding in abysmal performance for a 8-10 BCD digits system. Edit: Even a cyclic 1/(1/7) or 1/(1/9) will come back to "7" or "9" exact, which seems really odd in a limited-precision BCD system.

That 1/(1/7) returns 7 precisely is just a side-result of how the division works, and of improper (or lucky) rounding. 1/9 does not return a precise result (try 9*(1/9)), but 1/(1/9) returns something a bit larger than 9, and the extra digit is just chopped off.

Link to comment
Share on other sites

So here is a really simple (yet revealing) basic program that basically performs C=1/(1/n) for n=1...100, and calculates the Error/Delta as (C-n). C should, in reality, be a VERY close number to n, for any n (for the most part, as some cases will yield in exact inverse).

 

I will later port the above code to the 48GX and run the test using LongFloat library set to an arbitrary precision of nine (9) digits which should be pretty close to Atari's FP math, and come back with such results.

 

All in all, I am seeing PRETTY LARGE errors on the above list, though, assuming ~9-10 digits BCD math.

 

There is nothing wrong with the results above. A ten-digit precise computation (as for example performed by the Linux 'bc' program) returns the same result:

 

38-1/(1/38)

-.0000001064

 

so that's actually ok. Division and multiplication in floating point are not the problematic operations. Subtraction is, because it can result in cancelation, and may return a completely useless result (i.e. no relevant digits left). Without any guard digits, which the original FP ROM did not have, one can get into instable regions quite quickly.

Link to comment
Share on other sites

There is nothing wrong with the results above. A ten-digit precise computation (as for example performed by the Linux 'bc' program) returns the same result:

 

38-1/(1/38)

-.0000001064

 

so that's actually ok.

 

 

 

Well, just to give you an idea of how far off the Atari list (and your Linux "bc" output) may be:

 

1. On the 48GX, I set LongFloat library's global precision to 10 digits.

2. Ran the full integer-inverse test, from 1 to 100, with a bottom-of-the-barrel 1-100 For-Next and invoking LongFloat FINV, FSUB commands.

3. Here's what I got:

  • The LARGEST delta I was able to extract from any given (Integer-Inv(Inv(Integer)) calculation was 3x10^-8 (0.00000003). In contrast, the largest delta of such computation found on the Atari list goes up to 8x10^-7, which is 26.67x larger error than the above (worse).
  • 38-Inv(Inv(38)) equals to 0.00000001, which is compared to your Linux' 10-digit BCD computation is 10.64x smaller (better).
  • The TOTAL cumulative error sum (all 100 numbers tested) adds up to -0.000000178 which, compared to Atari's 0.000011471, represents a 64.44x SMALLER (better) cumulative error (again,we are talking about what is supposed to be a 10-digit BCD system).

So after seeing how LARGE the above errors can be, I can only infer a few things:

 

1. My reference (48GX) is actually using more digits than those set in LongFloat's global "Digits" variable (which I set to 10).

2. Or, the Atari's BCD implementation is definitely NOT a 10-digit BCD (but something closer to 8-9 digits, max).

3. Or, the Atari's BCD implementation is a 10-digit BCD with relatively poor management of rounding or basic Add/Sub/Mul/Div operations.

4. Or a combination of some of the above.

 

At this point, I would like to see actual evidence or mathematical reasoning of why the Atari's FP package is outputting such large differences, in such basic operations.

Link to comment
Share on other sites

Well, just to give you an idea of how far off the Atari list (and your Linux "bc" output) may be:

 

1. On the 48GX, I set LongFloat library's global precision to 10 digits.

2. Ran the full integer-inverse test, from 1 to 100, with a bottom-of-the-barrel 1-100 For-Next and invoking LongFloat FINV, FSUB commands.

3. Here's what I got:

  • The LARGEST delta I was able to extract from any given (Integer-Inv(Inv(Integer)) calculation was 3x10^-8 (0.00000003). In contrast, the largest delta of such computation found on the Atari list goes up to 8x10^-7, which is 26.67x larger error than the above (worse).
  • 38-Inv(Inv(38)) equals to 0.00000001, which is compared to your Linux' 10-digit BCD computation is 10.64x smaller (better).

'bc' is actually an arbitrary precision calculator, I just gave it 10 significant digits to work with.

 

38-1/(1/38)

-.0000001064

 

should actually be correct. The question is *when* you round. I'm not quite clear how the bc algorithm works precisely, though. That is, whether the division is carried out in ten digits, or whether the result is rounded to ten digits. If you round to ten digits, then your result is correct, as maple also says:

 

> evalf(1/evalf(1/38,10),10);

38.00000001

 

  • The TOTAL cumulative error sum (all 100 numbers tested) adds up to -0.000000178 which, compared to Atari's 0.000011471, represents a 64.44x SMALLER (better) cumulative error (again,we are talking about what is supposed to be a 10-digit BCD system).

So after seeing how LARGE the above errors can be, I can only infer a few things:

 

1. My reference (48GX) is actually using more digits than those set in LongFloat's global "Digits" variable (which I set to 10).

 

It's using a different division algorithm which ensures you an *output* precision of ten digits. Which is something different what the BCD math in the math pack does. It only has an intermediate precision of at most ten digits.

 

2. Or, the Atari's BCD implementation is definitely NOT a 10-digit BCD (but something closer to 8-9 digits, max).

 

It's neither. It is a 5-digit math to the base of 100. That's not the same as a 10-digit base 10 math. Round-off in Mathpack BCD destroys two digits, not one.

 

3. Or, the Atari's BCD implementation is a 10-digit BCD with relatively poor management of rounding or basic Add/Sub/Mul/Div operations.

4. Or a combination of some of the above.

That, too. (-: It uses (unless you use the Os++ implementation) a chunk-off rounding policy, which is close to the stupidest thing one can do.

 

 

At this point, I would like to see actual evidence or mathematical reasoning of why the Atari's FP package is outputting such large differences, in such basic operations.

Would the source be evidence enough? Division is actually the pen-and-paper algorithm, no guard digits, nothing.

Link to comment
Share on other sites

It's neither. It is a 5-digit math to the base of 100. That's not the same as a 10-digit base 10 math. Round-off in Mathpack BCD destroys two digits, not one.

 

 

That, too. (-: It uses (unless you use the Os++ implementation) a chunk-off rounding policy, which is close to the stupidest thing one can do.

 

Would the source be evidence enough? Division is actually the pen-and-paper algorithm, no guard digits, nothing.

 

Thanks, APPRECIATE the feedback.

 

So 5-digits 100-base BCD system... WTF (!?) Is this because of trying to directly leverage how the 6502 works around math, or... because of someone went "high" on weeds when implementing this section of ROM?

 

All-in-all, this is an area of the system that (in my humble opinion) requires TRUE re-work (consistency and precision, before speed, though). Not sure if having things crammed in 2Kbytes of code is the actual driver / culprit of the final product (I did not see better results from Newell code, although I need to test a bit better).

 

Time to re-write history, again, if OS++ has not done so, yet.

Link to comment
Share on other sites

Thanks, APPRECIATE the feedback.

 

So 5-digits 100-base BCD system... WTF (!?) Is this because of trying to directly leverage how the 6502 works around math, or... because of someone went "high" on weeds when implementing this section of ROM?

Well, the point is that a base 100 system is so simple when using BCD math. Shift out a byte, adjust the exponent by one. A base-ten system would have required mantissa adjustments that shift by nibbles, which would have made the whole system more complicated (and slower).

 

All-in-all, this is an area of the system that (in my humble opinion) requires TRUE re-work (consistency and precision, before speed, though). Not sure if having things crammed in 2Kbytes of code is the actual driver / culprit of the final product (I did not see better results from Newell code, although I need to test a bit better).

The trouble is backwards compatibility. If it would be done properly, then a proper base-2 system would not only potentially use more ROM space (I haven't tried, though). It would also mean that any BASIC program with floating point constants in its body (so, every single one, essentially) would no longer work when LOADed from disk because the number format changed. And converting from and to BCD is not a realistic option either. It would also impact precision, and speed a lot.

 

Yes, you can write better mathpacks, even on a 6502, but the numerically most feasible options to not satisfy the constraints of portability.

 

Time to re-write history, again, if OS++ has not done so, yet.

Well, I don't think it's the last word in this area. It also had to work with constraints, namely I did not use the luxery of using other parts of the ROM for the mathpack. If you can loosen that, you could make it much better (and faster). That said, it probably avoided the most desasterous mistake of the original ROM, as in using an outright stupid rounding mode, and inadequate polynomial approximations for LOG and EXP. But given that the number format itself is just a very lousy choice, that's very little what one could do in principle.

Link to comment
Share on other sites

  • 2 months later...

I never knew about this either - one more great piece of software by Carol Shaw. Is it safe to assume then, that this is not using the FP in ROM? I suppose I should just download it and set some breakpoints in the emu.

Yes, you are right, Calculator does not use the FPs from the Atari OS. I am in direct contact to Carol.

  • Like 2
Link to comment
Share on other sites

  • 2 months later...

Thanks for the file, however, I didn't see one named OSNXL. I suspect it is the same as the newest version of OmniMon XL, but I'm not sure. It's been too long...

 

I am trying to re-learn everything I have forgotten over the years, and it is difficult :)

 

-K

I believe most users/sites just called it OmnimonXL, I know that's what I thought it was called until recently. OmnimonXL and FastchipXL are components of the OSNXL OS.

Link to comment
Share on other sites

so, all the varities of OmniMon XL contain Fastchip? That's what I am looking for. I guess I could type in a simple test program to find out...

 

EDIT: I did, and it seems a little faster. a 1 to 100 loop with multiply and sqr ran 3 seconds faster.

 

10 for x=1 to 100:a=x*x:b=sqr(a):?x,a,b:next x

Edited by Kyle22
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...