+Lee Stewart Posted June 24, 2023 Author Share Posted June 24, 2023 I am finally getting back to working on fbForth 2.1. In particular, I am working through the Floating Point Library (FPL) to make use of the console FP routines, where possible, because they operate on the 16-bit bus and thereby are faster than those that are currently operating from cartridge ROM (8-bit bus). I was going to reference the routines directly, but thought that, perhaps, I should go the route the Editor/Assembler Manual uses for XMLLNK and rely only on knowing the location of XMLTAB, its pointers (their names, not their addresses) and the pointed-to tables (names) and their pointers (names). This means that I would be using only the address of XMLTAB (>0CFA) to get at the addresses of all of the relevant routines, which, for the FP routines of interest, are in the tables pointed to by addresses XMLTAB and XMLTAB+2, viz., FLTTAB and XTAB. The only difficulty with the above scenario is that it will complicate using addresses not in those tables when I want to enter a routine at a different location. Of course, I could add the offset of my entry point’s address from an address in the relevant table, which is not difficult, but will make the code a bit convoluted, I fear. Any thoughts on the path I should take? ...lee 1 Quote Link to comment Share on other sites More sharing options...
+jedimatt42 Posted June 24, 2023 Share Posted June 24, 2023 Skipping the XMLLNK like address lookups would save instructions and reduce calling overhead. If you document which console ROM addresses you use as entry points, then any alternative console ROMs (a practical myth) could still support it. 1 1 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 30, 2023 Author Share Posted June 30, 2023 Okay...I’m going with the direct-reference approach. I am working through the routines, one at a time. So far, I have OCOMP (compare routine called by Forth F> , F= , F< ; now calls FCOMP in console ROM) CSN (convert string to FP number or 16-bit integer routine; now calls CSN+4 in console ROM, which is after the assignment to R3 of the get-character routine. If entered at CSN, the routine would be expecting to read from VRAM) Removed CSINT (convert string to 16-bit integer routine, except for error return, because it was only ever called by CSN—same as console routine) These XMLLNK-called routines are relatively easy to manage. Once done with those, the real charmers will be working through the code of the transcendental functions to call the relevant console routines with proper setups as with the above routines. I may be awhile. ...lee 5 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted July 2, 2023 Author Share Posted July 2, 2023 When I previously ported the MDOS L10 floating point library (FPL) to fbForth 2.0 (and TurboForth), I used FP copy routines to save 6 bytes each time they were called (via BL). This saved 324 bytes in bank #3. Without those space-saving routines, there would not have been enough room in bank #3 for the FPL—only 86 bytes free in bank #3, at the moment. Now, by using console ROM routines as much as possible, 900+ bytes are freed up. With that much free space, I can afford losing those 324 bytes to increase speed with inline code. This still leaves nearly 700 bytes (including previous free space) in bank #3 plus the increased speed of inline code. This is likely quite significant for the transcendental functions, which are constantly moving FP numbers around while evaluating polynomials. As I have just begun sifting through the transcendental routines, I think I will, indeed, take advantage of the increased speed of the inline code. Please, don’t bother me while I mumble to myself... ...lee 3 2 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted July 3, 2023 Share Posted July 3, 2023 This sounds like it will fly. I am looking forward to seeing your work. I would have to re-compile my kernel with user variables in expansion RAM to use the code, but that's a possibility I think. 2 1 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted July 5, 2023 Author Share Posted July 5, 2023 (edited) I am done with the initial pass through the FPL code and, unless I deleted something I shouldn’t have, I have way more than 900+ bytes of free space in bank #3—and that is before I inline the 324 bytes of FP copy code! Free space in bank #3 is currently at 1638 bytes! Adding those 324 bytes of inline code will make it 1314 bytes of free space, which brings me to another possible speed-up, viz., the code for jumping into the middle of console ROM FCOMP to compare two FP numbers, one of which is not in ARG. Those familiar with XMLLNKing to FCOMP know that the comparison, going in through the front door of FCOMP, is of the FP number in FAC with the one in ARG. There are five instances in the transcendental functions that jump into the middle of FCOMP. Going in through the front door, with the relevant FP numbers in FAC and ARG and using the GPL workspace (I am using GPLWS for all of the transcendental functions), would require only BL @FCOMP Jumping in right after FCOMP sets the pointer, R7, to ARG, misses the setting up of R10 (return address) and R3 (another branch address, STEX01 = >0FAA), so they must be set up by my calling code. At the moment, after R7 is assigned to the location of the second FP number (the first is already in FAC), I call my space-saving routine, FCOMP7, with BL @FCOMP7 ...and here is FCOMP7 *++ call into middle of console ROM FCOMP FCOMP7 MOV R11,R10 ;set up return to calling routine from console ROM LI R3,STEX01 ;set up missing status exit from FCOMP B @FCOMP+26 ;branch 26 bytes into console FCOMP..returns to caller This is 10 bytes, while the calling code is 4 bytes: 10 + 5 x 4 = 30 bytes. Here is the inline code that will replace the above: *++ call into middle of console ROM FCOMP LI R10,$+12 ;set up return to calling routine from console ROM LI R3,STEX01 ;set up missing status exit from FCOMP B @FCOMP+26 ;branch 26 bytes into console FCOMP..returns here This is 12 bytes. With 5 uses, it will be 5 x 12 = 60 bytes—just 30 more. Now we are at 1314 – 30 = 1284 bytes of free space in bank #3. Of course, now I must see what bugs I need to squish. At least, it compiles without errors. ——EDIT below——————————————————————————————————————————— And there were bugs aplenty! I was relying on the CPU status register, which had the wrong status because FCOMP intercepts it at STEX01 (see above) to assign it to the GPL status byte (>837C), which I could check by adding more code, but I don’t need to go that route. All I need to do is assign my return location to R3 and FCOMP will return with the result of the comparison in the CPU status register and I just saved four more bytes: *++ Call into console ROM FCOMP..return result in CPU status register LI R3,$+8 ;set up return here with status from console ROM B @FCOMP+26 ;branch 26 bytes into console FCOMP..returns here ——EDIT above——————————————————————————————————————————— ...lee Edited July 6, 2023 by Lee Stewart CORRECTION and ENHANCEMENT 4 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted July 5, 2023 Share Posted July 5, 2023 So where will ARG come from in your new version? The Forth DATA stack perhaps? Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted July 5, 2023 Author Share Posted July 5, 2023 3 hours ago, TheBF said: So where will ARG come from in your new version? The Forth DATA stack perhaps? ARG is not used in the comparison. FCOMP sets R7 to point to ARG and R5 to point to FAC, but my routine sets R7 to point to wherever the compared FP number resides in RAM and jumps into FCOMP after FCOMP’s R7 assignment, so FCOMP is none the wiser, happily comparing FAC to the number at my location. ...lee 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted July 5, 2023 Share Posted July 5, 2023 So... does that mean you could put all eight bytes of the float onto the data stack and point R7 to the appropriate end of the data? If you could, then the FAC would become the top of the floating point stack, cached, ie: the way I do math in Camel Forth. The data stack or another stack of your creation could be the "next on stack" for FP routines. Things that I wonder about... 2 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted July 5, 2023 Author Share Posted July 5, 2023 4 hours ago, TheBF said: So... does that mean you could put all eight bytes of the float onto the data stack and point R7 to the appropriate end of the data? If you could, then the FAC would become the top of the floating point stack, cached, ie: the way I do math in Camel Forth. The data stack or another stack of your creation could be the "next on stack" for FP routines. Things that I wonder about... Indeed, yes! In fact, you could set up both pointers R5 and R7 and enter FCOMP two bytes later at FCOMP+28 for the comparison. I must correct my dissertation 2 posts back. It caused a serious bug in the several routines that call into FCOMP at FCOMP+26. As it turns out, I rely on the GPL status byte when I call FCOMP normally, but the FPL routines that call into it rely on the CPU status register upon return, which means I need to add more code to decode the GPL status byte (>837C)—unless I can get it to return immediately after the CPU status register is set—which, as huck would lave it, I can! It even saves four more bytes, making my inline code only 8 bytes!: *++ Call into console ROM FCOMP..return result in CPU status register LI R3,$+8 ;set up return here with status from console ROM B @FCOMP+26 ;branch 26 bytes into console FCOMP..returns here ...lee 4 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted July 7, 2023 Author Share Posted July 7, 2023 I am basically done with changing my Floating Point Math Library to use 16-bit console ROM arithmetic routines instead of my 8-bit ROM routines. I thought this was going to take a lot more time. I probably need to spend a little time trying to shake out any lingering bugs, but first, I think I will work on the few remaining changes I want to make for fbForth 2.1. ======================================= Shifting gears... I was certainly intrigued by @TheBF’s multiple WHILE words within a single BEGIN … WHILE … REPEAT construct, but I was, as he put it, not happy with the idea—especially with the addition of a THEN ( ENDIF ) after REPEAT for each additional WHILE within the loop. For fbForth to do that, I would need to change the compiler check code for WHILE . This is a really simple change, but kind of defeats the purpose of the check codes. Instead, I decided to rewrite REPEAT to do all the work, not requiring the user to remember any additional code. This is the original code for REPEAT : \ Classic figForth REPEAT, allowing only one WHILE : REPEAT Compile Time: ( addr1 1 addr2 4 --- ) Runtime: ( --- ) ?COMP \ insure we are compiling >R >R \ WHILE info to RS [COMPILE] AGAIN \ resolve REPEAT with AGAIN R> R> 2- \ resolve WHILE with ENDIF (after first.. [COMPILE] ENDIF \ ..forcing compiler check code to that of IF) ; IMMEDIATE And here is the multi- WHILE REPEAT : \ REPEAT that allows multiple WHILEs : REPEAT Compile Time: ( addr1 1 [addr2 4,...] --- ) Runtime: ( --- ) ?COMP \ insure we are compiling 0 >R \ indicate no more WHILEs \ Save WHILE info to RS [COMPILE] BEGIN DUP 4 = \ WHILE compiler check code? [COMPILE] WHILE >R >R 1 >R \ WHILE info plus process indicator to RS MYSELF \ compile cfa of REPEAT \ Resolve REPEAT [COMPILE] AGAIN \ resolve REPEAT with AGAIN \ Process waiting WHILEs [COMPILE] BEGIN R> \ get process indicator [COMPILE] WHILE R> R> 2- \ resolve next WHILE with ENDIF (after first.. [COMPILE] ENDIF \ ..forcing compiler check code to that of IF) MYSELF \ compile cfa of REPEAT ; IMMEDIATE I am not sure this second definition will work, as it stands, because it is referencing itself and I may not have that reference set up correctly. I plan to implement it in ALC anyway, so that is a moot point. When I have it working, the following Forth code will be possible: : MYWORD BEGIN \ <do stuff> WHILE \ <do more stuff> WHILE \ <do yet more stuff> WHILE \ <do yet even more stuff> REPEAT \ <continue doing more stuff> ; As with the classic definition, there must be, at least, one WHILE , but there may be as many as the programmer needs after that—all this without the distracting, brain-hurting (for me!) additional THEN or ENDIF (synonyms) for each additional WHILE . Comments? ...lee Quote Link to comment Share on other sites More sharing options...
+TheBF Posted July 7, 2023 Share Posted July 7, 2023 I like the idea of making it look cleaner. I am not fond of altering a standard word too much. Since FbForth has many words specific to TI-99 already, an alternative would be to keep REPEAT and add a word that does this thing specifically. REPEATS maybe ?? As I write this I think you could get away with: : REPEATS ?COMP [COMPILE] REPEAT [COMPILE] ENDIF ; IMMEDIATE Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted July 8, 2023 Author Share Posted July 8, 2023 4 hours ago, Lee Stewart said: I plan to implement it in ALC anyway, so that is a moot point. When I have it working, the following Forth code will be possible. I basically have the new REPEAT working now. I will check it in greater detail tomorrow. It is taking up 12 fewer bytes in bank #0 and taking up 68 bytes for its ALC in bank #3. I will post the code later for the hosts of the curious! ....lee 1 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted July 8, 2023 Author Share Posted July 8, 2023 3 hours ago, Lee Stewart said: I basically have the new REPEAT working now. I will check it in greater detail tomorrow. It is taking up 12 fewer bytes in bank #0 and taking up 68 bytes for its ALC in bank #3. I will post the code later for the hosts of the curious! ....lee Here is the code for REPEAT : ;[*** REPEAT *** Compile Time: ( addr1 1 [addr2 4, ...] --- ) [ IMMEDIATE word ] * Runtime: ( --- ) *++ This compile-time word can handle 1 or more WHILE words within a *++ BEGIN ... REPEAT construct. Any WHILE that fails exits the loop. * * DATA AGN__N * RPT__N .name_field_immediate 6, 'REPEAT ' * _RPT DATA $+2 * BL @BLF2A * DATA __RPT->6000+BANK3 __RPT BL @BLA2F ;execute Forth.. DATA QCOMP ; ..word ?COMP DECT R ;reserve space on return stack (RS) CLR R0 ;zero "more WHILE" indicator MOV R0,*R ;push FALSE for last WHILE..last item popped from RS INC R0 ;"more WHILE" indicator is now TRUE *++ Process each WHILE RPT01 MOV *SP,R1 ;read TOS CI R1,4 ;check-code for WHILE ? JNE RPT02 ;exit loop if not DECT R ;yes..reserve space on RS MOV *SP+,*R ;push WHILE check-code to RS DECT R ;reserve space on RS MOV *SP+,*R ;push WHILE address to RS DECT R ;reserve space on RS MOV R0,*R ;push TRUE "more WHILE" indicator to RS JMP RPT01 ;check for another WHILE *++ Resolve REPEAT RPT02 BL @BLA2F ;execute Forth.. DATA _AGAIN ; ..word AGAIN *++ Resolve each WHILE RPT03 MOV *R+,R0 ;pop "more WHILE" indicator from RS JEQ RPTXIT ;if FALSE, we're outta here DECT SP ;reserve space on parameter stack (PS) MOV *R+,*SP ;pop WHILE address from RS to PS DECT SP ;reserve space on PS MOV *R+,*SP ;pop WHILE check-code from RS to PS DECT *SP ;adjust WHILE check-code to that for IF BL @BLA2F ;execute Forth.. DATA _ENDIF ; ..word ENDIF JMP RPT03 ; check for another WHILE RPTXIT B @RTNEXT ;back to fbForth interpreter ;]* ...lee 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted July 8, 2023 Share Posted July 8, 2023 As always you write some very fine ALC. I think it could also be done in Forth using an idea from ENDCASE, which compiles a prerequisite number of THEN (ENDIF) tokens to match the OF tokens. So in theory we could do: : REPEAT [COMPILE] AGAIN BEGIN ?DUP WHILE [COMPILE] ENDIF REPEAT ; IMMEDIATE Which would be even smaller by maybe 1/2 or so. Not sure if that would jibe with your existing AGAIN and ENDIF definitions. If they have compile time match detection it might not work this simply but however you implement ENDCASE should give the necessary recipe, I think. Just a thought. 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted July 8, 2023 Share Posted July 8, 2023 17 hours ago, Lee Stewart said: I am basically done with changing my Floating Point Math Library to use 16-bit console ROM arithmetic routines instead of my 8-bit ROM routines. I thought this was going to take a lot more time. I probably need to spend a little time trying to shake out any lingering bugs, but first, I think I will work on the few remaining changes I want to make for fbForth 2.1. BTW, did you get a chance to compare the speed of the new code versus the old version? Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted July 8, 2023 Author Share Posted July 8, 2023 3 hours ago, TheBF said: BTW, did you get a chance to compare the speed of the new code versus the old version? Not yet. I got sidetracked, dontcha know! I do plan to do that soon—I promise. ...lee 1 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted July 8, 2023 Author Share Posted July 8, 2023 (edited) 17 hours ago, TheBF said: As always you write some very fine ALC. I think it could also be done in Forth using an idea from ENDCASE, which compiles a prerequisite number of THEN (ENDIF) tokens to match the OF tokens. So in theory we could do: : REPEAT [COMPILE] AGAIN BEGIN ?DUP WHILE [COMPILE] ENDIF REPEAT ; IMMEDIATE Which would be even smaller by maybe 1/2 or so. Not sure if that would jibe with your existing AGAIN and ENDIF definitions. If they have compile time match detection it might not work this simply but however you implement ENDCASE should give the necessary recipe, I think. Just a thought. It is complicated by the compiler check codes; that the WHILE info is above the REPEAT info on the stack; and that REPEAT does not work in the above definition without significant tweaks. I did write a compiled high-level definition (untested), Spoiler * High-level (headerless) definition of REPEAT * _RPT DATA DOCOL DATA QCOMP,ZERO,TOR REPT01 DATA DUP,LIT,4,EQUAL,ZBRAN,REPT02-$ DATA TOR,TOR,ONE,TOR,BRANCH,REPT01-$ REPT02 DATA _AGAIN REPT03 DATA FROMR,ZBRAN,REPTEX-$ DATA FROMR,FROMR,TWOM,_ENDIF,BRANCH,REPT03-$ REPTEX DATA SEMIS but I cannot afford that kind of space (54 bytes) in Bank #0. Doing it like CASE , ENDCASE would require changing other definitions in addition to a definition for REPEAT as long as the above, I think.. ...lee Edited July 9, 2023 by Lee Stewart CODE CORRECTION Quote Link to comment Share on other sites More sharing options...
+TheBF Posted July 8, 2023 Share Posted July 8, 2023 I thought that might be the case. Reducing all the control flow stuff down to bare bones seems to be the only way to get that simplicity to flow through to the rest of the pyramid. The only alternative I can think of is implementing a separate control flow stack and that's just another huge change. Your ALC will do the job just fine. 2 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted July 8, 2023 Author Share Posted July 8, 2023 18 hours ago, Lee Stewart said: Here is the code for REPEAT : Just for grins, here is a slightly quicker version, which is 4 bytes longer: Spoiler ;[*** REPEAT *** Compile Time: ( addr1 1 [addr2 4, ...] --- ) [ IMMEDIATE word ] * Runtime: ( --- ) *++ This compile-time word can handle 1 or more WHILE words within a *++ BEGIN ... REPEAT construct. Any WHILE that fails exits the loop. * * DATA AGN__N * RPT__N .name_field_immediate 6, 'REPEAT ' * _RPT DATA $+2 * BL @BLF2A * DATA __RPT->6000+BANK3 __RPT BL @BLA2F ;execute Forth.. DATA QCOMP ; ..word ?COMP DECT R ;reserve space on return stack (RS) CLR R0 ;zero "more WHILE" indicator MOV R0,*R ;push FALSE for last WHILE..last item popped from RS INC R0 ;"more WHILE" indicator is now TRUE *++ Process each WHILE RPT01 MOV *SP,R1 ;read TOS CI R1,4 ;check-code for WHILE ? JNE RPT02 ;exit loop if not AI R,-6 ;yes..reserve space on RS for 3 items MOV *SP+,@4(R) ;pop WHILE check-code to RS MOV *SP+,@2(R) ;pop WHILE address to RS MOV R0,*R ;push TRUE "more WHILE" indicator to RS JMP RPT01 ;check for another WHILE *++ Resolve REPEAT RPT02 BL @BLA2F ;execute Forth.. DATA _AGAIN ; ..word AGAIN *++ Resolve each WHILE RPT03 MOV *R+,R0 ;pop "more WHILE" indicator from RS JEQ RPTXIT ;if FALSE, we're outta here AI SP,-4 ;reserve space on parameter stack (PS) for 2 items MOV *R+,@2(SP) ;pop WHILE address from RS to PS MOV *R+,*SP ;pop WHILE check-code from RS to PS DECT *SP ;adjust WHILE check-code to that for IF BL @BLA2F ;execute Forth.. DATA _ENDIF ; ..word ENDIF JMP RPT03 ;check for another WHILE RPTXIT B @RTNEXT ;back to fbForth interpreter ;]* The difference is in how stack space is reserved and how values are pushed to the stacks. ...lee 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted July 8, 2023 Share Posted July 8, 2023 The AI instruction is always there to temp us. 1 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted July 10, 2023 Author Share Posted July 10, 2023 I just realized that the above code for REPEAT does not presume at least one WHILE , which would allow the following code, which should throw an error: : MYWORD BEGIN \ <do stuff> REPEAT ; That previous code is certainly not catastrophic because it would just become another way to do : MYWORD BEGIN \ <do stuff> AGAIN ; But, that is not how figForth, TI Forth, and, consequently, previous versions of fbForth work. So... here (in the spoiler) is my current code for REPEAT , which forces at least one WHILE : Spoiler ;[*** REPEAT *** Compile Time: ( addr1 1 [addr2 4, ...] --- ) [ IMMEDIATE word ] * Runtime: ( --- ) *++ This compile-time word can handle 1 or more WHILE words within a *++ BEGIN ... REPEAT construct. Any WHILE that fails exits the loop. * * DATA AGN__N * RPT__N .name_field_immediate 6, 'REPEAT ' * _RPT DATA $+2 * BL @BLF2A * DATA __RPT->6000+BANK3 __RPT BL @BLA2F ;execute Forth.. DATA QCOMP ; ..word ?COMP *++ Process each WHILE, assuming at least one SETO R1 ;make more-while flag non-zero RPT01 DECT R ;reserve space on RS for 1 item MOV R1,*R ;push flag to RS DECT R ;reserve space on RS for 1 item MOV *SP+,*R ;pop WHILE check-code to RS DECT R ;reserve space on RS for 1 item MOV *SP+,*R ;pop WHILE address to RS MOV *SP,R1 ;check next check-code..TOS to R1 AI R1,-4 ;check-code for WHILE? JEQ RPT01 ;get another WHILE if flag = 0 *++ Resolve REPEAT BL @BLA2F ;execute Forth word.. DATA _AGAIN ; ..AGAIN to resolve REPEAT *++ Resolve each WHILE RPT02 DECT SP ;reserve space on PS for 1 item MOV *R+,*SP ;pop WHILE address from RS to PS DECT SP ;reserve space on PS for 1 item MOV *R+,*SP ;pop WHILE check-code from RS to PS DECT *SP ;adjust WHILE check-code to that for IF BL @BLA2F ;execute Forth word.. DATA _ENDIF ; ..ENDIF to resolve WHILE MOV *R+,R1 ;pop flag from RS to R1..= 0? JEQ RPT02 ;if so, get another WHILE B @RTNEXT ;back to fbForth interpreter ;]* ...unless, of course, you think that is a feature I should embrace. ...lee 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted July 10, 2023 Share Posted July 10, 2023 Yes it's important that REPEAT behave as expected even though it has secret abilities under the hood. By Jove, I think you've got it. 1 1 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted July 11, 2023 Author Share Posted July 11, 2023 I ran several comparison tests for the two incarnations (fbForth 2.0 vs. fbForth 2.1) of the Floating Point Math Library (FPL). Though the functions that were run were by no means exhaustive for the more complex functions such as ^ , which can take variable amounts of time depending on the values supplied, they were the same conditions for the two fbForth versions. As expected, fbForth 2.1, using console ROM routines where possible, was faster, though not as much as I expected. Here is a table comparing the two versions (time is seconds): Function Loops fbForth 2.0 fbForth 2.1 -------- ----- ----------- ----------- F/ 10000 133 111 F* 10000 76 66 F+ 10000 16 15 F- 10000 16 15 F> 10000 7.6 7.5 SIN 1000 72 61 COS 1000 73 62 TAN 1000 156 129 SQR 1000 75 61 ^ 1000 209 174 >F 10000 54 53 The times for the more complex functions will improve a bit when I unroll the 47 instances of “BL @R1$2” (188 more bytes gone from Bank #3). Four bytes were saved (at the expense of a little time) each time that routine was called: * Space-saving (4 bytes) routine for copying a floating-point number. * R1$2 MOV *R1+,*R2+ MOV *R1+,*R2+ MOV *R1+,*R2+ MOV *R1,*R2 RT There are four more instances in Bank #3 of a similar routine I might unroll, as well—making a total of 204 bytes I will be taking from Bank #3. This will bring the free-byte total in Bank #3 to 1322—still a respectable amount of space for more functions. I would even have room to put FBFONT (?) back in ROM (1024 bytes) and UDSQRT (128 bytes), as well. The possibilities are not limitless, but many. H-m-m-m..... ...lee 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted July 11, 2023 Share Posted July 11, 2023 Looks like you can book about 15% better on the transcendentals. That's damned good. How did you end up handling the FAC ARG thing. Is ARG taken from the data stack? Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.