Jump to content
IGNORED

fbForth—TI Forth with File-based Block I/O [Post #1 UPDATED: 06/05/2024]


Lee Stewart

Recommended Posts

13 hours ago, Lee Stewart said:

 

I wasted (I think) 8 bytes of bank#2 (where resides the code for COLD ) to also re-initialize R13 and R14 of SNDWS to 0, just in case. The reason I say, “wasted”, is that every time the sound queue is finished, both indices are the same. They may not be 0, but as long as they both point to the same place in the circular queue, I should only need to insure that the queue is all zeros. It is definitely safer, but I don’t think necessary. So far, I can afford the space, so I guess I should keep it—for now, at least.

 

...lee

You are correct that it is not necessary to initialize the queue. I have built RS232 communication queues using this method and they seem to run forever with no reset.

I remember reading in some textbook somewhere that it is tricky to tell how many items are in the queue with just the head/tail pointers but you can always tell if the queue is empty.

It's when the tail=head. 

So you should be able  do compare on R13,R14  on the ISR read side, to determine if there is something waiting. That gives you a quick ISR response when things are empty.

That's what I have done in the past. I will try and find that old text book see if there are any more tricks I have forgotten.

  • Like 1
Link to comment
Share on other sites

I found the text.  "Data Structures and Algorithms, Aho, Hopcroft, Ullman, 1983 Bell Telephone Labs. Inc"

 

The test for Empty queue in Pascal looks like this in the book.

function EMPTY (Q:QUEUE) : boolean 
	begin 
	  if (Q.front=Q.rear then 
 	     return(true)
	  else
	     return(false)
	end. { EMPTY }

 

It talks about the confusion of telling the difference between a full queue and an empty queue. One solution given is to never let the queue fill up.

The other is to provide a "bit" to indicate when the queue is full. I have done that in the past with a counter rather than a bit. 

And as you have done you can just ignore it and let the data overwrite. I have done that in most of my real world uses. 

 

It's fun to translate that function EMPTY to Forth, :) 

 

: EMPTY?    ( -- ?) Q.front @  Q.rear @ = ; 

 

If the entire program takes that much more source code which one should be easier to maintain? 🤔

  • Like 2
Link to comment
Share on other sites

  • 1 month later...

Lately, I have been tightening up the low-level support code in the fbForth kernel. A carryover from TI Forth are the Forth words called “system synonyms”, which perform tasks unique to the TI-99/4A and have names identical with their ALC counterparts. These include GPLLNK, DSRLNK, and XMLLNK. The TI programmers called these particular functions by setting up the BLWP code in registers and then BLing to that code. This is certainly clever, but creates an unnecessary level of indirection. It may have stemmed from using the DX10 minicomputer environment for the development, which forced (I think) separate code and data segments. Here is the current XMLLNK:

*== XML link utility.                     CODE =  12  =================
*
* XML routine to be called must be on the stack.
XML    LI   R0,>0420        Construct the BLWP instruction
       LI   R1,XMLLNK         to the XMLLNK utility
       MOV  *SP+,R2           with this datum identifying the routine
       LI   R3,>045B        Construct the B *LINK instruction
       MOV  LINK,R4         Save LINK (R11) address
*++ Not sure why the TI programmers didn't just use "BL R0", which would
*++ have saved 2 bytes and been clearer IMHO!
       BL   @2*R0+MAINWS    Execute the above instructions
       MOV  R4,LINK           and reconstruct LINK (R11)
       JMP  BKLINK          Return to caller

 

and how I changed it to avoid the BL indirection, which also saved 12 bytes:

*== XML link utility.                     CODE =  12  =================
*
XML    MOV  *SP+,@XMLPRG   XML prog name from stack follows XMLLNK call below
       BLWP @XMLLNK        call XMLLNK with datum copied from stack
XMLPRG DATA >0000          XML program name copied here before calling XMLLNK
       JMP  BKLINK         return to caller

 

While working on this, it occurred to me that there is a lot more that could be done to tighten up the entire system-call code, which includes one or more additional indirections. I will detail this in a subsequent post, but right now the beach is calling my name. Later...

 

...lee

  • Like 4
  • Thanks 1
Link to comment
Share on other sites

  • 2 weeks later...

I have this bad habit of “carefully” changing lots of code before testing any of it! It almost bit me this time. As you know, I have been busy modifying the fbForth SYSTEM calls. I was testing as I went along until I got to the five or so keyboard-related routines, all of which I modified before testing again—definitely not a good idea! Because of where fbForth went into the weeds, I was pretty sure the low-level KY routine (waits for user to press a key) called by the high-level Forth KEY, was the culprit, but I was dreading the debugging process. Probably only @TheBF and @Willsy will care (and might not have made my mistake), but here it is:

 

When the KY routine does not detect a key-press, it does not exit normally by RTing back to its caller. Rather, it causes the high-level KEY to be re-executed by backing up the Interpretive Pointer (IP) and branching to NEXT in the inner interpreter. This works because KEY is a high-level Forth colon definition. I had changed it to an ALC-defined word, which cannot be re-executed in the same manner as a colon-defined word. I had to change the backup code from

* Set up to re-enter via fbForth's KEY       
KEY7   DECT IP                re-execute fbForth's KEY
       B    *NEXT

to

* Set up to re-enter via fbForth's KEY       
KEY7   DECT W                 re-execute fbForth's KEY
       B    @DOEXEC           not a colon definition anymore

which manipulates the current Word pointer (W) rather than the IP. All fixed! :cool:

 

Here is the inner interpreter (located in high-speed, scratchpad RAM), in case you were wondering:

Spoiler
DODOES DECT SP
       MOV  W,*SP
       MOV  LINK,W
DOCOL  DECT R
       MOV  IP,*R
       MOV  W,IP
$NEXT  MOV  *IP+,W
DOEXEC MOV  *W+,R1
       B    *R1
$SEMIS MOV  *R+,IP
       MOV  *IP+,W
       MOV  *W+,R1
       B    *R1

 

 

I will sleep well tonight!

 

...lee

  • Like 3
  • Thanks 1
Link to comment
Share on other sites

Good catch! BTW you are not alone in changing things without testing. :)

The shame of it is Forth makes testing so simple most of the time. 

 

Is it worth asking why you changed KEY from a colon definition to ALC?

  • Like 1
Link to comment
Share on other sites

17 minutes ago, TheBF said:

Good catch! BTW you are not alone in changing things without testing. :)

The shame of it is Forth makes testing so simple most of the time. 

 

Is it worth asking why you changed KEY from a colon definition to ALC?

 

Well...high-level KEY and KEY8 each called (via CFA) a code-defined third word that BLed to the keyboard ALC KY routine (38 bytes). Now, KEY and KEY8 each BL to KY themselves, making the third word unnecessary (20 bytes).

 

...lee

  • Like 1
Link to comment
Share on other sites

24 minutes ago, Lee Stewart said:

 

Well...high-level KEY and KEY8 each called (via CFA) a code-defined third word that BLed to the keyboard ALC KY routine (38 bytes). Now, KEY and KEY8 each BL to KY themselves, making the third word unnecessary (20 bytes).

 

...lee

And I assume it's a little bit more faster since the savings too. I mean like responsive

 

Edited by GDMike
Link to comment
Share on other sites

  • 2 weeks later...

Still working on the TODO list for fbForth 3.0, but here is the current list with not very many items left! Of the items left to do, I want to work on a couple before committing to them:  fbForth300_TODO.txt

  1. The new DSRLNK currently being discussed in its own AA thread,
  2. FONTED full-font display screen toggles,
  3. Including UDSQRT in the resident dictionary—plenty of room for it and could use it for actual distances instead of their squares in coincidence calculations. Only hesitation is that the result is truncated instead of rounded, which likely does not matter.

The recent tasks accomplished are

  1. Replacement of all SYSTEM calls with direct B or BL calls, with concomitant removal of the SYSTEM word and the its call table.
  2. Removal of many instances of “LIMI 0” due to (1) and addition of same to individual routines.
  3. Changing the user variable SCRFNT from a bi-state into a tri-state flag:
    • (=0) default font from bank #3 of cartridge,
    • (<0) user font file,
    • (>0) console font with small caps for lowercase.

...lee

  • Like 1
  • Thanks 2
Link to comment
Share on other sites

1 hour ago, TheBF said:

Those are pretty nice upgrades Lee. 

 

I suspect you will save a few cycles in the overall operation without that SYSTEM table. 

How big was that SYSTEM table?

 

Thanks. There were 24 entries in the SYSTEM table.

 

...lee

  • Thanks 1
Link to comment
Share on other sites

  • 2 weeks later...

 

 

I am testing inclusion into the resident dictionary of fbForth 3.0 several convenient words that are currently defined in the FBLOCKS blocks file for optional use: CELLS 2DUP 2DROP 2SWAP NIP TUCK -ROT PICK ROLL WITHIN <> $.

 

Most of them do not take much code. ROLL is an exception. If I use the high-level recursive Forth code (written by Marshall Linker) I have in FBLOCKS, it takes 22 bytes in bank #0, which has code only for code fields and parameter fields:

 

: ROLL  ( [n]..[0] +n --- [n-1]..[0][n] )
   -DUP IF 
      1- SWAP >R MYSELF R> SWAP 
   THEN   ;

 

Here is my ALC for ROLL , which takes 34 bytes. It does one thing that the above code does not and that is to disallow a negative n, so adding ABS before the -DUP to the above code makes this ALC just 10 bytes more. Perhaps I should check for stack underflow, which would take another 6 bytes—though none (or very few) of the other stack manipulation words do this, which puts the onus on the programmer:

ROLL   DATA $+2
       MOV  *SP+,R2     ;pop position (count)
       JEQ  ROLLEX      ;cannot be 0
       JLT  ROLLEX      ;cannot be negative
       MOV  R2,R3       ;save position
       SLA  R3,1        ;correct to cell position
       A    SP,R3       ;point to nth cell below TOS
       MOV  *R3,R0      ;save cell to move to TOS
       MOV  R3,R4       ;construct pointer to
       DECT R4          ;     (n-1)th cell below TOS
ROLLUP MOV  *R4,*R3     ;copy down one cell
       DECT R3          ;next src cell
       DECT R4          ;next dst cell
       DEC  R2          ;done?
       JNE  ROLLUP      ;nope..do another
       MOV  R0,*SP      ;yup..we're outta here
ROLLEX B    *NEXT

 

If you see any way to tighten up this code, let me know.

 

...lee

  • Like 2
Link to comment
Share on other sites

What would it look like if you strung the Forth primitives end-to-end, without NEXT,  use Assembler IF ENDIF and a JMP to do the recursion?

 

: ROLL  ( [n]..[0] +n --- [n-1]..[0][n] )
             \ #instructions
   -DUP      \  3
   IF        \  1
      1-     \  1 
      SWAP   \  3
      >R     \  2
      MYSELF \  1 ??
      R>     \  2
      SWAP   \  3
   THEN ; \ =  15  

 

Potentially the same number of instructions. :( 

Maybe some optimizations could be found with SWAP because you are using registers in the code. So maybe you just SWAP the argument order?

Might worth a look.

 

  • Like 1
Link to comment
Share on other sites

16 minutes ago, TheBF said:

What would it look like if you strung the Forth primitives end-to-end, without NEXT,  use Assembler IF ENDIF and a JMP to do the recursion?

 

: ROLL  ( [n]..[0] +n --- [n-1]..[0][n] )
             \ #instructions
   -DUP      \  3
   IF        \  1
      1-     \  1 
      SWAP   \  3
      >R     \  2
      MYSELF \  1 ??
      R>     \  2
      SWAP   \  3
   THEN ; \ =  15  

 

Potentially the same number of instructions. :( 

Maybe some optimizations could be found with SWAP because you are using registers in the code. So maybe you just SWAP the argument order?

Might worth a look.

 

Well, I think the reason the recursion works in the high-level code is that ROLL gets pushed to the return stack by DOCOL every time it is executed within itself and unrolls when the IF fails at n=0. I am unsure how I would manage that in the ALC.

 

...lee

Link to comment
Share on other sites

I think recursion here is  a "call" in machine code. So for 9900 we would need to make the code a nestable sub-routine that pushes R11 on entry and pops R11 on exit.

Essentially you are making one Forth word a "subroutine threaded" word. 

 

I will play around a bit too and see if I am blowing smoke on this. :) 

Come to think of it I don't think I got around to making RECURSE in my machine code project so maybe I can try implementing and testing on that platform.

I predict a few crashes... ;)

 

 

  • Like 1
Link to comment
Share on other sites

12 hours ago, Lee Stewart said:

I am testing inclusion into the resident dictionary of fbForth 3.0 several convenient words that are currently defined in the FBLOCKS blocks file for optional use: CELLS 2DUP 2DROP 2SWAP NIP TUCK -ROT PICK ROLL WITHIN <> $.

 

Most of them do not take much code. ROLL is an exception. If I use the high-level recursive Forth code (written by Marshall Linker) I have in FBLOCKS, it takes 22 bytes in bank #0, which has code only for code fields and parameter fields:

 

Continuing with this, PICK starts out just like ROLL , in that a similar ALC solution needs to get the address of the nth cell down from TOS, which allows for this code-sharing opportunity:

 

;[*** ROLL ***        ( [n]..[0] +n --- [n-1]..[0][n] )
*        DATA MROT_N
* ROLL_N .NAME_FIELD 4, 'ROLL '
ROLL   DATA $+2
       BL   @PIKROL     ;get counter and nth cell pointer
       MOV  R2,R2       ;counter=0?
       JEQ  ROLLEX      ;do nothing if flag=0
       MOV  R3,R4       ;save pointer
       DECT R4          ;point to (n-1)th cell
       MOV  *R3,R0      ;save cell to move to TOS
ROLLUP MOV  *R4,*R3     ;copy down one cell
       DECT R3          ;next src cell
       DECT R4          ;next dst cell
       DEC  R2          ;done?
       JNE  ROLLUP      ;nope..do another
       MOV  R0,*SP      ;yup..we're outta here
ROLLEX B    *NEXT

;[*++ PIKROL ++*  called by PICK and ROLL
PIKROL MOV  *SP+,R2     ;position (count)
       ABS  R2          ;force positive
       MOV  R2,R3       ;save position
       SLA  R3,1        ;correct to cell position
       A    SP,R3       ;point to nth cell
PKRLEX RT               ;return with flag
;]*
;[*** PICK ***        ( [n]..[0] +n --- [n-1]..[0][n] )
*        DATA ROLL_N
* PICK_N .NAME_FIELD 4, 'PICK '
PICK   DATA $+2
       BL   @PIKROL     ;get counter and nth cell pointer
PICK01 DECT SP          ;get room on stack
       MOV  *R3,*SP     ;
PICKEX B    *NEXT

 

The BL..RT overhead adds 6 bytes to each word, but saves 10, for a net savings of 4 bytes for each word or 8 bytes total. Unfortunately, this forces an additional instruction to the code for ROLL , which results in saving only 2 bytes, and a net savings of only 6 bytes—perhaps not worth the somewhat convoluted code—though bank #0 space is getting very tight. I don’t know.... :ponder: 🤔

 

...lee

Link to comment
Share on other sites

After I saw your Forth version I took the approach that since "thou shalt not roll",  the recursive Forth version was good enough. :)

 

My Pick looks like this.  I never considered putting ABS in there. 

CODE PICK   ( n -- n)   \ GForth ITC takes 8 intel instructions for PICK
              TOS  1 SLA,       \  (" n CELLS")
              SP TOS ADD,     
             *TOS TOS MOV,  
              NEXT,         
ENDCODE

 

 

But in mission critical uses I load 3RD and 4TH which are the same size as OVER in 9900 code. 

HEX
CODE 3RD  ( a b c d --  a b c d b)  \ ANS: 2 PICK
          0646 , C584 ,  \ TOS PUSH,      
          C126 , 0004 ,  \ 4 (SP) TOS MOV, 
          NEXT,  
          ENDCODE

CODE 4TH  ( a b c d e--  a b c d e a) \ ANS: 3 PICK
          0646 , C584 ,  \ TOS PUSH, 
          C126 , 0006 ,  \ 6 (SP) TOS MOV,
          NEXT,
          ENDCODE

 

  • Like 1
Link to comment
Share on other sites

By the way I counted wrong in that comment above.

GForth takes 10 instructions in Intel land.

Code pick
( $40339C )  mov     dword ptr 41F2D8 , ebp  \ $89 $2D $D8 $F2 $41 $0
( $4033A2 )  mov     eax , dword ptr [edi]  \ $8B $7
( $4033A4 )  add     edi , # 4  \ $83 $C7 $4
( $4033A7 )  mov     eax , dword ptr [edi] [eax*4]  \ $8B $4 $87
( $4033AA )  sub     edi , # 4  \ $83 $EF $4
( $4033AD )  mov     dword ptr [edi] , eax  \ $89 $7
( $4033AF )  mov     eax , dword ptr 0 [ebp]  \ $8B $45 $0
( $4033B2 )  add     ebp , # 4  \ $83 $C5 $4
( $4033B5 )  mov     esi , eax  \ $89 $C6
( $4033B7 )  mov     ebx , dword ptr [eax]  \ $8B $18
( $4033B9 )  jmp     40152A  \ $E9 $6C $E1 $FF $FF
end-code
 ok

 

But I just checked out VFX Forth's version!

see pick 
PICK 
( 004323D0    488B5CDD00 )            MOV     RBX, [+RBX*8]
( 004323D5    C3 )                    RET/NEXT
( 6 bytes, 2 instructions )
 ok

 

  • Like 1
Link to comment
Share on other sites

14 hours ago, TheBF said:

What would it look like if you strung the Forth primitives end-to-end, without NEXT,  use Assembler IF ENDIF and a JMP to do the recursion?

 

: ROLL  ( [n]..[0] +n --- [n-1]..[0][n] )
             \ #instructions
   -DUP      \  3
   IF        \  1
      1-     \  1 
      SWAP   \  3
      >R     \  2
      MYSELF \  1 ??
      R>     \  2
      SWAP   \  3
   THEN ; \ =  15  

 

Potentially the same number of instructions. :( 

Maybe some optimizations could be found with SWAP because you are using registers in the code. So maybe you just SWAP the argument order?

Might worth a look.

 

4 hours ago, TheBF said:

I think recursion here is  a "call" in machine code. So for 9900 we would need to make the code a nestable sub-routine that pushes R11 on entry and pops R11 on exit.

Essentially you are making one Forth word a "subroutine threaded" word. 

 

I will play around a bit too and see if I am blowing smoke on this. :) 

Come to think of it I don't think I got around to making RECURSE in my machine code project so maybe I can try implementing and testing on that platform.

I predict a few crashes... ;)

 

OK...Here is an ALC recursion that works! It is 40 bytes, so 6 bytes longer than the simple loop, but it certainly is satisfying to accomplish it!

 

ROLL   DATA $+2
       LI   R7,ROLLIT   ;load entry point for recursive call
       MOV  *SP+,R2     ;pop position (count)
       BL   *R7
ROLLEX B    *NEXT

ROLLIT 
*++ Push return address to return stack
       DECT R
       MOV  R11,*R
*++ 1-
       DEC  R2
*++ IF
       JLT  ROLITX      ;we're done if negative
*++ >R       
       DECT R
       MOV  *SP+,*R
*++ MYSELF (recurse)
       BL   *R7
*++ SWAP R>  (pop return stack to stack and SWAP)
       DECT SP
       MOV  @2(SP),*SP  ;over
       MOV  *R+,@2(SP)  ;under
*++ Pop return address and return
ROLITX MOV  *R+,R6
       B    *R6

 

Can it be refactored to tighter code?

 

...lee

  • Like 1
Link to comment
Share on other sites

On 11/15/2023 at 2:48 AM, TheBF said:

I found the text.  "Data Structures and Algorithms, Aho, Hopcroft, Ullman, 1983 Bell Telephone Labs. Inc"

 

The test for Empty queue in Pascal looks like this in the book.

function EMPTY (Q:QUEUE) : boolean 
	begin 
	  if (Q.front=Q.rear then 
 	     return(true)
	  else
	     return(false)
	end. { EMPTY }

 

It talks about the confusion of telling the difference between a full queue and an empty queue. One solution given is to never let the queue fill up.

The other is to provide a "bit" to indicate when the queue is full. I have done that in the past with a counter rather than a bit. 

And as you have done you can just ignore it and let the data overwrite. I have done that in most of my real world uses. 

 

It's fun to translate that function EMPTY to Forth, :) 

 

: EMPTY?    ( -- ?) Q.front @  Q.rear @ = ; 

 

If the entire program takes that much more source code which one should be easier to maintain? 🤔

I don't know which language they intended that to be, but it's not Pascal. In Pascal, it would be like one of these two:

function empty(q:queue): boolean;
begin
  if q.front=q.rear then
    empty := true
  else
    empty := false
end;

function empty(q:queue): boolean;
begin
  empty := q.front=q.rear
end;

 

Just because you don't use Forth you don't have to overcomplicate things.

 

  • Haha 1
Link to comment
Share on other sites

21 minutes ago, apersson850 said:

I don't know which language they intended that to be, but it's not Pascal. In Pascal, it would be like one of these two:

function empty(q:queue): boolean;
begin
  if q.front=q.rear then
    empty := true
  else
    empty := false
end;

function empty(q:queue): boolean;
begin
  empty := q.front=q.rear
end;

 

Just because you don't use Forth you don't have to overcomplicate things.

 

Very nice.  It shows how people who write text books don't always have a clear view on the real world engineering demands. 

 

If you do complicate things now Herr Wirth will probably come to haunt you. :)

 

I notice that I typed that in wrong with that leading '('.    mea culpa. 

 

Link to comment
Share on other sites

46 minutes ago, Lee Stewart said:

 

 

OK...Here is an ALC recursion that works! It is 40 bytes, so 6 bytes longer than the simple loop, but it certainly is satisfying to accomplish it!

 

ROLL   DATA $+2
       LI   R7,ROLLIT   ;load entry point for recursive call
       MOV  *SP+,R2     ;pop position (count)
       BL   *R7
ROLLEX B    *NEXT

ROLLIT 
*++ Push return address to return stack
       DECT R
       MOV  R11,*R
*++ 1-
       DEC  R2
*++ IF
       JLT  ROLITX      ;we're done if negative
*++ >R       
       DECT R
       MOV  *SP+,*R
*++ MYSELF (recurse)
       BL   *R7
*++ SWAP R>  (pop return stack to stack and SWAP)
       DECT SP
       MOV  @2(SP),*SP  ;over
       MOV  *R+,@2(SP)  ;under
*++ Pop return address and return
ROLITX MOV  *R+,R6
       B    *R6

 

Can it be refactored to tighter code?

 

...lee

 That is totally awesome.  I got called away and didn't get to play with this.

Felicitations!

 

  • Thanks 1
Link to comment
Share on other sites

30 minutes ago, apersson850 said:

I don't know which language they intended that to be, but it's not Pascal. In Pascal, it would be like one of these two:

 

I just peeked in the text and they mention they are using what they call "Super Pascal" to express the alogrithms.

So this is "psuedo-pascal" in reality. 

Link to comment
Share on other sites

52 minutes ago, Lee Stewart said:

 

 

OK...Here is an ALC recursion that works! It is 40 bytes, so 6 bytes longer than the simple loop, but it certainly is satisfying to accomplish it!

 

ROLL   DATA $+2
       LI   R7,ROLLIT   ;load entry point for recursive call
       MOV  *SP+,R2     ;pop position (count)
       BL   *R7
ROLLEX B    *NEXT

ROLLIT 
*++ Push return address to return stack
       DECT R
       MOV  R11,*R
*++ 1-
       DEC  R2
*++ IF
       JLT  ROLITX      ;we're done if negative
*++ >R       
       DECT R
       MOV  *SP+,*R
*++ MYSELF (recurse)
       BL   *R7
*++ SWAP R>  (pop return stack to stack and SWAP)
       DECT SP
       MOV  @2(SP),*SP  ;over
       MOV  *R+,@2(SP)  ;under
*++ Pop return address and return
ROLITX MOV  *R+,R6
       B    *R6

 

Can it be refactored to tighter code?

 

...lee

Is there a way to not load R7 and rather replace the MYSELF line with 

         BL @ROLLIT

 

or some such computed address to the entry to the code

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...