Jump to content
IGNORED

fbForth—TI Forth with File-based Block I/O [Post #1 UPDATED: 06/05/2024]


Lee Stewart

Recommended Posts

The current beta (fbForth 3.0:Q) fixes a bug discovered by Bob Carmany (@atrax27407) using MAME emulating an EVPC card, which has a 9938 VDP on board. The bug exists in 80-column text mode only when the VDP is a 9938. F18A ignored it. I needed to change VR13 from >10 to >01. That VDP register has to do with blinking text on the 9938. Today’s Zoom meeting discussion ran down the fix, with Erik Olson (@FarmerPotato) providing the key information leading to the fix. Thanks also to @matthew180, @Gary from OPA, and @mizapf (PM) for their input.

 

Here are the current betas:  

 

...lee

  • Like 5
Link to comment
Share on other sites

While considering changing BRANCH and 0BRANCH to using absolute addresses instead of offsets, I revisited the code for (+LOOP) . I looked at how @Willsy did it for TurboForth, to see whether I could implement it for fbForth 3.0 because it is faster. I may try it, but I do not want the bizarre behavior from Forth79 onward of going one loop too far with negative steps:

Return execution to the corresponding DO until the new index is equal to or greater than the limit (n>0),  or until the new index is less  than the limit (n<0).

 

It appears to me that the ANS folks have drunk the Kool-Aid with this comment from the Forth79 Standard,

(Comment:  It is a historical precedent that the limit for n<0 is irregular. Further consideration of the characteristic is unlikely.)

 

and the reframing of the limit test to

If the loop index did not cross the boundary between the loop limit minus one and the loop limit, continue execution at the beginning of the loop.

 

It seems so strange that they would use the test of crossing between limit-1 and limit for both loop directions when limit-1 is outside the loop in the negative direction!

 To me, it is a thinly veiled effort to make that behavior appear normal, when it is not. I’ll stop talking now.

 

...lee

 

  • Like 2
Link to comment
Share on other sites

I think the extra loop test on the limit is to make sure the loop ends, if for some reason inside the loop the code that bring executed is messing around with the limit variable itself by changing it like to skip a loop or something based on a certain action taken inside the loop.

Link to comment
Share on other sites

1 hour ago, Lee Stewart said:

While considering changing BRANCH and 0BRANCH to using absolute addresses instead of offsets, I revisited the code for (+LOOP) . I looked at how @Willsy did it for TurboForth, to see whether I could implement it for fbForth 3.0 because it is faster. I may try it, but I do not want the bizarre behavior from Forth79 onward of going one loop too far with negative steps:

Return execution to the corresponding DO until the new index is equal to or greater than the limit (n>0),  or until the new index is less  than the limit (n<0).

 

It appears to me that the ANS folks have drunk the Kool-Aid with this comment from the Forth79 Standard,

(Comment:  It is a historical precedent that the limit for n<0 is irregular. Further consideration of the characteristic is unlikely.)

 

and the reframing of the limit test to

If the loop index did not cross the boundary between the loop limit minus one and the loop limit, continue execution at the beginning of the loop.

 

It seems so strange that they would use the test of crossing between limit-1 and limit for both loop directions when limit-1 is outside the loop in the negative direction!

 To me, it is a thinly veiled effort to make that behavior appear normal, when it is not. I’ll stop talking now.

 

...lee

 

"I'll stop talking now" :) 

Made me chuckle. (Please don't stop talking)

 

In the comments in Camel Forth Brad says this about the Forth83/ANSI/ISO Forth DO LOOP.

 

\ ; '83 and ANSI standard loops terminate when the boundary of
\ ; limit-1 and limit is crossed, in either direction.  This can
\ ; be conveniently implemented by making the limit 8000h, so that
\ ; arithmetic overflow logic can detect crossing.  I learned this
\ ; trick from Laxen & Perry F83.

 

From what I know this was just a clever way to allow DO LOOP to count the full range of a native integer.

I would have NEVER figured out that kind of math.

 

I have not surveyed how a lot of other systems implement LEAVE, but Mark's method of pushing an exit address onto the R stack is unique I think. 

(at least I think it's an exit address that he pushes)

 

Maybe you know all this already but its a good exercise for me to try and understand this again. It took me a long time to grok it. 

 

Camel Forth and F83 use a tiny "leave" stack at compile time  ( I reserved 4 cells) to keep track of each "DO" by pushing 0 onto the leave stack. 

The word LEAVE compiles UNLOOP to pop the values off the R stack but it also compiles a BRANCH instruction to place holder now called "AHEAD" in modern systems.

 

The word LOOP or +LOOP perform the branch back using the clever computation, but also execute the word RAKE as it is called in Forth83. 

RAKE is for all the LEAVEs. :) 

 

RAKE  simply takes each value from the leave stack and:

1. If it is a zero it does nothing.

2. If it is non-zero, that is an address so it resolves the BRANCH (ie IF) that was compiled by LEAVE by compiling the word THEN. 

 

And then by magic it works.

 

https://github.com/bfox9900/CAMEL99-ITC/blob/master/cc9900/SRC.ITC/ISOLOOPS.HSF

 

Something for you to ponder. 

 

 

 

 

 

 

  • Like 2
  • Confused 1
Link to comment
Share on other sites

1 hour ago, TheBF said:

In the comments in Camel Forth Brad says this about the Forth83/ANSI/ISO Forth DO LOOP.

 

\ ; '83 and ANSI standard loops terminate when the boundary of
\ ; limit-1 and limit is crossed, in either direction.  This can
\ ; be conveniently implemented by making the limit 8000h, so that
\ ; arithmetic overflow logic can detect crossing.  I learned this
\ ; trick from Laxen & Perry F83.

 

From what I know this was just a clever way to allow DO LOOP to count the full range of a native integer.

I would have NEVER figured out that kind of math.

 

I rather think it happened the other way round, i.e., someone discovered the clever, faster overflow method via the sign-bit toggle and that just so happened to screw up the termination in the negative direction, but no matter, we'll just pretend that is the way it ought to be! No, thank you!—I much prefer the (unfortunately, slower) figFORTH way of keeping the index from crossing outside the limit in either direction.

 

...lee

  • Like 2
Link to comment
Share on other sites

21 minutes ago, Lee Stewart said:

 

I rather think it happened the other way round, i.e., someone discovered the clever, faster overflow method via the sign-bit toggle and that just so happened to screw up the termination in the negative direction, but no matter, we'll just pretend that is the way it ought to be! No, thank you!—I much prefer the (unfortunately, slower) figFORTH way of keeping the index from crossing outside the limit in either direction.

 

...lee

And since you have made a  FIG Forth system it's expected to behave that way or it breaks a lot of code. 

 

Reminds me of the old song "Be True to Your School" :)

 

  • Like 2
Link to comment
Share on other sites

On 5/17/2024 at 7:59 AM, TheBF said:

In the comments in Camel Forth Brad says this about the Forth83/ANSI/ISO Forth DO LOOP.

 

\ ; '83 and ANSI standard loops terminate when the boundary of
\ ; limit-1 and limit is crossed, in either direction.  This can
\ ; be conveniently implemented by making the limit 8000h,

I've been trying to understand this since yesterday.  I see the advantage from adjusting the loop limit, but the same adjustment must be done to the initial counter value, yes? 
 

IF you DO that then I has to do extra work, yeah?

 

In TIForth the I word was just R. 

 

  • Like 1
Link to comment
Share on other sites

9 minutes ago, FarmerPotato said:

I've been trying to understand this since yesterday.  I see the advantage from adjusting the loop limit, but the same adjustment must be done to the initial counter value, yes? 
 

IF you DO that then I has to do extra work, yeah?

 

In TIForth the I word was just R. 

 

Here is the code.  Indeed, I has to do the same computation so R@ is not the same as I or J 

 

\ conventional do loops use 2 cells on the RSTACK
[CC] cr .( Rstack based DO/LOOP ) [TC]

CODE <?DO>  ( limit ndx -- )
            *SP TOS CMP,        \ compare 2 #s
            @@1 JNE,            \ if they are not the same jump to regular 'do.' (BELOW)
            TOS POP,            \ remove limit
            TOS POP,            \ refill TOS
            IP RPOP,
            NEXT,

+CODE <DO>  ( limit indx -- )
@@1:        R0  8000 LI,        \ load "fudge factor" to LIMIT
            *SP+ R0  SUB,       \ Pop limit, compute 8000h-limit "fudge factor"
            R0  TOS ADD,        \ loop ctr = index+fudge
            R0  RPUSH,
            TOS RPUSH,
            TOS POP,            \ refill TOS
            NEXT,
ENDCODE

CODE <+LOOP>
            TOS *RP ADD,        \ save space by jumping into <loop>
            TOS POP,            \ refill TOS, (does not change overflow flag)
            @@2 JMP,
+CODE <LOOP>
            *RP INC,            \ increment loop
@@2:        @@1 JNO,            \ if no overflow then loop again
            IP INCT,            \ move past (LOOP)'s in-line parameter
            @@3 JMP,            \ jump to UNLOOP
@@1:        *IP IP ADD,         \ jump back
            NEXT,

+CODE UNLOOP
@@3:        RP  4 ADDI,         \ collapse rstack frame
            NEXT,
ENDCODE

CODE I      ( -- n)
            TOS PUSH,        
            *RP    TOS MOV, 
            2 (RP) TOS SUB,    
            NEXT,             
            ENDCODE

CODE J      ( -- n)
            TOS PUSH,
            4 (RP) TOS MOV,   \ outer loop index is on the rstack
            6 (RP) TOS SUB,   \ index = loopindex - fudge
            NEXT,
            ENDCODE

 

  • Like 1
Link to comment
Share on other sites

@TheBF   I see it now.  The inline assembly took a while to follow.

 

I was thinking that the carry bit would work too.  Crossing from -1 to 0. 

 

That got me wondering if there were any other status bit tricks. 

 

Here's one of Charles Moore's thoughts (paraphrased)

 

Suppose a system or resident dictionary uses absolute addresses, but a user dictionary is relocatable and has addresses relative to a base register.  The same kind of entry must be treated differently depending on which dictionary it is in. (Charles Moore 1970 p. 148)
 

I thought, the LSbit of a word pointer is not used (on a 16-bit machine).  You could have two kinds of pointers, by testing the extra bit.  

 

Using the MSBit, you add SLA W,1 after each MOV *IP+,W fetch. If it carries, you have an absolute resident pointer, range of 32K words.  If it doesn't carry, you have a relocatable pointer to user dictionary.  Plenty of room for refinements on top of that. 

 

I considered also the LSBit, since MOV doesn't care about that bit (on a 16-bit bus.)   I thought, why not use the parity status bit?  But you don't get the parity for free on MOV, only on MOVB and the other 8 bit ALU instructions.  Prolly really tricky for the compiler, anyway. 

 

 

 

  • Like 3
Link to comment
Share on other sites

I understand what you are saying but I have not fully processed how that would work in real life. 

It would not have occurred to me that there is extra storage capacity hidden in an address. :)

Pretty clever thinking by you. 

 

I know MPE Forth compilers did relocatable overlays by loading them in two places and somehow comparing the differences to find the addresses that need adjusting.

I have never used them or build my own to fully grok that method.

 

Link to comment
Share on other sites

2 hours ago, TheBF said:

 

I understand what you are saying but I have not fully processed how that would work in real life. 

 

I haven't figured it out either :(  :)  

 

It's a new concept of what a pointer is. That complicates any kind of pointer math, which code always does. Perhaps if weird pointer use is limited to execution tokens, which are supposed to be opaque, right?

 

Taking one step in: there are two inner interpreters; one for resident, one for relocatable. Call them A$NEXT and R$NEXT.  Their main difference is that once inside, NEXT points back to themself.
 

The cost to switch modes is an extra LI NEXT,xxx and a JMP.   Resident words return back to A$NEXT and relocatable words return back to  R$NEXT.  A$NEXT keeps NEXT pointing back into itself. 

 

NEXT is altered only when the interpreter executes the other kind of token.  The cost to cross over is an extra LI.  Both interpreters bloat up by the pointer test on W after MOV *IP+,W. 

When A$NEXT encounters an A xt, it immediately branches through its  CFA. If it encounters a R xt, it jumps out to a LI  NEXT,R$NEXT and a JMP into the other interpreter.
 

The resident code is optimized and is compiled with absolute addresses. There are compiler STATE 1 (resident) and 2 (relocatable). 

 

The R$NEXT interpreter has to do extra work. So user or R words run a bit slower.
 

Mainly, the R interpreter must examine the content of  each CFA, because it may be an A or R pointer (there's that special bit again.)  If R pointer, it must add a base register. It doesn't have to change NEXT in that case. 
 


So there's some if the complexity. 
 

Another wacky idea I had is that all xt are just small integers 1,2,3..  indexes into tables of CFA,PFA,NFA. That has advantages in that programs can occupy  much much more than 64K.  The upper 8 bits might even be a block number!

 

But I'm not doing any if these things. I'm using the basic interpreter I found in the Geneve Forth. 
 

 

  • Like 1
Link to comment
Share on other sites

2 minutes ago, FarmerPotato said:

I haven't figured it out either :(  :)  

 

It's a new concept of what a pointer is. That complicates any kind of pointer math, which code always does. Perhaps if weird pointer use is limited to execution tokens, which are supposed to be opaque, right?

 

Taking one step in: there are two inner interpreters; one for resident, one for relocatable. Call them A$NEXT and R$NEXT.  Their main difference is that once inside, NEXT points back to themself.
 

The cost to switch modes is an extra LI NEXT,xxx and a JMP.   Resident words return back to A$NEXT and relocatable words return back to  R$NEXT.  A$NEXT keeps NEXT pointing back into itself. 

 

NEXT is altered only when the interpreter executes the other kind of token.  The cost to cross over is an extra LI.  Both interpreters bloat up by the pointer test on W after MOV *IP+,W. 

When A$NEXT encounters an A xt, it immediately branches through its  CFA. If it encounters a R xt, it jumps out to a LI  NEXT,R$NEXT and a JMP into the other interpreter.
 

The resident code is optimized and is compiled with absolute addresses. There are compiler STATE 1 (resident) and 2 (relocatable). 

 

The R$NEXT interpreter has to do extra work. So user or R words run a bit slower.
 

Mainly, the R interpreter must examine the content of  each CFA, because it may be an A or R pointer (there's that special bit again.)  If R pointer, it must add a base register. It doesn't have to change NEXT in that case. 
 


So there's some if the complexity. 
 

Another wacky idea I had is that all xt are just small integers 1,2,3..  indexes into tables of CFA,PFA,NFA. That has advantages in that programs can occupy  much much more than 64K.  The upper 8 bits might even be a block number!

Sounds like your are going to be busy doing experiments. :) 

 

The cool thing about threaded code is that you just need to define an "entry" routine and an "exit" routine for these new word type.

Then the compiler simply make the new words with the entry compiled before the address list and the exit compiled after the address list and it just works. 

 

 

Your idea of using small integers is called a byte code interpreter. 1..255 are the legal op codes.  GPL is one of those. I suppose TI BASIC is as well. 

I have yet to make one but it would be interesting to see how it performs.  The indexed addressing mode on 9900 would make it pretty quick. 

 

Next would be something like this, I think, which is only marginally slower than current ITC. 

 

l: _next               
            *IP+ W  MOVB,     \ move CFA into Working register & incr IP
            *W+  R5 MOVB,     \ move contents of CFA to R5 & INCR W
            OPTABLE(R5) B,    \ branch to the address in R5

 

  • Like 1
Link to comment
Share on other sites

On 5/7/2024 at 2:36 PM, Lee Stewart said:
  • fbForth ISR disabled at bootup (enable by storing contents of INTLNK at >83C4, the console ISR hook)
  • PLAY , STREAM , SAY will display “ISR?” if fbForth ISR is disabled

 

I am changing how the fbForth 3.0 ISR operates. The first item above will still obtain, but, implementing an idea I had during the last Saturday’s Zoom meeting and taking a page from @FarmerPotato’s book, 

 

23 hours ago, FarmerPotato said:

I thought, the LSbit of a word pointer is not used (on a 16-bit machine).  You could have two kinds of pointers, by testing the extra bit.

 

I am using the LSb of the ISR hook to indicate whether to service speech/sound (LSb=1) or skip directly to servicing any user ISR (LSb=0). That way, The user can store the contents of INTLNK (LSb=0)at >83C4 and only service a user ISR.

 

I am also no longer having PLAY , STREAM , SAY check for the presence of the fbForth ISR. Rather, those words will now unconditionally load the fbForth ISR at >83C4 and set its LSb to 1 to insure speech/sound processing. It will be on the user to disable the fbForth ISR by CLeaRing >83C4 when it is no longer desired or setting the LSb to 0 if only a user ISR needs servicing.

 

Here is the current beta with the above changes:     fbForth300_8_Ra.bin     fbForth300.rpk

 

Current free space in ROM banks:
   bank 0:   278 bytes
   bank 1:    96 bytes
   bank 2:   104 bytes
   bank 3:   118 bytes

 

Let me know if I screwed something up.

 

...lee

  • Like 2
Link to comment
Share on other sites

3 hours ago, TheBF said:

Sounds like your are going to be busy doing experiments. :) 

 

The cool thing about threaded code is that you just need to define an "entry" routine and an "exit" routine for these new word type.

Then the compiler simply make the new words with the entry compiled before the address list and the exit compiled after the address list and it just works. 

 

 

Your idea of using small integers is called a byte code interpreter. 1..255 are the legal op codes.  GPL is one of those. I suppose TI BASIC is as well. 

I have yet to make one but it would be interesting to see how it performs.  The indexed addressing mode on 9900 would make it pretty quick. 

 

Next would be something like this, I think, which is only marginally slower than current ITC. 

 

l: _next               
            *IP+ W  MOVB,     \ move CFA into Working register & incr IP
            *W+  R5 MOVB,     \ move contents of CFA to R5 & INCR W
            OPTABLE(R5) B,    \ branch to the address in R5

 

Not byte code, I'm thinking words.  The xt is the index to the arrays.

 

More like this: 

 

DOCOLON EQU $
	DECT  RP
	MOV  IP,*RP       \ save IP to return to
	MOV  @PFA(W),IP
DONEXT EQU $          \ the address in register NEXT
	MOV  *IP+,W       \ W is an xt, a small even number
	MOV  @CFA(W),TEMP6
	B *TEMP6

DOVAR  EQU $
	DECT SP
	MOV @PFA(W),*SP    \ put PFA on stack. the loader made it an absolute address
	MOV *RP+,IP        \ return
	B   *NEXT

DOCONSTANT EQU $
	DECT SP
	MOV  @PFA(W),TEMP1   \ absolute address holding the constant
	MOV  *TEMP1,*SP      \ get the constant 
    B    *NEXT

 

There's some other issues like where a VARIABLE allocates storage... there needs to be an idea of DSEG or data segment that stays resident  (if relocatable blocks can get paged out).  If blocks load into infinite SAMS memory, variables persist. 

 

  • Like 2
Link to comment
Share on other sites

If my legions |:) of testers do not discover any bugs in the next few days, I will release fbForth 3.0:0 on Monday or Tuesday.

 

Then, of course, the real fun begins—yeah, I am going to embark upon the manual rewrite! Misguided though they be 🤪, suggestions for inclusion in the new manual are welcome. :waving:

 

...lee

  • Like 2
  • Thanks 1
Link to comment
Share on other sites

I have fbForth 3.0:0 ready to release. I need to do a little work on FBLOCKS (a day or two) before I release it. I think I will release them together. Thanks for your patience, all ye hordes of Forthers!

 

...lee

  • Like 5
Link to comment
Share on other sites

I have changed the 64-column editor to allow the user to post the menu in the bottom 5 lines of the 8-line SPLIT window with <CTRL+.>.

 

I was vacillating between user invocation of the menu and automatically posting it as with the 40/80-column editor. What do you think?

 

...lee

                       

PS: I will post fbForth 3.0:0 later today or early tomorrow—I promise.

  • Like 2
Link to comment
Share on other sites

On 6/3/2024 at 1:33 PM, Lee Stewart said:

PS: I will post fbForth 3.0:0 later today or early tomorrow—I promise.

 

OK—It took a little longer than I anticipated! All necessary files are near the top of the first post.

 

Now—On to updating the manual.

 

...lee

  • Like 5
  • Thanks 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...