fbForth fbForth—TI Forth with File-based Block I/O [Post #1 UPDATED: 06/05/2024]

+TheBF · May 7

R@ as a synonym for R

WE ARE THE BORG. RESISTANCE IF FUTILE. YOU WILL BE ASSIMILATED.

image.png.fcf6c1a8b97051e2b5b582a28f2b1497.png

atrax27407 · May 8

Got everything downloaded and my multi-column FBLOCKS MENU display installed. Thanks, Lee!

Edited May 8 by atrax27407

+Lee Stewart · May 11

The current beta (fbForth 3.0:Q) fixes a bug discovered by Bob Carmany (@atrax27407) using MAME emulating an EVPC card, which has a 9938 VDP on board. The bug exists in 80-column text mode only when the VDP is a 9938. F18A ignored it. I needed to change VR13 from >10 to >01. That VDP register has to do with blinking text on the 9938. Today’s Zoom meeting discussion ran down the fix, with Erik Olson (@FarmerPotato) providing the key information leading to the fix. Thanks also to @matthew180, @Gary from OPA, and @mizapf (PM) for their input.

Here are the current betas:

fbForth300_8_Qa.bin (remove “_Qa” from filename for systems needing ROM type at its end)
fbForth300.rpk

...lee

+Lee Stewart · May 17

While considering changing BRANCH and 0BRANCH to using absolute addresses instead of offsets, I revisited the code for (+LOOP) . I looked at how @Willsy did it for TurboForth, to see whether I could implement it for fbForth 3.0 because it is faster. I may try it, but I do not want the bizarre behavior from Forth79 onward of going one loop too far with negative steps:

Return execution to the corresponding DO until the new index is equal to or greater than the limit (n>0), or until the new index is less than the limit (n<0).

It appears to me that the ANS folks have drunk the Kool-Aid with this comment from the Forth79 Standard,

(Comment: It is a historical precedent that the limit for n<0 is irregular. Further consideration of the characteristic is unlikely.)

and the reframing of the limit test to

If the loop index did not cross the boundary between the loop limit minus one and the loop limit, continue execution at the beginning of the loop.

It seems so strange that they would use the test of crossing between limit-1 and limit for both loop directions when limit-1 is outside the loop in the negative direction!

To me, it is a thinly veiled effort to make that behavior appear normal, when it is not. I’ll stop talking now.

...lee

Gary from OPA · May 17

I think the extra loop test on the limit is to make sure the loop ends, if for some reason inside the loop the code that bring executed is messing around with the limit variable itself by changing it like to skip a loop or something based on a certain action taken inside the loop.

+TheBF · May 17

1 hour ago, Lee Stewart said:

While considering changing BRANCH and 0BRANCH to using absolute addresses instead of offsets, I revisited the code for (+LOOP) . I looked at how @Willsy did it for TurboForth, to see whether I could implement it for fbForth 3.0 because it is faster. I may try it, but I do not want the bizarre behavior from Forth79 onward of going one loop too far with negative steps:

Return execution to the corresponding DO until the new index is equal to or greater than the limit (n>0), or until the new index is less than the limit (n<0).

It appears to me that the ANS folks have drunk the Kool-Aid with this comment from the Forth79 Standard,

(Comment: It is a historical precedent that the limit for n<0 is irregular. Further consideration of the characteristic is unlikely.)

and the reframing of the limit test to

If the loop index did not cross the boundary between the loop limit minus one and the loop limit, continue execution at the beginning of the loop.

It seems so strange that they would use the test of crossing between limit-1 and limit for both loop directions when limit-1 is outside the loop in the negative direction!

To me, it is a thinly veiled effort to make that behavior appear normal, when it is not. I’ll stop talking now.

...lee

"I'll stop talking now"

Made me chuckle. (Please don't stop talking)

In the comments in Camel Forth Brad says this about the Forth83/ANSI/ISO Forth DO LOOP.

\ ; '83 and ANSI standard loops terminate when the boundary of
\ ; limit-1 and limit is crossed, in either direction.  This can
\ ; be conveniently implemented by making the limit 8000h, so that
\ ; arithmetic overflow logic can detect crossing.  I learned this
\ ; trick from Laxen & Perry F83.

From what I know this was just a clever way to allow DO LOOP to count the full range of a native integer.

I would have NEVER figured out that kind of math.

I have not surveyed how a lot of other systems implement LEAVE, but Mark's method of pushing an exit address onto the R stack is unique I think.

(at least I think it's an exit address that he pushes)

Maybe you know all this already but its a good exercise for me to try and understand this again. It took me a long time to grok it.

Camel Forth and F83 use a tiny "leave" stack at compile time ( I reserved 4 cells) to keep track of each "DO" by pushing 0 onto the leave stack.

The word LEAVE compiles UNLOOP to pop the values off the R stack but it also compiles a BRANCH instruction to place holder now called "AHEAD" in modern systems.

The word LOOP or +LOOP perform the branch back using the clever computation, but also execute the word RAKE as it is called in Forth83.

RAKE is for all the LEAVEs.

RAKE simply takes each value from the leave stack and:

1. If it is a zero it does nothing.

2. If it is non-zero, that is an address so it resolves the BRANCH (ie IF) that was compiled by LEAVE by compiling the word THEN.

And then by magic it works.

https://github.com/bfox9900/CAMEL99-ITC/blob/master/cc9900/SRC.ITC/ISOLOOPS.HSF

Something for you to ponder.

+Lee Stewart · May 17

1 hour ago, TheBF said:
In the comments in Camel Forth Brad says this about the Forth83/ANSI/ISO Forth DO LOOP.
\ ; '83 and ANSI standard loops terminate when the boundary of
\ ; limit-1 and limit is crossed, in either direction.  This can
\ ; be conveniently implemented by making the limit 8000h, so that
\ ; arithmetic overflow logic can detect crossing.  I learned this
\ ; trick from Laxen & Perry F83.
From what I know this was just a clever way to allow DO LOOP to count the full range of a native integer.

I would have NEVER figured out that kind of math.

I rather think it happened the other way round, i.e., someone discovered the clever, faster overflow method via the sign-bit toggle and that just so happened to screw up the termination in the negative direction, but no matter, we'll just pretend that is the way it ought to be! No, thank you!—I much prefer the (unfortunately, slower) figFORTH way of keeping the index from crossing outside the limit in either direction.

...lee

+TheBF · May 17

21 minutes ago, Lee Stewart said:

I rather think it happened the other way round, i.e., someone discovered the clever, faster overflow method via the sign-bit toggle and that just so happened to screw up the termination in the negative direction, but no matter, we'll just pretend that is the way it ought to be! No, thank you!—I much prefer the (unfortunately, slower) figFORTH way of keeping the index from crossing outside the limit in either direction.

...lee

And since you have made a FIG Forth system it's expected to behave that way or it breaks a lot of code.

Reminds me of the old song "Be True to Your School"

+FarmerPotato · May 19

On 5/17/2024 at 7:59 AM, TheBF said:
In the comments in Camel Forth Brad says this about the Forth83/ANSI/ISO Forth DO LOOP.
\ ; '83 and ANSI standard loops terminate when the boundary of
\ ; limit-1 and limit is crossed, in either direction.  This can
\ ; be conveniently implemented by making the limit 8000h,

I've been trying to understand this since yesterday. I see the advantage from adjusting the loop limit, but the same adjustment must be done to the initial counter value, yes?

IF you DO that then I has to do extra work, yeah?

In TIForth the I word was just R.

+TheBF · May 19

9 minutes ago, FarmerPotato said:

I've been trying to understand this since yesterday. I see the advantage from adjusting the loop limit, but the same adjustment must be done to the initial counter value, yes?

IF you DO that then I has to do extra work, yeah?

In TIForth the I word was just R.

Here is the code. Indeed, I has to do the same computation so R@ is not the same as I or J

\ conventional do loops use 2 cells on the RSTACK
[CC] cr .( Rstack based DO/LOOP ) [TC]

CODE <?DO>  ( limit ndx -- )
            *SP TOS CMP,        \ compare 2 #s
            @@1 JNE,            \ if they are not the same jump to regular 'do.' (BELOW)
            TOS POP,            \ remove limit
            TOS POP,            \ refill TOS
            IP RPOP,
            NEXT,

+CODE <DO>  ( limit indx -- )
@@1:        R0  8000 LI,        \ load "fudge factor" to LIMIT
            *SP+ R0  SUB,       \ Pop limit, compute 8000h-limit "fudge factor"
            R0  TOS ADD,        \ loop ctr = index+fudge
            R0  RPUSH,
            TOS RPUSH,
            TOS POP,            \ refill TOS
            NEXT,
ENDCODE

CODE <+LOOP>
            TOS *RP ADD,        \ save space by jumping into <loop>
            TOS POP,            \ refill TOS, (does not change overflow flag)
            @@2 JMP,
+CODE <LOOP>
            *RP INC,            \ increment loop
@@2:        @@1 JNO,            \ if no overflow then loop again
            IP INCT,            \ move past (LOOP)'s in-line parameter
            @@3 JMP,            \ jump to UNLOOP
@@1:        *IP IP ADD,         \ jump back
            NEXT,

+CODE UNLOOP
@@3:        RP  4 ADDI,         \ collapse rstack frame
            NEXT,
ENDCODE

CODE I      ( -- n)
            TOS PUSH,        
            *RP    TOS MOV, 
            2 (RP) TOS SUB,    
            NEXT,             
            ENDCODE

CODE J      ( -- n)
            TOS PUSH,
            4 (RP) TOS MOV,   \ outer loop index is on the rstack
            6 (RP) TOS SUB,   \ index = loopindex - fudge
            NEXT,
            ENDCODE

+FarmerPotato · May 20

@TheBF I see it now. The inline assembly took a while to follow.

I was thinking that the carry bit would work too. Crossing from -1 to 0.

That got me wondering if there were any other status bit tricks.

Here's one of Charles Moore's thoughts (paraphrased)

Suppose a system or resident dictionary uses absolute addresses, but a user dictionary is relocatable and has addresses relative to a base register. The same kind of entry must be treated differently depending on which dictionary it is in. (Charles Moore 1970 p. 148)

I thought, the LSbit of a word pointer is not used (on a 16-bit machine). You could have two kinds of pointers, by testing the extra bit.

Using the MSBit, you add SLA W,1 after each MOV *IP+,W fetch. If it carries, you have an absolute resident pointer, range of 32K words. If it doesn't carry, you have a relocatable pointer to user dictionary. Plenty of room for refinements on top of that.

I considered also the LSBit, since MOV doesn't care about that bit (on a 16-bit bus.) I thought, why not use the parity status bit? But you don't get the parity for free on MOV, only on MOVB and the other 8 bit ALU instructions. Prolly really tricky for the compiler, anyway.

+TheBF · May 20

I understand what you are saying but I have not fully processed how that would work in real life.

It would not have occurred to me that there is extra storage capacity hidden in an address.

Pretty clever thinking by you.

I know MPE Forth compilers did relocatable overlays by loading them in two places and somehow comparing the differences to find the addresses that need adjusting.

I have never used them or build my own to fully grok that method.

+FarmerPotato · May 20

2 hours ago, TheBF said:

I understand what you are saying but I have not fully processed how that would work in real life.

I haven't figured it out either

It's a new concept of what a pointer is. That complicates any kind of pointer math, which code always does. Perhaps if weird pointer use is limited to execution tokens, which are supposed to be opaque, right?

Taking one step in: there are two inner interpreters; one for resident, one for relocatable. Call them A$NEXT and R$NEXT. Their main difference is that once inside, NEXT points back to themself.

The cost to switch modes is an extra LI NEXT,xxx and a JMP. Resident words return back to A$NEXT and relocatable words return back to R$NEXT. A$NEXT keeps NEXT pointing back into itself.

NEXT is altered only when the interpreter executes the other kind of token. The cost to cross over is an extra LI. Both interpreters bloat up by the pointer test on W after MOV *IP+,W.

When A$NEXT encounters an A xt, it immediately branches through its CFA. If it encounters a R xt, it jumps out to a LI NEXT,R$NEXT and a JMP into the other interpreter.

The resident code is optimized and is compiled with absolute addresses. There are compiler STATE 1 (resident) and 2 (relocatable).

The R$NEXT interpreter has to do extra work. So user or R words run a bit slower.

Mainly, the R interpreter must examine the content of each CFA, because it may be an A or R pointer (there's that special bit again.) If R pointer, it must add a base register. It doesn't have to change NEXT in that case.

So there's some if the complexity.

Another wacky idea I had is that all xt are just small integers 1,2,3.. indexes into tables of CFA,PFA,NFA. That has advantages in that programs can occupy much much more than 64K. The upper 8 bits might even be a block number!

But I'm not doing any if these things. I'm using the basic interpreter I found in the Geneve Forth.

+TheBF · May 20

2 minutes ago, FarmerPotato said:

I haven't figured it out either

It's a new concept of what a pointer is. That complicates any kind of pointer math, which code always does. Perhaps if weird pointer use is limited to execution tokens, which are supposed to be opaque, right?

Taking one step in: there are two inner interpreters; one for resident, one for relocatable. Call them A$NEXT and R$NEXT. Their main difference is that once inside, NEXT points back to themself.

The cost to switch modes is an extra LI NEXT,xxx and a JMP. Resident words return back to A$NEXT and relocatable words return back to R$NEXT. A$NEXT keeps NEXT pointing back into itself.

NEXT is altered only when the interpreter executes the other kind of token. The cost to cross over is an extra LI. Both interpreters bloat up by the pointer test on W after MOV *IP+,W.

When A$NEXT encounters an A xt, it immediately branches through its CFA. If it encounters a R xt, it jumps out to a LI NEXT,R$NEXT and a JMP into the other interpreter.

The resident code is optimized and is compiled with absolute addresses. There are compiler STATE 1 (resident) and 2 (relocatable).

The R$NEXT interpreter has to do extra work. So user or R words run a bit slower.

Mainly, the R interpreter must examine the content of each CFA, because it may be an A or R pointer (there's that special bit again.) If R pointer, it must add a base register. It doesn't have to change NEXT in that case.

So there's some if the complexity.

Another wacky idea I had is that all xt are just small integers 1,2,3.. indexes into tables of CFA,PFA,NFA. That has advantages in that programs can occupy much much more than 64K. The upper 8 bits might even be a block number!

Sounds like your are going to be busy doing experiments.

The cool thing about threaded code is that you just need to define an "entry" routine and an "exit" routine for these new word type.

Then the compiler simply make the new words with the entry compiled before the address list and the exit compiled after the address list and it just works.

Your idea of using small integers is called a byte code interpreter. 1..255 are the legal op codes. GPL is one of those. I suppose TI BASIC is as well.

I have yet to make one but it would be interesting to see how it performs. The indexed addressing mode on 9900 would make it pretty quick.

Next would be something like this, I think, which is only marginally slower than current ITC.

l: _next               
            *IP+ W  MOVB,     \ move CFA into Working register & incr IP
            *W+  R5 MOVB,     \ move contents of CFA to R5 & INCR W
            OPTABLE(R5) B,    \ branch to the address in R5

+Lee Stewart · May 20

On 5/7/2024 at 2:36 PM, Lee Stewart said:

fbForth ISR disabled at bootup (enable by storing contents of INTLNK at >83C4, the console ISR hook)

PLAY , STREAM , SAY will display “ISR?” if fbForth ISR is disabled

I am changing how the fbForth 3.0 ISR operates. The first item above will still obtain, but, implementing an idea I had during the last Saturday’s Zoom meeting and taking a page from @FarmerPotato’s book,

23 hours ago, FarmerPotato said:

I thought, the LSbit of a word pointer is not used (on a 16-bit machine). You could have two kinds of pointers, by testing the extra bit.

I am using the LSb of the ISR hook to indicate whether to service speech/sound (LSb=1) or skip directly to servicing any user ISR (LSb=0). That way, The user can store the contents of INTLNK (LSb=0)at >83C4 and only service a user ISR.

I am also no longer having PLAY , STREAM , SAY check for the presence of the fbForth ISR. Rather, those words will now unconditionally load the fbForth ISR at >83C4 and set its LSb to 1 to insure speech/sound processing. It will be on the user to disable the fbForth ISR by CLeaRing >83C4 when it is no longer desired or setting the LSb to 0 if only a user ISR needs servicing.

Here is the current beta with the above changes: fbForth300_8_Ra.bin fbForth300.rpk

Current free space in ROM banks:
   bank 0:   278 bytes
   bank 1:    96 bytes
   bank 2:   104 bytes
   bank 3:   118 bytes

Let me know if I screwed something up.

...lee

atrax27407 · May 20

Downloaded, installed in my HSGPL without any errors at startup. I'll check the individual BLOCKS a bit later today.

+FarmerPotato · May 20

3 hours ago, TheBF said:
Sounds like your are going to be busy doing experiments.

The cool thing about threaded code is that you just need to define an "entry" routine and an "exit" routine for these new word type.

Then the compiler simply make the new words with the entry compiled before the address list and the exit compiled after the address list and it just works.

Your idea of using small integers is called a byte code interpreter. 1..255 are the legal op codes. GPL is one of those. I suppose TI BASIC is as well.

I have yet to make one but it would be interesting to see how it performs. The indexed addressing mode on 9900 would make it pretty quick.

Next would be something like this, I think, which is only marginally slower than current ITC.
l: _next               
            *IP+ W  MOVB,     \ move CFA into Working register & incr IP
            *W+  R5 MOVB,     \ move contents of CFA to R5 & INCR W
            OPTABLE(R5) B,    \ branch to the address in R5

Not byte code, I'm thinking words. The xt is the index to the arrays.

fbForth fbForth—TI Forth with File-based Block I/O [Post #1 UPDATED: 06/05/2024]

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members