Jump to content
IGNORED

fbForth—TI Forth with File-based Block I/O [Post #1 UPDATED: 06/05/2024]


Lee Stewart

Recommended Posts

I want to include in the fbForth word glossary the definition of NULL , whose true name is NUL or ASCII 0, i.e., the name field consists of only a single ASCII 0. NULL is actually only used here for bookkeeping purposes and anywhere it appears, including the definition, it should be understood as an ASCII 0. Here's what I have so far:

 

NULL [Literally NUL (ASCII 0)] [immediate word] Resident

( --- )

There is actually no word in fbForth with the name, ‘ NULL ’. The name field for NULL contains an ASCII 0. Every fbForth buffer, including the terminal input buffer, must end with an ASCII 0. When INTERPRET reaches it, it will search for it in the dictionary and will find what we are here calling NULL . NULL is the only way to exit the endless loop in INTERPRET . When NULL executes, it drops the top value on the return stack and thus returns, not to INTERPRET, but to the word that executed INTERPRET (usually QUIT or LOAD ). Here is its definition, keeping in mind that ‘ NULL ’ represents an actual NUL (ASCII 0):

: NULL BLK @ IF ?EXEC THEN R> DROP ; IMMEDIATE

 

My biggest problem is how to adequately represent the name of the word. Here I have indicated as clearly as I could that NULL is not its real name. If a person actually wished to redefine it, s/he could enter an ASCII 0 at the keyboard with <Ctrl+,>; but, I don't think I should help the user destroy the system! :-o

 

What do you think? @Willsy? @Vorticon? @jacquesg? Other Forthers?

 

...lee

Link to comment
Share on other sites

I don't agree with the notion that this is an actual word. I think it is better described as a token, which is present at the end of buffers which causes INTERPRET to exit. I think it should be written up in the manual in this context, rather than the context of a word. Describing it as a word is just going to cause confusion.

 

FWIW :-)

Link to comment
Share on other sites

I don't agree with the notion that this is an actual word. I think it is better described as a token, which is present at the end of buffers which causes INTERPRET to exit. I think it should be written up in the manual in this context, rather than the context of a word. Describing it as a word is just going to cause confusion.

 

FWIW :-)

 

Perhaps you're right; but, token or not in concept to us, it is, in fact, an actual word that INTERPRET will search for, find in the dictionary in the FORTH vocabulary and execute because it is an immediate word. INTERPRET does not know that it is a token for termination of the buffer—it is just another word to INTERPRET. But, when INTERPRET executes it, INTERPRET is exited. Here is the ALC from the fbForth resident dictionary:

*

*** NULL ***               [ IMMEDIATE word ]
*   This word is actually a true null, i.e., ASCII 0
       DATA L1144
L1145  DATA >C180
_NULL  DATA DOCOL,BLK,AT,ZBRAN,L1146-$,QEXEC
L1146  DATA FROMR,DROP,SEMIS 

*

The name field is line 4 and contains C180h. When the precedence bit and the terminator bits are masked out, it becomes 0100h. As you can see, the count byte contains 01h and the lone character that constitutes the word's name is 00h.

 

...lee

Link to comment
Share on other sites

I think I'll just bag it. It seemed like a good idea at the time. Derick and Baker's FORTH Encyclopedia includes a very good explanation of NULL , so I thought it might be good to include it in the manual. It certainly would be confusing to a Forth neophyte and knowing that it exists and how it works is certainly not necessary to understanding how to write Forth words.

 

...lee

Link to comment
Share on other sites

I need scrutiny of yet another glossary entry I'm struggling with:

 

COMPILE Resident

( --- )

COMPILE is a compile-only word that will execute when its containing word executes, which means that its containing word must be a compile-only word that executes during compilation, i.e., an immediate word. This effectively defers compilation of the word following COMPILE until the word containing them is executed within the definition of yet another word.

When the word containing COMPILE executes during the compilation of a new word, the execution address cfa of the word following COMPILE is copied (compiled) into the dictionary entry for the new word’s definition. For example,

: WORD1 COMPILE WORD0 ; IMMEDIATE

: WORD2 WORD1 … ;

When WORD2 is compiled, WORD1 executes, which executes COMPILE to place the cfa of WORD0 into the definition of WORD2 .

 

Does this make sense? Or, do I have more work to do? @Willsy et al.?

 

...lee

Link to comment
Share on other sites

I was almost finished with the Cs in the fbForth glossary when I hit a snag with CREATE ! :mad: Words in fbForth can have names as long as 31 characters because there are 5 available bits in the name length byte—the other 3 are reserved for terminator, precedence and smudge bits. Everything is fine with CREATE until the user attempts to define a word (via : , <BUILDS , ASM: or CODE ) with a name longer than 31 characters. (Why anyone would want to do that I couldn't say; but, the point is that they can.) When CREATE encounters a word name longer than 31 characters, it only truncates the name. It does nothing with the name length byte. This is a serious error on the part of the TI Forth designers—and one I should probably correct in fbForth, but may not until cartridge time.

 

The problem is that the very next step in CREATE , after name truncation (if too long), is toggling the terminator and smudge bits—not setting, but toggling, i.e., XORing with A0h. All is well if the length byte is ≤ 31 because those high bits will be 0 and XORing with A0h will toggle the relevant bits on. If the length byte is > 31, the high 3 bits will be set or reset incorrectly and, of course, the true length will likely ≠ 31. All CREATE would need to do is to store back into the length byte the result of the MIN operation that it uses to truncate the name when necessary. The following is the commented high-level Forth code for CREATE (for both fbForth and TI Forth:

*

HEX
: CREATE 
    HERE =CELLS DP !    ( adjust HERE to even cell boundary)
    LATEST ,            ( store nfa of latest defined word in our label field)
    -FIND               ( search for previous instance of new word)
    IF                  ( found it?)
        DROP NFA ID. 4 MESSAGE SPACE    (yes; tell user it's a duplicate)
    THEN
    HERE DUP            ( -FIND has put new word at HERE, now our name field)
    C@ WIDTH @ MIN      ( truncate name to WIDTH [31] chars if necessary)
    1+ =CELLS ALLOT     ( reserve name field with even number of bytes)
    DUP 0A0 TOGGLE      ( DUP nfa; set smudge and starting terminator bits)
    HERE 1- 080 TOGGLE  ( set ending terminator bit)
    CURRENT @ !         ( store nfa in CURRENT vocabulary's 'latest' pointer)
    HERE 2+ ,           ( reserve space for code field and store pfa there)
                        ( NOTE: space NOT reserved for parameter field!)
; 

*

The fix is pretty simple—just insert DUP HERE C! after line 10 to store the correct length (same or truncated) in the length byte of the new name field.

 

Of course, worrying about long names flies in the face of all that's holy in Forth—short names, short routines, .... After all, most word names in fbForth are less than 10 characters. I think the longest is EMPTY-BUFFERS at 13—and, I think the reason for that may well be to make the user think about what s/he is about to do while taking the time to type the word! :-o

 

Also, if I were to correct the problem with names longer than 31 characters the way I suggest above, fbForth would lose some generality for users that decide that WIDTH ought to be less than 31. If the user chose to limit WIDTH to 13, everything would be fine as long as the user keeps name lengths less than 32. Then, words with names longer than 13 characters and identical leading 13 characters would be unique as long as the lengths were different. That's how Forth systems that only store the first 3 characters of a name function à la Brodie's Starting Forth. But I digress....

 

I think all I'm going to do here is to briefly explain the situation in the glossary and be done with it until cartridge time—maybe. Thoughts?

 

...lee

Link to comment
Share on other sites

That actually sounds like a good solution, Lee. The Forth user should be able to avoid the problem so long as they actually read the manual! :) And as you noted, most Forth users use much shorter names by default, so the problem should only rarely manifest in any event.

  • Like 2
Link to comment
Share on other sites

While attempting to flesh out the glossary entry for ENCLOSE , I commented ALC for it in the fbForth system code, which follows, for your edification, in the following spoiler:

 

 

 

*** ENCLOSE ***
*       ( addr1  char --- addr1  n1  n2  n3 )
*
       DATA L100A
L100B  DATA >8745,>4E43,>4C4F,>53C5
ENCLOS DATA $+2
       MOV  *SP+,TEMP1      pop delimiter to R1
       MOV  *SP,TEMP2       get string address to R2
       SWPB TEMP1           get delimiter to high byte
       SETO TEMP3           set char offset from addr1 to -1
* skip leading delimiters loop
ENCL1  INC  TEMP3           increment char offset from addr1
       CB   TEMP1,*TEMP2+   is char a delimiter? [increment addr1]
       JEQ  ENCL1           yes; look at next char
*
       DEC  TEMP2           no; restore to 1st non-delimiter char
       AI   SP,-6           reserve 3 more cells of stack space 
       MOV  TEMP3,@4(SP)    set n1 to offset of 1st non-delimiter char
       MOV  TEMP3,*SP       set n3 to offset of 1st non-delimiter char
       INC  TEMP3           offset of next char to test
       MOV  TEMP3,@2(SP)    set n2 to offset of next char to test
       MOVB *TEMP2,W        Copy 1st non-delimiter char to MSB of W (R10) ?????
       JNE  ENCL4           is it ASCII 0?
       B    *NEXT           yes; we're done
ENCL4  INC  TEMP2           no; increment addr1 to next char to test
* non-delimiter counting loop
ENCL2  MOV  TEMP3,@2(SP)    set n2 to offset of next char to test
       MOVB *TEMP2,W        copy next char to test to MSB of W (R10) ?????
       JEQ  ENCL3           is it ASCII 0?
       INC  TEMP3           no; increment offset
       CB   TEMP1,*TEMP2+   is it a delimiter? [increment addr1]
       JNE  ENCL2           no; loop
*
ENCL3  MOV  TEMP3,*SP       set n3 to offset of 1st char not tested or
*                               of NULL char after word
       B    *NEXT           we're done 

 

 

 

It seems pretty straightforward to me except for lines 22 and 28, which appear to me to be using W (R10, the fbForth inner interpreter current word pointer) in a non-standard way. I would expect W to only be used to hold CFAs of fbForth words; but, here, ENCLOSE is copying a character or 0 to W's high byte—that doesn't point to anything! What's going on? Of the handful of browsers of this thread, I would expect @Willsy to be the only one that can answer my question; but, anyone else is certainly welcome to jump in. :grin: One day I will work out what's going on—but, I'll never finish this manual if I don't quit going off on these tangents!! |:)

 

...lee

 

Link to comment
Share on other sites

Okay, to me it looks as if they are simply using W as a scratch register. They can do this because W will be loaded again in NEXT. I have used the same technique many times in TF. In TF W is R6 and many times I have used it because I know the code will end with a call to NEXT which looks like this:

 

   MOV *PC+, W
   MOV *W, R7
   B *R7
So the previous value of W is not important to NEXT.

 

This is safe for ALC words that end in a call to NEXT. However high-level words (I.e. colon definitions) most definitely cannot modify W.

 

HTH :-)

Edited by Willsy
  • Like 1
Link to comment
Share on other sites

Okay, to me it looks as if they are simply using W as a scratch register. They can do this because W will be loaded again in NEXT. I have used the same technique many times in TF. In TF W is R6 and many times I have used it because I know the code will end with a call to NEXT which looks like this:

 

   MOV *PC+, W
   MOV *W, R7
   B *R7
So the previous value of W is not important to NEXT.

 

This is safe for ALC words that end in a call to NEXT. However high-level words (I.e. colon definitions) most definitely cannot modify W.

 

HTH :-)

 

 

It certainly does help. Wait!—I just got it! :idea: With the help of your "scratch" comment, I see that it is simply a setup for the conditional jumps following each of those MOVBs! :dunce:

 

...lee

Link to comment
Share on other sites

OK—here's the fleshed-out glossary entry for ENCLOSE :

 

ENCLOSE Resident

( addr1 char --- addr1 n1 n2 n3 )

The text scanning primitive used by WORD . From the text address addr1 and an ASCII-delimiting character char, is determined the byte offset n1 to the first non-delimiter character, the offset n2 to the delimiter after the text and the offset n3 to the first character not included, i.e., the character about to be read. This procedure will not process past an ASCII NUL (0), treating it as an unconditional parsing terminator.

WORD uses the output from ENCLOSE to advance IN by n3 and calculate the parsed word’s length as n2n1 for use in constructing the packed character string (see footnote 4 on page 11) for the word, which WORD copies to HERE .

If we let each ‘{}’ represent one character; each character is either a non-delimiter character, ‘chr’, a delimiter character, ‘delim’, or the null character, ‘0’, ENCLOSE allows three possible parsing scenarios after leading delimiter characters are skipped:

1) n1n3{0}n2

2) n1{chr}…{chr}n2n3{0}

3) n1{chr}…{chr}n2{delim}n3{chr | 0}…

The offsets, n1, n2 and n3 are shown above in the positions they indicate when returned on the stack by ENCLOSE . Where they are shown next to each other, they, in fact, have the same value. One thing to keep in mind is that n3 will never point to the position after an ASCII 0.

Scenario (1) above is important because it is the only way that INTERPRET , otherwise an infinite loop, can be forced to exit. The null character will be parsed as a single-character word that will be found in the dictionary and executed by INTERPRET , causing INTERPRET ’s demise.

 

Comments? Not clear? Not even Ian Stewart could explain it better?? @Willsy? @Ksarul? Anyone?

 

...lee

Link to comment
Share on other sites

I realize I am, once again, likely talking to myself—but, I am posting these glossary excerpts to get any comments for improvement of the fbForth 1.0 Manual while I'm still working on it. The following snippets concern the CASE ... ENDCASE construct and likely more than you ever wanted to know about how it works in fbForth.

 

Keep in mind that these definitions do not appear grouped together like this in the glossary—they are in ASCII sort order with all the other words in the glossary.

 

S-o-o-o—blue-pencil away:

 

CASE [immediate word] Resident

Compile time: ( --- csp 4 ) Runtime: ( n --- n )

Used in a colon definition to initiate the construct:

CASE

n1 OF … ENDOF

n2 OF … ENDOF

ENDCASE

At compile-time, CASE gets the value csp of CSP to the stack for later restoration at the end of ENDCASE ’s compile-time activity. It stores the current stack position in CSP to help ENDCASE track how many OF … ENDOF branch distances to process. It finally pushes 4 to the stack for compile-time error checking by OF and ENDCASE .

At runtime, CASE itself does nothing with the number n on the stack; but, it must be there for OF or ENDCASE to consume. If n = n1, the code between the immediately following OF and ENDOF is executed. Execution then continues after ENDCASE . If n does not match any of the values preceding any OF , the code between the last ENDOF and ENDCASE is executed and may use n; but, one cell must be left for ENDCASE to consume or a stack underflow will result. Execution then continues after ENDCASE .

ENDCASE [immediate word] Resident

Compile time: ( csp addr1addrn 4 --- ) Runtime: ( n --- )

Occurs in a colon definition as the termination of the CASE ENDCASE construct.

At compile time, it uses the 4 for compile-time error checking. It uses the value in CSP put there by CASE to track the number of OF … ENDOF clauses for which it must calculate branch distances from the addresses (addr1addrn) that each ENDOF left on the stack.

At runtime, if all OF … ENDOF clauses fail, any code after the last ENDOF , including ENDCASE , will execute. ENDCASE will remove the number n left on the stack by the failure of the last OF .

If you include code between the last ENDOF and ENDCASE , it must leave at least one number on the stack for ENDCASE to consume to prevent stack underflow. See CASE .

OF [immediate word] Resident

Compile time: ( 4 --- addr 5 ) Runtime: ( n --- [ ] | n )

Occurs inside a colon definition as part of the OF … ENDOF construct inside of the CASE … ENDCASE construct.

At compile-time, checks for the value 4 on the stack left there by CASE or a previous ENDOF , compiles (OF) , leaves its address addr for branching resolution by ENDOF and leaves a 5 for its matching ENDOF to check.

At runtime, the value n is compared to the value which was on top of the stack when CASE ’s runtime action occurred. If the numbers are identical, the words between OF and ENDOF will be executed. Otherwise, n is put back on the stack for execution to continue after ENDOF . See CASE and ENDOF .

ENDOF [immediate word] Resident

Compile time: ( addr1 5 --- addr2 4 ) Runtime: ( --- )

Occurs in a colon definition as the termination of the OF ENDOF construct within the CASE ENDCASE construct.

At compile time, ENDOF checks for a 5 on the stack. It then compiles BRANCH , leaves its address addr2 for processing by ENDCASE . It next leaves 4 on the stack for compile-time error checking by the next OF or ENDCASE . It finally calculates the forward branch offset from addr1 to HERE for its matching OF and stores the value at the spot reserved for it at addr1.

At runtime, ENDOF causes execution to proceed after ENDCASE . See OF .

...lee

Edited by Lee Stewart
Link to comment
Share on other sites

Too much focus on how they work and not enough on what they are used *for* imo. The actual reason as to why one would use these words is lost imo.

 

I don't think anyone would care how they work, about using CSP and putting a 4 on the stack etc. How do I use 'em? :-)

  • Like 1
Link to comment
Share on other sites

I'm reading them, Lee, and I'm sure Willsy will chime in soon as well. ;) I was able to follow both of your last posts, so I would say they are clear enough. . .

 

 

I read them too, Lee, although I'm not much of an expert in Forth! But it's fun to read about challenges and progress in an TI project!

 

Yeah, I knew that—and most certainly appreciate it! Thanks for your support.

 

...lee

Link to comment
Share on other sites

Too much focus on how they work and not enough on what they are used *for* imo. The actual reason as to why one would use these words is lost imo.

 

I don't think anyone would care how they work, about using CSP and putting a 4 on the stack etc. How do I use 'em? :-)

 

Thanks, Mark. I do see your point. However, I do want those reference elements in there as well as the "how to" parts. In deference to your advice, I think I will make a clearer separation between "compile time" and "runtime" behaviors for the handful of words that have those extensive descriptions. I will also add some detailed usage examples of all of the conditionals to § 2.5 "Control Structures" and include references to them in the glossary.

 

...lee

Link to comment
Share on other sites

To someone like me who doesn't know Forth, reading the definitions feels a bit overwhelming. A somewhat exaggerated analogy from my perspective:

 

Some of the definitions are akin to (feel as if?) the Extended BASIC manual telling the user that upon executing CALL CLEAR, the XB ROM internal code writes 768 0x20 characters, with offset 0x60, to the VDP write port and when all iterations are complete, control is returned to XB.

 

Again, I don't know much about the language, just sharing with you my initial impressions. I will also mention that I have not looked at the full manual, just the recent glossary excerpts.

Edited by InsaneMultitasker
Link to comment
Share on other sites

I think I'm getting punchy! I'm trying to come up with a good example of a DO loop that makes use of I (index of containing loop), J (index of next outer loop) and LEAVE (leave containing loop at next execution of LOOP ); but, I may be making it too complicated. IF ... ELSE ... THEN will have already been explained. Here it is:

: 8X8SRCH 	                                Search an 8x8, row-major array for a number
	( n addr --- F | c r T )	        In:  n = number to match; addr = array address;  Out:  false (0), if not found—or c = column; r = row; true (non-zero), if found
	8 0 DO	                                Array row loop
		8 0 DO	                        Array column loop
			OVER OVER	        Copy n and addr to top of stack
			J 8 * I +	        Convert row and column to address offset into array
			+ @	                Add offset to  addr and get value at that location
			= IF	                Do we have a match to n?
				DROP DROP	Yes; DROP  top 2 numbers from the stack
				I J 1 LEAVE	Leave column c, row r and 1 for outer loop test; leave inner loop when we next get to LOOP
			ELSE	                No match
				0	        Leave 0 for outer loop test
			THEN	
		LOOP	                        Inner loop end
		IF	                        Did we have a match?
			1	                Yes; leave true (1)
			LEAVE	                Leave outer loop at LOOP
		THEN	
	LOOP	                                Outer loop end
	DEPTH 2 = IF	                        # cells on stack = 2?
		DROP DROP 0	                Yes; loop exhausted with no match; DROP everything and leave false (0)
	THEN	
;	

Suggestions?

 

...lee

Link to comment
Share on other sites

I think your post was posted 6 time! LMAO! I presume something went wrong at the Atariage end - no big deal, but made me laugh!

 

 

	DEPTH 2 = IF	                        # cells on stack = 2?
		DROP DROP 0	                Yes; loop exhausted with no match; DROP everything and leave false (0)
	THEN	

 

Danger Will Robinson :grin:

 

This code presumes that the stack was empty before calling the word (apart from the arguments that the word itself needs, of course). As it's an example, that's okay, though I would personally mention it in the preamble to the example.

 

Or, save the depth to the return stack at the start. Then, at the end, if the depth now is equal to the saved depth, there was no match. If the depth now is equal to saved depth + 1 then there was a match. That would make the word safer to use.

 

As I say, your code is an example, so it's your call, though personally, I would try to make the code as robust as I could, and explain in the text why the code was written like that; it's a useful "aha" moment for the reader.

 

FWIW ;)

 

Mark

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...