apersson850 Posted February 4, 2022 Share Posted February 4, 2022 OK. That means that your code does run with interrupt service enabled, or the timing wouldn't work. When my program is running interrupts are disabled, so it saves some fraction there. My super-duper 99 also includes a real time clock, based on a clock chip from National, with a resolution of one millisecond. I check the time before and after launching the sort procedure, and get the time from that. I've written a library unit called realtime, which allows access to my clock. The clock is my own design, so I had to make the drivers for it, of course. When the program includes uses realtime, a timer data type becomes available. The unit supports dynamic creation and disposal of timers. When a timer has been created, it can be reset, started and stopped. And read, of course. Here is a syntactically incorrect example, which just shows the principle of how I use them. I could stop the timer before reading it, but it's not mandatory. Any number of timers can be created and run simultaneously. uses realtime; var timer: tmrtype; elapsed: timetype; begin new(timer); tmrreset(timer); tmrstart(timer); quicksort(n,array); tmrread(timer,elapsed); with elapsed do write(hour); write(minute); write(second); writeln(fraction); end; dispose(timer); end. 3 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted February 4, 2022 Author Share Posted February 4, 2022 6 hours ago, apersson850 said: OK. That means that your code does run with interrupt service enabled, or the timing wouldn't work. When my program is running interrupts are disabled, so it saves some fraction there. My super-duper 99 also includes a real time clock, based on a clock chip from National, with a resolution of one millisecond. I check the time before and after launching the sort procedure, and get the time from that. I've written a library unit called realtime, which allows access to my clock. The clock is my own design, so I had to make the drivers for it, of course. When the program includes uses realtime, a timer data type becomes available. The unit supports dynamic creation and disposal of timers. When a timer has been created, it can be reset, started and stopped. And read, of course. Here is a syntactically incorrect example, which just shows the principle of how I use them. I could stop the timer before reading it, but it's not mandatory. Any number of timers can be created and run simultaneously. uses realtime; var timer: tmrtype; elapsed: timetype; begin new(timer); tmrreset(timer); tmrstart(timer); quicksort(n,array); tmrread(timer,elapsed); with elapsed do write(hour); write(minute); write(second); writeln(fraction); end; dispose(timer); end. Very nice. That's the way to do it. Hardware! I have the 9901 timer running continuously in Camel99 Forth and I use it primarily for timing the "MS" word in Forth which waits for milli-seconds. ( I limit resolution to 10mS because I poll it in Forth (slow) and it also yields to any other tasks while it's waiting) ( the code below is in my Cross-compiler Forth. That's the reason for the special incantations. [CC] "cross-compiling" lets me use the host Forth interpreter for immediate commands [TC] "target-compiling" All definitions goe into the target image ) \ TICKTOCK.HSF TMS9901 hardware timer interface for Camel 99 Forth \ credit to: http://www.unige.ch/medecine/nouspikel/ti99/tms9901.htm#Timer \ impovements based on code from Tursi Atariage \ TMR! now loads from the Forth stack \ Jan 31, 2021 simplified JIFFS \ Dec 2021 removed JIFFS , replaced with TICKS which gives ~10mS resolution \ timer resolution: 64 clock periods, thus 64*333 = 21.3 microseconds \ Max Duration : ($3FFF) 16383 *64*333 ns = 349.2 milliseconds CROSS-ASSEMBLING [CC] DECIMAL [TC] CODE TMR! ( n -- ) \ load TMS9901 timer from stack 0 LIMI, R12 CLR, \ CRU addr of TMS9901 = 0 0 SBO, \ SET bit 0 to 1, Enter timer mode R12 INCT, \ CRU Address of bit 1 = 2 , I'm not kidding TOS 14 LDCR, \ Load 14 BITs from TOS into timer -1 SBZ, \ reset bit 0, Exits clock mode, starts decrementer 2 LIMI, TOS POP, NEXT, ENDCODE CODE TMR@ ( -- n) \ read the TMS9901 timer 0 LIMI, TOS PUSH, R12 2 LI, \ cru = 1 (honest, 2=1) -1 SBO, \ SET bit 0 TO 1, Enter timer mode TOS 14 STCR, \ READ TIMER (14 bits) -1 SBZ, \ RESET bit 1, exit timer mode 2 LIMI, NEXT, ENDCODE [CC] DECIMAL [TC] : TICKS ( n -- ) \ n must be less than 4000. 4000 TICKS ~= 100 mS TMR@ >R BEGIN PAUSE TMR@ R@ - ABS OVER > UNTIL R> 2DROP ; : MS ( n -- ) 10 / 0 ?DO 420 TICKS LOOP ; [CC] HEX [TC] I also use it for testing Assembly language code routines. Of it course it rolls over every 349 mS or so. I explored some code last year that reads the 9901 on the VDP interrupt and adds the difference from last reading to a 32bit int. I didn't go back to it to really test it. I should probably look into making that work as it would be a "reasonably" accurate fined grained timer. For a time I had a an 8K RAM chip in my SuperCart that had battery backed-up clock built in. The batteries were dead or dying and it was unreliable so I took it out. In Classic99 I could read the CLOCK file but that would be best for very long duration timing. Your solution is the best of the lot but iI suppose part of the "fun" is trying make it work with the 1978 hardware. (sometimes its not fun) 3 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted February 4, 2022 Author Share Posted February 4, 2022 I went looking on Rosetta Code for tasks that were not completed in Forth. This one looked like one I could accomplish so I put it up there. Remove vowels from a string - Rosetta Code 2 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted February 4, 2022 Author Share Posted February 4, 2022 And I found the entry for this one wanting as well. When you have something as versatile as Forth you have to show off a little bit don't you? So I added FOREACH. Loops/Foreach - Rosetta Code 2 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted February 4, 2022 Author Share Posted February 4, 2022 I revisited this 32 bit timer idea with fresh eyes and I think I have something that works well. The idea is the 9901 can only time 349 mS. The interrupt happens every 16mS but not reliably. So... Let's read the timer every interrupt and add it's value to a 32 bit accumulator variable. It's not likely the interrupts will be off for 300 milli-seconds at least in my system. That let's us keep a very big number as a timer. It's spinning away here on the screen. NEEDS MOV, FROM DSK1.ASM9900 NEEDS ISR' FROM DSK1.ISRSUPPORT CREATE T32 0 , 0 , \ 32 bit timer accumulator CODE TIMERISR \ read the timer, which runs continuously in Camel99 Forth 0 LIMI, R12 2 LI, -1 SBO, R0 14 STCR, -1 SBZ, 2 LIMI, \ add timer value to 32 bit accumulator R0 T32 2+ @@ ADD, OC IF, T32 @@ INC, ENDIF, RT, ENDCODE ISR' TIMERISR INSTALL 4 Quote Link to comment Share on other sites More sharing options...
apersson850 Posted February 5, 2022 Share Posted February 5, 2022 As a stock feature, the p-system has a 32-bit timer that's running all the time, driven by the VDP interrupt. It does of course stop each time you access a floppy disk or similar. It can be ccessed by the intrinsic Pascal procedure time. My real-time clock has nothing to do with that timer, though. It's implemented with a National Semiconductor MM58167A RTC chip on a card in the PEB. It's visible in some of the pictures in this album. 3 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted February 5, 2022 Author Share Posted February 5, 2022 17 hours ago, apersson850 said: As a stock feature, the p-system has a 32-bit timer that's running all the time, driven by the VDP interrupt. It does of course stop each time you access a floppy disk or similar. It can be ccessed by the intrinsic Pascal procedure time. My real-time clock has nothing to do with that timer, though. It's implemented with a National Semiconductor MM58167A RTC chip on a card in the PEB. It's visible in some of the pictures in this album. That's is a great looking TI-99. Thanks for sharing. I guess if I really wanted a real time clock I should make a board for my PEB. One day maybe. Still lots of S/W ideas to explore. 2 Quote Link to comment Share on other sites More sharing options...
GDMike Posted February 5, 2022 Share Posted February 5, 2022 And don't forget, tipi provides clock too. That's how I pull it. 3 Quote Link to comment Share on other sites More sharing options...
apersson850 Posted February 5, 2022 Share Posted February 5, 2022 Yes, now there are several options. When I made my hardware clock card there was none available on the market. Or I would have bought one. 3 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted February 9, 2022 Author Share Posted February 9, 2022 I am always amazed how a small change can make a difference. So I was testing my latest kernel build and I found the Benchie benchmark. It assigns a VALUE in the middle of the loop. I was getting a timing of 26.25 seconds whereas TurboForth could rip this off in 24.5 seconds. 5 CONSTANT FIVE 0 VALUE BVAR HEX 100 CONSTANT MASK : BENCHIE MASK 0 DO 1 BEGIN DUP SWAP DUP ROT DROP 1 AND IF FIVE + ELSE 1- THEN TO BVAR BVAR DUP MASK AND UNTIL DROP LOOP ; I wondered if assigning that VALUE with TO was the problem. My TO code just used LITERAL and !. Just bog standard Forth. Seemed like a long shot to me but what the heck. So I did this and created a literal operator that did the store and removes 2 instructions to push the stack down by using R1 for the address. CODE LIT! ( n addr -- ) \ combine function of LIT and ! *IP+ R1 MOV, TOS R1 ** MOV, TOS POP, NEXT, ENDCODE .( .) : VALUE CONSTANT ; : TO ( n -- ) ' >BODY \ compute PFA at compile time STATE @ IF POSTPONE LIT! , EXIT THEN ! ; IMMEDIATE And... just like that it's 24.5 seconds. 4 Quote Link to comment Share on other sites More sharing options...
Willsy Posted February 10, 2022 Share Posted February 10, 2022 Nice. I think TF does something similar? From memory, TF puts the address to write the value stack in line with a word called doTO. 10 TO FRED Might nievely compile to: LIT 10 LIT FRED ! TF compiles it as LIT 10 doTo FRED One less run through the inner interpreter. 2 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted February 10, 2022 Author Share Posted February 10, 2022 4 hours ago, Willsy said: Nice. I think TF does something similar? From memory, TF puts the address to write the value stack in line with a word called doTO. 10 TO FRED Might nievely compile to: LIT 10 LIT FRED ! TF compiles it as LIT 10 doTo FRED One less run through the inner interpreter. Sounds the same yes. In my case Store is slower when you cache TOS in a register so I really win using the extra register instead of the stack. But in the end it really is about removing cycles through the inner interpreter if you want to go faster. I have broken my DTC version somewhere in the branching mechanism. I don't think DTC is very practical for TI-99 because of size, but it it is certainly fun to watch it run with a two instruction NEXT. *IP+ W MOV, *W B, And the way I did it R11 becomes the W register which is pretty neat. I started reviewing something I call ASM Forth that I started but didn't finish. When I can see that it works well enough, I will put it up on Github. We might have to start another topic for that one. 4 Quote Link to comment Share on other sites More sharing options...
apersson850 Posted February 10, 2022 Share Posted February 10, 2022 20 minutes ago, TheBF said: ... fun to watch it run with a two instruction NEXT. *IP+ W MOV, *W B, And the way I did it R11 becomes the W register which is pretty neat. Yes, since the TMS 9900 doesn't have autoincrement deferred, like Digital's VAX, that's the best you can do. For those unfamiliar with the architecture, the VAX 11 from Digital was a 32 bit architecture, with an ortoghonal instruction set and plenty of addressing modes. Where the TI in its general addressing mode has a two bit addressing mode specifier, the VAX 11 had four bits. Thus a large number of addressing modes was possible. Autoincrement implied indirect, but you could also do autoincrement deferred, which meant that the register pointed not to the data, but to the address of the data. Double indirection, in other words. 5 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted February 10, 2022 Author Share Posted February 10, 2022 1 hour ago, apersson850 said: Yes, since the TMS 9900 doesn't have autoincrement deferred, like Digital's VAX, that's the best you can do. For those unfamiliar with the architecture, the VAX 11 from Digital was a 32 bit architecture, with an ortoghonal instruction set and plenty of addressing modes. Where the TI in its general addressing mode has a two bit addressing mode specifier, the VAX 11 had four bits. Thus a large number of addressing modes was possible. Autoincrement implied indirect, but you could also do autoincrement deferred, which meant that the register pointed not to the data, but to the address of the data. Double indirection, in other words. I have never used it but I understand that the 6809 could do DTC NEXT in one instruction as well. There is lot of stuff I don't understand but when I looked at the RISC 5 instruction set it seems extremely verbose to do simple things. I get the RISC idea but the choices made for what the instructions should be seem odd to me. And... nobody seems to care about sub-routine calling overhead in these new designs. I suppose because memory is so cheap. Just put code inline. Chuck Moore built machines that did sub-routine calls in 1 cycle and return was just bit 15 that could be set on your instruction so it was free. 2 Quote Link to comment Share on other sites More sharing options...
D-Type Posted February 11, 2022 Share Posted February 11, 2022 16 hours ago, TheBF said: I have never used it but I understand that the 6809 could do DTC NEXT in one instruction as well. : NEXT, Y ,++ [] JMP, ; \ 6809 DTC 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted February 13, 2022 Author Share Posted February 13, 2022 I have been looking at SCAN ( addr len char ) a non-standard word but one that is in GForth and many other systems. It is a very handy primitive for finding a character in a string. I wondered how to do it in Forth and it really responds well to the dual WHILE structure because there are two loop ending conditions. It could be done with an AND as well I think but might need a more stack shuffles. : SCAN ( adr len char -- adr' len') >R \ remember char BEGIN DUP WHILE ( len<>0) OVER C@ R@ <> WHILE ( R@<>char) 1 /STRING \ advance to next char address REPEAT THEN R> DROP \ drop char \ 32 bytes ; This made me realize I could do the dual WHILE trick in Forth Assembler now that I have the ANSI style loops as part of ASM9900. I am again impressed with the 9900 instruction set. The point of making an interpreter is normally to save some space but the 9900 does the same function in 10 bytes less. So smaller and faster. This is partly because of using registers rather than stack operations for the variables and also because the jumps in Assembler are short jumps and so take only 2 bytes per jump. This make me wonder if I could use short jumps for Forth? Hmmm... CODE SCAN ( adr len char -- adr' len' ) \ find matching char TOS SWPB, \ char stays in TOS 2 (SP) W MOV, \ address->w *SP+ R1 MOV, \ POP count into R1, BEGIN, R1 R1 MOV, NE WHILE, ( len<>0) *W TOS CMPB, NE WHILE, ( *R8<>R1) ( do: 1 \STRING ) W INC, \ inc. adr R1 DEC, \ dec. len REPEAT, ENDIF, W *SP MOV, \ store updated address on stack R1 TOS MOV, \ updated count to TOS NEXT, \ 26 bytes ENDCODE 2 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted February 24, 2022 Author Share Posted February 24, 2022 Sometimes I can make myself feel pretty stupid. I am working on a little debugger Forth for @FarmerPotato for Geneve 2020 and I wondered if I could incorporate wordlists into the system to hide the Assembler. It took me down the rabbit hole of how to make sure the dictionary would wake up correctly because I had not made that work in Camel99 yet. One of the neat things about Forth is that while you are compiling code you have access to the interpreter and can do processing inside the source code as well. This works perfectly when loading source code but if you save that code as a binary executable YOU HAVE TO REMEMBER TO DO IT at runtime. So in my WORDLIST implementation I used this feature to initialize the new Forth wordlist in three lines. Somehow I missed the importance of these lines all this time. DOH! One of those lines completely changes the dictionary search mechanism. (how dumb am I) Anyway I finally can build a big system with vocabularies, save it as a binary program and it starts with everything intact. This has great implications for the machine Forth compiler which needs many vocabularies to partition cross-compiler, cross-assembler and target Forth dictionaries. Here is what I needed to add : INIT-WORDLISTS ['] FIND12 'FIND ! CONTEXT @ @ FORTH-WORDLIST ! ONLY FORTH DEFINITIONS ; And now here is what it takes to build a new Forth with more features: \ building a big Forth system with WORD lists FEB 2022 Brian Fox \ starting point from the CAMEL99 Kernel INCLUDE DSK1.WORDLISTS \ load the assembler in it's own vocabulary VOCABULARY ASSEMBLER ASSEMBLER DEFINITIONS INCLUDE DSK1.ASM9900 \ load some tools ONLY FORTH DEFINITIONS INCLUDE DSK1.TOOLS \ create a new BOOT word : COLD WARM \ Init hardware, restore Forth dictionary INIT-WORDLISTS \ set find mechanism, init FORTH wordlist ABORT ; \ reset the interpreter LOCK \ lock end of dictionary at COLD INCLUDE DSK1.SAVESYS ' COLD SAVESYS DSK6.FATFORTH 6 Quote Link to comment Share on other sites More sharing options...
+FarmerPotato Posted February 24, 2022 Share Posted February 24, 2022 14 hours ago, TheBF said: And now here is what it takes to build a newForth with more features I’m very grateful! My understanding of how VOCABULARY works is poor. I know it changes the results of a dictionary lookup, and the compiled results aren’t affected by anything later. (CFAs are just pointers.) But when you switch vocabularies , does it change some pointers in the dictionary? Is that the problem you had with saving the state, there are some pointers left in the wrong state? I’ve used vocabulary in FORTI because the note compiler has words A B C D E F G. (And A# and A$ for flat.) These can only execute as words when the vocabulary is active, which is always inside the compiling word VOICE: — ; kind of like assembler, but for music. The A B C words compute a note number, modified by a LOT of context (like key and octave plus ornamentation) They push a word or two onto the dictionary. The VOICE: is used to make a sequence of notes , to be interpreted later by a player. And outside a VOICE: definition you never, ever want A to be anything but a hexadecimal number! (Even the assembler vocabulary uses A, (A comma ) still needing a lot of studying to really comprehend what’s going on with VOCABULARY. 2 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted February 24, 2022 Author Share Posted February 24, 2022 19 hours ago, FarmerPotato said: I’m very grateful! My understanding of how VOCABULARY works is poor. I know it changes the results of a dictionary lookup, and the compiled results aren’t affected by anything later. (CFAs are just pointers.) But when you switch vocabularies , does it change some pointers in the dictionary? Is that the problem you had with saving the state, there are some pointers left in the wrong state? I’ve used vocabulary in FORTI because the note compiler has words A B C D E F G. (And A# and A$ for flat.) These can only execute as words when the vocabulary is active, which is always inside the compiling word VOICE: — ; kind of like assembler, but for music. The A B C words compute a note number, modified by a LOT of context (like key and octave plus ornamentation) They push a word or two onto the dictionary. The VOICE: is used to make a sequence of notes , to be interpreted later by a player. And outside a VOICE: definition you never, ever want A to be anything but a hexadecimal number! (Even the assembler vocabulary uses A, (A comma ) still needing a lot of studying to really comprehend what’s going on with VOCABULARY. It is probably beyond a simple post to fully explain this but here goes. (I could not have done this 3 years ago so I guess that's progress for an old guy) Since the Forth dictionary is a linked list that begins at the last word defined, that constitutes a vocabulary in the simplest sense. However... VOCABULARY is not part of the ANSI Forth and has been replaced by WORDLIST. WORDLIST is a simpler thing, a little 3 field data structure that just returns its address onto the data stack. The address is called a 'wid'. (wordlist identifier) : WORDLIST ( -- wid) HERE 0 , \ init nfa of last word in wordlist WID-LINK @ , \ compile link to previous wordlist DUP WID-LINK ! \ link to previous wordlist 0 , \ name of this wordlist. Must be patched ; To make Forth search a WORDLIST you make it the "CONTEXT" wordlist with SET-CONTEXT. There are two variables that determine specific roles of dictionary searches made by the compiler/interpreter. VARIABLE CURRENT VARIABLE CONTEXT The CONTEXT "WORDLIST" is what the compiler searches to find definitions to build new definitions or to run an immediate command. CONTEXT holds a "pointer to a pointer" to the last word created in a WORDLIST. The CURRENT wordlist is the name space (wordlist) that new words will become part of when you define them. The good news is that it's simple to make a vocabulary (which most implementers do) with CREATE/DOES> . Just create the wordlist data structure and at run-time have it set itself as the CONTEXT wordlist. : VOCABULARY ( <text> ) CREATE WORDLIST LATEST @ SWAP 4 + ! \ update wordlist name field DOES> SET-CONTEXT ; In Fig-Forth or Forth 79 systems FORTH is the default vocabulary that is searched and where new words are built. If you invoke another vocabulary by using its name, the new vocabulary is searched first and when you get to the end of the it, the search continues on into the Forth vocabulary. So they are effectively connected by default. In ANSI Forth and Forth 83 we are given control of what gets searched and in what order. So if you have FORTH, ASSEMBLER and EDITOR vocabularies, you can control which one gets searched first, second and third in any order. This is more complicated by gives you great flexibility for complex jobs. In the new model the CONTEXT variable becomes an array of addresses (in my implementation). Each array cell points to the last word defined in each specific vocabulary. The system knows the number of vocabularies you want searched and it will search each one in the order that they are placed in the array. There is still only one CURRENT vocabulary where new definitions are added. I think I will leave it there for a little digestion. More info can be found here: WORDLIST - SEARCH (forth-standard.org) Here is my latest code for WORDLISTS Spoiler \ wordlist.fth for CAMEL99 FORTH Oct 2020 Brian Fox \ Code adapted from Web: https://forth-standard.org/standard/search \ Dec 2020: Removed SET-CURRENT to save precious bytes \ Jan 5, 2020: back migrated some enhancements from CODEX work \ Jun 4, 2021: Changed order of patching to work with TTY version \ Sep 25, 2021: Corrected SET-CONTEXT, Removed ROOT to save space. \ Feb 23, 2022: Added INIT-WORDLISTS for binary program startup \ -------- \ 'wid' is a word-list ID. \ In Camel Forth, wid is a pointer to a Name Field Address (NFA) \ ie: a counted string of the last word defined in the wordlist. \ The kernel program has a pre-defined CONTEXT array to hold the \ Forth wordlist plus 8 user defined wordlists. \ NEEDS .S FROM DSK1.TOOLS ( Debugging) HERE DECIMAL CREATE #ORDER 1 , \ No. of active wordlists starts at 1 VARIABLE WID-LINK \ Pointer to the most recently defined wordlist CREATE FORTH-WORDLIST 0 , 0 , LATEST @ , FORTH-WORDLIST WID-LINK ! \ set first WID in the chain : WORDLIST ( -- wid) HERE 0 , \ init nfa of last word in wordlist WID-LINK @ , \ compile link to previous wordlist DUP WID-LINK ! \ link to previous wordlist 0 , \ name of this wordlist. Must be patched ; HEX : .WID ( wid -- ) [ 2 CELLS ] LITERAL + @ ?DUP 0= IF EXIT THEN \ name field is empty. COUNT 1F AND TYPE SPACE ; \ : ]CONTEXT ( n -- addr) CELLS CONTEXT + ; \ context as array HEX ( Machine code is same size but faster) CODE ]CONTEXT ( n -- addr) A104 , \ TOS TOS ADD, 0224 , CONTEXT , \ TOS CONTEXT AI, NEXT, ENDCODE .( .) : GET-ORDER ( -- widn ... wid1 n ) \ *reversed order on stack #ORDER @ 0 DO #ORDER @ I - 1- ]CONTEXT @ LOOP #ORDER @ ; DECIMAL : SET-ORDER ( wid1x ... wid1 n -- ) \ n cannot be 0 DUP 0< IF DROP FORTH-WORDLIST DUP 2 THEN DUP #ORDER ! 0 ?DO I ]CONTEXT ! LOOP ; : ONLY ( -- ) TRUE SET-ORDER ; \ set search order to FORTH FORTH : SET-CONTEXT ( wid -- ) \ place 'wid' at beginning of search order >R GET-ORDER NIP \ remove 1st wordlist R> SWAP SET-ORDER \ put 'wid' first ; \ User API ... : ALSO ( -- ) GET-ORDER OVER SWAP 1+ SET-ORDER ; : PREVIOUS ( -- ) GET-ORDER NIP 1- SET-ORDER ; : DEFINITIONS ( -- ) CONTEXT @ CURRENT ! ; .( .) \ non-standard but nice to have : VOCABULARY ( <text> ) CREATE WORDLIST LATEST @ SWAP 4 + ! \ update wordlist name field DOES> SET-CONTEXT ; : ORDER ( -- ) CR GET-ORDER 0 DO .WID LOOP CR ." Current: " CURRENT @ .WID CR ; : FORTH ( -- ) FORTH-WORDLIST SET-CONTEXT ; \ patch FORTH-WORDLIST to existing dictionary CONTEXT @ @ FORTH-WORDLIST ! \ set the new search order and current vocabulary FORTH-WORDLIST DUP 2 SET-ORDER DEFINITIONS \ Forth 2012 6.1.1550, Extend FIND to search all active wordlists : FIND12 ( FIND12) ( c-addr -- c-addr 0 | xt 1 | xt -1 ) FALSE \ default flag CONTEXT #ORDER @ CELLS ( -- addr size) BOUNDS ?DO OVER I @ @ (FIND) ?DUP IF 2SWAP 2DROP LEAVE THEN DROP 2 +LOOP ; ' FIND12 'FIND ! ONLY FORTH DEFINITIONS : INIT-WORDLISTS ['] FIND12 'FIND ! CONTEXT @ @ FORTH-WORDLIST ! ONLY FORTH DEFINITIONS ; INIT-WORDLISTS CR HERE SWAP - DECIMAL SPACE . .( bytes) HEX 2 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted February 25, 2022 Share Posted February 25, 2022 17 hours ago, TheBF said: In the new model the CONTEXT variable becomes an array of addresses (in my implementation). Each array cell points to the last word defined in each specific vocabulary. The system knows the number of vocabularies you want searched and it will search each one in the order that they are placed in the array. There is still only one CURRENT vocabulary where new definitions are added. You must have written this too quickly—CURRENT and CONTEXT in the above context should be reversed. —[Corrected]— ...lee 2 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted February 25, 2022 Author Share Posted February 25, 2022 38 minutes ago, Lee Stewart said: You must have written this too quickly—CURRENT and CONTEXT in the above context should be reversed. ...lee Yes indeed. Caught me again. Thanks. I will edit it up. 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 7, 2022 Author Share Posted March 7, 2022 While reading comp.lang.forth I saw a link to this page which I think is the best summation of Chuck Moore's approach to programming that I have seen. It was written by the late Jeff Fox (no relation) who passed away of a heart attack at the age of 62 in 2011. Jeff worked closely with Chuck creating software for Chuck's new CPU designs. He is missed in the Forth community. http://www.ultratechnology.com/method.htm I personally like step 9 in this 10 step approach 9. Code. Build custom tools if they help. Write code so simple and clear that bugs simply can't happen. Make the code "right by design." :define using one-liners about this long ; Interactively test each Forth word. Extend the core language making your custom language and moving you toward your solution. Return to 1. UNTIL the code solution falls out. : HARD REMEMBERING TO MAKE NICE SHORT WORDS ; 4 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 12, 2022 Author Share Posted March 12, 2022 (edited) Pythagoras + a bit of Fudge I decided to remove my "improved" TI-FORTH SPRITE code from my DIRSPRIT ( direct sprites) library. I changed my coincidence code awhile back to just compare numbers in the sprite x,y values in VDP RAM, so it was not really being used. The TI-FORTH distance between sprites calculation was always sub-optimal returning the distance squared and if the sums overflowed it just returned 32767. I think I remember Lee fixed this with floating point math but I went down the rabbit hole of how to do it with integers. I had a pretty quick square root word for 16 bits which I found somewhere years ago, in Forth magazine I think. It's pretty clever but being 16 bits it doesn't work past 65535. With a 255x195 screen summing the squares can be done to 32 bit resolution so that's no problem. But I could not figure out how to get the square root of a 32 bit number without a lot more work. I realized that if when I did the 32 bit addition unsigned, I got a 1 in the upper bits of the double number. Using that 1 as an overflow flag I could perform the square root on the lower 16 bits and if there was an overflow I can fudge the value to give something useful. Not perfect Pythagorean computation but much easier to do than fighting with a 32 bit square root calculation I think. The fudge factor calculation takes the incorrect square root (in an over-flow), divides it by PI and adds it to 255. This cause the values to be slightly compressed as you get farther away but it would still work in a game. A little bit like those mirrors on your car that say "Things appear closer than they are". \ DISTANCE.FTH compute distance between any two sprites Mar 12 2022 NEEDS DUMP FROM DSK1.TOOLS NEEDS SPRITE FROM DSK1.DIRSPRIT NEEDS AUTOMOTION FROM DSK1.AUTOMOTION MARKER NEW DECIMAL : ^2 ( n n -- n) DUP * ; : DIFF ( x y x y -- dx dy) ROT - -ROT - ; : SUM ( dx^2 dy^2 -- d) 0 ROT 0 D+ ; \ sum squares to 32 bit resolution : SUM-SQUARES ( x y x y -- d ) DIFF ^2 SWAP ^2 SUM ; : SQRT ( n -- n ) -1 TUCK DO 2+ DUP +LOOP 2/ ; : PI/ ( n -- n ) 10000 31415 */ ; : DISTANCE ( x y x y -- n) SUM-SQUARES >R SQRT R> IF PI/ 255 + THEN ; : SP.DISTXY ( x y spr# -- dist ) POSITION DISTANCE ; : SP.DIST ( spr#1 spr#2 -- dist ) POSITION ROT SP.DISTXY ; And here is some test code \ test code DECIMAL \ char clr x y spr# \ ------------------------- CHAR 0 6 0 0 0 SPRITE CHAR 1 10 240 85 1 SPRITE CHAR 2 9 255 255 2 SPRITE CHAR 3 11 127 90 3 SPRITE CHAR 4 13 199 149 4 SPRITE 1 MAGNIFY CLEAR 0 0 SP.DIST . 0 1 SP.DIST . 1 2 SP.DIST . 2 3 SP.DIST . 0 3 SP.DIST . 0 4 SP.DIST . 0 2 SP.DIST . You can see the compression in the screen capture. Distance between sprite 0 and sprite 2 should 360, but it's only 335. This error only occurs after the distance exceeds 255. I may still pursue a 32bit Square root but this seems useable. Edited March 12, 2022 by TheBF wrong image 2 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted March 13, 2022 Share Posted March 13, 2022 52 minutes ago, TheBF said: I think I remember Lee fixed this with floating point math but I went down the rabbit hole of how to do it with integers. . . . I may still pursue a 32bit Square root but this seems useable. I ported a C program to take an unsigned double (32-bit) square to a single (16-bit) square root. It consumes 128 bytes in the fbForth dictionary: Spoiler /* from Craig McQueen’s response on stackoverflow.com to “Looking for an efficient integer square root algorithm for ARM Thumb2”: (https://stackoverflow.com/questions/1100090/looking-for-an-efficient-integer-square-root-algorithm-for-arm-thumb2) */ #define BITSPERLONG 32 #define TOP2BITS(x) ((x & (3L << (BITSPERLONG-2))) >> (BITSPERLONG-2)) struct int_sqrt { unsigned sqrt, frac; }; /* usqrt: ENTRY x: unsigned long EXIT returns floor(sqrt(x) * pow(2, BITSPERLONG/2)) Since the square root never uses more than half the bits of the input, we use the other half of the bits to contain extra bits of precision after the binary point. EXAMPLE suppose BITSPERLONG = 32 then usqrt(144) = 786432 = 12 * 65536 usqrt(32) = 370727 = 5.66 * 65536 NOTES (1) change BITSPERLONG to BITSPERLONG/2 if you do not want the answer scaled. Indeed, if you want n bits of precision after the binary point, use BITSPERLONG/2+n. The code assumes that BITSPERLONG is even. (2) This is really better off being written in assembly. The line marked below is really a "arithmetic shift left" on the double-long value with r in the upper half and x in the lower half. This operation is typically expressible in only one or two assembly instructions. (3) Unrolling this loop is probably not a bad idea. ALGORITHM The calculations are the base-two analogue of the square root algorithm we all learned in grammar school. Since we're in base 2, there is only one nontrivial trial multiplier. Notice that absolutely no multiplications or divisions are performed. This means it'll be fast on a wide range of processors. */ void usqrt(unsigned long x, struct int_sqrt *q) { unsigned long a = 0L; /* accumulator */ unsigned long r = 0L; /* remainder */ unsigned long e = 0L; /* trial product */ int i; for (i = 0; i < BITSPERLONG; i++) /* NOTE 1 */ { r = (r << 2) + TOP2BITS(x); x <<= 2; /* NOTE 2 */ a <<= 1; e = (a << 1) + 1; if (r >= e) { r -= e; a++; } } memcpy(q, &a, sizeof(long)); } /** * \brief Fast Square root algorithm * * Fractional parts of the answer are discarded. That is: * - SquareRoot(3) --> 1 * - SquareRoot(4) --> 2 * - SquareRoot(5) --> 2 * - SquareRoot(8) --> 2 * - SquareRoot(9) --> 3 * * \param[in] a_nInput - unsigned integer for which to find the square root * * \return Integer square root of the input value. */ uint32_t SquareRoot(uint32_t a_nInput) { uint32_t op = a_nInput; uint32_t res = 0; uint32_t one = 1uL << 30; // The second-to-top bit is set: use 1u << 14 for uint16_t type; use 1uL<<30 for uint32_t type // "one" starts at the highest power of four <= than the argument. while (one > op) { one >>= 2; } while (one != 0) { if (op >= res + one) { op = op - (res + one); res = res + 2 * one; } res >>= 1; one >>= 2; } return res; } HEX \ Registers: R0,R1 = udh,udl \ R2,R3 = root (nh,nl) \ R4,r5 = floating 1 (f1h,f1l) \ R6,R7 = wkh,wkl ASM: UDSQRT ( ud -- n ) *SP+ R0 MOV, \ pop udh to R0 *SP R1 MOV, \ udl to R1 R2 CLR, \ clear running.. R3 CLR, \ ..root (nh,nl) R5 CLR, \ set floating 1.. R4 4000 LI, \ ..to 4000 0000 \ get highest power of 4 <= square (udh,udl) BEGIN, R4 R0 C, \ f1h:udh? EQ IF, R5 R1 C, \ f1l:udl? THEN, H WHILE, R4 R5 C, \ f1h:f1l? H IF, \ bit in f1h? R4 2 SRL, \ yes..shift f1h right 2 bits EQ IF, \ bit shifted out? R5 4000 LI, \ set f1l to shifted-out bit THEN, ELSE, \ no..bit is in f1l R5 2 SRL, \ shift f1l right 2 bits THEN, REPEAT, \ calculate square root BEGIN, R4 R5 C, \ f1h:f1l? NE WHILE, \ f1h,f1l > 0? \ set up wkh,wkl = nh,nl + f1h,f1l to compare to udh,udl R2 R6 MOV, \ nh to wkh R3 R7 MOV, \ nl to wkl R5 R7 A, \ f1l + wkl OC IF, \ carry? R6 INC, \ yes..increment wkh THEN, R4 R6 A, \ f1h + wkh \ check if need to update running values of udh,udl & nh,nl R0 R6 C, \ udh:wkh? EQ IF, \ udh = wkh? R1 R7 C, \ udl:wkl? THEN, HE IF, \ udh,udl >= wkh,wkl? \ udh,udl = udh,udl - wkh,wkl R7 R1 S, \ udl - wkl NC IF, R0 DEC, \ reduce udh on no carry THEN, R6 R0 S, \ udh - wkh \ nh,nl = wkh,wkl + f1h,f1l R5 R7 A, \ wkl + f1l OC IF, \ carry? R6 INC, \ yes..increment wkh THEN, R4 R6 A, \ wkh + f1h R6 R2 MOV, \ wkh to nh R7 R3 MOV, \ wkl to nl THEN, \ f1h,f1l >> 2 R4 R5 C, \ f1h:f1l? H IF, \ bit in f1h? R4 2 SRL, \ yes..shift f1h right 2 bits EQ IF, \ bit shifted out? R5 4000 LI, \ set f1l to shifted-out bit THEN, ELSE, \ no..bit is in f1l R5 2 SRL, \ shift f1l right 2 bits THEN, \ nh,nl >> 1 R3 1 SRL, \ shift nl right 1 bit R2 1 SRL, \ shift nh right 1 bit OC IF, \ carry? R3 8000 ORI, \ set MSb of nl THEN, REPEAT, R3 *SP MOV, \ return n on stack ;ASM DECIMAL 1 hour ago, TheBF said: You can see the compression in the screen capture. Distance between sprite 0 and sprite 2 should be 360, but it's only 335. The TI-99/4A’s 192x256 resolution has a maximum pixel distance, corner to corner, of ~319 pixels: √(2552 + 1912) ≈ 318.6. With anything more, one of the sprites is off screen, so I guess you are involving off-screen distances in your calculations, but to what end? ...lee 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted March 13, 2022 Author Share Posted March 13, 2022 9 minutes ago, Lee Stewart said: I ported a C program to take an unsigned double (32-bit) square to a single (16-bit) square root. It consumes 128 bytes in the fbForth dictionary: The TI-99/4A’s 192x256 resolution has a maximum pixel distance, corner to corner, of ~319 pixels: √(2552 + 1912) ≈ 318.6. With anything more, one of the sprites is off screen, so I guess you are involving off-screen distances in your calculations, but to what end? ...lee Ah yes. Very nice work. Thanks. Amazing how many instructions it takes in assembler. I was just trying to see how far off the calculations were with the off-screen sprite. I will double check this thing with the 256 x 192 coordinates. I may be able to get closer on the reduced size. 1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.