+TheBF Posted October 21, 2019 Share Posted October 21, 2019 6 hours ago, moulinaie said: Hi all, I used my My Little Compiler to solve the problem: It solves the problem in 14 seconds (but only 7 seconds to calculate, the first 7 seconds are used to load the compiler from disk and compile the DATA lines). here is the source for the Pre-compiler: 100 CALL CLEAR $MLC F 110 10 3000 300 CALL LINK("SEVEN",N,A$) 310 PRINT "7 TO THE POWER OF ";N; " IS" 320 PRINT A$ 330 END $SEVEN 0 DIMTABLE F 256 ; F points to the special 256 bytes buffer DIMTABLE G 256 ; G points to 256 bytes (for returning string) STARTDATA BYTES 0,7,4,1,8,5,2,9,6,3 BYTES 0,0,1,2,2,3,4,4,5,6 ENDDATA E ; E points to this multiplication table LET P 1 ; current power = 1 PUTTABLE F 0 7 ; current value is 7 at first byte LET Z 0 ; current LEN-1 of number in digits LET X 0 ; clear word for byte operations LET A 0 ; flag for "found" REPEAT LET N 0 ; max occurences of 7 LET C 0 ; current carry is zero FOR I 0 Z GETTABLE F I X ; X = digit GETTABLE E X Y ; multiplied by 7 ADD Y C ; plus carry ADD X 10 ; points to tenths GETTABLE E X C ; get new carry COMPARE Y 10 ; if more than a digit IF>= SUB Y 10 ; then reports... INC C ; ...one tenth on carry ENDIF PUTTABLE F I Y ; store new digit COMPARE Y 7 ; a "7" found? IF= INC N ; one more COMPARE N 6 ; six "7" found? IF= INC A ; flag for found ENDIF ELSE LET N 0 ; not a "7", reset counter ENDIF NEXT COMPARE C 0 ; if carry remains IF<> INC Z ; then one more digit PUTTABLE F Z C ; and stored ENDIF INC P COMPARE A 0 UNTIL<> LET J Z FOR I 0 Z GETTABLE F I X ; one digit from F table ADD X 48 ; in ASCII PUTTABLE G J X ; one character in G string DEC J NEXT INC Z ; correct len LET U 2 PUTTABLE U 0 G ; set A$ to G string with Z characters PUTPARAM 1 P ; set N to the current power $$ $END This is reduced to those few lines in XB : 100 CALL CLEAR 110 CALL INIT::PRINT "Loading MLC Compiler..."::CALL LOAD("DSK1.NEWFLO")::CALL LINK("NEWFLO") 120 IO(1)=3000::CALL LINK("COMPIL",IO(),S$(),C$())::If IO(1) THEN PRINT "Error ";IO(1)::END 130 PRINT "Compilation OK!" 300 CALL LINK("SEVEN",N,A$) 310 PRINT "7 TO THE POWER OF ";N; " IS" 320 PRINT A$ 330 END 3000 DATA P 3010 DATA SEVEN 3020 DATA "TRF256TRG256_ABAT<TH10.00070401080502090603TH10.00000102020304040506TELAT>E=P1TPF0.7=Z0=X0=A0LA=N0=C0=I0LBTGFIXTGEXY+YC+X10TGEXC" 3030 DATA "CY10_C?<C-Y10ICLCTPFIYCY7_C!=CINCN6_D!=DIALD_DBDLC=N0LDIICIZ!>BCC0_B?=BIZTPFZCLBIPCA0?=A=JZ=I0LATGFIX+X48TPGJXDJIICIZ!>AIZ=U2" 3040 DATA "TPU0GP1P" 3050 DATA "" 3060 DATA "" Guillaume. Very clever! Willsy's original challenge required scrolling... If I do it this way I should expect very different results for sure! Quote Link to comment Share on other sites More sharing options...
+TheBF Posted October 21, 2019 Share Posted October 21, 2019 So simply removing the need to scroll was a big improvement. It also meant that I could scan the array for integer sevens rather than converting to a text string and scanning the string. Without improving the calculator the results in indirect threaded Forth are 33 Seconds. This is a reasonable ratio for Assembler to Indirect Threaded Forth. 3 to 5 times slower is typical. I will try Guillaumes calculation method and see if that buys me something. It may not because the UM/MOD ( "un-signed mixed division and modulo") routine is just a native DIV instruction that simultaneously computes 2 results. The overhead in Forth of shifting values and doing a subtract might actually be no improvement. We shall see. For complete comparison the compile time is also about 7 seconds. ( I have another compiler that compiles about 20% faster, but I broke it making some changes so it's offline) Spoiler \ lucien2 Version of FIG Forth PORTED to CAMEL99 Forth with re-work \ ** non-scrolling version ** \ NEEDS DUMP FROM DSK1.TOOLS NEEDS ELAPSE FROM DSK1.ELAPSE NEEDS ()@, FROM DSK1.CODEMACROS DECIMAL 180 CONSTANT SIZE \ variable fast fetchers VARIABLE i MACRO i@ i @, ;MACRO VARIABLE X MACRO X@ X @, ;MACRO VARIABLE POWER VARIABLE LENGTH CREATE A1 SIZE CELLS ALLOT CREATE A2 SIZE CELLS ALLOT CREATE PAD 256 ALLOT \ Integer Arrays that is indexed addressing MACRO ]A1@ ( ndx -- n) A1 ()@, ;MACRO MACRO ]A2@ ( ndx -- n) A2 ()@, ;MACRO MACRO ]A1! ( ndx -- n) A1 ()!, ;MACRO MACRO ]A2! ( ndx -- n) A2 ()!, ;MACRO \ Note: Used UM/MOD, un-signed division because it's faster : A1*7->A2 ( -- ) i OFF X OFF BEGIN i@ ]A1@ DUP 0= i@ LENGTH @ > AND X@ 0= AND 0= WHILE 7 * X@ + S>D 10 UM/MOD X ! i@ ]A2! i 1+! REPEAT DROP i@ LENGTH ! ; \ number converion helpers : <# ( -- ) PAD HP ! ; \ set HP to buffer for number conversion : #> ( -- pad length ) PAD HP @ OVER - ; : HOLD ( char -- ) HP @ C! HP 1+! ; \ hold digit in pad, bump pointer : A2>$ ( -- addr len) \ array to text string <# -1 LENGTH @ 1- DO I ]A2@ [CHAR] 0 + HOLD -1 +LOOP #> \ compute length [CHAR] 0 SKIP ; \ skip leading zeros : SEVENS? ( -- ?) 0 LENGTH @ 1- 0 DO 1+ I ]A2@ 7 = AND DUP 4 > IF LEAVE THEN LOOP ; : INTRO PAGE ." The 5 Sevens Problem" CR CR ." Press a key to start" KEY DROP ; : CALCULATOR CR ." Working ..." BEGIN A1*7->A2 A2 A1 LENGTH @ CELLS CMOVE POWER 1+! SEVENS? UNTIL ; : ERASE ( address length --) 0 FILL ; : EMPTY ( array --) SIZE CELLS ERASE ; : SETUP A1 EMPTY A2 EMPTY 7 A1 ! 2 POWER ! LENGTH OFF PAGE TICKER OFF ; : RESULTS CR CR CR CR ." The Answer is 7 ^" POWER @ 1- . CR .ELAPSED CR ; : RUNFASTER INTRO SETUP CALCULATOR RESULTS A2>$ TYPE ; \ CAMEL99 time 33 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted October 21, 2019 Share Posted October 21, 2019 Just for curiosity, the same code compiled with Direct threaded Forth gets us to 27.4 seconds, a 20% improvement. Compile time 5.23 seconds. (*with a bug in the string conversion. I don't use this compiler these days. Code is bigger) Quote Link to comment Share on other sites More sharing options...
+RXB Posted October 21, 2019 Share Posted October 21, 2019 Honestly if you want to delay everything with Basic or XB just use PRINT and scroll the entire screen. DISPLAY AT(row,col) is much faster then having to move the entire screen every time it adds a new power to display. RXB CALL HPUT(row,col,string or number) is very slightly faster then DISPLAY AT(row,col) (4 minute test is about 1 second faster, 21 minute test it is almost 3 seconds faster) My main point is SCROLL is insanely slow to use and always is slow, you are after all rewriting the entire screen each pass. 2 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted October 22, 2019 Share Posted October 22, 2019 On 10/21/2019 at 5:57 AM, moulinaie said: Knowing that 7 = 8 - 1, you can say that : value * 7 = shift-left (value,3) - value. That can be fast in assembly. But I don't know if your FORTH version has the binary shift instructions. Guillaume. I added a shift-left by 3 operator to my FORTH way back, to handle character patterns faster. It's called 8* so this is simple to try. I have cleaned up the computation section so for reference the '*" operator/ Time 27.33 seconds I tried the shift method in hi-level Forth and it is a bit slower that using the '*" operator. Time= 29.53 seconds : 7* ( n -- n') DUP 8* SWAP - ; This is because of the stack juggling operations needed and the overhead of running each of those Forth instructions. (The address interpreter is 3 instructions that run each time a Forth word ends) I tried removing the extra call to 7* by making a text macro. : 7* S" DUP 8* SWAP -" EVALUATE ; IMMEDIATE Time =28.41 I removed all Forth overhead with my INLINE[ ] compiler. CODE 7* INLINE[ DUP 8* SWAP - ] NEXT, ENDCODE TIME = 27.13 (20mS faster) So I re-wrote it in Forth Assembler. CODE 7* ( n -- n') TOS R1 MOV, TOS 3 SLA, R1 TOS SUB, NEXT, ENDCODE Time= 26.48 (3% better) The real bottle neck is probably the DIV instruction. And also Forth looping is much slower than assembler. I will have to implement your algorithm. -------------------- FYI for anyone following this thread, I have simplified the computation code quite a bit: : A1*7->A2 ( -- ) X OFF 0 \ index on stack BEGIN DUP ]A1@ 7* X@ + S>D 10 UM/MOD X ! OVER ]A2! 1+ DUP LENGTH@ > X@ 0= AND UNTIL LENGTH ! ; Quote Link to comment Share on other sites More sharing options...
moulinaie Posted October 23, 2019 Share Posted October 23, 2019 16 hours ago, TheBF said: So I re-wrote it in Forth Assembler. CODE 7* ( n -- n') TOS R1 MOV, TOS 3 SLA, R1 TOS SUB, NEXT, ENDCODE Using assembler is not FORTH... ? In this case, MLC can do assembler too, there is an integrated assembler into the PreCompiler: Here is the source code for SEVENA (A for assembler) : (you can see that Assembler can exchange data with the variables of MLC as A to Z are predefined into the assembler) 100 CALL CLEAR $MLC F 110 10 3000 300 CALL LINK("SEVEN",N,A$) 310 PRINT "7 TO THE POWER OF ";N; " IS" 320 PRINT A$ 330 END $SEVEN DIMTABLE F 256 ; F points to the special 256 bytes buffer DIMTABLE G 256 ; G points to 256 bytes (for returning string) STARTDATA BYTES 0,7,4,1,8,5,2,9,6,3 BYTES 0,0,1,2,2,3,4,4,5,6 ENDDATA E ; E points to this multiplication table LET P 1 ; current power = 1 PUTTABLE F 0 7 ; current value is 7 at first byte ; R0 = pointer on F table ; R1 = pointer to E table ; R2 = 7 ; R3 = one digit and pointer into E ; R4 = one digit after multiplication ; R5 = carry ; R6 = number of digits-1 into F table ; R7 = number of consecutive "7" currently found ; R8 = flag for FOUND ; R9 = loop counter on digits while mul*7 ; R10 = 10 $[ LI R6,1 ; current LEN in digits in F table CLR R8 ; flag for "found" LI R10,>A00 ; R10 = byte 10 LI R2,>700 ; R2 = byte 7 MOV @E,R1 ; mutliplication table E ; start of MAIN loop (a852) CLR R7 CLR R5 MOV R6,R9 MOV @F,R0 ; DEBUT start of multiply loop (a85C) CLR R3 ; for byte operation MOVB *R0,R3 ; new digit SWPB R3 ; to word A R1,R3 ; E+digit -> points to the units of table*7 MOVB *R3,R4 ; get units AB R5,R4 ; plus carry MOVB @10(R3),R5 ; E+digit+10 -> points to tenths of table*7, new carry CB R4,R10 ; more than a digit? JLT +3 ; to AA SB R10,R4 ; if so, reports ten... AI R5,>100 ; ...on the carry (byte) ; AA MOVB R4,*R0+ ; store new digit CB R2,R4 ; is it a "7" ? JNE +6 ; to BB INC R7 ; if so, one more found CI R7,6 ; six "7" ? JNE +3 ; to CC INC R8 ; yes!!! end of search ! JMP +1 ; to CC ; BB CLR R7 ; not a "7"... reset counter ; CC DEC R9 ; loop counter JNE -25 ; no, return to DEBUT CI R5,0 ; a carry remains? JEQ +2 ; to DD INC R6 ; yes! One more digit MOVB R5,*R0 ; and store it ; DD INC @P ; power+1 CI R8,0 ; flag for found JEQ -40 ; still zero... back to MAIN DEC R6 ; to return LEN-1 MOV R6,@Z ; else, update number of digits $] LET J Z FOR I 0 Z GETTABLE F I X ; one digit from F table ADD X 48 ; in ASCII PUTTABLE G J X ; one character in G string DEC J NEXT INC Z ; correct len LET U 2 PUTTABLE U 0 G ; set A$ to G string with Z characters PUTPARAM 1 P ; set N to the current power $$ $END And the resulting X-BASIC program : 100 CALL CLEAR 110 CALL INIT::PRINT "Loading MLC Compiler..."::CALL LOAD("DSK1.NEWFLO")::CALL LINK("NEWFLO") 120 IO(1)=3000::CALL LINK("COMPIL",IO(),S$(),C$())::If IO(1) THEN PRINT "Error ";IO(1)::END 130 PRINT "Compilation OK!" 300 CALL LINK("SEVEN",N,A$) 310 PRINT "7 TO THE POWER OF ";N; " IS" 320 PRINT A$ 330 END 3000 DATA P 3010 DATA SEVEN 3020 DATA "TRF256 TRG256 _A BA T< TH10.00070401080502090603 TH10.00000102020304040506 TE LA T>E =P1 TPF0.7 TT518 TT1 TT1224 TT522 TT2560" 3030 DATA "TT514 TT1792 TT-16288 TT-24536 TT1223 TT1221 TT-15802 TT-16352 TT-24534 TT1219 TT-12080 TT1731 TT-24383 TT-12013 TT-20219" 3040 DATA "TT-11933 TT10 TT-28028 TT4355 TT28938 TT549 TT256 TT-9212 TT-28414 TT5638 TT1415 TT647 TT6 TT5635 TT1416 TT4097 TT1223 TT1545" 3050 DATA "TT5863 TT645 TT0 TT4866 TT1414 TT-11259 TT1440 TT-24514 TT648 TT0 TT5080 TT1542 TT-14330 TT-24494 =JZ =I0 LA TGFIX +X48 TPGJX DJ" 3060 DATA "II CIZ !>A IZ =U2 TPU0G P1P" 3070 DATA "" 3080 DATA "" Again, here is the ZIP file with everything in it and the DSK1 forlder to test it under CLASSIC99. OLD DSK1.SEVENA RUN That's it ! Guillaume. seven_mlc.zip Quote Link to comment Share on other sites More sharing options...
+TheBF Posted October 23, 2019 Share Posted October 23, 2019 Well... the line between Forth and Assembler is very thin. Notice, I did not write the entire program in Assembler just three instructions. And after I do that a new word is in the Forth language called "7*". So is it Forth? All 100+ primitive operations in the language are exactly the same. I will leave it up to you. Thank you for the Assembler version. I will re-write it in Forth Assembler as an exercise. If I do it correctly, I should be able to use your code for the calculation section and keep the rest in Forth just to see what happens to performance. So MLC is a very slick system. It integrates with BASIC very well. I think more BASIC programmers should take advantage of it for time critical stuff. Something I notice is that it is very much like Forth only it uses PREFIX notation. I was playing with emulating MLC in Forth by redefining Forth. Here is how it might look (NOT FINISHED, NOT TESTED) It would not run as fast as MLC of course, but it would work. But you get an idea of what I mean from the example below. INCLUDE DSK1.DATABYTE \ BFOX library for TI99 BYTE DATA directives : DIMTABLE BUFFER: ; : LET ! ; : STARTDATA HERE ; : ENDDATA CONSTANT ; : PUTABLE SWAP CELLS + ! ; \ I need define variables before usage, ( I think MLC is using array indexed by the character name no?) VARIABLE A VARIABLE B VARIABLE C VARIABLE D ETC... : ; \ ; ; Comment is redefined to ';' :-) ; now we write reverse polish notation MLC 256 DIMTABLE F ; F points to the special 256 bytes buffer 256 DIMTABLE G ; G points to 256 bytes (for returning string) STARTDATA BYTE 0,7,4,1,8,5,2,9,6,3 BYTE 0,0,1,2,2,3,4,4,5,6 ENDDATA E ; E points to this multiplication table 1 P LET ; current power = 1 7 0 F PUTTABLE ; current value is 7 at first byte 0 E LET ; current LEN-1 of number in digits 0 X LET ; clear word for byte operations 0 A LET ; flag for "found" Quote Link to comment Share on other sites More sharing options...
moulinaie Posted October 23, 2019 Share Posted October 23, 2019 19 minutes ago, TheBF said: [] Thank you for the Assembler version. I will re-write it in Forth Assembler as an exercise. If I do it correctly, I should be able to use your code for the calculation section and keep the rest in Forth just to see what happens to performance. [] Here is how it might look (NOT FINISHED, NOT TESTED) It would not run as fast as MLC of course, but it would work. But you get an idea of what I mean from the example below. Hi ! I think if you translate from my to your assembler, the speed will be the same. I understand what you mean by emulating MLC ! I remember that when I wrote my FORTH compiler for the Atari ST, I used the same method as you to implement assembler. I turned it into prefixed notation. It's a very funny language! Guillaume. 1 Quote Link to comment Share on other sites More sharing options...
moulinaie Posted October 23, 2019 Share Posted October 23, 2019 (edited) Here is my solution in XB , no multiplication, no division, only integers (but the BASIC treat them as float...) I'd be curious to know what the speed is with a Basic Compiler. 100 DIM F(256) 110 FOR I=0 TO 9:: READ D(I),U(I)::NEXT I::P=1::F(0)=7::Z=0::A=0 120 C=0::N=0 130 FOR I=0 TO Z::X=F(I)::Y=U(X)+C::C=D(X)::IF Y>=10 THEN Y=Y-10::C=C+1 140 F(I)=Y::IF Y<>7 THEN N=0::GOTO 160 150 N=N+1::IF N=6 THEN A=1 160 NEXT I 170 IF C THEN Z=Z+1::F(Z)=C 180 P=P+1::IF A=0 THEN GOTO 120 300 PRINT "7 to the power of";P;"is" 310 FOR I=Z TO 0 STEP -1::PRINT CHR$(F(I)+48);::NEXT I 315 CALL SOUND(4000,440,0) 320 END 1000 DATA 0,0,0,7,1,4,2,1,2,8,3,5,4,2,4,9,5,6,6,3 It takes about 11 min 34 sec to find that the solution is 7^175 ! But it works. (I added four seconds of sound at the end to "wake me up"! Then , you just have to subtract 4 seconds to your chrono at the end of the sound.) With Option CPU Overdrive, it takes 1 min 20sec to complete. Guillaume. Edited October 23, 2019 by moulinaie 3 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted October 23, 2019 Share Posted October 23, 2019 Ah, so you know Forth well. I am "preaching to the choir" as we say in English. Atari ST... I always wanted one of those machines and a copy of Dragon Forth. Les memoires.... ? 1 Quote Link to comment Share on other sites More sharing options...
moulinaie Posted October 23, 2019 Share Posted October 23, 2019 47 minutes ago, TheBF said: Ah, so you know Forth well. I am "preaching to the choir" as we say in English. Atari ST... I always wanted one of those machines and a copy of Dragon Forth. Les memoires.... ? In French we say "je prèche des convaincus" (preaching conviced people) ! What's cool with FORTH is that you can extend it so easely. In my version, I wanted to organize memory my way and I didn't work with addresses but I tokenised the words into one single 16 bits word. So I was able to treat 65536 different instructions, looks enough. (including the user words). A text line was read and tokenised giving a serie of 16 bits words (with a pre-interpretation for IF...THEN...DO ..LOOP jumps, etc) Then the tokens were executed one by one. I still use the FORTH on the ST/TT. Guillaume. PS: In this case, you should use "les souvenirs" instead of "les mémoires". Quote Link to comment Share on other sites More sharing options...
+TheBF Posted October 23, 2019 Share Posted October 23, 2019 Thank you for the French lesson. I am an "Anglophone" here in Canada and so I don't get to use French very much. I love the language. So that method is commonly called a "byte code" interpreter, however you used integers. There is a system called Open-firmware that uses 255 byte codes for the low level operations. This makes very small programs. I don't remember how they index the user's words. I think they used banks of 255 bytes, but not sure anymore. Open-firmware booted Solaris and Apple Power PC machines and I believe it is used for booting some Linux distros. (not up to date on that) Thanks again for your input here. It's very interesting to see other solutions. Quote Link to comment Share on other sites More sharing options...
+TheBF Posted October 29, 2019 Share Posted October 29, 2019 Been dragging my a** for a week with the flu. But I took a run at re-writing this program in a what I consider to be Forth style, which I would define as factoring out the program into easily digestible pieces for easy debugging and also to add some clarity to what can be a hard language to read. This version compiles on Turbo Forth and CAMEL99 Forth. The run times are virtually identical 32.3 / 32.7 with Turbo Forth ahead due to it's big stash of code sitting in scratchpad RAM. But CAMEL99 performs respectably. I removed the double condition that ended the multiplier loop by using two loops that return to the same place. This made a slightly faster multiply step. One of the interesting bits is counting the sevens with no variable and no IF statement. The counter is on the data stack and IF is replaced by the AND operator. Where does the extra time get used up versus Assembler? For reference, writing the SEVENS? routine in Assembler results in 17 second timing. As shown earlier we can speed this up by 3% by replacing the multiply operator with a three instruction routine. I have yet to get the big integer multiplier code working in assembler to compare this method to Guillaume's method. I am missing something simple. ? So much code so little time. \ Sevens problem re-written in a factored Style. \ Turbo Forth compatible version DECIMAL 180 CONSTANT SIZE VARIABLE X VARIABLE POWER VARIABLE LENGTH CREATE A1 SIZE CELLS ALLOT CREATE A2 SIZE CELLS ALLOT CREATE PAD 256 ALLOT \ re-defined for best speed \ Expose memory BLOCKS as indexed arrays : ]A1 ( ndx -- n) CELLS A1 + ; : ]A2 ( ndx -- n) CELLS A2 + ; : A1*7->A2 ( -- ) 0 \ index on stack 0 X ! \ remainder storage BEGIN BEGIN DUP ]A1 @ 7 * X @ + 0 10 UM/MOD X ! OVER ]A2 ! 1+ DUP LENGTH @ > UNTIL X @ WHILE ( "while there is a remainder") REPEAT ( Do it again) LENGTH ! ; \ BIG number converion based on Forth internal method (modified) 48 CONSTANT '0' VARIABLE HP \ "HOLD" pointer. Where to put DIGIT in the string : <# ( -- ) PAD HP ! ; : #> ( -- pad length ) PAD HP @ OVER - ; : HOLD ( char -- ) HP @ C! 1 HP +! ; \ hold digit in pad, bump pointer : DIGIT ( n -- char) '0' + ; : A2>#S ( -- pad length ) 0 LENGTH @ DO I ]A2 @ DIGIT HOLD -1 +LOOP ; \ These are not in Turbo Forth kernel : /STRING ( c-addr1 u1 n -- c-addr2 u2 ) ROT OVER + -ROT - ; : SKIP ( addr len char -- addr' len') >R BEGIN OVER C@ R@ = WHILE 1 /STRING REPEAT R> DROP ; : A2$ ( -- addr len ) <# A2>#S #> '0' SKIP ; : SEVENS? ( -- ?) 0 \ flag/counter on stack LENGTH @ 1- 0 DO 1+ \ bump counter I ]A2 @ 7 = \ test for a '7'. TRUE= -1, FALSE=0 AND \ AND flag with count (replaces IF) DUP 5 = \ is count equal to 5? IF LEAVE THEN \ if so, leave the loop, return flag LOOP ; : INTRO PAGE ." The 5 Sevens Problem" CR CR ." Find the power of 7 with more than" CR ." 5 sequential sevens" CR CR ." Press key to start" KEY DROP ; : EMPTY CELLS 0 FILL ; : INITS ( -- ) A1 SIZE EMPTY A2 SIZE EMPTY 7 A1 ! 2 POWER ! 1 LENGTH ! ; : CALCULATOR A2 A1 BEGIN A1*7->A2 2DUP LENGTH @ CELLS CMOVE \ copy A2->A1 1 POWER +! SEVENS? UNTIL 2DROP ; : RUN INTRO INITS CR CR ." Working..." CALCULATOR CR CR ." The Answer is 7 ^" POWER @ 1- . CR A2$ TYPE ; \ CAMEL99 V2.5 0:32.7 \ TURBO Forth 0:32.3 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted October 29, 2019 Share Posted October 29, 2019 On 3/25/2011 at 7:56 AM, lucien2 said: Maybe versus Weiand's Forth: Compiled-Basic: 1m40s Weiand-Forth: 3m My first version runs in 3m with TF (5m with Weiand), so the new one must run in ~1m50s (I don't have TF to test it). We can't really compare, since it's not exactly the same algorithm. I'm new to forth, and I'm impressed with the performance against interpreted basic. Running Turboforth with its fast screen I/O the version that works like the Compiled BASIC, using my re-written version ran in 1:28~ Quote Link to comment Share on other sites More sharing options...
moulinaie Posted October 29, 2019 Share Posted October 29, 2019 8 hours ago, TheBF said: Been dragging my a** for a week with the flu. But I took a run at re-writing this program in a what I consider to be Forth style, which I would define as factoring out the program into easily digestible pieces for easy debugging and also to add some clarity to what can be a hard language to read. This version compiles on Turbo Forth and CAMEL99 Forth. The run times are virtually identical 32.3 / 32.7 with Turbo Forth ahead due to it's big stash of code sitting in scratchpad RAM. But CAMEL99 performs respectably. I removed the double condition that ended the multiplier loop by using two loops that return to the same place. This made a slightly faster multiply step. One of the interesting bits is counting the sevens with no variable and no IF statement. The counter is on the data stack and IF is replaced by the AND operator. Where does the extra time get used up versus Assembler? For reference, writing the SEVENS? routine in Assembler results in 17 second timing. As shown earlier we can speed this up by 3% by replacing the multiply operator with a three instruction routine. I have yet to get the big integer multiplier code working in assembler to compare this method to Guillaume's method. I am missing something simple. ? So much code so little time. \ Sevens problem re-written in a factored Style. \ Turbo Forth compatible version DECIMAL 180 CONSTANT SIZE VARIABLE X VARIABLE POWER VARIABLE LENGTH CREATE A1 SIZE CELLS ALLOT CREATE A2 SIZE CELLS ALLOT CREATE PAD 256 ALLOT \ re-defined for best speed \ Expose memory BLOCKS as indexed arrays : ]A1 ( ndx -- n) CELLS A1 + ; : ]A2 ( ndx -- n) CELLS A2 + ; : A1*7->A2 ( -- ) 0 \ index on stack 0 X ! \ remainder storage BEGIN BEGIN DUP ]A1 @ 7 * X @ + 0 10 UM/MOD X ! OVER ]A2 ! 1+ DUP LENGTH @ > UNTIL X @ WHILE ( "while there is a remainder") REPEAT ( Do it again) LENGTH ! ; \ BIG number converion based on Forth internal method (modified) 48 CONSTANT '0' VARIABLE HP \ "HOLD" pointer. Where to put DIGIT in the string : <# ( -- ) PAD HP ! ; : #> ( -- pad length ) PAD HP @ OVER - ; : HOLD ( char -- ) HP @ C! 1 HP +! ; \ hold digit in pad, bump pointer : DIGIT ( n -- char) '0' + ; : A2>#S ( -- pad length ) 0 LENGTH @ DO I ]A2 @ DIGIT HOLD -1 +LOOP ; \ These are not in Turbo Forth kernel : /STRING ( c-addr1 u1 n -- c-addr2 u2 ) ROT OVER + -ROT - ; : SKIP ( addr len char -- addr' len') >R BEGIN OVER C@ R@ = WHILE 1 /STRING REPEAT R> DROP ; : A2$ ( -- addr len ) <# A2>#S #> '0' SKIP ; : SEVENS? ( -- ?) 0 \ flag/counter on stack LENGTH @ 1- 0 DO 1+ \ bump counter I ]A2 @ 7 = \ test for a '7'. TRUE= -1, FALSE=0 AND \ AND flag with count (replaces IF) DUP 5 = \ is count equal to 5? IF LEAVE THEN \ if so, leave the loop, return flag LOOP ; : INTRO PAGE ." The 5 Sevens Problem" CR CR ." Find the power of 7 with more than" CR ." 5 sequential sevens" CR CR ." Press key to start" KEY DROP ; : EMPTY CELLS 0 FILL ; : INITS ( -- ) A1 SIZE EMPTY A2 SIZE EMPTY 7 A1 ! 2 POWER ! 1 LENGTH ! ; : CALCULATOR A2 A1 BEGIN A1*7->A2 2DUP LENGTH @ CELLS CMOVE \ copy A2->A1 1 POWER +! SEVENS? UNTIL 2DROP ; : RUN INTRO INITS CR CR ." Working..." CALCULATOR CR CR ." The Answer is 7 ^" POWER @ 1- . CR A2$ TYPE ; \ CAMEL99 V2.5 0:32.7 \ TURBO Forth 0:32.3 Hi, cool version! I like your "SEVENS?" procedure with the AND. Compared to the assembler, sure that this FORTH version is time consuming, for example in the use of ]A1 and ]A2, for every access to the table, the index is added to the base address. If there was something like a pointer to parse the table sequentially, it would be faster. Another idea, is it possible to write two procedures, the one to compute A1*7 into A2 and the second te compute A2*7 into A1. Then you won't have to CMOVE the table at each iteration. Guillaume. 1 Quote Link to comment Share on other sites More sharing options...
moulinaie Posted October 29, 2019 Share Posted October 29, 2019 (edited) Hello, I worked again on the MLC program with assembler. By default, MLC uses >A000 as its Workspace Pointer for registers R0-R15. The program modifies temporary this to use >83E0 as the workspace pointer (the zone used by GPL!). This is much faster now ! I run once the program (to get my "SEVEN" subprogram in memory), and then (on real hardware) FOR I=1 TO 30::CALL LINK("SEVEN",N,A$)::NEXT I With the old assembler version it took 60 seconds, so one calculation* was only 60/30 = 2 seconds. With the new version, it takes 42 seconds, so one calculation* is only 42/30 = 1,4 seconds. Waow! Sure, there is no display, but N (=power) and A$ (=result string) are correctly filled. Guillaume. (*) one calculation means to compute every power of seven until 175 to find the correct answer PROGRAM to paste into CLASSIC99 (if you already have MLC on your disk) : 100 CALL CLEAR 110 DIM IO(3) 120 CALL INIT::PRINT "Loading MLC Compiler..."::CALL LOAD("DSK1.NEWFLO")::CALL LINK("NEWFLO") 130 IO(1)=3000::CALL LINK("COMPIL",IO(),S$(),C$())::If IO(1) THEN PRINT "Error ";IO(1)::END 140 PRINT "Compilation OK!" 300 CALL LINK("SEVEN",N,A$) 310 PRINT "7 TO THE POWER OF ";N; " IS" 320 PRINT A$ 330 END 3000 DATA P 3010 DATA SEVEN 3020 DATA "TRF256 TRG256 TRH22 _A BA T< TH10.00070401080502090603 TH10.00000102020304040506 TE LA T>E =P1 TPF0.7 TM-31776.22H TT736" 3030 DATA "TT-31776 TT518 TT1 TT1224 TT522 TT2560 TT514 TT1792 TT-16288 TT-24536 TT1223 TT1221 TT-15802 TT-16352 TT-24534 TT1219 TT-12080" 3040 DATA "TT1731 TT-24383 TT-12013 TT-20219 TT-11933 TT10 TT-28028 TT4355 TT28938 TT549 TT256 TT-9212 TT-28414 TT5638 TT1415 TT647 TT6" 3050 DATA "TT5635 TT1416 TT4097 TT1223 TT1545 TT5863 TT645 TT0 TT4866 TT1414 TT-11259 TT1440 TT-24514 TT648 TT0 TT5080 TT-14330 TT-24494" 3060 DATA "TT-16352 TT-24534 TT-16288 TT-24532 TT514 TT12288 TT-24506 TT-12048 TT-20286 TT1537 TT-11197 TT1542 TT5882 TT736 TT-24576" 3070 DATA "TMH22-31776 =U2 TPU0G P1P" 3080 DATA "" 3090 DATA "" Source program: ; this program searches for the first power of 7 ; that contains six consecutive "7" 100 CALL CLEAR $MLC F 110 10 3000 300 CALL LINK("SEVEN",N,A$) 310 PRINT "7 TO THE POWER OF ";N; " IS" 320 PRINT A$ 330 END $EQU \gplreg PADE0 ; >83E0 \mlcreg &HA000 $$ $SEVEN DIMTABLE F 256 ; F points to a 256 bytes buffer DIMTABLE G 256 ; G points to 256 bytes (for returning string) DIMTABLE H 22 ; to save GPL registers R0-R10 (11*2) STARTDATA BYTES 0,7,4,1,8,5,2,9,6,3 BYTES 0,0,1,2,2,3,4,4,5,6 ENDDATA E ; E points to this multiplication table LET P 1 ; current power = 1 PUTTABLE F(0) 7 ; current value is 7 at first byte ; R0 = pointer on F table ; R1 = pointer to E table ; R2 = 7 ; R3 = one digit and pointer into E ; R4 = one digit after multiplication ; R5 = carry ; R6 = number of digits-1 into F table ; R7 = number of consecutive "7" currently found ; R8 = flag for FOUND ; R9 = loop counter on digits while mul*7 ; R10 = 10 BMOVE \gplreg 22 H ; save GPL registers R0-R10 into H $[ LWPI \glpreg ; use fast registers ! LI R6,1 ; current LEN in digits in F table CLR R8 ; flag for "found" LI R10,>A00 ; R10 = byte 10 LI R2,>700 ; R2 = byte 7 MOV @E,R1 ; mutliplication table E MAIN_LOOP: CLR R7 CLR R5 MOV R6,R9 MOV @F,R0 MULTIPLY_LOOP: CLR R3 ; for byte operation MOVB *R0,R3 ; new digit SWPB R3 ; to word A R1,R3 ; E+digit -> points to the units of table*7 MOVB *R3,R4 ; get units AB R5,R4 ; plus carry MOVB @10(R3),R5 ; E+digit+10 -> points to tenths of table*7, new carry CB R4,R10 ; more than a digit? JLT AA: SB R10,R4 ; if so, reports ten... AI R5,>100 ; ...on the carry (byte) AA: MOVB R4,*R0+ ; store new digit CB R2,R4 ; is it a "7" ? JNE BB: INC R7 ; if so, one more found CI R7,6 ; six "7" ? JNE CC: INC R8 ; yes!!! end of search ! JMP CC: BB: CLR R7 ; not a "7"... reset counter CC: DEC R9 ; loop counter JNE MULTIPLY_LOoP: ; no, return to DEBUT CI R5,0 ; a carry remains? JEQ DD: INC R6 ; yes! One more digit MOVB R5,*R0 ; and store it DD: INC @P ; power+1 CI R8,0 ; flag for found JEQ MAIN_LOOP: ; still zero... back to MAIN MOV R6,@Z ; else, update number of digits MOV @F,R0 ; source F ascending MOV @G,R1 ; dest G descending LI R2,&H3000 ; "0" in high byte A R6,R1 ; R1 points to the end of STRING G EE: MOVB *R0+,R3 ; one digit from F AB R2,R3 ; to ASCII DEC R1 MOVB R3,*R1 ; character in G string DEC R6 JNE EE: lwpi \mlcreg ; back to my registers $] BMOVE H 22 \gplreg ; restore GPL values LET U 2 PUTTABLE U 0 G ; set A$ to G string with Z characters PUTPARAM 1 P ; set N to the current power $$ $END seven_mlc.zip Edited October 29, 2019 by moulinaie 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted October 29, 2019 Share Posted October 29, 2019 1 hour ago, moulinaie said: Hi, cool version! I like your "SEVENS?" procedure with the AND. Compared to the assembler, sure that this FORTH version is time consuming, for example in the use of ]A1 and ]A2, for every access to the table, the index is added to the base address. If there was something like a pointer to parse the table sequentially, it would be faster. Another idea, is it possible to write two procedures, the one to compute A1*7 into A2 and the second te compute A2*7 into A1. Then you won't have to CMOVE the table at each iteration. Guillaume. If there isn't "something like a pointer" it's my fault. It's Forth. I have been thinking that as well. Managing two pointers on the stack will be slower ( I think) because I have to swap them back and forth to update them but I will give it a try. But I could make a couple of variables to hold the pointers and some auto incrementing fetch and store operators... that might work. The CMOVE is not a big slowdown compared to the computation. To test it I made MOVE16 word in Assembler that is 2X faster (moves 16bit words) and the speedup was very little. Of course each little part that is re-written in Forth Assembler makes a difference as we saw with your suggestion for 7*. Typically that is how you use these indirect threaded Forth systems. Write the program, find the bottlenecks and re-code the little pieces in Assembler. I really want to get the Assembler version of A1*7->A2 working as CODE to get a sense of what happens. I need to dig in to find my error. My last resort is to get my Native99 Forth compiler working well enough to compile the program like you are doing with MLC and see what happens. That will take a little more time since I left that project a while back and it will take me some time to re-familiarize myself and add what extensions I need to this program. Using the scratchpad RAM really makes a difference as you have noted. It is so sad we don't have more RAM on the 16bit buss. Quote Link to comment Share on other sites More sharing options...
apersson850 Posted October 29, 2019 Share Posted October 29, 2019 I really like the 64 KBytes internal 16-bit wide RAM I put into my console. It's relieving to not have to worry where code, and workspace, is located, as all is equally fast. 1 Quote Link to comment Share on other sites More sharing options...
moulinaie Posted October 29, 2019 Share Posted October 29, 2019 (edited) 8 minutes ago, TheBF said: Using the scratchpad RAM really makes a difference as you have noted. It is so sad we don't have more RAM on the 16bit buss. I think that most of you know this page, but in case... This is how to add the 32k on the 16 bits bus. http://www.mainbyte.com/ti99/32K16/32k16.html 3 minutes ago, apersson850 said: I really like the 64 KBytes internal 16-bit wide RAM I put into my console. It's relieving to not have to worry where code, and workspace, is located, as all is equally fast. How did you proceed? Guillaume. Edited October 29, 2019 by moulinaie Quote Link to comment Share on other sites More sharing options...
apersson850 Posted October 29, 2019 Share Posted October 29, 2019 Well, I added 64 KBytes of static RAM, covering all the address range of the TMS 9900. By default, only 2000H-3FFFH and A000H-FFFFH is enabled. By using CRU bits (base address 400H), I can enable the remaining RAM, 8 KBytes at a time. I can also disable the RAM at the normal memory expansion addresses, which will allow a standard memory expansion, like a card in the PEB, to become visible. So it's possible to get full 64 K contiguous RAM, as well as use either the fast internal 32 K RAM or the standard speed 32 K RAM, as you like. This gives a total of 96 K RAM available, under software control. Quote Link to comment Share on other sites More sharing options...
moulinaie Posted October 29, 2019 Share Posted October 29, 2019 1 minute ago, apersson850 said: Well, I added 64 KBytes of static RAM, covering all the address range of the TMS 9900. By default, only 2000H-3FFFH and A000H-FFFFH is enabled. By using CRU bits (base address 400H), I can enable the remaining RAM, 8 KBytes at a time. I can also disable the RAM at the normal memory expansion addresses, which will allow a standard memory expansion, like a card in the PEB, to become visible. So it's possible to get full 64 K contiguous RAM, as well as use either the fast internal 32 K RAM or the standard speed 32 K RAM, as you like. This gives a total of 96 K RAM available, under software control. That's impressive ! Is it a difficult modification of the console? Have you got some pictures? Guillaume. Quote Link to comment Share on other sites More sharing options...
apersson850 Posted October 29, 2019 Share Posted October 29, 2019 (edited) I didn't have any easy way of making a PCB at that time, so everything is piggy-backed on exisiting circuits. Picture of computer's board Wiring diagram Block schematics Slight modification to the design took place after the diagram was done. But this is way off topic in this thread. Edited October 29, 2019 by apersson850 1 Quote Link to comment Share on other sites More sharing options...
+RXB Posted October 29, 2019 Share Posted October 29, 2019 5 hours ago, apersson850 said: I didn't have any easy way of making a PCB at that time, so everything is piggy-backed on exisiting circuits. Picture of computer's board Wiring diagram Block schematics Slight modification to the design took place after the diagram was done. But this is way off topic in this thread. I will just stick with the SAMS 1 Meg card or the newer 4 Meg card. 1 Quote Link to comment Share on other sites More sharing options...
apersson850 Posted October 29, 2019 Share Posted October 29, 2019 Yes, much more memory. But not 16 bit wide and not designed in the 1980's, so it's a bit different. 1 Quote Link to comment Share on other sites More sharing options...
+9640News Posted October 29, 2019 Share Posted October 29, 2019 7 hours ago, apersson850 said: Well, I added 64 KBytes of static RAM, covering all the address range of the TMS 9900. By default, only 2000H-3FFFH and A000H-FFFFH is enabled. By using CRU bits (base address 400H), I can enable the remaining RAM, 8 KBytes at a time. I can also disable the RAM at the normal memory expansion addresses, which will allow a standard memory expansion, like a card in the PEB, to become visible. So it's possible to get full 64 K contiguous RAM, as well as use either the fast internal 32 K RAM or the standard speed 32 K RAM, as you like. This gives a total of 96 K RAM available, under software control. If something like what you described could be implemented on a card that would fit in the PEBox rather than modifying a console, you would have some pretty powerful features at your fingertips programmers could easily use. With the Geneve Operating system source code as a base, the Math, Video, and Keyboard XOP's could be tweaked as is. Probably wouldn't be much of a hurdle to even implement the memory management routines as well. With just the 64K, you would really just be capped at the video and keyboard routines which may work out well for the F18A MK2. Now, back to the topic of this thread............. 1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.