jschultzpedersen Posted June 7, 2023 Author Share Posted June 7, 2023 Hi again While rewriting some part of the code for my scroll test in assembler, I ran into a couple of issues. For one thing I need to write values between RAM and VRAM. The TF assembler manual has an example in section 8.2. This example clashes with information gleaned from the Editor/Assembler manual. The EA manual states, that when you use the VDPWA/VDPWD addresses (VPDA/VDPD in TF), you need to set the two most significant bits to 01. In other words, if you wish to start writing at address 0 in VRAM, you must write the address as $4000 + 0 = $4000 as in R0 $4000 LI, and not just write 0 as in R0 CLR, Doing the latter has the side effect of you being unable to write to address 0, because there is an autoincrement effect on this address that kicks in immediately. The other thing may be a feature or a bug. I am not sure. Anyway here it is... ASM: TEST ( -- ) R1 XMAX LI, SP DECT, R1 *SP MOV, ;ASM If I boot the system and run this, it provides the correct answer to the width of the screen. So if I start in 2 GMODE, it responds with 80 in the stack. However, if I then change to 1 GMODE or 0 GMODE and run the TEST again, it still responds 80 and not 32 or 40. If I reboot and immediately changes GMODE, it responds with the correct answer in the new mode, but again, it does not change, if I change GMODE later. This is not an issue in regular TF code. Here is the code I used to test the example from the manual. I have modified it to adapt to different screen modes as best I could, limited by the behaviour of XMAX, and modified the start value to $4000 as explained above. I also decided to move the stack value to a register, which should make the code a little bit faster by using register address mode instead of indirect mode for the many repetitions of that particular code line. PS. Remember that MPY returns its result as a 32 bit value in the current and next register, and it is right justified. So the result of multiplying the screen width with the line numbers (SMAX) ends up in R1 and R2 with the useful part in R2. So R2 is the counter register. 0.$8C02 CONSTANT VDPA $8C00 CONSTANT VDPW 1.VARIABLE SMAX 24 SMAX ! 2.. 3.ASM: WIPE ( ASCII -- ) 4. R1 XMAX LI, SMAX @@ R1 MPY, 5. R0 $4000 LI, 6. *SP R7 MOV, R7 SWPB, SP INCT, 7. BEGIN, 8. R0 SWPB, R0 VDPA @@ MOVB, 9. R0 SWPB, R0 VDPA @@ MOVB, 10. R7 VDPW @@ MOVB, 11. R0 INC, 12. R2 DEC, 13. EQ UNTIL, 14.;ASM 15. regards Jesper 1 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 7, 2023 Share Posted June 7, 2023 3 hours ago, jschultzpedersen said: For one thing I need to write values between RAM and VRAM. The TF assembler manual has an example in section 8.2. This example clashes with information gleaned from the Editor/Assembler manual. The EA manual states, that when you use the VDPWA/VDPWD addresses (VPDA/VDPD in TF), you need to set the two most significant bits to 01. In other words, if you wish to start writing at address 0 in VRAM, you must write the address as $4000 + 0 = $4000 as in R0 $4000 LI, and not just write 0 as in R0 CLR, Doing the latter has the side effect of you being unable to write to address 0, because there is an autoincrement effect on this address that kicks in immediately. Yeah—@Willsy needs to change that example. It should be something like this, which takes advantage of the auto-incrementing feature of writes to the VDP Write Data register: $8C02 CONSTANT VDPA \ address of VDP address register $8C00 CONSTANT VDPW \ address of VDP write register ASM: WIPE ( ascii -- ) \ create assembly language word “WIPE” R0 $0040 LI, \ swapped screen address 0 with write-data flag R0 VDPA @@ MOVB, \ write low byte to address register R0 SWPB, \ get address high byte, with write-data flag R0 VDPA @@ MOVB, \ write hi byte to address register R2 960 LI, \ counter *SP SWPB, \ get ASCII value on stack in high byte BEGIN, \ begin a loop (VDP address auto-increments) *SP VDPW @@ MOVB, \ write to VDP write-data register R2 DEC, \ decrement counter EQ UNTIL, \ repeat loop if R2 is not 0 SP DECT, \ remove ASCII value from the stack ;ASM \ end definition of assembly language word ...lee 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted June 7, 2023 Share Posted June 7, 2023 Lee got here first but here is another little tidibit of info. It is simpler to test these things from the console if you take arguments from the data stack. This is easy with the *SP+ indirect auto-increment on the SP register. Here is an example word that I wrote up but it is not tested. $8C02 CONSTANT VDPA $8C00 CONSTANT VDPW VARIABLE SMAX 24 SMAX ! ASM: VFILL ( ascii Vaddr bytes -- ) \ get the arguments from the data stack *SP+ R2 MOV, \ pop loop counter *SP+ RO MOV, \ pop VDP address *SP+ R7 MOV, \ pop ascii char R7 SWPB, \ fix byte order R0 $4000 ORI, \ set write bit on Vaddr \ set the VDP address once, 1st write goes to that address and VDP auto-increments R0 SWPB, R0 VDPA @@ MOVB, R0 SWPB, R0 VDPA @@ MOVB, \ loop only needs to count bytes, VDP chip does the rest BEGIN, R7 VDPW @@ MOVB, R2 DEC, EQ UNTIL, ;ASM 2 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 7, 2023 Share Posted June 7, 2023 4 hours ago, jschultzpedersen said: The other thing may be a feature or a bug. I am not sure. Anyway here it is... ASM: TEST ( -- ) R1 XMAX LI, SP DECT, R1 *SP MOV, ;ASM If I boot the system and run this, it provides the correct answer to the width of the screen. So if I start in 2 GMODE, it responds with 80 in the stack. However, if I then change to 1 GMODE or 0 GMODE and run the TEST again, it still responds 80 and not 32 or 40. If I reboot and immediately changes GMODE, it responds with the correct answer in the new mode, but again, it does not change, if I change GMODE later. This is not an issue in regular TF code. There is no bug. The problem with TEST is that XMAX is a constant that executes only at the time TEST is compiled. Every time you invoke TEST , the value in R1 will be the same, regardless of changes to the ‘constant’ that occur after TEST was compiled. The solution is to either pass XMAX to TEST on the stack or get it from its actual location, $A02C. Though using $A02C is likely safe enough for the foreseeable future, future revisions of TurbForth could change it. ...lee 2 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 7, 2023 Share Posted June 7, 2023 4 hours ago, Lee Stewart said: The solution is to either pass XMAX to TEST on the stack or get it from its actual location, $A02C. Though using $A02C is likely safe enough for the foreseeable future, future revisions of TurbForth could change it. One way to insure you get the correct storage location for the screen’s column width ( XMAX ) is to make use of some readily available carnal knowledge of TurboForth. The following code snippet is the listing I have of the definition of XMAX from a slightly older revision of TurboForth: Machine Address Code Assembly Source ------- ------- ------------------------------------------------------- ; XMAX ( -- xmax ) (constant) ; places horizontal screen size (32/40/80) on stack 771C 770A xmaxh data ioerrh,4 771E 0004 7720 584D text 'XMAX' 7722 4158 7724 7726 gxmax data $+2 ; <----Code Field (CFA points here) 7726 C1A0 mov @xmax,r6 7728 A02C ; xmax address = CFA + 4 772A 10D4 jmp dovar As you can see, the value A02C is located 4 bytes after the Code Field in the definition of XMAX . Since the ALC for this word is likely never to change (though its actual ROM location may), it should be safe enough to ‘tick’ the word to get its CFA and add 4 to get the ROM address of the address of the screen’s column width (A02C, in this case). You can also add 2 to this result to get the storage location of the screen’s row height (always 24), which immediately follows it—currently, A02E: ' XMAX 4 + @ \ current value: $A02C DUP 2+ \ current value: $A02E Your code can now be $8C02 CONSTANT VDPA $8C00 CONSTANT VDPW ASM: WIPE ( ASCII -- ) ' XMAX 4 + @ \ get address of xmax (probably $A02C) DUP @@ R1 MOV, \ copy xmax to R1 2+ @@ R1 MPY, \ multiply xmax by ymax (probably at $A02E) R0 $0040 LI, \ saves one SWPB, R0 VDPA @@ MOVB, R0 SWPB, R0 VDPA @@ MOVB, *SP+ R7 MOV, \ pop ASCII value off stack R7 SWPB, BEGIN, R7 VDPW @@ MOVB, \ copy ASCII value to next VRAM address R2 DEC, EQ UNTIL, ;ASM ...lee 2 Quote Link to comment Share on other sites More sharing options...
jschultzpedersen Posted June 8, 2023 Author Share Posted June 8, 2023 Hi Thanks! I think this latest piece of code will go into my little book of 'clever code for future use'. It answers several questions in one neat package 😃 regards Jesper 4 1 Quote Link to comment Share on other sites More sharing options...
jschultzpedersen Posted June 9, 2023 Author Share Posted June 9, 2023 Hi again Just a supplementary question... What is the advantage of entering MC code via ASM or CODE (apart from the hassle of getting the actual hex codes for the instructions). Do they not provide the same end result? Like in... HEX CODE: PLUS A534 ;CODE DECIMAL or ASM: PLUS ( arg1 arg2 -- result ) R4 *+ R4 ** A, ;ASM regards Jesper 2 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted June 9, 2023 Share Posted June 9, 2023 50 minutes ago, jschultzpedersen said: Hi again Just a supplementary question... What is the advantage of entering MC code via ASM or CODE (apart from the hassle of getting the actual hex codes for the instructions). Do they not provide the same end result? Like in... HEX CODE: PLUS A534 ;CODE DECIMAL or ASM: PLUS ( arg1 arg2 -- result ) R4 *+ R4 ** A, ;ASM regards Jesper If you have memorized the entire instruction set in numerical form this is no advantage at all. Charles Moore who invented Forth, was very fond of writing programs in machine code for his CPUs. They only had 21 instructions so was simple to do. In fact on these small systems after using the assembler to make and debug a word, most of the time we translate it to machine code in the source code so the Assembler is not needed when we compile the code. Mark has made a very nice ASM2CODE (or some such name) word that takes data from memory at saves it as a machine code word. 3 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 9, 2023 Share Posted June 9, 2023 2 hours ago, jschultzpedersen said: ASM: PLUS ( arg1 arg2 -- result ) R4 *+ R4 ** A, ;ASM A little clearer and, certainly, easier to read, would be ASM: PLUS ( arg1 arg2 -- result ) *SP+ *SP A, ;ASM For convenience, TurboForth has the Forth pointer-register (IP,SP,RP,W,NXT) operands as single words instead of requiring the two words necessary for the other registers. For example, Reg + Mode Word Reg + Mode Word Addressing Mode Enhanced (Wycove Forth) (TI Forth) ----------------------- -------- --------------- --------------- Indirect *SP SP ** SP *? Indirect auto-increment *SP+ SP *+ SP *?+ Indexed @(SP) SP () SP @(?) ...lee 2 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 9, 2023 Share Posted June 9, 2023 2 hours ago, TheBF said: Mark has made a very nice ASM2CODE (or some such name) word that takes data from memory at saves it as a machine code word. Yeah—the word is ASM>CODE and can be loaded from the UTILS disk. The syntax is ASM>CODE <name> <file> <name> must be a word defined with ASM: ... ;ASM . A most convenient method is to use the Windows clipboard (CLIP) for <file> in Classic99: ASM>CODE PLUS CLIP You can then paste it into a block or wherever. ...lee 2 Quote Link to comment Share on other sites More sharing options...
jschultzpedersen Posted June 9, 2023 Author Share Posted June 9, 2023 Hi Aha. I was thinking of using ASM to generate the MC code, since either method seems to generate the same MC code, and then get the hex codes for CODE via the DUMP utility. But this is easier. Thanks. And don't worry - I have no intentions of learning all the codes for the 10 different formats used by MC code. 😃 I just wanted to understand the 'inner workings' of how instructions are implemented. If there is no advantage to using CODE, I'll probably stick with ASM. Using CODE may save some bytes in the source code if not in the compiled code, but three months from now I would probably have forgotten what the hex code was supposed to do and how, unless I commented it lavishly. And once I have used BSAVE / BLOAD to store the actual MC code, the question of whether CODE or ASM compiles faster, means nothing. regards Jesper 1 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 9, 2023 Share Posted June 9, 2023 23 minutes ago, jschultzpedersen said: Hi Aha. I was thinking of using ASM to generate the MC code, since either method seems to generate the same MC code, and then get the hex codes for CODE via the DUMP utility. But this is easier. Thanks. And don't worry - I have no intentions of learning all the codes for the 10 different formats used by MC code. 😃 I just wanted to understand the 'inner workings' of how instructions are implemented. If there is no advantage to using CODE, I'll probably stick with ASM. Using CODE may save some bytes in the source code if not in the compiled code, but three months from now I would probably have forgotten what the hex code was supposed to do and how, unless I commented it lavishly. And once I have used BSAVE / BLOAD to store the actual MC code, the question of whether CODE or ASM compiles faster, means nothing. regards Jesper As Brian ( @TheBF ) indicated, the singular reason for using CODE: ... ;CODE is to avoid the 2784-byte overhead of loading the Forth TMS9900 Assembler. You definitely need the Assembler code for source code clarity, but the machine code for production—particularly, for sizeable projects. ...lee 2 Quote Link to comment Share on other sites More sharing options...
Willsy Posted June 12, 2023 Share Posted June 12, 2023 On 6/9/2023 at 8:02 PM, jschultzpedersen said: If there is no advantage to using CODE, I'll probably stick with ASM. Using CODE may save some bytes in the source code if not in the compiled code, but three months from now I would probably have forgotten what the hex code was supposed to do and how, unless I commented it lavishly. And once I have used BSAVE / BLOAD to store the actual MC code, the question of whether CODE or ASM compiles faster, means nothing. Yeah, that's pretty much it. And, as Brian said, the reason for CODE is to allow the fast loading of code words, without needing to load the assembler. If you're storing your application in a blocks file then CODE is your friend. You would build the application using the assembler, then, when finished, convert those ASM words to CODE words and incorporate into your source code. It's very useful when using Classic99 with the CLIP device. You can copy them into a text file on your host PC. Latterly, I've been building apps using text files rather than blocks. Blocks are fine until you want to insert a line halfway through your application! (Though there are some utilities to help move blocks around your disk. Type FILES when booted from the TF boot disk). I write the code as a text file on the PC, save it (as a .txt file) in the Classic99 DSK1 directory, then use the TF Text File Interpreter (block 44 of the utils disk - at least on my copy!) to load those text files directly into TF. Here's a video of me using it while working on a silly game I'm developing when the mood takes me. It's deliberately designed to look like a Sinclair ZX81 (which had no colour, no sound, and no graphics to speak of). Hence I add the grey background which is the only colour the ZX81 has! Luckily the audio was disabled when recording, so yoou can't hear me cursing with each dumb ass mistake! By the way, you can turn line display off in the text interpreter: FALSE TO SHOW should do it. It loads much faster. You can even turn it off in the text file itself. And then turn it on (TRUE TO SHOW) in the text file just before a word that you've modified. Then you can see it load, and turn the line display off again: : someWord ( -- ) blah blah blah ; true to show \ we want to see the next word being loaded : anotherWord ( -- ) even more blah ; false to show \ okay, as you were Untitled 24.mp4 3 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted June 12, 2023 Share Posted June 12, 2023 Minor point of order Mark. s" myfile.txt" included would make it standard Forth if that is of any interest to you. Then you can define: : INCLUDE BL PARSE INCLUDED ; \ OR : INCLUDE BL WORD COUNT INCLUDED ; (it was nice to see colour spelled in a civilized manner.) 3 Quote Link to comment Share on other sites More sharing options...
Willsy Posted June 12, 2023 Share Posted June 12, 2023 Ooh nice! That works like a charm! Regarding the colour/color debacle, I literally asked myself the other day "how do the Canadians spell colour?" and you've just answered it! 2 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted June 12, 2023 Share Posted June 12, 2023 2 minutes ago, Willsy said: "how do the Canadians spell colour?" and you've just answered it! Answer: The way G_d intended. 🤣 1 1 Quote Link to comment Share on other sites More sharing options...
Willsy Posted June 12, 2023 Share Posted June 12, 2023 Another quick one, for the benefit of anyone interested. You can use BLK>FILE to copy a block, well... to a file. Again, in Classic99 (in Linux or Windows) it's very handy to export a block to a text editor for some deft editing: Untitled 25.mp4 3 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted June 12, 2023 Share Posted June 12, 2023 (edited) Can you post the source for that one Mark? I have struggled to make a small text file editor and they are always way bigger than a block editor. So with a block to file convertor that lets people edit and test blocks of code interactively but save library code as files. I think for this little machine it's a good compromise. The challenge I guess is being keeping the text files formatted so they easily can come back into blocks for editing but that's not too much trouble. Edited June 12, 2023 by TheBF typo Quote Link to comment Share on other sites More sharing options...
Willsy Posted June 13, 2023 Share Posted June 13, 2023 Sure. First, the code; --BLOCK-00020--------- CR 16 CLOAD file-type TRUE VALUE cswtch .( BLK>FILE - dumps a range of blocks to a text file.) .( e.g. 1 21 BLK>FILE DSK2.BLKDUMP) : SZ ZEROS ! ; : HDR cswtch NOT DUP TO cswtch IF ." On" ELSE ." Off" THEN CR ; : BAR S" --BLOCK---------------" 2DUP >R >R DROP 8 + SWAP -1 SZ N>S 0 SZ ROT SWAP CMOVE S" " file-out #PUT DROP R> R> file-out #PUT ; .( HDR toggles headers on and off.) : BLK>FILE ( start end -- ) DEPTH 2 < NOT IF file-type DROP out-spec CMOVE [ CHAR O LITERAL ] out-spec + 1- C! set-out-file file-out #OPEN ABORT" Can't open output file" 1+ SWAP DO I cswtch IF BAR THEN DROP I BLOCK 16 0 DO DUP HERE 64 VMBR HERE 64 -TRAILING DUP IF file-out #PUT ELSE 2DROP S" " file-out #PUT THEN DROP 64 + LOOP DROP LOOP file-out #CLOSE ELSE TRUE ABORT" Syntax error" THEN ; That's directly from the disk block (block 20 on the TF boot disk). Here's a tidier version (note: this is old code, and is likely really terrible): CR 16 CLOAD file-type TRUE VALUE cswtch .( BLK>FILE - dumps a range of blocks to a text file.) .( e.g. 1 21 BLK>FILE DSK2.BLKDUMP) .( HDR toggles headers on and off.) : SZ ZEROS ! ; : HDR cswtch NOT DUP TO cswtch IF ." On" ELSE ." Off" THEN CR ; : BAR S" --BLOCK---------------" 2DUP >R >R DROP 8 + SWAP -1 SZ N>S 0 SZ ROT SWAP CMOVE S" " file-out #PUT DROP R> R> file-out #PUT ; : BLK>FILE ( start end -- ) DEPTH 2 < NOT IF file-type DROP out-spec CMOVE [ CHAR O LITERAL ] out-spec + 1- C! set-out-file file-out #OPEN ABORT" Can't open output file" 1+ SWAP DO I cswtch IF BAR THEN DROP I BLOCK 16 0 DO DUP HERE 64 VMBR HERE 64 -TRAILING DUP IF file-out #PUT ELSE 2DROP S" " file-out #PUT THEN DROP 64 + LOOP DROP LOOP file-out #CLOSE ELSE TRUE ABORT" Syntax error" THEN ; It relies on common file routines on block 16: --BLOCK-00016--------- \ Common routines for file utilities : file-type S" DV080SI" ; FBUF: thefile FBUF: file-out : >ftype file-type SWAP 1+ SWAP DROP SWAP CMOVE ; : rec-len BL WORD NUMBER 0> ABORT" Invalid record length" file-type DROP 3 + DUP 3 + SWAP DO BL I C! LOOP N>S file-type DROP 3 + SWAP CMOVE ; : D/V S" DV" >ftype rec-len ; : D/F S" DF" >ftype rec-len ; : I/F S" LF" >ftype rec-len ; : I/V S" LV" >ftype rec-len ; : open-file HERE COUNT thefile FILE thefile #OPEN ABORT" Can't open input file" ; : out-spec S" DV080SO" ; : get-filename BL WORD DUP -ROT HERE 1+ SWAP CMOVE DUP HERE 1+ + file-type ROT SWAP CMOVE 8 + HERE C! ; : set-out-file BL WORD DUP -ROT HERE 1+ SWAP CMOVE DUP HERE 1+ + out-spec ROT SWAP CMOVE 8 + HERE C! HERE COUNT file-out FILE ; : FT? ." Configured for" file-type 2- TYPE CR ; The common file routines above allow the file utilities to work on different types of files - DV80, DF80 etc. 2 1 Quote Link to comment Share on other sites More sharing options...
jschultzpedersen Posted June 21, 2023 Author Share Posted June 21, 2023 Hi again I am trying to push the speed of my scroll routine to make it less than 1 VBL in duration. I am not quite there. But it made me think... I tried to make my own version of VMBW using the TurboForth workspace. My routine works like this: $8C00 CONSTANT VDPW $8C02 CONSTANT VDPA ASM: RAM->VRAM ( VRAM RAM LEN -- ) *SP+ R2 MOV, *SP+ R1 MOV, *SP+ R0 MOV, R0 $4000 AI, R0 SWPB, R0 VDPA @@ MOVB, R0 SWPB, R0 VDPA @@ MOVB, BEGIN, R1 ** R7 MOV, R7 VDPW @@ MOVB, R7 SWPB, R7 VDPW @@ MOVB, R1 INCT, R2 DEC, EQ UNTIL, ;ASM I assume the code behind VMBW is similar, if not exactly the same. And yet, my code runs about 30% slower. Is this because the TurboForth code is in 16 bit ROM space while my code is in 8 bit RAM, or is my code just inefficient? If the cause is the 16 vs. 8 bit code location, I will not spend more time trying to 'outrun' the TurboForth code 😀 regards Jesper Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 21, 2023 Share Posted June 21, 2023 1 hour ago, jschultzpedersen said: Hi again I am trying to push the speed of my scroll routine to make it less than 1 VBL in duration. I am not quite there. But it made me think... I tried to make my own version of VMBW using the TurboForth workspace. My routine works like this: $8C00 CONSTANT VDPW $8C02 CONSTANT VDPA ASM: RAM->VRAM ( VRAM RAM LEN -- ) *SP+ R2 MOV, *SP+ R1 MOV, *SP+ R0 MOV, R0 $4000 AI, R0 SWPB, R0 VDPA @@ MOVB, R0 SWPB, R0 VDPA @@ MOVB, BEGIN, R1 ** R7 MOV, R7 VDPW @@ MOVB, R7 SWPB, R7 VDPW @@ MOVB, R1 INCT, R2 DEC, EQ UNTIL, ;ASM I assume the code behind VMBW is similar, if not exactly the same. And yet, my code runs about 30% slower. Is this because the TurboForth code is in 16 bit ROM space while my code is in 8 bit RAM, or is my code just inefficient? If the cause is the 16 vs. 8 bit code location, I will not spend more time trying to 'outrun' the TurboForth code 😀 regards Jesper Your VMBW write loop is overly complicated. It should be BEGIN, R1 *+ VDPW @@ MOVB, R2 DEC, EQ UNTIL, As to TurboForth’s code for VMBW, it is on the 8-bit bus (executing from cartridge ROM) but differs in one respect, which makes it a bit faster—it puts VDPW into a register: R15 VDPW LI, BEGIN, R1 *+ R15 ** MOVB, R2 DEC, EQ UNTIL, ...lee 2 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted June 21, 2023 Share Posted June 21, 2023 46 minutes ago, Lee Stewart said: Your VMBW write loop is overly complicated. It should be BEGIN, R1 *+ VDPW @@ MOVB, R2 DEC, EQ UNTIL, As to TurboForth’s code for VMBW, it is on the 8-bit bus (executing from cartridge ROM) but differs in one respect, which makes it a bit faster—it puts VDPW into a register: R15 VDPW LI, BEGIN, R1 *+ R15 ** MOVB, R2 DEC, EQ UNTIL, ...lee For Jesper's study, my tests have shown this buys you about 12% faster loops for VDP reads/writes versus symbolic addressing inside the loop. 1 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted June 21, 2023 Share Posted June 21, 2023 Another thing that the game developers do around here is do multiple writes inside the loop to reduce the number of jumps. The simplest one is to write two bytes and decrement the loop counter by two. That should give you a version that is faster than the stock version as long as you are writing and even number of bytes. I have seen some examples of 8 byte writes inside the loop. I think using the AI, instruction gives you a way to decrement by 8 with good speed. R15 VDPW LI, BEGIN, R1 *+ R15 ** MOVB, R1 *+ R15 ** MOVB, R2 DECT, EQ UNTIL, 2 Quote Link to comment Share on other sites More sharing options...
jschultzpedersen Posted June 21, 2023 Author Share Posted June 21, 2023 Hi again I listen and learn.... Thanks! $8C00 CONSTANT VDPW $8C02 CONSTANT VDPA ASM: RAM->VRAM ( VRAM RAM LEN -- ) *SP+ R2 MOV, *SP+ R1 MOV, *SP+ R0 MOV, R0 $4000 AI, R0 SWPB, R0 VDPA @@ MOVB, R0 SWPB, R0 VDPA @@ MOVB, R15 VDPW LI, BEGIN, R1 *+ R15 ** MOVB, R1 *+ R15 ** MOVB, R2 DECT, EQ UNTIL, ;ASM --> VARIABLE MYDATA : TEST HERE MYDATA ! 10 ALLOT 10 0 DO 65 I + I MYDATA @ + C! LOOP 30000 0 DO ( 0 MYDATA @ 10 VMBW ) 0 MYDATA @ 10 RAM->VRAM LOOP ; With your improvements implemented, the RAM->VRAM word is apparently faster than the standard VMBW word. The test program ran at about 15 seconds vs. 19 seconds. Both versions write ten characters from RAM to the top left corner of the display 30000 times. I think I can implement these methods in a couple of other places in my code and hopefully get similar speed increases. If it all works out, I will probably do an article for the TI*MES magazine, published by the English user club. It would be the natural conclusion to some earlier articles I wrote about scrolling in other languages for the TI99. regards Jesper 4 Quote Link to comment Share on other sites More sharing options...
jschultzpedersen Posted June 22, 2023 Author Share Posted June 22, 2023 Hi again Things are now moving almost at VBL speed with the latest improvements. Somewhere between 55-60 refreshes per second. I can probably get there, if I write the current four ASM modules into one. I took a small MPEG2 video - at least I think it is MPEG2 The camera is at least 15 years old 😀. Besides, I used medium quality, since high quality means a 50 Mb file, which may or may not irritate others. I cannot get the inbuilt video recorder in Classic99 to work. I tried to download the Cinepak Codec for Windows 10, but got warnings about it including a trojan horse by Norton. So I respectfully denied and dug out the old Cybershot camera. Since it only handles 25 frames per second, the scrolling gets a bit chopped up, running at twice that speed. But you get an idea of how it behaves currently. regards Jesper MOV02009.MPG 3 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.