Jump to content
IGNORED

Through the Trap Door


Asmusr

Recommended Posts

This is my port of the ZX Spectrum game 'Through the Trap Door' to the TI-99/4A using partially automated Z80 to TMS9900 conversion. Unfortunately it wasn't possible to obtain the same frame rate as on the Spectrum, so shortly into the video I switch the emulator to 2x speed. I'm not sure I actually enjoy the game - it's very hard and sometimes frustrating - but here we go:

 

The source code is quite advanced with its own byte code graphics language similar to GPL, and it's using a double buffer system in CPU RAM. In each frame:

  1. It clears the active screen buffer in CPU RAM.
  2. It drawn all the elements in the scene to the active screen buffer (also walls and floors) in two layers using the GPL-like language mentioned above.
  3. It iterates through the active screen buffer and compares with the passive screen buffer, sending only changed characters to the VDP memory.

That sounded like a good scheme for the 9918A because the CPU to VDP transfer would be minimized, but in the game the screen buffer is 3*768 bytes long and actually both point 2 and 3 take about 500,000 clock cycles, which results in about 3 frame per second. To really bring the game up to playable speed, I would have to cut that down to half.

 

After the automatic conversion of the original Z80 code to TMS9900 code, I had to fill up all memory on the TI, also the >4000 and the >6000 regions, to be able to run the code under emulation. That's a problem with automatic conversion: everything grows bigger, it's only by lengthy peep optimization that the code shrinks again. But, as one of the last step, I split the code into game levels and added ROM banks for each level and ended up with a little bit of RAM free.

 

Source code: https://github.com/Rasmus-M/trapdoor

trapdoor8.bin

Edited by Asmusr
  • Like 22
  • Thanks 3
Link to comment
Share on other sites

6 hours ago, Asmusr said:

This is my port of the ZX Spectrum game 'Through the Trap Door' to the TI-99/4A using partially automated Z80 to TMS9900 conversion. Unfortunately it wasn't possible to obtain the same frame rate as on the Spectrum, so shortly into the video I switch the emulator to 2x speed. I'm not sure I actually enjoy the game - it's very hard and sometimes frustrating - but here we go:

 

https://youtu.be/Yz_NefvU2Gg

 

The source code is quite advanced with its own byte code graphics language similar to GPL, and it's using a double buffer system in CPU RAM. In each frame:

  1. It clears the active screen buffer in CPU RAM.
  2. It drawn all the elements in the scene to the active screen buffer (also walls and floors) in two layers using the GPL-like language mentioned above.
  3. It iterates through the active screen buffer and compares with the passive screen buffer, sending only changed characters to the VDP memory.

That sounded like a good scheme for the 9918A because the CPU to VDP transfer would be minimized, but in the game the screen buffer is 3*768 bytes long and actually both point 2 and 3 take about 500,000 clock cycles, which results in about 3 frame per second. To really bring the game up to playable speed, I would have to cut that down to half.

 

After the automatic conversion of the original Z80 code to TMS9900 code, I had to fill up all memory on the TI, also the >4000 and the >6000 regions, to be able to run the code under emulation. That's a problem with automatic conversion: everything grows bigger, it's only by lengthy peep optimization that the code shrinks again. But, as one of the last step, I split the code into game levels and added ROM banks for each level and ended up with a little bit of RAM free.

 

Source code: https://github.com/Rasmus-M/trapdoor

trapdoor8.bin 64 kB · 16 downloads

Screenshot_20230414-200533_Firefox.thumb.jpg.fbb6535a69b4c271754d920d6b1a9283.jpg

  • Sad 1
Link to comment
Share on other sites

@Asmusr the approach used for the conversion is really interesting and intriguing. This should significantly reduce the time for 1:1 conversions from Spectrum to TI99. 🙂 


Are you thinking to try to apply this method to MSX1 or Colecovision porting (same Speccy's CPU)? Do you think that, in these cases, the use of the same GPU of the TI could improve the final frame rate?


The result of this specific game conversion is a really slow game that does not fit too well with the game category (action/platform). But it should not impact games like Lords of Midnight and The Hobbit, that surely will be very appreciated by the 99ers. ;-) Maybe with games with less gfx on screen it will be possible to obtain a more decent frame rate. I'm thinking of 16K games like Arcadia, Horace Goes Skiing, Pssst, etc. Will you try with some other game to see the results?

 

Could you, finally, explain more the technique used? Is the conversion executed at runtime on the TI or on the PC, etc.

 

Thanks.

Link to comment
Share on other sites

Nice technical achievement ! 👍
I remember seeing that game BITD. Nice graphics. I thought it was the Cookie Monster from Sesame Street, but apparently Berk is based on a British television show of the same name. I think Cookie Monster came first.
https://en.wikipedia.org/wiki/The_Trap_Door_(video_game)
https://en.wikipedia.org/wiki/The_Trap_Door
https://en.wikipedia.org/wiki/Cookie_Monster
image.png.cf2ce58d70f8b6c8f796f91905170600.png

Link to comment
Share on other sites

4 hours ago, tmop69 said:

@Asmusr the approach used for the conversion is really interesting and intriguing. This should significantly reduce the time for 1:1 conversions from Spectrum to TI99. 🙂 


Are you thinking to try to apply this method to MSX1 or Colecovision porting (same Speccy's CPU)? Do you think that, in these cases, the use of the same GPU of the TI could improve the final frame rate?


The result of this specific game conversion is a really slow game that does not fit too well with the game category (action/platform). But it should not impact games like Lords of Midnight and The Hobbit, that surely will be very appreciated by the 99ers. ;-) Maybe with games with less gfx on screen it will be possible to obtain a more decent frame rate. I'm thinking of 16K games like Arcadia, Horace Goes Skiing, Pssst, etc. Will you try with some other game to see the results?

 

Could you, finally, explain more the technique used? Is the conversion executed at runtime on the TI or on the PC, etc.

 

Thanks.

The conversion takes place on a PC, and to be honest there is a lot of work to do afterwards. In particular, everything to do with graphics, sound and input hardware have to be rewritten. Still, it would take a lot longer to develop a game like Through the Trap Door from scratch. Commented source code, or more likely a commented disassembly, is a requirement, and on the Spectrum there are many examples using SkoolKit. I'm not aware of anything similar for the MSX1 or Colecovision. PSSST, The Hobbit, and Lords of Midnight all have disassemblies for the Spectrum, but I'm picking another game for my next project. Something that should run fast enough, and a genre I think the TI community would enjoy. There is even a chance of improved graphics, music and speech.

 

 

  • Like 7
  • Thanks 3
Link to comment
Share on other sites

One of the "Dizzy" games might have been a nice game to try and convert.  In some of those games, hardly anything moves on the screen apart from the Dizzy character itself , repeatedly flapping it's arms up and down.     Mind, all of the Dizzy games ran on a 48K Spectrum.

Link to comment
Share on other sites

7 hours ago, Retrospect said:

One of the "Dizzy" games might have been a nice game to try and convert.  In some of those games, hardly anything moves on the screen apart from the Dizzy character itself , repeatedly flapping it's arms up and down.     Mind, all of the Dizzy games ran on a 48K Spectrum.

Yes it looks like a good game to port, but no source code, I'm afraid. 

  • Like 1
Link to comment
Share on other sites

On 4/23/2023 at 9:36 PM, Tursi said:

I wonder if we can analyze the final resulting code, and maybe come up with a handful of peephole optimizations that could be automatically applied to improve performance?

 

Here are a few examples for consideration:

 

       movb *ix,a                      ; LD A,(IX+$00)     ; 
       inc  ix                         ; INC IX            ; 

       movb @timer,a                   ; LD A,($DA14)      ;
       sb   one,a                      ; DEC A             ; 
       movb a,@timer                   ; LD ($DA14),A      ; 
	   
       .push hl                        ; PUSH HL           ;
       movb @hit,a                     ; LD A,($D300)      ;
       li   hl,frame_ref               ; LD HL,$CE1A       ; 
       cb   a,*hl                      ; CP (HL)           ;
       .pop hl                         ; POP HL            ;

       dec  de                         ; DEC DE            ; 
       movb d,a                        ; LD A,D            ; 
       socb @e,a                       ; OR E              ; 
       jne  start_loop                 ; JR NZ,$C4FD       ; 
   
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ; 

       inc  hl                         ; INC HL            ; 
       inc  hl                         ; INC HL            ; 
       inc  hl                         ; INC HL            ; 
       inc  hl                         ; INC HL            ; 
       cb   a,*hl                      ; CP (HL)           ; 
       dec  hl                         ; DEC HL            ; 
       dec  hl                         ; DEC HL            ; 
       dec  hl                         ; DEC HL            ; 
       dec  hl                         ; DEC HL            ; 

       li   de,prep_open               ; LD DE,$D241       ; Take care of little/big endian
       movb *ix,@l                     ; LD L,(IX+$00)     ;
       movb @1(ix),h                   ; LD H,(IX+$01)     ; 
       movb @e,*hl                     ; LD (HL),E         ;
       inc  hl                         ; INC HL            ;
       movb d,*hl                      ; LD (HL),D         ;

       li   hl,routine_1               ; LD HL,$CE14       ; 
       movb *hl,a                      ; LD A,(HL)         ; 
       inc  hl                         ; INC HL            ; 
       socb *hl,a                      ; OR (HL)           ; 
       .push af                        ; PUSH AF           ; Does not affect flags on Z80
       jne  somewhere                  ; CALL Z,$CE21      ; Take care of flags

 

  • Like 1
Link to comment
Share on other sites

44 minutes ago, Asmusr said:

Here are a few examples for consideration:

 

       inc  hl                         ; INC HL            ; 
       inc  hl                         ; INC HL            ; 
       inc  hl                         ; INC HL            ; 
       inc  hl                         ; INC HL            ; 
       cb   a,*hl                      ; CP (HL)           ; 
       dec  hl                         ; DEC HL            ; 
       dec  hl                         ; DEC HL            ; 
       dec  hl                         ; DEC HL            ; 
       dec  hl                         ; DEC HL            ; 

 

       inct hl                         ; INC HL            ; 
                                       ; INC HL            ; 
       inct hl                         ; INC HL            ; 
                                       ; INC HL            ; 
       cb   a,*hl                      ; CP (HL)           ; 
       dect hl                         ; DEC HL            ; 
                                       ; DEC HL            ; 
       dect hl                         ; DEC HL            ; 
                                       ; DEC HL            ; 

 

  • Like 1
Link to comment
Share on other sites

2 hours ago, Asmusr said:

Here are a few examples for consideration:

 

       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ;
       a    hl,hl                      ; ADD HL,HL         ; 

 

       sla  hl,9                       ; ADD HL,HL         ;
                                       ; ADD HL,HL         ;
                                       ; ADD HL,HL         ;
                                       ; ADD HL,HL         ;
                                       ; ADD HL,HL         ;
                                       ; ADD HL,HL         ;
                                       ; ADD HL,HL         ;
                                       ; ADD HL,HL         ;
                                       ; ADD HL,HL         ; 

ADD HL,HL is 2x HL, which is bit-shifting one to the left. And this is done 9 times.

Edited by sometimes99er
  • Like 1
Link to comment
Share on other sites

9 hours ago, Asmusr said:

Here are a few examples for consideration:

 

       .push hl                        ; PUSH HL           ;
       movb @hit,a                     ; LD A,($D300)      ;
       li   hl,frame_ref               ; LD HL,$CE1A       ; 
       cb   a,*hl                      ; CP (HL)           ;
       .pop hl                         ; POP HL            ;

 

       cb   @hit,@frame_ref            ; PUSH HL           ;
                                       ; LD A,($D300)      ;
                                       ; LD HL,$CE1A       ; 
                                       ; CP (HL)           ;
                                       ; POP HL            ;
Edited by sometimes99er
  • Like 2
Link to comment
Share on other sites

On 4/14/2023 at 12:52 PM, Asmusr said:

That sounded like a good scheme for the 9918A because the CPU to VDP transfer would be minimized, but in the game the screen buffer is 3*768 bytes long and actually both point 2 and 3 take about 500,000 clock cycles, which results in about 3 frame per second. To really bring the game up to playable speed, I would have to cut that down to half.

16-bit optimizations...

 

1. Are the buffers compared byte to byte?

 

Using word compare could be faster. 9900 CB has to fetch both bytes anyway! It might be ok  to write   both bytes to VDP, regardless of which changed. It depends on the probability of  changing adjacent characters. Quite likely I think. 
 

2. I have a hunch there is an XOR optimization to be had...

  • Like 2
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...