Jump to content
IGNORED

StrangeCart


speccery

Recommended Posts

  • 2 weeks later...

Update: I released a new firmware in Wiki at GitHub. This firmware is from a couple of weeks ago, I just at the time did not release it, no particular excuse for that omission. This version has the new CALL GRAM command supported.

 

Added a new Wiki page to document the GRAM support with system GROM override. I am not happy with the material I created, but at least it is something.

 

I also released the source code to the modified MiniMemory cartridge image I use to integrate to TI BASIC, it is here are github. This is perhaps not very interesting, since without understanding and documentation of the microcontroller firmware it is probably hard to do anything with it, but for information sharing purposes it's there nevertheless.

  • Like 3
  • Thanks 1
Link to comment
Share on other sites

Also added a section in the Wiki to describe the different types of cartridge images. It's perhaps obvious to people reading this thread here at AA; but for others perhaps not understood that the StrangeCart supports normal TI-99/4A cartridge emulation, and also augmented cartridges which are StrangeCart specific and integrate TMS9900 code with ARM code.

  • Like 4
  • Thanks 1
Link to comment
Share on other sites

I continue to be blown away by @Eric Lafortune's breakout game. Just checked that it can also be loaded from StrangeCart's SCD1 drive. Since it's a "regular" BASIC program that of course should work. I had kind of forgotten that I had also added a StrangeCart command dl_fs some time ago. With that one can use a terminal program (over USB serial to StrangeCart) to send a Basic program to the SCD1 drive. So just say:

dl_fs BREAKOUT

And then send the program with XMODEM. After this is done, from TI BASIC one can issue 

OLD SCD1.BREAKOUT
RUN

and off it goes. I totally have to try creating a game with this framework. I really like how it's using VDP memory for TMS9900 code snippets.

 

[EDIT: Forgot to mention here that I did some further updates to the StrangeCart Wiki]

Edited by speccery
  • Like 4
Link to comment
Share on other sites

2 hours ago, speccery said:

Just checked that it can also be loaded from StrangeCart's SCD1 drive. Since it's a "regular" BASIC program that of course should work.

?

Thanks for checking! Handling different storage hardware was tricky, actually. Drivers generally reserve buffers of different sizes at the top of VDP RAM. Loaded BASIC programs then get shifted down in memory by an unknown number of bytes. That presents a problem to some steps in the jailbreak process. @senior_falcon came up with the clever solution, which includes drawing known bytes in the VDP screen image table with regular BASIC statements.

 

The framework works well, but the extreme constraints require constant carefulness and stamina. I liked the challenge and the "what if" aspect of it.

  • Thanks 2
Link to comment
Share on other sites

5 hours ago, dhe said:

Upgraded to latest version - and everything is a ok.

 

call carts(2) - "Strange2" - gave me a strange display and nothing else - standard console with TMS9918A.

Thanks for testing! I have done all my development on systems with F18A. I do have a console with TMS9918A (or I guess it's TMS9929 but no difference) but I don't think I have a suitable display immediately available, I will see if I can wire something up to test with that. Anyway interesting to hear the test result, the probable cause is that the demo outruns the memory access speed of the standard VDP. I will make a new version of the demo with some NOPs inserted, hopefully that helps. I will also try to get the INPUT statement support you asked for done in the coming days. Its one of the last big missing features of TI BASIC. Processing of the input data itself and assigning to variables is not hard, the reason why I haven't jumped into this before is that the screen editor needs to be there. Either by integrating into the normal TI BASIC screen editor implemented in GPL, or by rolling my own. I will probably try to integrate with the GPL editor as the first preference, although I am not sure how hard it is to use outside of standard BASIC GPL code.

  • Like 1
Link to comment
Share on other sites

40 minutes ago, speccery said:

Thanks for testing! I have done all my development on systems with F18A. I do have a console with TMS9918A (or I guess it's TMS9929 but no difference) but I don't think I have a suitable display immediately available, I will see if I can wire something up to test with that. Anyway interesting to hear the test result, the probable cause is that the demo outruns the memory access speed of the standard VDP. I will make a new version of the demo with some NOPs inserted, hopefully that helps. I will also try to get the INPUT statement support you asked for done in the coming days. Its one of the last big missing features of TI BASIC. Processing of the input data itself and assigning to variables is not hard, the reason why I haven't jumped into this before is that the screen editor needs to be there. Either by integrating into the normal TI BASIC screen editor implemented in GPL, or by rolling my own. I will probably try to integrate with the GPL editor as the first preference, although I am not sure how hard it is to use outside of standard BASIC GPL code.

If you want to try and narrow it down quicker, unless you're doing something very exotic for VDP access speed, the only place you SHOULD be able to overrun the VDP on the TI is by reading immediately after setting the VDP address. If you look for those spots and insert the NOP, that SHOULD in theory be adequate.

 

And I really, really want to know if you find it is not. :)

 

  • Like 2
Link to comment
Share on other sites

One other oddity I had. The addition of force command is very handy. The odd thing - most of the carts

call cart(x) - when you reset the console (button or power off/on cycle) - you go back to the 4a title screen.

 

With force command when I did that, it took me directly to the force command command prompt. I easily enough returned to normal by hitting the MCU reset button.

Link to comment
Share on other sites

13 hours ago, Tursi said:

If you want to try and narrow it down quicker, unless you're doing something very exotic for VDP access speed, the only place you SHOULD be able to overrun the VDP on the TI is by reading immediately after setting the VDP address. If you look for those spots and insert the NOP, that SHOULD in theory be adequate.

 

And I really, really want to know if you find it is not. :)

 

Thanks @Tursi for the comment. CALL CARTS(2) runs one of the StrangeCart demos. This demo runs in GM2 without sprites. It uses dynamically created code, where the TMS9900 is running a sequence of LI R0,X commands. That code is in the cartridge ROM area, and looks like this:

6200	LWPI >8C00
		LI R1,ADDR_LO
		LI R1,ADDR_HI
		LI R0,data
		LI R0,data
		LI R0,data
		... hundreds more LI instructions ...
		B  @BUSSTOP

The workspace pointer points to the VDP, there are two writes to setup the VDP write address, followed by a ton a writes to the data register. At the end it branches to what I call the bus stop, where the TMS9900 signals the ARM that's its done writing that strip of data and the ARM creates a new strip with new data (around 7K or something like that in one go). To my knowledge this is the fastest way to write data to the VDP.

 

Each LI instruction transfers one byte of data to the VDP memory. To transfer 6K (one high resolution black and white screen) the ARM generates four strips of code (the strip length is limited by the size of the cartridge memory space).

 

There are no VDP reads, just these types of writes. It would be good news if the standard VDP can keep up with this.

Link to comment
Share on other sites

6 hours ago, dhe said:

One other oddity I had. The addition of force command is very handy. The odd thing - most of the carts

call cart(x) - when you reset the console (button or power off/on cycle) - you go back to the 4a title screen.

 

With force command when I did that, it took me directly to the force command command prompt. I easily enough returned to normal by hitting the MCU reset button.

Force command has been there from the start :) when you received the board. When you update the firmware, the flash memory of the microcontroller is erased and reprogrammed, but the vast majority of cartridge images are stored on the serial flash chip, which is external to the MCU. Due to this the version of force command is not the most recent. I think the force command ROM image includes a power up routine or something which diverts the boot process to it immediately - the strange cart does not do anything different with it. The Wiki includes instructions how to update the flash chip if you want to change the cartridge set (or you can ask me and I can provide you a new cartridge image file).

  • Like 2
  • Thanks 1
Link to comment
Share on other sites

10 hours ago, speccery said:

Thanks @Tursi for the comment. CALL CARTS(2) runs one of the StrangeCart demos. This demo runs in GM2 without sprites. It uses dynamically created code, where the TMS9900 is running a sequence of LI R0,X commands. That code is in the cartridge ROM area, and looks like this:


6200	LWPI >8C00
		LI R1,ADDR_LO
		LI R1,ADDR_HI
		LI R0,data
		LI R0,data
		LI R0,data
		... hundreds more LI instructions ...
		B  @BUSSTOP

The workspace pointer points to the VDP, there are two writes to setup the VDP write address, followed by a ton a writes to the data register. At the end it branches to what I call the bus stop, where the TMS9900 signals the ARM that's its done writing that strip of data and the ARM creates a new strip with new data (around 7K or something like that in one go). To my knowledge this is the fastest way to write data to the VDP.

 

Each LI instruction transfers one byte of data to the VDP memory. To transfer 6K (one high resolution black and white screen) the ARM generates four strips of code (the strip length is limited by the size of the cartridge memory space).

 

There are no VDP reads, just these types of writes. It would be good news if the standard VDP can keep up with this.

That counts as really exotic. :)

 

You can just insert the NOPs in the recommended areas, but since I assume that's being done for performance, these guidelines seem to hold true in my testing on both TI and ColecoVision:

 

- register accesses are zero wait state. Thus, you can write to VDP write-only registers as well as the address register at full speed.

- after writing a "read" address, you must wait 8uS before reading /each/ byte of data (so between reads as well as before the first one).

- after writing a "write" address, you must wait 8uS after /each/ written data byte before the next one can be written. Likewise, you can not write registers safely during this time.

 

Basically, think about "is the VDP accessing RAM?" If yes, you need the delay. It's 8uS in graphics and bitmap mode, 3.1uS in text mode, and 3.5uS in multicolor mode.

 

If the screen is blanked, or you are in the vertical blank area (starting at the interrupt signal and for 4300uS after - but remember by the time you SEE the interrupt signal, time has already elapsed) - then the delay is 2uS across the board.

 

Unfortunately, I've found no evidence that turning sprites on or off helps this at all. I don't think the VDP can reallocate the spare cycles back to the CPU dynamically. Hard to prove though.

 

LI blows the "every write is safe on the TI" theory by being one of the few instructions that doesn't read-before-write. ;) LI is 12 cycles, plus 2 8-bit cycles to read the instruction, plus 1 8-bit cycle to write the data, so 12+4+4+4 = 24 cycles, I believe. That's 8uS on the dot, I believe, but then we have that pesky +/- 5% on the CPU clock, so it could be as quick as 7.6uS (or as slow as 8.4uS). The CPU and the VDP are not in lockstep on the TI, the VDP runs its own clock which is more precise and can be considered to be accurate - thus we actually have to consider the clock slip. 

 

IOW: looks like it's too fast to be reliable across all systems. You need to be just 1.2 CPU cycles slower (or, I guess, 1.26 cycles on the fast clock ;) ) .

 

As for what to sneak in between... the fastest full instruction is an illegal opcode at 6 cycles + 4 to read it - 10 cycles for 3.3333 uS... would result in about 40% wasted time.

 

I wonder if your board could be programmed to hold READY for just 1 cycle between operations... it wouldn't even really matter when. ;) That'd get you up to practically all machines nearly all the time. With a 2 cycle hold, you should be rock solid. Unfortunately, I don't know if the READY line reaches the CPU from the cart port unless GROM is being accessed, so that might not work anyway.

 

 

  • Like 1
  • Thanks 2
Link to comment
Share on other sites

8 hours ago, Asmusr said:

I think @PeteE has been using the LI technique for writing to the VDP, for instance in his Ghosts'n'Goblins clone. Perhaps he has experience with how it works on a real 9918A?

 

I've been using it for years now, including the Copper demo too, all writing data to a real 9918A without any problems.

  • Like 2
  • Thanks 2
Link to comment
Share on other sites

3 hours ago, PeteE said:

I've been using it for years now, including the Copper demo too, all writing data to a real 9918A without any problems.

Thanks, great to hear! That means the issues that @dhe noticed on the StrangeCart demo are due to something else - which is hardly surprising given that the firmware is new. I will nevertheless try to build a test setup with a console with original VDP, just need to sort out a display for it.

 

  • Like 1
Link to comment
Share on other sites

  • 2 weeks later...

A quick update for @dhe: I have worked a little on the support for the input statement. The StrangeCart Basic interpreter should now have a nearly complete implementation of processing of the statement, but what is still missing is the editor side of it. In other words I have tested it and it seems to work, supporting an optional prompt and multiple variables in my Mac development version. Like I wrote before, I will either have to integrate the existing GPL editor routine (if possible) at GROM@>2A42 into my code - or write an alternate version in C++ for the StrangeCart and interface with the TMS9900 using my existing methods to write to VDP memory and scan the keyboard. I would like to avoid writing yet another editor if possible, so if anyone knows how to use the existing GROM editor routine implemented by the console Basic GROM for other applications, please let me know :) 

  • Thanks 1
Link to comment
Share on other sites

  • 4 weeks later...

I am repeating here what I wrote in the Bad Apple demo discussion thread, since I think this is a very cool new capability the StrangeCart has:

 

@Eric Lafortune I'm happy to report that I just got the Bad Apple demo running on the real iron! Very cool! As I am writing this, I have run the demo exactly once, so my code is just fresh out of the oven. I am using the StrangeCart to support this cartridge. The Bad Apple demo is unmodified, I am using your binary distribution with the 4.5 megabyte image.

 

For the technically minded, you might wonder this is done? I added to the StrangeCart what I call the "streaming mode". My firmware only supports cartridge images up to 128K of ROM (plus some GROM on top of that). The reason for this is that cartridges are served from the internal RAM of the microcontroller on the StrangeCart. It has 192K RAM, which actually is pretty big, but of course falls short of the 4.5 megs required for the demo.  So now, when a cartridge is loaded and if the cartridge image is larger than 128K, the firmware loads the first 128K of the cartridge to RAM. When the cartridge is started (from the TI's normal boot menu), it starts to execute from the first 128K loaded into RAM. Of course with these paged cartridges only the first 8K is visible to the TMS9900 at a time.

 

It's worth noting that there is no READY line on the TI-99/4A cartridge bus. Thus a cartridge cannot introduce hardware wait states to the TMS9900. When it wants to fetch something from ROM, you better have that data available right then, as there is only about half a microsecond to return the first byte to the TMS9900. In other words, the data has to be already in RAM at that time.

 

The StrangeCart's CPU has two cores. One of the cores (M0+) spends 100% of its time serving the the bus of the TMS9900. The new feature I added is that whenever the TMS9900 changes the 8K page of the large cartridge image, the M0+ writes the number of the newly loaded page to the interprocessor mailbox in the MCU. This in turn now causes an interrupt for the more powerful M4 CPU core. The interrupt routine updates some variables and returns.

For the M4 core I support what I call "cartridge service routines". When a cartridge is loaded (i.e. cartridge for TI-99/4A, containing TMS9900 code) for some types of cartridges the M4 core starts to execute a corresponding service routine. Now for these big 128k+ ROM images, I created a new service routine which monitors the variables set by the interrupt routine. When it sees a new page number written, it checks if the following page is already in RAM. If not, it is loaded from the serial flash chip with the assumption that it is soon needed. Since there is only 128K of RAM, loading a new 8K page means that it has to be put somewhere, replacing an existing 8K page. I went with this logic: the first 64K of a cartridge ROM is always  kept in RAM (I am assuming that the usage pattern is such that the code for a huge cartridge is kept here). For the remaining 64K of on-chip RAM, I search for the first 8K page frame which is not currently being used, and replace its contents with the new 8K page. In practice this means that page frames 8 (at 64K) and 9 (at 72K) are loaded in an alternating pattern with new data. When Bad Apple demo code is working  on the page frame at 64K, I load the next one into 72K.  When the demo code switches to the next page, it will use data at 72K and the next page is loaded into 64K. This seems to work perfectly, and should work for any cartridge doing video decode type activity with a sequential access pattern.

 

The TMS9900 is so slow (and the compression so good) that it's only consuming a couple of 8K pages per second. Like I think I wrote in the past, the TMS9900 in the TI-99/4A can only read data from ROM at a maximum rate of about 1 megabytes per second. While serving the TMS9900 bus, the StrangeCart can concurrently load data from the flash ROM chip at 6 megabytes per second. Thus this happens much faster than the TI-99/4A can consume it, leaving the only problem that when a page switch occurs the next page has to be preloaded ahead of time.

  • Like 4
  • Thanks 2
Link to comment
Share on other sites

Nice! Can it manage the 8MB Dragon's Lair demo now? The pages are accessed sequentially, so as long as you are pre-caching forward, it has a chance of working. ;)

 

(Edit: I guess the scene transitions may be an issue, since you can't predict which page it will jump to, and the target page contains the information needed to play the scene. So if that's wrong, you'll get garbage...)

Edited by Tursi
  • Like 1
Link to comment
Share on other sites

6 hours ago, Tursi said:

Nice! Can it manage the 8MB Dragon's Lair demo now? The pages are accessed sequentially, so as long as you are pre-caching forward, it has a chance of working. ;)

 

(Edit: I guess the scene transitions may be an issue, since you can't predict which page it will jump to, and the target page contains the information needed to play the scene. So if that's wrong, you'll get garbage...)

Perhaps it can! A couple of questions:

  • Does it need a working set of ROM pages, or the code runs from RAM and only data is fetched from ROM?
  • After writing the address of the new bank, how soon is the first byte fetched? Every microsecond counts here....
  • When a new page is selected, does fetching start from a random address on a page or a known address (zero) on that page?

If data fetches start from the beginning of each page, I could just cache for example 32 or 64 bytes of each page into RAM ahead of time. With 8 megs there are 1024 pages, so preloading 32 or 64 bytes is not a problem, only taking 32k or 64k of RAM. During the fetches of those early bytes the loading of the rest could be done concurrently.

If the first address on a page is random, then at least it would work by changing the game slightly, so that a hint would be provided. For example the first data item would be read twice, with a slight delay between the reads and discard the result of the first read, giving the cartridge a bit of time to load the page from there.

Link to comment
Share on other sites

Winter is coming, and I’d sure like to use the Strangecart for some projects (I have TI basic integration in Stevie working, at least on the fly decrunching to the editor. Crunching source text in editor to TI basic tokens is next). Having a TI Basic performance boost would really be nice.

 

Are you planning on doing future hardware changes? What are the plans for Strangecart availability. I’d like to get my hands on one for sure 😀 I’m not so worried about strangecart software upgrades, that’s part of the fun I’d say. Seeing how strangecart further evolves.

Edited by retroclouds
  • Like 2
Link to comment
Share on other sites

16 hours ago, speccery said:

beginning of each page, I could just cache for example 32 or 64 bytes of each page into RAM ahead of time. With 8 megs there are 1024 pages, so preloading 32 or 64 bytes is not a problem, only taking 32k or 64k of RAM. During the fetches of those early bytes the loading of the rest could be done concurrently.

If the first address on a page is random, then at least it would work by changing the game slightly, so that a hint would be provided. For example the first data item would be read twice, with a slight delay between the reads and discard the result of the first read, giving the cartridge a bit of time to load the page from there.

That's a fascinating idea. Let me get that data for you. ;)

  • Like 1
Link to comment
Share on other sites

23 hours ago, speccery said:

Perhaps it can! A couple of questions:

  • Does it need a working set of ROM pages, or the code runs from RAM and only data is fetched from ROM?
  • After writing the address of the new bank, how soon is the first byte fetched? Every microsecond counts here....
  • When a new page is selected, does fetching start from a random address on a page or a known address (zero) on that page?

The demo starts from a ROM header, while the real game starts from a tiny (256 byte) GROM. There are code layout differences and the real game swaps pages for code and game data, while the demo doesn't need to.

 

>> Does it need a working set of ROM pages, or the code runs from RAM and only data is fetched from ROM?

 

The program lives in a single page, but which page that is depends on whether you are playing keyboard or joystick. However, the demo doesn't have keyboard, so the one page is all of the program. It can't run from RAM though, since it doesn't require memory expansion. However, the video playback and several trampoline routines run from scratchpad RAM.

 

>> After writing the address of the new bank, how soon is the first byte fetched? Every microsecond counts here....
>> When a new page is selected, does fetching start from a random address on a page or a known address (zero) on that page?

 

So these are the big questions... starting the demo all works from one page, so it should be fine.

 

The demo changes pages for just two reasons:

- to display a still frame
- to play a video

 

For a still frame:

* Entry here at >8320. R3 has page address, R0 has page data in MSB
    MOV R0,*R3          * change page on cart
    MOVB *R1+,@>8C00    * read data into VDP memory


So that's about as tight as it can get. The bankswitch happens at the end of the first instruction, and the access happens at the beginning of the next one. However, it's just graphics data. If the first few bytes are wrong, it probably won't be serious. A few bytes of corruption. The demo already has some anyway, since the cart header sits on top of some of the graphical data (in fact, because of that it might not even be visible ;) ).

 

The return code runs from scratchpad as well:

	CLR @CARTPAGE
	B *R11

 

But, it's always returning to that same first code page.
 

For a video frame, and this is the same whether F18A or not:

 

    MOV r9,*R8+     bank in the next page of video data
    LI R10,>6020    video data start address
    LI R0,>0040     vdp address (swapped and with write flag set)
    BL @LOOP1       subroutine to copy a block (still in scratchpad)
LOOP1
    MOVB R0,*R14      set VDP address
    MOVB @>8301,*R14  set VDP address
    LI R0,192         loops to execute
    INC R10                discard one cart sample
    MOVB @BEEPVOL,*R15    beep tone
    MOVB *R10+,*R13     first data from cartridge


So even though it's all in scratchpad, it's still a lot of instructions between the page change and the first access. Again, it's just video data, so even if it was late it might be okay -- you'd need it to be right before the 5th byte, though, otherwise the sound chip setup can be corrupted and it probably won't be able to recover.

 

The end of video is a little tighter:

	CLR @CARTPAGE
	B @ABSRET

 

But again, always returning to that same initial code page.

 

All the scene data is in the main page as well, so, that seems pretty simple. There's a lot more jumping around of code in the real game, but the demo was kept simpler. It seems like it should be pretty feasible without any tweaks?

Edited by Tursi
  • Like 3
Link to comment
Share on other sites

2 hours ago, Tursi said:

All the scene data is in the main page as well, so, that seems pretty simple. There's a lot more jumping around of code in the real game, but the demo was kept simpler. It seems like it should be pretty feasible without any tweaks?

Thanks for sharing the details. It does look feasible without any tweaks.

Let me ask a (stupid) question, is the cartridge image for the demo readily available somewhere to test with?

 

All of this makes me wonder how fast the StrangeCart can actually load data from the flash chip, i.e. what would be the latency from address setup to data. It looks like it can't do it in realtime, but I need to measure that. I will try to setup a test scenario, using your code:

    MOV R0,*R3          * change page on cart
    MOVB *R1+,@>8C00    * read data into VDP memory

The question becomes how would this need to be altered (if at all) to support the game. The MOVB *R1+,... is tricky since the full address (R1) will only be know during the actual fetch, and it may be that this is where things fall apart. However, if the returning garbage for the first two bytes does not matter, this might work as is. 

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...