Jump to content
IGNORED

PICO9918 - a TMS9918A drop-in replacement powered by a Pi Pico (RP2040)


Recommended Posts

3 hours ago, visrealm said:

Well. Here it is! PICO9918 v0.4 finally working.

 

@aotta you'll notice the PICO9918 version splash is much splashier here :)

Really nice (and less invasive) the new splash routine!!!

For who's interested in SCART output, i got some first success just varying pixelClockKHz and hSyncParam.syncPixels, but had a wrong resolution till now. I'm confident that with @visrealm help we could get it working fine

  • Like 3
Link to comment
Share on other sites

Still waiting on my solder paste to arrive next week.  Hopefully, I will be able to get a v3 board built (I have a hot plate).

After that, I have a Coleco Vision, some TI's and an ADAM to test it in.

 

 

  • Like 2
Link to comment
Share on other sites

I've simplified the Thumb9900 assembler to use a 16KB jump table for instructions (located in flash memory, not to waste RAM, interesting that no performance was lost doing that).  I also fixed a bug in the X (execute) instruction, which was missing a necessary reload of R3 from the current PC value.  The conditional jumps have also been optimised.

 

Slightly over 3.5MIPS average now.

TMS9900A.zip

  • Like 5
Link to comment
Share on other sites

Posted (edited)
1 hour ago, JasonACT said:

I've simplified the Thumb9900 assembler to use a 16KB jump table for instructions (located in flash memory, not to waste RAM, interesting that no performance was lost doing that).  I also fixed a bug in the X (execute) instruction, which was missing a necessary reload of R3 from the current PC value.  The conditional jumps have also been optimised.

 

Slightly over 3.5MIPS average now.

TMS9900A.zip 13.81 kB · 2 downloads

Thanks Jason. I wouldn't be too concerned about RAM. Currently, the entire firmware binary is ~26KB. Add in around 20KB for runtime stuff (VRAM and incidentals). We're good on RAM.

Edited by visrealm
  • Like 2
Link to comment
Share on other sites

3 hours ago, JasonACT said:

I've simplified the Thumb9900 assembler to use a 16KB jump table for instructions (located in flash memory, not to waste RAM, interesting that no performance was lost doing that).  I also fixed a bug in the X (execute) instruction, which was missing a necessary reload of R3 from the current PC value.  The conditional jumps have also been optimised.

 

Slightly over 3.5MIPS average now.

TMS9900A.zip 13.81 kB · 2 downloads

Nice!  So what's left to do a full ti99/4a emulation on a pico? 

Link to comment
Share on other sites

1 hour ago, MarkB said:

Nice!  So what's left to do a full ti99/4a emulation on a pico? 

Thanks, but I think other people have done that already, try searching the forum, I'm sure I've read some posts that talk about it.  Probably @speccery thinking about it.  I'm just trying to make a little engine to match the F18A GPU speed, which is said to be 4-6MIPS...  I'm still quite a bit short - but that may not matter, it depends on the different use-cases, and no-one using it has chimed-in.  I'm also not quite sure how fast a TMS9900 needs to run to do 3.5MIPS, but it's certainly a few times more than a standard console runs at, so there's lots of room for a "standard emulator", I guess.

 

I'm not sure I can make it much faster though, I know Classic99 has a 64KB lookup table (or 2?) for flags being set "compared to zero", but I didn't really want to go that far on this little device...  Maybe I should?  But I don't think it'll get me to 4MIPS.  Anyway - some of this is just random thoughts.

  • Like 2
Link to comment
Share on other sites

30 minutes ago, JasonACT said:

Thanks, but I think other people have done that already, try searching the forum, I'm sure I've read some posts that talk about it.  Probably @speccery thinking about it.  I'm just trying to make a little engine to match the F18A GPU speed, which is said to be 4-6MIPS...  I'm still quite a bit short - but that may not matter, it depends on the different use-cases, and no-one using it has chimed-in.  I'm also not quite sure how fast a TMS9900 needs to run to do 3.5MIPS, but it's certainly a few times more than a standard console runs at, so there's lots of room for a "standard emulator", I guess.

 

I'm not sure I can make it much faster though, I know Classic99 has a 64KB lookup table (or 2?) for flags being set "compared to zero", but I didn't really want to go that far on this little device...  Maybe I should?  But I don't think it'll get me to 4MIPS.  Anyway - some of this is just random thoughts.

Ok I couldn't find it when I searched - I'll try again.

(you also answered my next question - which was why do you need so many MIPS)

  • Like 2
Link to comment
Share on other sites

1 hour ago, JasonACT said:

I'm still quite a bit short - but that may not matter, it depends on the different use-cases, and no-one using it has chimed-in.

Alex Kidd has an F18a based freerunning "water ripple" effect which might look a little different at a lower MIPS, but it wouldn't break anything.

 

The good thing is that the F18a has a bunch of interrupt based features which generally means people write less timing sensitive code...

  • Like 4
Link to comment
Share on other sites

5 hours ago, JasonACT said:

Thanks, but I think other people have done that already, try searching the forum, I'm sure I've read some posts that talk about it.  Probably @speccery thinking about it.  I'm just trying to make a little engine to match the F18A GPU speed, which is said to be 4-6MIPS...  I'm still quite a bit short - but that may not matter, it depends on the different use-cases, and no-one using it has chimed-in.  I'm also not quite sure how fast a TMS9900 needs to run to do 3.5MIPS, but it's certainly a few times more than a standard console runs at, so there's lots of room for a "standard emulator", I guess.

 

I'm not sure I can make it much faster though, I know Classic99 has a 64KB lookup table (or 2?) for flags being set "compared to zero", but I didn't really want to go that far on this little device...  Maybe I should?  But I don't think it'll get me to 4MIPS.  Anyway - some of this is just random thoughts.

I worked on a full TI-99/4A emulation for the 32blit and pimoroni pico system. The 32blit uses the STM32H750 processor and has no problem to run a full console at full speed, but with the pico it did not quite get there. If I remember correctly the picosystem runs the RP2040 at 250MHz. My implementation of the TI wasn't fully complete, the 9918 VDP was not fully functional. It had enough functionality to run TI Invaders though :)

When emulating the full console on the RP2040 I think quite a lot of load comes from just driving the display, which I think runs over an SPI bus. That means that the PICO needs to maintain a 240x240 16bpp frame buffer and convert from the TMS9918 memory layout to that, which alone takes some horsepower and then there is the TMS9900 core to take care as well. I am sure that could be optimized further, for the picosystem is just a tad bit too small, and I also didn't get around implementing a virtual keyboard.

  • Like 5
Link to comment
Share on other sites

11 hours ago, JasonACT said:

I'm not sure I can make it much faster though, I know Classic99 has a 64KB lookup table (or 2?) for flags being set "compared to zero", but I didn't really want to go that far on this little device...  Maybe I should?  But I don't think it'll get me to 4MIPS.  Anyway - some of this is just random thoughts.

Yeah, two! ;) One for bytes and one for words. But that's on an system with technically infinite resources, not necessarily what you want if memory is tight. But in theory it's slightly faster than a mask and test and set bits. (By the time I did it PCs were not sweating over emulation anymore.)

  • Like 3
Link to comment
Share on other sites

On 7/22/2024 at 9:04 AM, Tursi said:

Yeah, two! ;) One for bytes and one for words. But that's on an system with technically infinite resources, not necessarily what you want if memory is tight. But in theory it's slightly faster than a mask and test and set bits. (By the time I did it PCs were not sweating over emulation anymore.)

The much smaller byte table makes a nice difference, the word table not so much when it's in FLASH, a little when it's in RAM - but yeah, that's a lot of RAM.

 

On 7/21/2024 at 6:31 PM, visrealm said:

I wouldn't be too concerned about RAM.

Ok, that means I can re-arrange the code and get quicker access to the 16KB jump table which is now closer to the code.  I've also found more savings in many places, and I've upped the overclock ever so slightly to 252MHz which I believe is the real target we're using for VGA...

 

3.9MIPS average, with the cputest debug output turned off.  And I'm all out of tricks.  But nearly 2 times faster than GCC12, almost 3 times GCC10.  I'm sure I've read somewhere "don't hand code assembler, GCC will match or beat you" but I think it is worthwhile.

TMS9900A.zip

  • Like 6
Link to comment
Share on other sites

Quick update on SCART RGBs output @aotta and I have been working on. He has confirmed both PAL RGBs 720x576i @50HZ and NTSC RGBs 720x480i @60HZ is now working.

 

This would be a compile-time option (coming soon).

  • Like 5
  • Thanks 2
Link to comment
Share on other sites

I found a few more instructions (adds, subs, incs & decs) that could have a couple of thumb assembler instructions removed.  I also think I stuffed the '99s branch-and-link instruction - storing the tainted R3 (copy of PC+2) instead of a fresh copy from Pico R10.  Not entirely sure why all tests passed with that, except there's only a few times in the cputest code it would be hit, maybe it was never hit in its current state.  Still 3.9MIPS, but with the cputest debug output re-enabled, so a small gain along with the bugfix.  I think only bugfixes from here on...  I hope there are none.

TMS9900A.zip

  • Like 5
Link to comment
Share on other sites

On 7/21/2024 at 6:40 PM, speccery said:

I worked on a full TI-99/4A emulation for the 32blit and pimoroni pico system. The 32blit uses the STM32H750 processor and has no problem to run a full console at full speed, but with the pico it did not quite get there. If I remember correctly the picosystem runs the RP2040 at 250MHz. My implementation of the TI wasn't fully complete, the 9918 VDP was not fully functional. It had enough functionality to run TI Invaders though :)

When emulating the full console on the RP2040 I think quite a lot of load comes from just driving the display, which I think runs over an SPI bus. That means that the PICO needs to maintain a 240x240 16bpp frame buffer and convert from the TMS9918 memory layout to that, which alone takes some horsepower and then there is the TMS9900 core to take care as well. I am sure that could be optimized further, for the picosystem is just a tad bit too small, and I also didn't get around implementing a virtual keyboard.

It might be worth another look at this now if we have working and efficient 9918 and 9900 emulations.  Some CRU and some plumbing and we might be there.  I've seen projects that create audio output and there seems to be a library for USB host so a keyboard is possible.  Interfacing I/O to the real ti99 keyboard would be cool as well.  I must do some tinkering when my pico arrives.

  • Like 6
Link to comment
Share on other sites

Posted (edited)

That'll work, @RickyDean

 

I have my own custom breakout board I originally designed for breadboarding. https://www.pcbway.com/project/shareproject/Breadboard_to_VGA_adapter_Panelized_844c6ef3.html

It's what I was using to test my v0.3 boards, only because I have a bunch of them.

 

Or you could just buy the VGA connectors and wire them up. 
https://a.aliexpress.com/_mOPy3hQ

 

For v0.4+, I'm using a 6pin 1.25mm JST connector which is compatible with some ITX VGA breakout adapters. https://a.aliexpress.com/_mMR1RLw

 

These are short though. I've ordered some of the connectors from above and some 8" JST 1.25mm cables to wire my own longer ones.

Edited by visrealm
  • Like 5
Link to comment
Share on other sites

On 7/24/2024 at 7:45 PM, JasonACT said:

I think only bugfixes from here on...  I hope there are none.

My last "written on paper" TODO was to free up R1 (which is used almost everywhere for a copy of R12 - the 99's STATUS register) by changing the support routines to not use R1...  That's done now, every '99 instruction that modified the status register is 2 Pico instructions faster.  Over 4MIPS average now.

 

On 7/21/2024 at 9:55 PM, JasonACT said:

But I don't think it'll get me to 4MIPS.

Well, what do you know...

TMS9900A.zip

  • Like 5
Link to comment
Share on other sites

This is my first "go" at the PIX instruction in Pico assembler.  It's not really tested, much like the C version isn't, as it mostly works on the "bitmap layer" of the F18A which I have no experience with.  I think of it as a sprite which is 4 colours + palette registers to make it useful, and the size/position is configurable.  There's also a "wide pixel" mode (2 pixels wide) which allows 16 colours, but the PIX instruction doesn't seem to consider that from what I can tell.

 

There's a couple of code comments that have been fixed, the conditional jumps have been moved closer to "start" so if they don't jump, it doesn't cost 2 extra Pico clock cycles to get to the next instruction.

 

Anyway, I've left the C version of PIX in there, so when we're ready we can test both...  I think the assembler version is currently a little bit faster.

TMS9900A.zip

Edited by JasonACT
  • Like 3
Link to comment
Share on other sites

7 hours ago, JasonACT said:

There's also a "wide pixel" mode (2 pixels wide) which allows 16 colours, but the PIX instruction doesn't seem to consider that from what I can tell.

No I don't think PIX works for fat pixel bitmaps, but I haven't actually tried it. 

Link to comment
Share on other sites

1 minute ago, Asmusr said:

No I don't think PIX works for fat pixel bitmaps, but I haven't actually tried it. 

To be more precise, it works, but it only changes 2 of the four bits of a pixel.

  • Like 3
Link to comment
Share on other sites

Wow, what a great team effort, the Pico is by far the most useful little controller out of the bunch. I will be including the Pico footprint on board in my varied TI designs. The doorway goes 2 ways, much can be done by GPIO and CRU being co-operative with a versatile programmer at the drivers seat. Such talent I see here in this forum, there is not much that cannot be done by the best of the best in this group, regards Arto. 

  • Like 3
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   1 member

×
×
  • Create New...