Jump to content
IGNORED

Gopher2600 (continuing development on Github)


JetSetIlly

Recommended Posts

3 minutes ago, Andrew Davie said:

Also worth mentioning - there's a global flag for "Show Tooltips" in the Debugger Preferences. The aforementioned operation of tooltips with SHIFT to show is related to how tooltips work when they are turned off (my preference) via this flag.

Yes. Clicking on the icon in the menubar also toggles the global setting. The video shows the icon and the preferences checkbox working in unison.

 

 

 

  • Like 1
Link to comment
Share on other sites

  • 1 month later...

I've been working hard, with help from @MarcoJ, to knock the bugs out of the ARMv7-M emulation in Gopher2600. This type of ARM is found in the PlusCart and UnoCart and @MarcoJ and @ZackAttack are making good use of it. The FPU emulation has proven especially challenging.

 

There's still a lot of work to be done but today I reached a significant milestone by gettng Marco's XWing / TIE Fighter 3D model demo working. Short video captured from the emulator below:

 

 

To be clear this is not real time. The emulation is currently running at about 5fps on my machine. By way of comparison, the ARM in the Harmony is emulated at well over a 100fps.

 

In addition to the 3D models, Marco's demo features music which relies on the FPU. This is also working under emulation but there are tuning issues so I won't post the results here just yet. I think the tuning issues are due to incorrect timing between the VCS and the ARM (the timer2 peripheral in the ARM being updated incorrectly/inconsistently) but I'm not entirely sure about this. More research and experimentation required.

 

As part of the process of bug finding process, I've improved the breakpoint system and the reliability/accuracy of local variable inspection. I've also added the ability to step by ARM instruction for super-detailed understanding of what is happening in the ARM registers.

 

I'll polish up what I have and hopefully have a new release in a couple of weeks. For anyone who is interested all code has been pushed to Github.

 

Steve

 

 

 

 

 

 

  • Like 6
  • Thanks 1
Link to comment
Share on other sites

13 hours ago, JetSetIlly said:

To be clear this is not real time. The emulation is currently running at about 5fps on my machine. By way of comparison, the ARM in the Harmony is emulated at well over a 100fps.

Why is that so? Does emulating the CPU require that many resources? Or is the code just not optimized yet? Or both?

Edited by Thomas Jentzsch
Link to comment
Share on other sites

1 minute ago, Thomas Jentzsch said:

Why is that so? Does emulating the CPU require that many resources? Or is the code just not optimized yet? Or both?

Great question. The main difference is that in Marco's ROMs and Zack's StrongARM ROMs, the ARM is being emulated for the entire screen, and not just just during VBLANK and Overscan like you would typically see in CDFJ ROMs.

 

I don't think there's much more I can do in the way of optimisation. In the case of the StrongARM ROMs though, there are library calls that have been implemented natively and those ROMs run much faster.

 

The performance drop is similar to what would happen if, instead of implementing CDFJ natively, we executed the instructions in the Harmony ROM under emulation and relied on the Harmony implementation. Although the drop wouldn't be as dramatic because there are fewer instructions per colour clock in the case of the Harmony ARM.

 

I'm not too worried about the performance though. I think the real utility for this is ROM development. Faster would be nice but I have a couple of ideas for how to improve the development experience in lieu of that.

Link to comment
Share on other sites

4 minutes ago, JetSetIlly said:

Great question. The main difference is that in Marco's ROMs and Zack's StrongARM ROMs, the ARM is being emulated for the entire screen, and not just just during VBLANK and Overscan like you would typically see in CDFJ ROMs.

The ARM is running in parallel with the 6507 kernel code? I haven't seen any CDFJ ROM doing that, yes. But I suppose it should be possible there too.

 

But I wonder why this is even required to that extend. Developers know that having enough CPU power can lead to lazy programming. Quite the opposite of what you would expect from a 2600 game.

4 minutes ago, JetSetIlly said:

The performance drop is similar to what would happen if, instead of implementing CDFJ natively, we executed the instructions in the Harmony ROM under emulation and relied on the Harmony implementation. Although the drop wouldn't be as dramatic because there are fewer instructions per colour clock in the case of the Harmony ARM.

Got you.

4 minutes ago, JetSetIlly said:

I'm not too worried about the performance though. I think the real utility for this is ROM development. Faster would be nice but I have a couple of ideas for how to improve the development experience in lieu of that.

Yes and no. If the emulators cannot handle these ROMs for the players, then many people will be excluded from playing them. :sad: 

Link to comment
Share on other sites

7 minutes ago, Thomas Jentzsch said:

But I wonder why this is even required to that extend. Developers know that having enough CPU power can lead to lazy programming. Quite the opposite of what you would expect from a 2600 game.

This is a philosophical question really. It's something I've thought about and I do have my own answer but I don't have the time right now to write it out 🙂

 

10 minutes ago, Thomas Jentzsch said:

Yes and no. If the emulators cannot handle these ROMs for the players, then many people will be excluded from playing them. :sad: 

Marco and Zack are making these games regardless of whether emulation is possible. I view my role as making their life a little easier by providing tooling to make the development cycle faster and to provide insights into what is happening in their C programs.

 

But I definitely take your point about being able to play the games and I'm thinking hard about how to improve the situation.

 

Link to comment
Share on other sites

14 hours ago, Thomas Jentzsch said:

But I wonder why this is even required to that extend. Developers know that having enough CPU power can lead to lazy programming. Quite the opposite of what you would expect from a 2600 game.

Ironically, it's been the opposite for me. I tend to spend way more time than I should on the display kernels because this tech allows you to squeeze more out of the TIA.

14 hours ago, JetSetIlly said:

Marco and Zack are making these games regardless of whether emulation is possible. I view my role as making their life a little easier by providing tooling to make the development cycle faster and to provide insights into what is happening in their C programs.

This is a huge understatement. I use Gopher 99% of the time and only bother testing on real hardware when I'm going to share a build with other pluscart users.

 

15 hours ago, Thomas Jentzsch said:

Yes and no. If the emulators cannot handle these ROMs for the players, then many people will be excluded from playing them. :sad: 

14 hours ago, JetSetIlly said:

But I definitely take your point about being able to play the games and I'm thinking hard about how to improve the situation.

How about compiling the game and emulator to webassembly and running it in web browsers? It might just be crazy enough to work.

  • Thanks 1
Link to comment
Share on other sites

3 hours ago, ZackAttack said:

How about compiling the game and emulator to webassembly and running it in web browsers? It might just be crazy enough to work.

Funnily enough, that's something I've looked at and kept on the back burner for a couple of years.

 

I've built the emulator so that the 2600 engine can be compiled into other things, including other GUI systems and I have a version that can be compiled to WASM. It's only a proof of concept so no debugger etc. but it runs, albeit too slowly to be of any use. I don't know what to do about it to be honest.

 

Link to comment
Share on other sites

3 hours ago, ZackAttack said:

Ironically, it's been the opposite for me. I tend to spend way more time than I should on the display kernels because this tech allows you to squeeze more out of the TIA.

Yes, the 6507 code still needs heavy optimization for optimal results. But the ARM code? Not so much.

3 hours ago, ZackAttack said:

How about compiling the game and emulator to webassembly and running it in web browsers? It might just be crazy enough to work.

Crazy, yes. But IMO any game should run at full speed in Stella too.

Link to comment
Share on other sites

On 8/16/2023 at 7:46 PM, Thomas Jentzsch said:

Why is that so? Does emulating the CPU require that many resources? Or is the code just not optimized yet? Or both?

My ACE roms are written in a raw ARM format that has no emulation shortcuts yet. I'm yet to improve on it. 

 

Running code for a non-native processor is one of the greatest challenges for emulating processors universally, at least in real time. Modern processors have reached roughly their limit in processing (6 GHz -  Core i9-13900KS ), until quantum processors become affordable.  Emulating the STM32 on a cycle by cycle basis at 216 MHz is quite a stretch to fit within a modern x86 processor. As a comparison, emulating the SNES (1 Mhz processor) was roughly only working on 200Mhz x86 processors in DOS mode with zSNES in emulation in the 90s, and attempting a cycle-for-cycle emulation in the Byuu emulator required a 2Ghz+ machine in the 2010's. Gopher2600 is a cycle for cycle emulator like byuu, at least for ACE in its raw form.

 

The solution to optimize my ACE programs for real time operation is to limit the scope of ARM code that runs during screen drawing, so that's its procedurally emulated instead of letting the game's own ROM driver do it. This is "dumbing down" the scope of what could be possible with STM32 (in a universal emulator sense) emulation in favour of agreeing on a common library that allows emulation to be faster for a limited subset. This is roughly what CDFJ is, a well thought out system of programming conventions of using the ARM processor such that running on real hardware vs emulation is seamless and optimized for emulation within the emulator. I'm not aware of people writing their own harmony cart ARM driver that uses another custom driver other than DPC+/CDF/CDFJ/CDFJ+, in which case would be a challenge to emulate unless a middleware emulation library was written.

 

Zack has made the STM32 emulate better with the StrongARM library. It's library in c is compiled natively for whichever processor Gopher2600 is run on. There is also the ELF extension of this that improves things even more. 

 

 

 

  • Like 1
Link to comment
Share on other sites

Thanks for the input. Yes, emulation seems tricky, especially now that you are running at 216 MHz. Probably one could emulate the 6507 on one CPU and the ARM on another one (per thread), but that seems to be it, right?

 

What exactly is the StrongARM library? I understand that it is compiled for the target CPUs, but what's its content? And how much is it used? 

Edited by Thomas Jentzsch
  • Like 1
Link to comment
Share on other sites

3 minutes ago, Thomas Jentzsch said:

Thanks for the input. Yes, emulation seems tricky, especially now that you are running at 216 MHz. Probably one could emulate the 6507 on one CPU and the ARM on another one (per thread), but that seems to be it, right?

I'm not sure there's any benefit to that because you need to keep the CPUs in synchronisation. Any advantage of parallel computation is lost because the synchronisation points are so frequent. That's my feeling anyway. It would be worth implementing and measuring.

 

11 minutes ago, Thomas Jentzsch said:

What exactly is the StrongARM library? I understand that it is compiled for the target CPUs, but what's its content? And how much is it used? 

The StrongARM library contains functions that facilitate the insertion of 6507 instructions into the data stream. For example, the StrongARM  function vcsJmp3() inserts a JMP instruction.

 

The implementation for Gopher2600 is here, to give you an idea of what's involved: https://github.com/JetSetIlly/Gopher2600/blob/master/hardware/memory/cartridge/elf/strongarm.go (Note that I've tied StrongARM to the ELF loader but it could theoretically be used with the ACE loader too)

 

 

 

 

  • Like 1
Link to comment
Share on other sites

On 8/16/2023 at 9:09 PM, JetSetIlly said:

Marco and Zack are making these games regardless of whether emulation is possible. I view my role as making their life a little easier by providing tooling to make the development cycle faster and to provide insights into what is happening in their C programs.

Gopher2600 is honestly the first time I have seen my ACE programs running under the hood. The way I wrote them before Gopher2600 is trying out code on the PlusCart with a trial and error basis. The feedback I had is whether the game crashed or the screen rolled on my Atari. It was slow and I tended to run code that was reliable rather than efficient. With Gopher2600's insight into the timing that various code takes I can make my software more efficient.

 

It has been a fun journey working with @JetSetIlly trying to find ways to improve the STM32's ARM instructions in Gopher2600 emulation by displaying diagnostic outputs that show different results when run on a PlusCart VS inside the emulator. This shows where the two diverge. The way forward is to throw different code at the emulator and compare it to the PlusCart and understand why it's different, and then write the fix for specific ARM instructions in the emulator. 

 

The below shows PlusCart and Gopher2600 running in parallel. A 3D variable was being watched using the 4 ASCII characters and the actual 3D plotting output. The PlusCart and emulator agreed on the variable, which proved Gopher2600 was processing the 3D transformations correctly. However, the divergence is with the 3D drawing. The 3D outline could be seen but errors were occurring in emulation that resulted in plotting artefacts. This lead to the discovery of a FPU instruction that wasn't quite working and writing the fix. After that, the 3D plotting was clean.

image.thumb.png.fa570d2f03586432c2937f752514376c.png

 

 

  • Like 1
Link to comment
Share on other sites

45 minutes ago, JetSetIlly said:

I'm not sure there's any benefit to that because you need to keep the CPUs in synchronisation. Any advantage of parallel computation is lost because the synchronisation points are so frequent. That's my feeling anyway. It would be worth implementing and measuring.

 

The StrongARM library contains functions that facilitate the insertion of 6507 instructions into the data stream. For example, the StrongARM  function vcsJmp3() inserts a JMP instruction.

Ah, now I understand the problem better. Before I thought both CPUs would work mostly on their own.

45 minutes ago, JetSetIlly said:

The implementation for Gopher2600 is here, to give you an idea of what's involved: https://github.com/JetSetIlly/Gopher2600/blob/master/hardware/memory/cartridge/elf/strongarm.go (Note that I've tied StrongARM to the ELF loader but it could theoretically be used with the ACE loader too)

I only had a brief look. Are you "racing" the 6507 with the ARM now? So that there is no fixed 6507 (kernel) code, just code generated on-the-fly by the ARM?

Edited by Thomas Jentzsch
Link to comment
Share on other sites

8 minutes ago, Thomas Jentzsch said:

Are you "racing" the 6507 with the ARM now?

That's a good way of thinking about it.

 

10 minutes ago, Thomas Jentzsch said:

So that there is no fixed 6507 (kernel9 code, just code generated on-the-fly by the ARM?

That's right. In a pure StrongARM ROM there is no DASM stage, it's all C (although it doesn't have to be C of course). The 6507 code could be different for every frame.

Link to comment
Share on other sites

12 minutes ago, JetSetIlly said:

That's right. In a pure StrongARM ROM there is no DASM stage, it's all C (although it doesn't have to be C of course).

Which differs from my ACE Programs. I’ve been pre-compiling a 6502 program with dasm that is baked into the ACE ROM and is then interpretted by the ARM code in the ROM.  I am hopeful of upgrading this to StrongARM style as an improvement.

Link to comment
Share on other sites

1 hour ago, Thomas Jentzsch said:

BTW: Did I miss the discussion about how the StrongARM library etc.? Or has that not been discussed at AtariAge before?

StrongARM has been around for a while but @ZackAttack will know for sure. I think it came about as a consequence of the bus stuffing experiments.

Link to comment
Share on other sites

On 8/17/2023 at 11:48 AM, Thomas Jentzsch said:

BTW: Did I miss the discussion about how the StrongARM library etc.? Or has that not been discussed at AtariAge before?

It's been so long, you probably just forgot about it. I got a little sidetracked with hardware development in my quest to make these ARM enhanced games publishable to carts, and then further sidetracked trying to make them support both 2600 and 7800. But thanks to recent collaborations there's been huge progress made and there should be some playable demos coming out later this year.

 

  • Like 1
Link to comment
Share on other sites

  • 2 weeks later...

@ZackAttack has put together an intriguing ROM using the new ELF/StrongARM "bankswitching" format. This particular ROM, Wushu Masters,  has given me a real headache and for the longest time it looked like this:

crt_single_wushu-masters_20230829_080206.thumb.jpg.db8ebc0163435b09a39ddcd050cb6600.jpg

 

Obviously wrong but it wasn't at all obvious what the problem was. All the ARM instructions it uses are used by other ROMs which work just fine.

 

This morning Zack pointed out that I wasn't executing the .init sections in the ELF file. I was loading them but not executing them. As the name suggests, these sections point to initialisation functions that must be run before the main() function.

 

A quick change to the emulator and the problem immediately resolved itself:

 

 

Selectable characters, scrolling background and great colours. I hope @ZackAttack finishes this. It looks to be tremendous fun 🙂

 

 

  • Like 7
Link to comment
Share on other sites

21 hours ago, JetSetIlly said:

@ZackAttack has put together an intriguing ROM using the new ELF/StrongARM "bankswitching" format. This particular ROM, Wushu Masters,  has given me a real headache and for the longest time it looked like this:

 

Very nice, looks like some updates have been made to the game since we played it on ZPH in 2019. Looking forward to playing it again when it's ready!

 

- James

  • Like 2
Link to comment
Share on other sites

👍 

 

Have you thought about scrolling the background delayed and in larger chunks? You only have to scroll when a fighter gets close to a border. And then you scroll (depending on the other fighter) e.g. 5 times one PF pixel at e.g. 30 Hz. So that you have them centered again. Or even a bit to the opposite side, because one could predict further movement into the current direction.

 

This kind of scrolling has been used in multiple games (e.g. Thrust, Boulder Dash and Mappy). It makes the playfield scrolling look smooth.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...