Jump to content
IGNORED

Strange 7800 issue & analysis


Muddyfunster

Recommended Posts

This is going to be a bit of a story, the rabbit hole runs deep so be warned and please bare with me :)

 

Introduction :

 

I've been getting some strange graphical corruptions on my NTSC 7800. I first notice it a few months ago when I started developing Bernie & the Tower of Doom (BTOD). The issues only happened when I was running Bernie's code and seemed to start when I started using a 256k + RAM image instead of 128K + RAM. I got my DF patched to the latest version (at the time) and it didn't really help but the issue didn't really get any worse, just a little intermittent glitching, cold booting the console usually fixed it. I also thought this is likely my code as it's only happening on BTOD. (or outside bet, my console is on the fratz).

 

image.thumb.jpeg.5ea7a573eb5d3de915c1f99ec696c35c.jpeg

 

Recently however, it's got to be much worse and rather than the issue exhibiting now and again, it's now every single time I run a build of BTOD. 

 

My setup:

 

NTSC 7800 + UAV + Dragonfly running 1.08 (not quite the latest version).

PAL 7800 + UAV + Dragonfly running 1.08 (not quite the latest version).

 

Symptoms:

 

See the images at the end of the post . The symptoms i'm experiencing are :

  • graphical glitching
  • colour glitching
  • general corruption
  • hard crashes and lockups

 

Sometimes the glitch isn't instant, it takes a few seconds and then ... one of the sprite's colours is now black or a platform goes black. Sometimes there is some flickering on the bottom row. With the colour glitch, if I "kill" Bernie then he stays locked in his "dead" animation and won't trigger a loss of life or restart, sometimes, it's random sprite or character data appearing on a story screen.

 

When colour glitching occurs, for example on Screen 1 of Tower 1, it's always on P3C2, P3C3, P1C1 and P1C2 = all going black.

 

The issue is only happening under BTOD's code, so that would theoretically make diagnosing this to be quite straight forward right? It's my got to be my code...

 

That's what I thought. Well here's the kicker. I cannot reproduce the issue when running the same build with the same Dragonfly on my PAL unit, or under emulation (BUP + A780) or on MiSter and not a single post has been made about the public demo having glitches (see the pictures below) that I'm aware of.  I know emulation isn't the same as hardware but I wanted to tick as many boxes and be as rigorous as I could be when trying to eliminate potential causes or devices.

 

I've posted some photos at the end of this post.

 

Testing:

 

So I see 3 things as potential causes Console, Code, Dragonfly

 

Console testing : I tested the code repeatedly on my NTSC7800, issues ranges from very minor mild glitching (random chars), to colour glitches right through to hard crashes and everything in between. I repeated the same test with the same binary on the PAL unit and the code was flawless and rocksolid (there is no PAL optimisation, it's just running the NTSC code with wrong colours). I also tested various combinations of warm / cold with no difference being made. This for me, tempts me to put a flag against the NTSC Console as the Code and DF seem good here on the PAL unit.

 

I also tried lots of other games on the console + DF combination, (EXO, Koppers, other demos and releases). All good.

 

Dragonfly : I don't have a 2nd DF so I can't 100% eliminate that, but I tried testing from the dev link and from SD and I also slowed down the transfer speed on the devlink in case it was a transfer issue, the results were consistent with the above - PAL fine, NTSC glitching. I also tried disabling the POKEY/YM addons in the DF, again no change. I tried various combinations of rom sizes and other roms. Other roms great, no change on BTOD.

 

Code : My other instinct was that there is some bug in the code that's causing the issues, or perhaps I have a weird hardware issue that only shows under certain circumstances.

 

The problem however is the issue is exhibiting in multiple places like a blanket issue rather than being localised to specific bits of code. Sometimes the glitching looked like I had doublewide enabled on regular text graphics, other times the corruption was just total (see images below). I have tried compiling with various builds of 7800Basic, going back as far as V0.20 results were consistent. I couldn't compile under older version than 0.20 to test without compile errors.

 

Sometimes the title page is ok, sometimes it glitches etc. The code was running on a 256k + RAM framework. I had managed to convince myself the combination of 256K + RAM was the issue. So I changed the banking to 512k, just like EXO was, and the same issues happened. I tried switching certain things on and off, one at a time to try eliminate things in the code that may be causing an issue, like 

 

I tried a few things and there was no improvement one way or the other.

 

This is where things get completely weird. Finally I got to the "let's try this random thing, even though I'm 99.99% certain it will have no effect" kind of thinking.

 

BTOD right now has no sound or music so I do not have the SET POKEYSUPPORT switch enabled in my 7800Basic code, with no music yet, there is no need. Ok, whatever, I'll continue the process of elimination and I added SET POKEYSUPPORT ON and Boom, the code is rock solid.

 

I thought this makes no sense, so I switched it off again and the glitches came back. Back on, Rock solid. I did about 10 cycles of this. with the same result. Mind blown, as this to me, makes no sense.

 

I swapped back to 256K+RAM with no POKEY, glitches came back.

 

Back to 256K+RAM with POKEY set, running fine. 

 

Adding the SET POKEYSUPPORT ON switch seems to fix these issues. 

 

Conclusions :

 

Things I know for sure :
 

My code works fine on my PAL console with or without POKEY being set in the code - Zero glitches

 

My code works on my NTSC console only with the POKEY set to on, otherwise there are glitches.

 

My code works fine under emulation with or without the POKEY set.

 

Things that I'm pretty certain of :

 

It only seems to be my NTSC console (no one else has reported it) & SET POKEYSUPPORT ON seems to fix it.

 

I think that there could be a pretty unique set of circumstances where the code on NTSC with POKEY set to off is causing some wild glitching.

 

(I don't have enough knowledge to speculate on how or why that might happen, I can only report the empirical evidence that I see).

 

I'd very much appreciate any input or ideas around this :) 

 

Media :

IMG_8731.thumb.jpg.29d2c2642f123f4a6ae7eff94d8075b0.jpgIMG_8729.thumb.jpg.b013eb1aa69b4a00ad4026339a458e84.jpgIMG_8730.thumb.jpg.1deb567476f3c94fc04e7f90240f718c.jpgIMG_8732.thumb.jpg.fe02465a9741a6906616c3d03f9fdc35.jpgIMG_8733.thumb.jpg.81c084757e53af5f073e35b3b46100e0.jpgIMG_8734.thumb.jpg.b024b0e21a4c67d553b93d0eb57a4303.jpgIMG_8735.thumb.jpg.7b4c93c91f2878e91956c1a58f575639.jpg

Link to comment
Share on other sites

Those look very similar to the glitching I've seen on consoles that needed the A15 deglitch cap. Although there have a been a few I've seen do similar with Ballblazer where the cap didn't seem to correct them or it got BB working and then caused issues with other games. What CPU brand is installed into your NTSC 7800? I had one that was wasn't nearly as bad, but it wasn't until I removed the Rockwell CPU from it and installed a Synertek I bought from Best Electronics that the issues went away.

 

Near as I can tell at this point, the Rockwells are just a problem waiting to happen in most cases and there really isn't much I've found to correct them.

 

NCR made ones are better but they tend to be the ones that need the deglitch cap added or at least most of the ones I've found that have them installed from the factory also have NCR Sallys in them.

 

BTW... that is similar to the corruption that was being reported with the earlier E.X.O. builds after the 3rd or 4th screen.

  • Thanks 1
Link to comment
Share on other sites

2 minutes ago, -^CrossBow^- said:

Those look very similar to the glitching I've seen on consoles that needed the A15 deglitch cap. Although there have a been a few I've seen do similar with Ballblazer where the cap didn't seem to correct them or it got BB working and then caused issues with other games. What CPU brand is installed into your NTSC 7800? I had one that was wasn't nearly as bad, but it wasn't until I removed the Rockwell CPU from it and installed a Synertek I bought from Best Electronics that the issues went away.

 

Near as I can tell at this point, the Rockwells are just a problem waiting to happen in most cases and there really isn't much I've found to correct them.

 

NCR made ones are better but they tend to be the ones that need the deglitch cap added or at least most of the ones I've found that have them installed from the factory also have NCR Sallys in them.

 

BTW... that is similar to the corruption that was being reported with the earlier E.X.O. builds after the 3rd or 4th screen.

I was actually just thinking about that, I seen your video a while ago and that video popped into my head about the cap. 

 

So if that is indeed the case it would manifest in other games as well wouldn't it (I think it was Ballblazer and another game you was demonstrating on the video). 

Edited by madmax_2069
  • Like 1
Link to comment
Share on other sites

3 minutes ago, madmax_2069 said:

I was actually just thinking about that, I seen your video a while ago and that video popped into my head about the cap. 

 

So if that is indeed the case it would manifest in other games as well wouldn't it (I think it was Ballblazer and another game you was demonstrating on the video). 

Possibly, but it seems that Ballblazer is the most problematic to show this. Although the one in the video was also having the same issues with earlier builds of E.X.O. and a few others have reported issues with that game during testing doing similar graphic corruption issues. However, in nearly all of those cases I believe all of those 7800s had/have Rockwell CPUs in them. But I believe with the latest updates to the DF carts, many of those issues went away so in those particular instances it was most likely a combo of the Rockwell CPU and the earlier code that was in DF carts.

 

But yeah...several of  his screen shots looks very similar to the issues I've seen from the CPU not being happy about things.

 

  • Like 1
Link to comment
Share on other sites

Seems to me it's a signal timing issue. Something (phi2, halt, rw, ...) has drifted out of spec, and it's occasionally causing DF to do the wrong thing, like a write instead of a read, or an action on an address that hasn't fully come up on the bus. Enabling pokey is probably giving you a weak pull-down on the signal in question, and it's enough to push it back in-spec.

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

6 hours ago, -^CrossBow^- said:

Near as I can tell at this point, the Rockwells are just a problem waiting to happen in most cases and there really isn't much I've found to correct them.

Same.  Mine has a Synertek in place of the Rockwell that was originally in there, and after the swap was done a number of inexplicable small-but-noticeable glitches went away.

  • Like 2
Link to comment
Share on other sites

Gents, thank you for the responses, insights and feedback, very much appreciated.

 

I had read about the CPU "issues" with the 7800, we ran into it a bit with one or two users with E.X.O. but I wasn't on the receiving end myself during development.  I've spent a few hours this morning doing a bit more research. What a rabbit hole :)

 

I think my solution looks like replacing the rockwell CPU with something else. I spy a few NCR 6502C's on Fleabay, I'll see about snagging one of those.

 

Thanks again for taking the time to read my ramble :)

 

 

  • Like 1
Link to comment
Share on other sites

Lewis, that sounds an awful lot like the issues I was having with 1942. I'd not had any issues with it on my PAL 7800 using either DF or Concerto, nor under emulation. When I received my NTSC 7800, the very first time I ran 1942 from DF I saw some graphics corruption. At this point it was a 144K ROM so I was able to run it from Concerto and that was fine. So it did seem to be the combination of my NTSC console and DF.

 

As the game progressed I saw other issues too - lockups, crashes and stack dumps. When it neared completion I asked a group of other DF owners to test and we saw issues on 60% of consoles. It wasn't a PAL/NSTC thing and I'm pretty sure it wasn't just Rockwell CPUs (although my NTSC 7800 has one). Rafal investigated and came up with the CPLD update. I don't know if there is scope for further improvement, but if it's not just your console then I am sure Rafal would be interested in taking a look.

 

If you can (and haven't already) it might be an idea to enable the stack dump. Most of the time it didn't help, but at one point I saw it reporting a BRK instruction at a location in non-bank switched memory which should have contained a PHP. I'm pretty good at creating bugs but I don't see how I could have caused that. It suggested the CPU tried to fetch a PHP (op. code $08) but actually received a BRK ($00), so 1-bit corruption during the fetch. I also saw it hitting BRKs which were actually instruction operands. I think the instruction fetch was corrupted turning a 2-byte instruction op. code into a 1-byte one, then the operand was fetched as if it were the next instruction (I'd seen this previously on Popeye but didn't have a handle on how it was happening).

 

Generally I was seeing some graphics corruption after 10 or 20 minutes of play. However sometimes I'd make a minor change to the code and it would occur almost straight away. It seemed to me that this change in behaviour was probably down to things being shifted around in memory. That could possibly be a factor with SET POKEYSUPPORT ON as it may pull in some additional code. Do you get the issues on the public demo?

 

When it came to getting others to test, I was generally getting a crash after an hour of play. So no one had to play the game for that sort of time, I created a version with invincibility so you could just leave it running (with something pressing the trigger so it was constantly firing) and look in on it later. I don't know how often you see the issues, but if they take some time to occur then it might be worth doing something like this.

 

Edited by playsoft
  • Like 2
  • Thanks 1
Link to comment
Share on other sites

Thanks guys,

 

@playsoft Thanks for that insight Paul, I'll check that tonight.

On 3/5/2023 at 1:02 PM, playsoft said:

Generally I was seeing some graphics corruption after 10 or 20 minutes of play. However sometimes I'd make a minor change to the code and it would occur almost straight away

I've seen this too. Sometimes it would run nice and clean than after 15 mins, some corruption hits usually in the same area each time.

 

For me it was happening sometimes on the public demo, but only on my NTSC box, not on PAL. 

 

As I've added more new features and stuff in the engine, it's become progressively worse to the point where every run has as issue or crashes on my NTSC combination.

 

15 hours ago, Eagle said:

Can you attach a rom that glitch?
Imho it’s timing. 

Thanks @Eagle.

 

The public test rom causes problems for me but not for any of the closed test group (ROM Attached).

 

We had a similar thing with E.X.O. with a couple of users reporting issues like this.

 

 

Bernie_TOD_145_pubdemo1.a78

  • Thanks 1
Link to comment
Share on other sites

13 minutes ago, Muddyfunster said:

Thanks Paul, Assume you are running 1.09 (latest).

Yes, I'm running 1.09 although it was the 1.08 firmware that fixed things for me - I'll put it back to 1.08 tomorrow and see if it makes a difference (I'm not expecting it to). You may have already done so and have probably had the suggestion already, but it might be worth giving the DF cart contacts a clean. Hopefully @juansolo and @marauder666 can get to the bottom of it.

  • Thanks 1
Link to comment
Share on other sites

 

Thanks @playsoft @Karl G & @Eagle .

 

I've got a replacement sally on the way and i'll be sending it and the unit to @juansolo & @marauder666 for "surgery" :) 

 

if that doesn't sort the issue out then it could be that my 7800 has some other issue or issues causing the glitching.

 

Appreciate all the comments, testing and suggestions, thanks all.

 

 

  • Like 2
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...