Jump to content
IGNORED

W65C02 or W65C816 on R1 of the Advanced PCB Remake For the Atari 800XL


Recommended Posts

Hi,

 

I'm not so familiar with PLD so it might be wrong...

 

Device    g22v10 ;

/* *************** INPUT PINS *********************/
PIN   1  =  CLK ;
                       

/* *************** OUTPUT PINS *********************/

PIN   16  = Q;
PIN   14  = Q2;
PIN   15 =  Q4;

Q.D = !CLK;
Q2.D = !Q.D;
Q4.D = !Q.2D;

 

 

I'm not added conditions from VPA/VPD because I don't know how you combine these in your program.

 

According mentioned page, I see we need to considering prevent access to bus when we have VPA & VDA != 0. If system works fine without control VPA/VDA means that wrong states are for example too short and have no influence to whole system...

 

 

 

Edited by pancio
4 minutes ago, pancio said:

Hi,

 

I'm not so familiar with PLD so it might be wrong...

 

Device    g22v10 ;

/* *************** INPUT PINS *********************/
PIN   1  =  CLK ;
                       

/* *************** OUTPUT PINS *********************/

PIN   16  = Q;
PIN   14  = Q2;
PIN   15 =  Q4;

Q.D = !CLK;
Q2.D = !Q.D;
Q4.D = !Q.2D;

 

 

I'm not added conditions from VPA/VPD because I don't know how you combine these in your program.

 

According mentioned page, I see we need to considering prevent access to bus when we have VPA & VDA != 0. If system works fine without control VPA/VDA means that wrong states are for example too short and have no influence to whole system...

 

 

 

I am positive that VPA/VPD aren't needed unless you are running the W65C816 in native mode.  If it's in emulated (6502) mode, you can ignore the pins. 

 

 

 

 

  • Thanks 1
1 hour ago, reifsnyderb said:

I am positive that VPA/VPD aren't needed unless you are running the W65C816 in native mode.  If it's in emulated (6502) mode, you can ignore the pins.

I cannot see anything in the datasheet that would confirm that VPA/VDA are not valid in the 6502 mode. Moreover, assuming that it only works in the native mode, would sort-of defeat the purpose, known as "20% power for free" - the majority of the legacy code, such as Atari BASIC or FP package, or TBXL, or SysInfo, and such, which could benefit from that, are running in the legacy "emulation" 6502 mode.

  • Like 1
13 minutes ago, drac030 said:

I cannot see anything in the datasheet that would confirm that VPA/VDA are not valid in the 6502 mode. Moreover, assuming that it only works in the native mode, would sort-of defeat the purpose, known as "20% power for free" - the majority of the legacy code, such as Atari BASIC or FP package, or TBXL, or SysInfo, and such, which could benefit from that, are running in the legacy "emulation" 6502 mode.

Good point.  Thanks!

2 hours ago, pancio said:

Hi,

 

I'm not so familiar with PLD so it might be wrong...

 

Device    g22v10 ;

/* *************** INPUT PINS *********************/
PIN   1  =  CLK ;
                       

/* *************** OUTPUT PINS *********************/

PIN   16  = Q;
PIN   14  = Q2;
PIN   15 =  Q4;

Q.D = !CLK;
Q2.D = !Q.D;
Q4.D = !Q.2D;

 

 

I'm not added conditions from VPA/VPD because I don't know how you combine these in your program.

 

According mentioned page, I see we need to considering prevent access to bus when we have VPA & VDA != 0. If system works fine without control VPA/VDA means that wrong states are for example too short and have no influence to whole system...

 

 

 

I saw @drac030 's post, think @drac030 is correct in his post, and think I mis-spoke.  VPA/VDA may be of use to run the CPU faster internally.  You'll need VPA, VDA, and a faster clock running into the PLD.  You may be able to use the Fast Phi clock pulse between the GTIA and ANTIC. 

 

Most likely, the ATF16V8B would work.  You'd run VPA, VDA, and the clock pulse into the PLD.  There are several pins still available to do this.

 

So, you could set pins 5, 6, and 7 to VPA, VDA, and the clock (FPhi), respectively.  (When putting the footprints on the board, I tend to move the pins around.)

 

Then, you'd need some code like this:

 

Phi0_CPU = Phi0 & L_HALT & VPA & VDA #    /* Phi0 to CPU.*/

          Phi0 & L_HALT & !VPA & VDA #

          Phi0 & L_HALT & VPA & !VDA #

          FPhi & L_HALT & !VPA & !VDA;     /* Use fast clock for internal operations.  */

 

That may work.....

 

I'll have to test that later on tonight with my board that is running the W65C816 processor.

 

 

 

 

 

 

 

 

 

 

Maybe I'm missing something, but I'm not sure how speeding up single internal operation cycles is going to help when the 65816 is tied directly to the main memory bus. Any speedup from running the IO cycle at 14MHz will just be eaten by additional time to synchronize back to the 1.79MHz clock, unless two or more IO cycles occur in a row. That generally requires something like native mode, an unaligned direct page, or write posting. It might work in Rapidus' case where the slower 40MHz clock is for wait states and it can shift the phase of the clock, but that can't be done for the main 1.79MHz system clock.

 

Posted (edited)
5 hours ago, reifsnyderb said:

I saw @drac030 's post, think @drac030 is correct in his post, and think I mis-spoke.  VPA/VDA may be of use to run the CPU faster internally.  You'll need VPA, VDA, and a faster clock running into the PLD.  You may be able to use the Fast Phi clock pulse between the GTIA and ANTIC. 

 

Most likely, the ATF16V8B would work.  You'd run VPA, VDA, and the clock pulse into the PLD.  There are several pins still available to do this.

 

So, you could set pins 5, 6, and 7 to VPA, VDA, and the clock (FPhi), respectively.  (When putting the footprints on the board, I tend to move the pins around.)

 

Then, you'd need some code like this:

 

Phi0_CPU = Phi0 & L_HALT & VPA & VDA #    /* Phi0 to CPU.*/

          Phi0 & L_HALT & !VPA & VDA #

          Phi0 & L_HALT & VPA & !VDA #

          FPhi & L_HALT & !VPA & !VDA;     /* Use fast clock for internal operations.  */

 

That may work.....

 

I'll have to test that later on tonight with my board that is running the W65C816 processor.

 

Using Fast Phi is a disaster.  That doesn't work at all.  Tying the fast clock directly to the oscillator works....but breaks compatibility with the SIDE3 cartridge.  It booted to BASIC and ran some cartridges (i.e.  PacMan and Donkey Kong).  I didn't do any more testing as SIDE3 wasn't working and the benchmarking software is on that cartridge.

 

On another note, doing a boneheaded maneuver like accidentally tying the fast clock PLD input to the /halt signal, instead of the oscillator, doesn't work either.

Edited by reifsnyderb
34 minutes ago, phaeron said:

Maybe I'm missing something, but I'm not sure how speeding up single internal operation cycles is going to help when the 65816 is tied directly to the main memory bus. Any speedup from running the IO cycle at 14MHz will just be eaten by additional time to synchronize back to the 1.79MHz clock, unless two or more IO cycles occur in a row. That generally requires something like native mode, an unaligned direct page, or write posting. It might work in Rapidus' case where the slower 40MHz clock is for wait states and it can shift the phase of the clock, but that can't be done for the main 1.79MHz system clock.

 

Yeah, pretty much the case.  I don't think there are enough internal cycles to make it worth the effort, either.

  • Like 1
12 hours ago, phaeron said:

Maybe I'm missing something, but I'm not sure how speeding up single internal operation cycles is going to help when the 65816 is tied directly to the main memory bus. Any speedup from running the IO cycle at 14MHz will just be eaten by additional time to synchronize back to the 1.79MHz clock, unless two or more IO cycles occur in a row.

I do not know *how* exactly it is implemented, I just know that it works in Rapidus and that it also worked in that early turbo board I mentioned: it had a physical switch to control whether the internal CPU cycles are executed with waitstates or with full speed (6x standard clock, 10.638 MHz), and when you set the waitstates to off, the effect on the CPU speed was similar to what happens normally when you switch off the Antic DMA - about 22% speedup on average.

 

But I of course agree that 20% may not really be worth the effort.

  • Like 4

That does bring up a point, what will occur when ANTIC is switched off as some utilities etc do. Will the new design detect and be full speed during that period or the during decrease Display lists sometimes used to gain more processing time. 22 percent of course has to be worth it. Each increase adds to overall productivity. Making the Atari self sufficient to do it's own heavy lifting when you are developing programs for the Atari itself. Sounds satisfying to have that.

Edited by _The Doctor__
27 minutes ago, drac030 said:

I do not know *how* exactly it is implemented, I just know that it works in Rapidus and that it also worked in that early turbo board I mentioned: it had a physical switch to control whether the internal CPU cycles are executed with waitstates or with full speed (6x standard clock, 10.638 MHz), and when you set the waitstates to off, the effect on the CPU speed was similar to what happens normally when you switch off the Antic DMA - about 22% speedup on average.

 

But I of course agree that 20% may not really be worth the effort.

It may be a matter of what the "fast" clock speed is.  The faster the "fast" clock speed is the more it makes sense to speed up the internal cycles.  3.58MHz may not be fast enough to make enough difference to make it worthwhile.

 

 

23 minutes ago, _The Doctor__ said:

That does bring up a point, what will occur when ANTIC is switched off as some utilities etc do. Will the new design detect and be full speed during that period or the during decrease Display lists sometimes used to gain more processing time. 22 percent of course has to be worth it. Each increase adds to overall productivity. Making the Atari self sufficient to do it's own heavy lifting when you are developing programs for the Atari itself. Sounds satisfying to have that.

I think the answer to your question is that the CPU and ANTIC still have to share the same memory.  Right now I am thinking the best option may be to try to interleave ANTIC and the CPU like Woz did with the Apple II.  The W65C816 board will be setup to experiment with this idea.  However, when the CPU has to drop to low speed so as to access I/O devices, and ANTIC, the interleaving will be temporarily suspended.  At least this is the theory.

  • Thanks 1

Hi @reifsnyderb,

 

Maybe this can help you. I am not sure if you have read the thread on when Rapidus was first announced. There is some information on how things were implemented and may provide some guidance:

 

https://forums.atariage.com/topic/246802-rapidus-accelerator/?do=findComment&comment=3477271

 

  • Like 2
11 minutes ago, scorpio_ny said:

Hi @reifsnyderb,

 

Maybe this can help you. I am not sure if you have read the thread on when Rapidus was first announced. There is some information on how things were implemented and may provide some guidance:

 

https://forums.atariage.com/topic/246802-rapidus-accelerator/?do=findComment&comment=3477271

 

Very interesting.  It's too bad there aren't any schematics...

Posted (edited)

I realized I had another way to do some nice bench marking on internally accelerating the CPU by using the VPA and VDA lines.  So, I tried it and here are my results:

 

I used the AHL test as follows:

Note:  Each test was ran twice.

 

OS:  OSR3

BASIC:  Atari Rev. C

  Normal Speed:  191 and 191 seconds

  CPU Internally Accelerated:  148 and 149 seconds

  Accelerated Speed:  128.6%

 

OS:  OSR6.4 (w/ variable speed Fast Math running at high speed)

BASIC:  Altirra BASIC 1.58

  Normal Speed:  36 and 36 seconds

  CPU Internally Accelerated:  29 and 28 seconds

  Accelerated Speed:  126.3%

 

The speed increase of both OS/BASIC combinations is comparable.  Now that I found a way to get a concrete answer, I have certainly changed my mind about this.  I'll have to do more testing so as to see if there are any other issues.  I figure it's better to figure this stuff out now, before I get a dedicated W65C816 board made.

 

On the negative note, I'll have to figure out why SIDE3 isn't working in this configuration.

 

Edit to add:  This may be more dramatic if the internal CPU speed were to be 14MHz.

 

 

 

Edited by reifsnyderb
  • Like 1

Another thing while I was reading the thread (apologies, I am not electronics expert like yourself. Just conducting a thought experiment). It seemed that the Rapidus was using the PBI bus because it had its own id number. This got me thinking. Would it make sense to split the project up? Goal 1: update your motherboard PCB design to easily use accommodate the newer CPU’s as full replacements, normal speeds. This could change could be rolled in as next phase PCB revision. Goal 2. The acceleration portion to be implemented as 1090 16 bit accelerator card. Maybe this would make sense because:

1. It may be simpler to implement as a card than creating a whole new mother board design (not sure if this is true)

2. It would be another selling point for one to acquire the 1090.

3. If you get it work, it could available  a wider audience since this could be used with a stock XL (XE?) computer with the 1090 expansion as well.

Posted (edited)
4 minutes ago, scorpio_ny said:

Another thing while I was reading the thread (apologies, I am not electronics expert like yourself. Just conducting a thought experiment). It seemed that the Rapidus was using the PBI bus because it had its own id number. This got me thinking. Would it make sense to split the project up? Goal 1: update your motherboard PCB design to easily use accommodate the newer CPU’s as full replacements, normal speeds. This could change could be rolled in as next phase PCB revision. Goal 2. The acceleration portion to be implemented as 1090 16 bit accelerator card. Maybe this would make sense because:

1. It may be simpler to implement as a card than creating a whole new mother board design (not sure if this is true)

2. It would be another selling point for one to acquire the 1090.

3. If you get it work, it could available  a wider audience since this could be used with a stock XL (XE?) computer with the 1090 expansion as well.

I would love to do a 1090XL accelerator card.  I haven't figured out a way to do it, though.  A big issue is both the Atari computer and 1090XL were designed so that the computer's CPU can't be turned off externally and the I/O devices can't be accessed directly by cards on the 1090XL.

 

If I can figure out a way to do this, I'll get it done.

Edited by reifsnyderb
17 minutes ago, reifsnyderb said:

I would love to do a 1090XL accelerator card.  I haven't figured out a way to do it, though.  A big issue is both the Atari computer and 1090XL were designed so that the computer's CPU can't be turned off externally and the I/O devices can't be accessed directly by cards on the 1090XL.

 

If I can figure out a way to do this, I'll get it done.

Oh wow! That is very interesting! Thank you for the information!

Posted (edited)

More testing on the internal CPU acceleration are mostly positive.  The 1091XL with an OS Extender card works.  MULE was also running for around 1/2 hour and was stable.  One issue I noticed was some sort of possible PMG issue at the top of the MULE land development screen.  The Donkey Kong and PacMan cartridges appear fine.  System appears stable.

Edited by reifsnyderb
  • Like 2
3 hours ago, scorpio_ny said:

It seemed that the Rapidus was using the PBI bus because it had its own id number.

The Rapidus occupies PBI #0 just to get things running: the PBI devices are guaranteed to be executed/initialized during system startup, thus the Rapidus presents itself as a PBI device #0 to a) get the FPGA core loaded after power up, b) get the configuration menu available to the user, when he presses Inv/Reset.

  • Like 3
  • Thanks 1

I added a jumper to make it easier to test the board.  If the jumper is on, the system runs about 25% faster.  If it is off, it runs at normal speed.

 

The right blue jumper is the "turbo" jumper and the left blue jumper is for MapRAM.

 

jumperadded.thumb.jpg.2dcd559bf813e639570ed5220add8015.jpg

 

Here's the underside of the board.  It's quite busy with the addition of the wires for the CPU, MapRAM, and crystal.

underside.thumb.jpg.0a06e375018f8613adbb8d05ee0ea887.jpg

 

 

 

 

 

  • Like 6
  • Thanks 1

@pancio

 

I discovered the VPA and VDA need hold-up resistors on them when used for internal CPU throttling.  Without these resistors, the computer doesn't always boot.  When this happens, the clock signal going to the processor is a mess.  I am theorizing that if VPA and VDA are floating, and they are being used to throttle the CPU clock, that some sort of feedback loop can happen which results in system failure.  I put 4.7k ohm hold-up resistors on both pins and it fixed the problem.

  • Like 5
  • Thanks 1

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...