4A50 cart--hardware details
I haven't gone into much detail elsewhere about the 4A50 cart and some of the techniques it uses, but since people may find it interesting I'll discuss it here.The heart of the cartridge is a Xilinx CPLD. This device has 36 macrocells connected to 32 I/O pins. While it's a step up from the 22V10 used in Al's bankswitch carts, it's still very cheap as such devices go.Another key to the cartridge is a 14.31818Mhz oscillator. Although many RAM-plus carts get by with some simple RC circuitry for timing, using a crystal oscillator makes it possible for the CPLD to "know" the cycle phase of the Atari's processor. This is essential for the "magic RAM write trick" discussed below.Finally, the cartridge contains 64KB of flash (treated as ROM) and 32KB of SRAM. Both of these are simple common ordinary chips.In addition to supporting 4A50 bankswitching, I wanted the cart to be able to support other forms by reprogramming the CPLD. The 32 I/O pins thus break down as:AD0-AD6 -- Tied to A0-A6 on the 2600 and A0-A6 of the RAM and ROMAD7-AD10 -- Tied to A7-A10 on the 2600 and, via resistors, to AQ7-AQ10AD11-AD12 -- Tied to A11-A12 on the 2600AQ7-AQ10 -- Tied to A7-A10 on the RAM and ROM and, via aforementioned resistors, to AD7-AD10AQ11-AQ14 -- Tied to A11-A14 on the RAM and ROMRamRW -- Tied to the R/W pin on the RAM, and A15 on the ROMRAMCS -- Chip-select of the RAMROMCS -- Chip-select of the ROM (flash)D0-D7 -- Tied to the data bus--used as inputs and for the 'bus-keeper' functionXtalIn -- Input from the 14.31818Mhz crystalspares -- Used for debugging; may also be usable for adding an EEPROM, LED, or other feature.The CPLD is thus capable of mapping any address to any 128-byte block of RAM or ROM, but when it's plugged into the 2600 it does not have control of the lower address bits even though it can see them.The use of resistors on A7-A10 provides a couple benefits:
- It allows the chip to control A7 if needed (as in Superchip games) but does not waste a macrocell if such ability is not needed.
- In bank-switching schemes, like 4A50, where the output address should either be taken from a particular set of latches or from the input address, the resistors provide an "almost-free" multiplexor. When the AQ outputs are enabled, the address pins will be driven by their corresponding latches; otherwise they'll be driven by the input address.
The first prototype only had the A7 resistor on-board (A8-A10 resistors were soldered on later). The next batch of prototypes will include them on-board.Although the 4A50's ability to bank-switch much more memory than earlier designs is simply a consequence of the larger memory chips used, there are a few aspects of its design that are unique. These include "magic RAM writes", "hotspot MSB discrimination", and "memory-mode presets".Magic RAM writesTo understand magic RAM writes, one must first understand how existing RAM cartridges work. Because the 2600 does not provide a read/write signal out to the cartridge port, cartridges have no inherent way of knowing whether a particular cycle is a read or a write. What most RAM cartridges do is allocate two ranges of addresses for the RAM: one for reads and one for writes. A read access to the read range will read the corresponding memory location; a write-address to the write range will write it. Write accesses to the read range will produce bus contention (bad), and read accesses to the write range will cause garbage data to be read and written (and the data read may not match the data written).In something like the Superchip with 128 bytes of RAM, doubling the address space from 128 bytes to 256 is no big deal. There are still 3840 bytes of address space left for the cartridge. With larger RAMs, however, things become problematic. In 3E bankswitching, RAM banks are limited to 1K because 1K of RAM uses up 2K of address space.The magic RAM write trick eliminates the requirement for a separate RAM space. It does this by taking advantage of a few observations:
- The modern RAM chips can be read in less than half a cycle
- The Xilinx CPLD features a "bus hold" function that will weakly try to hold the data bus high when it's high and low when it's low.
- Neither the 6507 nor any of the other chips on the data bus have any trouble overpowering the CPLD's bus-hold circuitry when they "want" to, but the bus-hold circuitry can keep the bus state stable when nothing is deliberately driving it.
- Writing a RAM address with the data it already held is harmless.
- The 6507 only drives the databus during phi2 when it wants to perform a write cycle.
Using all of these facts together, the 4A50 cart RAM cycle performs a read after the address is stable, then--while keeping the chip selected--hits the /WE line shortly before the 6507 drives phi2. The RAM chip-select is then released shortly before the 6502 releases phi2. If the 6507 is performing a read cycle, it will read the data that was put on the bus during phi1 and held there by the CPLD. That data will then get written back to the RAM when the chip-select is released. If the 6507 is performing a write cycle, the data that was read from the RAM will be overwritten by the processor's data, and that will get written into RAM when chip-select is released. This technique produces nice waveforms on the data bus with the Heavy Sixer and the 2600jr. The 7800 seems to have pullups on the data bus which make things somewhat marginal, but testing instructions with many consecutive bus float states (e.g. "LDA (0,X)") suggests that things should be stable there as well.Note that the magic RAM write trick requires that the cartridge know when phi2 is going to start/end. This can be inferred by counting 14.31818Mhz clocks following a change of A0. This will work nicely for NTSC machines, but PAL machines will require the use of a different oscillator. Otherwise, code executing in RAM that performs an STA WSYNC that idles the CPU for 75 cycles will likely fail.Hotspot discriminationOn most 6502-based systems, reading a random RAM address is generally pretty harmless. On a 2600 running a 4K cart, the only read operation that will have any sort of side-effect is a read of the RIOT timer (which clears the interrupt latch). Unless software happens to use this latch (the interrupts themselves are not used) even that read will be harmless. Thus, it is safe to do things like use the "BIT abs" opcode to harmlessly skip over a two-byte instruction without worrying about what instruction is being skipped.On typical bank-switch carts, there are a few more addresses which can cause trouble, but there still aren't a whole lot. Superchip carts add bigger 'problematic' address ranges (reading the "write addresses" of RAM is bad), and the Supercharger has an even bigger one. Any accidental access of the form $N0xx (N being odd; xx being anything) can spell disaster. So trying to skip over something like "ORA #$10" will trigger an access to $1009. Oops.To alleviate this problem somewhat, the 4A50 cartridges will ignore any access to a bank-switch hotspot unless the previous byte fetched was $6X or $7X. Thus, the hotspot at $68A9 will be triggered by "BIT $68A9" but not by "BIT $0809" or by a BIT instruction that's used to skip over an "LDA #$08" instruction (which would access address $0809). Although a programmer could still get tripped up by trying to skip over e.g. an "LDA #$68" instruction, operands of the form $6x-$7x seem like they'd be less common than many other values.Memory Mode PresetsOne thing that's often desirable in a bank-switching scheme is to allow the programmer to switch among many banks efficiently. Unfortunately, conventional techniques for allowing this require bank-switching hardware to include registers to hold the different bank selections among which the program will switch. In a part with 36 macrocells, that approach simply isn't going to fly; even in a part with 72 macrocells, it can only go so far.What the 4A50 bank-switch method does to alleviate this is to use RAM instead. RAM addresses from $E8 to $FF are reserved for magic bank-switch hotspots. Reading or writing one of those will cause the value read or written to be loaded into one of the CPLD's bank-switch registers; consequently, it will "feel" as though the CPLD has twelve more bank-switch registers than it really does.Note that accesses from $01E8-$01FF will not trigger these hotspots. Although it is important that any of those hotspots that is actually used not overlap the stack, code which doesn't need all twelve of the hotspots may place the stack on top of the ones it doesn't use.
5 Comments
Recommended Comments