Cart port read write timing

Van · December 9, 2012

I have a question, does anyone know the timing for the cart port. I'm working on a hardware project, very similar to a bank switch controller, and trying to determine the setup timing for the data lines. The lack of RD/WR strobe to sync things is bothering me.

For a RD, how much setup time do I have before the data needs to be stable for the 6507, and for a WR, how long after the start of the access before I can sample the data bus?

I don't think a read will be too much of a problem, quick logic should be able to stabilize long before the 6507 reads the bus, but I'm worried about when the data bus is stable before I sample it. Are there any hw delay tricks used in Super-carts for the ram WR timing? cascaded gates or the like?

Edited December 9, 2012 by Van

+batari · December 14, 2012

Hello,

You have a lot of time for a read, roughly 500 ns after an address change is fine, maybe even 525 ns. Writes are a problem though as depending on the addressing mode, the write may occur on either the first or second cycle following an address change and the data valid window is small. The best way to deal with this, I think, is to capture the data bus before an address change (50-100ns is safe, maybe more). One way to do this is use an oscillator and capture values every 100ns or less and use the second to last value once the address changes. You can also time forward but you need two writes if the address stays the same for two cycles.

For more info on timing, there was a document posted about the Graduate addon. I can't be bothered to find it now but it's out there on on this forum.

Van · December 14, 2012

Thanks for the pointers. I didn't think reads would be too much of a problem. I'm designing with '595 and '165 shift registers, working on memory mapped I/O ports. There is about 30-40 nS prop delay through the address decoding, so should be good before the end of the read cycle, in fact the '595's output should hold past the address bus by 10-15 nS.

Writes are a problem though as depending on the addressing mode, the write may occur on either the first or second cycle following an address change and the data valid window is small.

Yea, this is where I thought there would be a problem, I really need to strobe the '165's /LD b before the end of the write window, but after the data bus settles.

Which addressing modes pose the greatest problem? The nature of the kernel I'm planning demands strict rules for access to these I/O registers, so that my PIC can follow what the 6507 is doing. So I would like to avoid the worst case modes.

Thanks again, searching now for 'Graduate addon'

+batari · December 14, 2012

The absolute writes are easiest as you can time these forward without too much trouble. The indexed modes generally hold the address for two cycles and do the write on the second cycle. You don't know if you're encountering one of these until the second cycle. But, if the indexed modes wrap a page, then you will only need to worry about the single cycle again.

If you're trying to create hardware to play existing games, you have to handle both. Timing backward from an address change from a write address works well for all kinds of writes.

Timing forward is usually easier, though. if you are making new hardware and only need to handle your own programming, you could hypothetically just handle the single cycle writes and make sure any indexed writes always wrap a page. Or, I suppose you could just handle the two-cycle writes, i.e. always use indexed writes that don't wrap a page.

I did find the graduate document - it's been a while since I read it but someone did draw up a few timing diagrams that helped me:

http://www.atarimuseum.com/archives/pdf/videogames/2600/graduate_design_notes.pdf

Van · December 14, 2012

Thanks so much! I was up far too late last night reading the Graduate design notes, good info there. Also spent allot of the last few hours reading the Chimera thread. I'm late to the party, but hate to see it die :-( At least you have carried the Harmony forward from those ashes, WELL DONE!! They'll have to pry my HCart from my cold hands, LOVE it.

To clarify, my project's objective is a midi interface with a co-processor (PIC) doing the midi Rx and note processing, not necessarily in a cart footprint. The PIC is linked to the 6507 through the cart interface via some (8 max) memory mapped registers (drop boxes), I was thinking 4 Rd & 4 Wr, but just Rd access could be OK too, for the proof-of -concept. Can see 'complex' issues for Wr functions that is hard to justify for this specialized hardware design The A2600 would be running a basic 4K kernel on a EEPROM, that just keeps loading TIA sound params from the MM registers; in fact, not even sure about using video. Without the overhead of refreshing, the kernel would free up CPU cycles for a higher sample rate, could even approach a crude Covox function.

I've been working on MIDIBox projects for awhile, so my first thought was to leverage the code base from those projects for the heavy lifting of the serial port and Midi protocol Their HW model relies on a 'SPI like' bus and chains of shift registers to interact with UI, SIDs, POKEYs, ect. (to many projects, too little time) I think this approach could work here also, as long as the kernel's TIA update cycle allows the PIC to reload the serial registers between the reads. Asynchronous access on both sides. Of course, if the PIC didn't have a 'new' update, the worst case is TIA would be updated with the same value it had last update. With a custom kernel, controlling the access on both sides can be synchronized to satisfy both cpus. Fingers Xed

I've been working up a prototype sch. that seem close to complete, but 'I don't know, what I don't know'! Will be bread boarding soon.

Thanks again.

supercat · December 18, 2012

A useful trick with writes, which I invented in 2006 I think, is to use a "bus keeper" circuit which weakly pulls pins on the data bus high if they're high, and low if they're low. On the first half of each cycle, read a byte from RAM at the appropriate address and output it briefly on the bus (whereupon the bus-keeper circuit will effectively latch it). Just before the end of the second half, copy the value on the bus. If the 6507 performed a read, the data will still be on the bus. If it performed a write, the new data will be on the bus.

Van · December 18, 2012

Very interesting idea Supercat. Reminds me of the 'bus stuffing' technique, with the Graduate I think. Does your design include a uController directly connected to the cart bus? I can't think of how to pull it off with MSI logic alone.

My design includes a PIC co-processor, but it's loosely linked with a SPI bus; data transfers are very asynchronous. The a2600 interface looks like a very small SPI memory to the uC. SO the cart bus interface is simple in that it's just decoder logic and shift registers, at this point. I can post a Sch. but it's still very much a WIP. After I test the basic hardware, I'll be moving it to a CPLD.

Maybe someday down the road, a modified Melody cart could be the interface. There is a SPI interface already included in the PCB layout, just needs a software change. Not a lot different then a memory mapper.

supercat · December 19, 2012

Very interesting idea Supercat. Reminds me of the 'bus stuffing' technique, with the Graduate I think. Does your design include a uController directly connected to the cart bus? I can't think of how to pull it off with MSI logic alone.

That design used a Xilinx 9536XL CPLD with 36 macrocells, clocked at 14.3818MHz (12x cycles per bus cycle), which was accurate enough that the timing would be correct at the end of a 75-cycle fetch following an STA WYNC,. I didn't watch for changes on all the address bits (not enough logic to do that). Instead, I resynched the clock every time I saw a change on A0. I'm pretty happy with the bank scheme I implemented back then, even though nowadays one could do better with an ARM.

Van · December 19, 2012

An xc9536XL, your Kung Fu is greater then mine!

I am targeting to a XC9572 now; I have some on hand. But if things work, would have to think about the XLs and the Voltage issues. Did you have any problems driving the bus with the 3.3V output levels?

Yea, an ARM is the way of the future. Love my Harmony cart, but just seems 'wrong' that the cart has like x20 the MIPS more then the host

supercat · December 20, 2012

I am targeting to a XC9572 now; I have some on hand.

While building a cartridge using a CPLD plus discrete RAM and ROM is an interesting challenge, I don't think it's really viable as a production-worthy approach. Using an ARM is just so much cheaper and easier.

That having been said, my banking scheme was designed around the goal of using macrocells as efficiently as possible. It had four bankable areas:

$1000-$17FF -- Bankable to any 2K chunk of RAM or ROM

$1800-$1DFF -- Bankable to the first 1.5K of any 2K chunk of RAM or ROM

$1E00-$1EFF -- Bankable to any 256-byte chunk of RAM or ROM

$1F00-1FFFF -- Fixed at last bank of ROM, but accesses will copy bits 3-6 of the address to bits 8-11 of the address for the $1E00-$1EFF bank.

The "interesting" behavior of the last bank was designed around high-res graphics. In particular, if $1F00-$1F5F was filled with $01,$02,$04,$08,$10,$20,$40,$80 and $1F80-$1FDF was filled with the complements of those, then a "pixel set" for coordinates in X and Y registers would be:

 lda $1F00,x
 ora $1E00,y
 sta $1E00,y

and a "pixel clear" would be:

 lda $1F80,x
 and $1E00,y
 sta $1E00,y

I also discovered that the same hardware worked very well for my game Ruby Runner; I defined the codes for various object types in such a way as to allow my board-scanning loop to be something like:

; Running from 6507 address $1E00
 lax (scanPtr),y
 nop $1F00,x ; bank-switch
 ... start processing code for that type of object

Rather than have to use a computed jump table to dispatch code for each type of object, I simply used the $1Fxx area to select one of 16 pieces of code to execute.

If I were designing a CPLD-based banking system today, I think I would specify that code should only be run from addresses $1800-$1FFF or $0880-$08FF. If code is only executed from those addresses (and in particular, not from $1000-$17FF), then the normally-inaccessible address bits A13-A15 for any access to $1000-$17FF will have been on the data bus D5-D7 during the last access that wasn't in that range. Thus, one could fairly easily setup a scheme in which $1000-$17FF was one bankable "data" area, $3000-$37FF was another, $5000-$57FF was another and $7000-$7FFF was a fourth. It would be especially nice if one could have the start address of each such area be specified with resolution of a page rather than a 2K block, but I don't think that would be feasible in a 95C72; you'd probably have to go up at least one more size for that.

Sign In

Cart port read write timing

Recommended Posts

Van

Link to comment

Share on other sites

+batari

Link to comment

Share on other sites

Van

Link to comment

Share on other sites

+batari

Link to comment

Share on other sites

Van

Link to comment

Share on other sites

supercat

Link to comment

Share on other sites

Van

Link to comment

Share on other sites

supercat

Link to comment

Share on other sites

Van

Link to comment

Share on other sites

supercat

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members

Apps

My Activity Streams

More