Jump to content
IGNORED

Part 2 - CDFJ Details


SpiceWare

Recommended Posts

CDFJ is built around data streams.  A data stream is a sequence of data elements made available over time - basically a list of values such as:

  • 10
  • 55
  • 20
  • 25
  • ...

 

The data stream will auto-advance so that the first time you read it you'd get 10, the next time you'd get 55, then 20 and so on. Data streams are very helpful during the kernel as you can update any TIA register in just 5 cycles:

LDA #DS0DATA
STA GRP0

 

 

General Purpose Data Streams

 

There are 32 general purpose data streams named DS0DATA thru DS31DATA. Each data stream has an increment value associated with it for the auto-advance feature. Example increments:

  • 1.0 for 1LK player graphics
  • 0.20 to repeat chunky playfield graphics over 5 scanlines
  • 2.0 to skip every other value. This is extremely useful for interlaced bitmap graphics, which are typical seen as 96 or 128 pixels across.

 

 

Communication Data Stream

 

There is a dedicated communication data stream named DSCOMM used for transferring data between the 6507 and ARM processors.

 

 

Jump Data Streams

 

There are 2 data streams for jumps named DSJMP1 and DSJMP2.  These override JMP $0000 and JMP $0001 respectively, providing 3 cycle flow control within the kernel. This means instead of counting scanlines and branching your kernel would look something like this kernel from Draconian:

; data stream usage for game screen
DS_GRP0             = DS0DATA
DS_GRP1             = DS1DATA
DS_HMP0             = DS2DATA
DS_HMP1             = DS3DATA
DS_MISSILE0         = DS4DATA   ; HMM0 and ENAM0
DS_MISSILE1         = DS5DATA   ; HMM1 and ENAM1
DS_BALL             = DS6DATA   ; HMBL and ENABL
DS_COLOR            = DS7DATA   ; color change for players and ball only
DS_SIZE             = DS8DATA   ; size change for all objects

NormalKernel:               ;   20
        lda #DS_SIZE        ; 2 22 <- just to keep stream in sync
nk1:    lda #DS_COLOR       ; 2 24 2 33 from Resm0Strobe28 <- just to keep stream in sync
        lda #DS_HMP0        ; 2 26
        sta HMP0            ; 3 29
        lda #DS_HMP1        ; 2 31
        sta HMP1            ; 3 34
        lda #DS_MISSILE0    ; 2 36
        tax                 ; 2 38
        stx HMM0            ; 3 41
        lda #DS_MISSILE1    ; 2 43
        tay                 ; 2 45
        sty HMM1            ; 3 48
        lda #DS_BALL        ; 2 50
        sta HMBL            ; 3 53
        sta ENABL           ; 3 56
        lda #DS_GRP0        ; 2 58
        sta GRP0            ; 3 61
        lda #DS_GRP1        ; 2 63
        sta WSYNC           ; 3 66/0
        sta HMOVE           ; 3  3
        sta GRP1            ; 3  6 <- also updates GRP0 and BL
        DIGITAL_AUDIO       ; 5 11
        stx ENAM0           ; 3 14
        sty ENAM1           ; 3 17
        jmp FASTJMP1        ; 3 20
        
ExitKernel:
        ...
        
Resm0Strobe23:                  ;   20
        sta RESM0               ; 3 23
        lda #DS_SIZE            ; 2 25
        sta NUSIZ0              ; 3 28  <- changes missile size
        jmp nk1                 ; 3 31
        
Resm0Strobe28: 
        ...

 

The data stream DSJMP1 is initially filled with addresses for NormalKernel, and ends with the address for ExitKernel.  The player, missile and ball reuse routines will change individual values in DSJMP1 to jump to reposition kernels such as Resm0Strobe23. 

 

 

Audio Data Stream

 

Lastly there's an audio data stream named AMPLITUDE. It will return a stream of data to play back a digital sample, or to play back 3 voice music with custom waveforms. The macro DIGIT_AUDIO in the above Draconian kernel is defined as:

    MAC DIGITAL_AUDIO
        lda #AMPLITUDE
        sta AUDV0
    ENDM

 

 

6507 Interface

 

From the Atari's point of view CDFJ only has 4 registers defined in the cartridge space.

  • DSWRITE at $1FF0
  • DSPTR at $1FF1
  • SETMODE at $1FF2
  • CALLFN at $1FF3

 

DSPTR is used to set the Display Data address for the DSCOMM data stream - basically setting the RAM location the 6507 code wishes to write to.  Write the low byte of the address first, then the high byte.

 

DSWRITE writes to the address set by DSPTR.  After writing, DSPTR advances to the next RAM location in preparation for the next write:

; define storage in Display Data
_DS_TO_ARM:        
_SWCHA:     ds 1        ; controller state to ARM code
_SWCHB:     ds 1        ; console switches state to ARM code
_INPT4:     ds 1        ; firebutton state to ARM code
_INPT5:     ds 1        ; firebutton state to ARM code

...

        ldx #<_DS_TO_ARM
        stx DSPTR
        ldx #>_DS_TO_ARM
        stx DSPTR
        ldx SWCHA           ; read state of both joysticks
        stx DSWRITE         ; written to _SWCHA 
        ldx SWCHB           ; read state of console switches
        stx DSWRITE         ; written to _SWCHB 
        ldx INPT4           ; read state of left joystick firebutton
        stx DSWRITE         ; written to _INPT4 
        ldx INPT5           ; read state of right joystick firebutton
        stx DSWRITE         ; written to _INPT5 

 

SETMODE controls Fast Fetch Mode and Audio Mode.  Fast Fetch mode overrides the LDA #immediate mode instruction and must be turned on to read from the data streams.  Audio Mode selects between digital sample mode or 3-voice music mode.

 

CALLFN is used to call the function main() in your C code. The value written to CALLFN determines if an interrupt will run to periodically update AUDV0. The interrupt is needed when playing back digital samples or 3-voice music.

        ldy #$FE    ; generate interrupt to update AUDV0 while running ARM code
        sty CALLFN

        ldy #$FF    ; do not update AUDV0
        sty CALLFN

 

 

C Interface
 

From the C code a number of functions have been defined to interact with CDFJ and Display Data:

  • setPointer()
  • setPointerFrac()
  • setIncrement()
  • setWaveform()
  • setSamplePtr()
  • setNote()
  • resetWave()
  • getWavePtr()
  • getWavePtr()
  • getPitch()
  • getRandom32()
  • myMemset()
  • myMemcpy()
  • myMemsetInt()
  • myMemcpyInt()

 

 This section will be expanded upon later.

  • Like 4
Link to comment
Share on other sites

Hi Darrell, thanks for the explanation on data streams.

Question: what is the role of the DSCOMM data stream when moving data between 6507 and ARM, in context with DSPTR and DSWRITE?

 

Another question: for my first CDJF project I've been pushing the limits of the ARM chip until it wasn't able to finish before the end of the VBLANK. I had to find out on real hardware, as I understand Stella doesn't take the ARM cycles into account.

While I fixed the problem by optimizing my C code, I find it hard to figure out how much 6507-cycles my ARM code is actually burning.

What do you normally do in a case like this?

 

Thanks,

Dion

  • Like 1
Link to comment
Share on other sites

The regular data streams are 0-31, DSCOMM is 32.  The addresses for regular data streams can only be set by the C code. The address for DSCOMM is set by 6507 code using DSPTR, though could also be set by C code. DSWRITE only writes to DSCOMM.

 

If Display Data is set like this:

_DS_TO_ARM:        
_SWCHA:     ds 1        ; controller state to ARM code
_INPT4:     ds 1        ; firebutton state to ARM code

_DS_FROM_ARM:
_COLUBK:    ds 1        ; background color from ARM code

 

Then the 6507 can send the values to the ARM, call the ARM code which calculates background color based on the joystick, then get the calculated background color by doing this:

        ldx #<_DS_TO_ARM
        stx DSPTR
        ldx #>_DS_TO_ARM
        stx DSPTR
        
        ldx SWCHA           ; read state of both joysticks
        stx DSWRITE         ; written to _SWCHA 
        ldx INPT4           ; read state of left joystick firebutton
        stx DSWRITE         ; written to _INPT4 
        ldx #$FF            ; flag to Run ARM code w/out digital audio interrupts
        stx CALLFN          ; runs main() in the C code

        lda #DSCOMM         ; read value in _COLUBK
        sta COLUBK          ; set the background color        
          

 



 

Link to comment
Share on other sites

To see how long routines run I set up the 6507 code to send time remaining (from RIOT register INTIM) to the C code. The C code can then display it using the score. I'll most likely cover that in Part 4 or 5 of the CDFJ tutorial.

 

If you don't want to wait that long you can check out this blog entry to see how I did it for DPC+. Do note it's a bit messed up (specifically the code blocks) due to the recent forum upgrade, but if you download the source you can see it. Look for TimeLeftOS and TimeLeftVB in the 6507 code. Also note the DPC+ registers start with DF instead of DS.  That's a misnomer because Data Fetcher implies read-only even though they can also be used to write data. This is because the names come from DPC where the only thing you could use them for was to fetch values from Display Data. In DPC+ these registers evolved to also include the ability to write to Display Data.

  • Like 1
Link to comment
Share on other sites

On 12/5/2019 at 5:21 PM, SpiceWare said:

To see how long routines run I set up the 6507 code to send time remaining (from RIOT register INTIM) to the C code. The C code can then display it using the score.

...

Thanks! Yes, it makes sense to use the score to display the remaining time for Vertical Blank and Overscan. I guess I first need to get the score-display routines working ?

Link to comment
Share on other sites

On 12/5/2019 at 5:21 PM, SpiceWare said:

To see how long routines run I set up the 6507 code to send time remaining (from RIOT register INTIM) to the C code. The C code can then display it using the score.

...

Displaying the remaining 'free' time in the score-display works for me!

Btw I found that using mod (%) or div (/) in C with anything other than a power-of-two number will stop execution of the Thumb routine that was called. 

Converting an integer to an array of numbers is clearly not the way to go on an ARM CPU ?

 

Screenshot_7.thumb.png.da820642df8fc1b8b15ecbdec84f18ea.png

Edited by Dionoid
  • Like 1
Link to comment
Share on other sites

Great! Nice 7 digit score!

 

Yep - the ARM in the Harmony/Melody supports multiplication, but not division.  I know the compiler's smart enough to use shifting to divide by powers of 2, but hadn't tried other values to see what would happen.

 

I've finished writing code for Part 3 - starts out with the splash screen

 

collect3.thumb.png.d5fe140352d9709a9a46dca3e2599929.png

 

after a couple seconds it goes to the menu screen

 

collect3_1.thumb.png.6dadddd723ee82ecf781cef8c94a66de.png

 

hitting RESET in the menu takes you to the game screen, where you can move the players around. Hitting SELECT takes you back to the menu.

collect3_2.thumb.png.a007deb852ca15f8d833ab91883342e2.png

 

Next I need to go thru the code to clean it up and comment it better so I can post it.

  • Like 1
Link to comment
Share on other sites

  • 1 year later...
On 1/17/2021 at 7:13 AM, Andrew Davie said:

I have found a seemingly valid use-case for auto-increment of 0.  Here's hoping it works!

 

It does, I used it in Draconian:

  setIncrement(DS_STATION_COLOR,  0, 0);

 

Draconian uses auto-detection for NTSC/PAL/SECAM and populates a color table in Display Data RAM for each object. The value for the station is read multiple times during the score kernel to set the color of the initial copy of P1, which is used to draw the radar.  Don't recall why I didn't do the same for the FormationColor, which is loaded from ZP RAM.  Possibly timing was tight and I needed to free up 1 cycle.

 

ScoreKernel:
        lda #<DS_PF0L           ; 2  2
        sta PF0                 ; 3  5
        sta ENABL               ; 3  8 - on VDEL
        DIGITAL_AUDIO           ; 5 13
        lda #<DS_PF1L           ; 2 15
        sta PF1                 ; 3 18
        lda #<DS_PF2L           ; 2 20
        sta PF2                 ; 3 23
        lda #<DS_RADAR_GRP0     ; 2 25
        sta GRP0                ; 3 28 - on VDEL
        lda #<DS_RADAR_GRP1     ; 2 30
        sta GRP1                ; 3 33 - updates GRP0 and BL as well
        lda #<DS_STATION_COLOR  ; 2 35
        sta COLUP1              ; 3 38
        lda #<DS_FORMATION_GRP0 ; 2 40
        sta GRP0                ; 3 43 - on VDEL
        lda #<DS_PF0R           ; 2 45
        sta PF0                 ; 3 48 - PF0R, 28-49
        lda #<DS_PF1R           ; 2 50
        sta PF1                 ; 3 53 - PF1R, 39-54
        lda #DS_FORMATION_GRP1  ; 2 55
        tay                     ; 2 57
        lda #<DS_PF2R           ; 2 59
        sta PF2                 ; 3 62 - PF2R, 50-65
        lda FormationColor      ; 3 65
        sty GRP1                ; 3 68 - updates GRP0 as well
        sta COLUP1              ; 3 71
        SLEEP 2                 ; 2 73
        jmp FASTJMP             ; 3 76

 

Link to comment
Share on other sites

  • 4 months later...
On 6/19/2021 at 9:04 AM, Lillapojkenpåön said:

Maybe this have been covered somewhere but I've never seen it, how exactly does the cdfj registers auto increment or decrement? How does the ARM even know that the 6502 is reading or writing to that register? How does the magic change happen?

 

Covered in the comments of this old DPC+ARM blog entry, so no surprise you hadn't run across it.

 

On 5/1/2015 at 10:15 AM, SpiceWare said:

There's no connection between the 6507 and the ROM. The connections are this:

6507 <--> ARM <--> Flash (ROM) and SRAM (RAM)

The ARM emulates everything about the cartridge, not just the DPC+ data fetchers. This emulation includes the bank switching hardware, any expansion RAM, and the ROM - this includes the connection between the 6507 and the simplest 2K and 4K ROMs.

 

Since the ARM is emulating everything it knows when the 6507 reads from a specific address and does whatever is required to return the expected value on the databus, be it a value from ROM, RAM, a datastream, a value calculate on the fly for 3-voice music, etc.

 

 

The Further Reading section in Part 1 also covers what the ARM is doing. @cd-w's Perfect Harmony goes into detail on how it handles a ROM cartridge with bankswitching:

 

Quote

The software on the harmony cart is also very straightforward. It doesn't attempt to synchronize clocks with the Atari. Instead, it just sits in an permanent loop, servicing requests from the Atari as follows:


1) Wait for an address request from the Atari (A12 high).
2) Read address bus (A0-A11).
3) Fetch data from flash memory at the requested address (may perform bankswitching here).
4) Assert data on the data bus (D0-D7).
5) Wait until there is a change of address (A0-A12).
6) Repeat forever.

This approach is necessary as the cartridge port doesn't output the 6502 clocks (phi1 and phi2) and also doesn't have a R/W line to distinguish between reads and writes. The wait for A12 at the beginning has the effect of keeping the ARM and Atari loosely synchronized.

 

There's just enough time during (3) for it to do the various things required to emulate the BUS, DPC, DPC+, CDFJ, etc coprocessor assisted cartridges.

  • Thanks 1
Link to comment
Share on other sites

Correct, in step 3 it checks if the address corresponds to a "CDFJ Register" and reacts accordingly.

 

Additionally, in the case of FastFetch and FastJump, it's monitoring the ROM values it puts on the databus to see if they correspond to a LDA #xx, JMP $0000, or JMP $0001 instruction.  If so, it overrides the xx, $0000, or $0001 value that would have next been put on the databus.

  • Thanks 1
Link to comment
Share on other sites

  • 8 months later...

When/how is what DSCOMM is "pointing" to ... reset?  That is, when is the pointer reset to point to the first item?

That is, if I load from DSCOMM, and there are a bunch of things I'm loading that way, how to I know what it's feeding me?

 

 

Edit:OK I think I might understand... please correct where wrong!

 

It's basically hardwired, or intended to be used, such that you first write vars which go to the storage at DS_TO_ARM.  More specifically, the address of this storage is written as lo/hi to DSPTR.  Then you write multiple bytes to DSWRITE which then are stored at that DS_TO_ARM memory area.  The label "DS_FROM_ARM" is superfluous; it's there as a visual placeholder only.  As soon as you start reading from DSCOMM (lda #DSCOMM), you are in fact reading from whatever DSPTR currently points to (which increments by one with each write to DSWRITE, or read via DSCOMM).  There is no "reset" per se; other than writing DSPTR.  You can either write it, or assume that once you've written your vars for TO ARM, you can do some arm stuff and then read the next n bytes (however many in your from section) via "lda #DSCOMM").

 

Edited by Andrew Davie
Link to comment
Share on other sites

3 hours ago, Andrew Davie said:

Edit:OK I think I might understand...

 

That is correct.

 

If you have _DS_FROM_ARM somewhere other than directly after the _DS_TO_ARM section you could either write the _DS_FROM_ARM address to DSPTR in the 6507 code, or use setPointer(DSCOMM, _DS_FROM_ARM) in the C code.

Link to comment
Share on other sites

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...