Jump to content
IGNORED

CDFJ support?


Andrew Davie

Recommended Posts

1 hour ago, JetSetIlly said:

One thing we still need to figure out is how the STMF32 differs to the LPC2000 internally. We now understand the memory model differences I believe, but there are still questions about the timer and whether there is a MAM and/or how it works in the STMF32.

 

In the meantime, I believe we can auto-detect differences between the memory models quite easily.

 

Auto-detection of what model a CDF binary is targetting can be done through inspection of 0x863 and 0x867:


        if data[0x863]&0x20 == 0x20 && data[0x867]&0x20 == 0x20 {
            memModel = memorymodel.PlusCart
        } else {
            memModel = memorymodel.Harmony
        }

 

And similarly for DPC+ the differences are in 0xc4b and 0xc4f:


        if data[0xc4b]&0x20 == 0x20 && data[0xc4f] == 0x20 {
            memModel = memorymodel.PlusCart
        } else {
            memModel = memorymodel.Harmony
        }

 

There are other differences but these are the ones I'm using.

 

 

I have been thinking about the detection too. I would suggest we change the signature or the version in the driver code.

 

But this should be discussed with Chris (@cd-w) Darrell (@SpiceWare) Fred (@batari) and John (@johnnywc) once we get CDFJ up and running on the STM32F4

 

 

  • Like 1
Link to comment
Share on other sites

On 12/10/2021 at 7:39 AM, Thomas Jentzsch said:

Got Cubix running in Stella using STM32F addresses. I did not change Andrew's code at all, just used the attached fixed files.

 

It doesn't work in gopher2600 though. So one of the emulators must be wrong. :) 

cubiks.bin 32 kB · 5 downloads

defines_cdfj.h 5.89 kB · 1 download custom.boot.lds 1.67 kB · 1 download custom.S 1.59 kB · 1 download

I made a bit of a hack/change to the makefile for Cubiks so I can easily build either without too much thought.

If you want to switch, either "make STM" or "make LPC" and then followed by  "make" will create a binary using the correct header files.

Build of STM version working OK for me on Gopher.

Link to comment
Share on other sites

 

Not working state of CDFJ driver for PlusCart/UnoCart. The digital audio part and the update music data fetcher are missing. The driver is only for the 0x4a subversion (CDFJ). Our collect3 binary only produces a black screen with this driver.

 

/*
 * cartridge_emulation_cdfj.c
 *
 *  Created on: 05.12.2021
 *      Author: stubig
 */
#include <ctype.h>
#include <stdlib.h>
#include "cartridge_emulation.h"
#include "cartridge_firmware.h"
#include "global.h"

#include "cartridge_emulation_dpcp.h"

#define LDA_IMMEDIATE 0xA9
#define JMP_ABSOLUTE  0x4C

#define DSCOMM 0x20                // datastream used for DSPTR and DSWRITE
#define DSJMP_BASE (uint16_t)0x21  // datastream base used for JMP FASTJMP1 and FASTJMP2 (0x22)



void emulate_CDFJ_cartridge( uint32_t image_size)
{
    uint8_t prev_rom = 0;

    uint16_t addr, addr_prev = 0, addr_prev2 = 0, data = 0, data_prev = 0;

    uint8_t* ccm = CCM_RAM;
    memcpy(ccm, buffer, 0x800); // CDFJ ARM Driver code (not really needed)

    uint8_t *myProgramImage = buffer + 4*1024, *bankPtr = buffer + 28*1024;
    uint8_t *myDisplayImage = ccm + 0x800;

    uint8_t *myAmplitudeStream = ccm + 0x23;
    uint8_t *myFastjumpStreamIndexMask = ccm + 0xfe;
    uint8_t *myDatastreamBase = ccm + 0x0098;
    uint32_t *myDatastream32Base =  ((uint32_t*)0x10000000) + 0x0098;
    uint32_t *myDatastreamCommPointer32 = ((uint32_t*)0x10000000) + 0x0098 + ( DSCOMM * 4);

    uint32_t *myDatastreamIncrement32Base =  ((uint32_t*)0x10000000) + 0x0124;
    uint8_t *myWaveformBase = ccm + 0x01b0;


    uint16_t bankAddr[8] = {0x4000, 0x5000, 0x6000, 0x7000, 0x7000, 0x1000, 0x2000, 0x3000 };

    uint16_t myDatastreamIndex;

    // Assuming mode starts out with Fast Fetch off and 3-Voice music,
    // need to confirm with Chris
    bool myFastFetchOn = false, myDigitalAudioOn = false, myFastJumpActive = false;


    uint32_t thumb_code_entry_point = (*(volatile uint32_t*)(&buffer[0x808]));//(uint32_t)0x20004d85; //buffer + 0x3bf4 + 1; //0x0c00;


    if (!reboot_into_cartridge()) {
        return ;
    }

    __disable_irq();    // Disable interrupts

    while (1)
    {
        while (((addr = ADDR_IN) != addr_prev) || (addr != addr_prev2))
        {
            addr_prev2 = addr_prev;
            addr_prev = addr;
        }

        // got a stable address
        if (addr & 0x1000)
        { // A12 high
            if (addr >= 0x1FF0 && addr <= 0x1FFB){
                switch(addr){
                    case 0x1FF0:  // DSWRITE
                        while (ADDR_IN == addr) { data_prev = data; data = DATA_IN; }
                        myDisplayImage[ (myDatastreamCommPointer32[0] >> 20) ] = (uint8_t)(data_prev DATA_IN_SHIFT);
                        myDatastreamCommPointer32[0] += 0x00100000;
                        break;
                    case 0x1FF1:  // DSPTR
                        while (ADDR_IN == addr) { data_prev = data; data = DATA_IN; }
                        myDatastreamCommPointer32[0] <<= 8;
                        myDatastreamCommPointer32[0] &= 0xf0000000;
                        myDatastreamCommPointer32[0] |= (((uint32_t)data_prev) << 20);
                        break;
                    case 0x1FF2:  // SETMODE
                        while (ADDR_IN == addr) { data_prev = data; data = DATA_IN; }
                        data = (uint8_t)(data_prev DATA_IN_SHIFT);
                        myFastFetchOn = ((data & 0x0F) == 0);
                        myDigitalAudioOn = ((data & 0xF0) == 0);
                        break;
                    case 0x1FF3:  // CALLFN
                        while (ADDR_IN == addr) { data_prev = data; data = DATA_IN; }
                        addr_prev = ADDR_IN;
                        DATA_OUT = ((uint16_t)0xEA)DATA_OUT_SHIFT;                // (NOP)
                        SET_DATA_MODE_OUT;

                        ((int (*)())thumb_code_entry_point)(data_prev);
                        // disable_irq_timer();
                        // __disable_irq();
                        // now send the VCS Program Counter to last address
                        addr = ADDR_IN;
                        while (ADDR_IN == addr);

                        addr = ADDR_IN;
                        DATA_OUT = ((uint16_t)0x4C)DATA_OUT_SHIFT;                // (JMP)
                        while (ADDR_IN == addr);

                        addr = ADDR_IN;
                        DATA_OUT = (addr_prev & 0xff)DATA_OUT_SHIFT;    // (Low Byte of new addr)
                        while (ADDR_IN == addr);

                        addr = ADDR_IN;
                        DATA_OUT = (addr_prev >> 8)DATA_OUT_SHIFT;    // (High Byte of new addr)
                        addr_prev = addr;                // set addr_prev for next loop
                        while (ADDR_IN == addr);
                        SET_DATA_MODE_IN;

                        break;

                    default: // Bank switching
                        bankPtr = &myProgramImage[ bankAddr[ ( addr & 7 ) ] ];
                        while (ADDR_IN == addr)
                        SET_DATA_MODE_IN;
                        break;
                }
            }else{
                if(myFastFetchOn){
                    if(prev_rom == JMP_ABSOLUTE ){
                        prev_rom = bankPtr[addr & 0xFFF];
                        if( (prev_rom & 0xfe) == 0 && bankPtr[(addr + 1) & 0xFFF ] == 0){
                            myDatastreamIndex = (((uint16_t)prev_rom) ) + DSJMP_BASE;
                            myFastJumpActive = true;
                            prev_rom = myDisplayImage[ myDatastream32Base[myDatastreamIndex] >> 20 ];
                            // increment datastream
                            myDatastream32Base[myDatastreamIndex] += 0x00100000;
                        }
                        goto _end_cycle;
                    }else if(prev_rom == LDA_IMMEDIATE ){
                        myDatastreamIndex = ((uint16_t)bankPtr[addr & 0xFFF]);
                        prev_rom = myDisplayImage[ myDatastream32Base[myDatastreamIndex] >> 20 ];
                        // increment datastream fractional
                        myDatastream32Base[myDatastreamIndex] += ( myDatastreamIncrement32Base[myDatastreamIndex] << 12);

                        goto _end_cycle;
                    }else if(myFastJumpActive){
                        myFastJumpActive = false;
                        prev_rom = myDisplayImage[ myDatastream32Base[myDatastreamIndex] >> 20 ];
                        // increment datastream
                        myDatastream32Base[myDatastreamIndex] += 0x00100000;
                        goto _end_cycle;
                    }
                    goto _normal_rom_access;

                } else {

                    // normal rom access
                    _normal_rom_access:
                    prev_rom = bankPtr[addr & 0xFFF];
                    _end_cycle:
                    DATA_OUT = ((uint16_t) prev_rom)DATA_OUT_SHIFT;
                    SET_DATA_MODE_OUT;
    //                updateMusicModeDataFetchers();
                    while (ADDR_IN == addr)
                            ;
                    SET_DATA_MODE_IN;
                }
            }
        }
    }

    __enable_irq();
}

 

Link to comment
Share on other sites

3 hours ago, Thomas Jentzsch said:

if(addr & 0x1000)

I doubt I can help much, but shouldn't you do "addr &= 0x1FFF" before the switch?

Yes, but the upper GPIOs should be pulled down, so 0x1FFF is the max we can get on the GPIO port. "addr &= 0x1FFF" isn't done on any other bankswitching too.

 

Link to comment
Share on other sites

A brief update on the (slow) progress.


After building several test ROMs by replacing bank 6 of @JetSetIlly's test ROM, I can now confirm that the following combinations are working on the PlusCart:

  • A standard 2K ROM (combat) in bank 6.
  • A custom 2K ROM which writes to DSWRITE, DSPTR and SETMODE and reads back the saved values with LDA_IMMEDIATE in FastFetchMode.

untested:

  • JMP_ABSOLUTE in FastFetchMode (can only be tested with valid data in one of the FASTJMP data streams).

failed:

  • Calling the "Initialize()" function via CALLFN.

 

the ARM entry point (&buffer[0x808]) from my previous code post is definitely wrong:

    uint32_t thumb_code_entry_point = (*(volatile uint32_t*)(&buffer[0x808]));

but using the bootloader entry point (0x20000808) or the entry point for the custom main function stored at the end of the bootloader (&buffer[0x864]) didn't worked either.

 

I had no luck with the bootloader entry point with my previous DPC+ tests, but the main function address at the end of the DPC+ bootloader (&buffer[0xc4c]) worked.

 

So I am not sure if we still have the wrong entry point, a problem with the bootloader, or if something is wrong in the CALLFN area of my code.

 

 

 

 

Link to comment
Share on other sites

11 minutes ago, Al_Nafuur said:

but using the bootloader entry point (0x20000808) or the entry point for the custom main function stored at the end of the bootloader (&buffer[0x864]) didn't worked either.

 

0x20000808 is the start of the Thumb code. The ARM code starts at 0x20000800 - the first two ARM instructions are all about entering Thumb mode.

 

(In Stella/Gopher2600 we enter at 0x20000808 because neither emulator emulate ARM mode, only Thumb mode. So depending on what your driver is doing you might need to enter at 0x20000800)

 

11 minutes ago, Al_Nafuur said:

So I am not sure if we still have the wrong entry point, a problem with the bootloader, or if something is wrong in the CALLFN area of my code.

 

What does your driver do when the ARM program has finished? In Gopher2600 I had to be careful about how/when I put the JMP instruction on the bus because of phantom reads.

Link to comment
Share on other sites

22 minutes ago, JetSetIlly said:

0x20000808 is the start of the Thumb code. The ARM code starts at 0x20000800 - the first two ARM instructions are all about entering Thumb mode.

 

(In Stella/Gopher2600 we enter at 0x20000808 because neither emulator emulate ARM mode, only Thumb mode. So depending on what your driver is doing you might need to enter at 0x20000800)

The STM32F407 only has the Thumb instruction set.

 

 

Quote

What does your driver do when the ARM program has finished? In Gopher2600 I had to be careful about how/when I put the JMP instruction on the bus because of phantom reads.

 

during the execution the data bus is set to 0xEA (6502 NOP instruction)

                        DATA_OUT = ((uint16_t)0xEA)DATA_OUT_SHIFT;                // (NOP)
                        SET_DATA_MODE_OUT;

on return from the custom function I am skipping one address bus change, to prevent jumping in at the end of a 6502 cycle (where the 6502 is already reading the data bus):

                        addr = ADDR_IN;
                        while (ADDR_IN == addr);                       // wait for end of current 6507 cycle.

                        addr = ADDR_IN;
                        DATA_OUT = ((uint16_t)0x4C)DATA_OUT_SHIFT;                // (JMP)
                        while (ADDR_IN == addr);

                        addr = ADDR_IN;
                        DATA_OUT = (addr_prev & 0xff)DATA_OUT_SHIFT;    // (Low Byte of new addr)
                        while (ADDR_IN == addr);

                        addr = ADDR_IN;
                        DATA_OUT = (addr_prev >> 8)DATA_OUT_SHIFT;    // (High Byte of new addr)
                        addr_prev = addr;                // set addr_prev for next loop
                        while (ADDR_IN == addr);
                        SET_DATA_MODE_IN;                        

 

 

this is similar to this (working part) of the DPC+ emulation:

https://gitlab.com/firmaplus/atari-2600-pluscart/-/blob/master/source/STM32firmware/PlusCart/Src/cartridge_emulation_dpcp.c#L353

 

 

Link to comment
Share on other sites

11 minutes ago, JetSetIlly said:

 

Oh yes. I hadn't noticed that.

 

Are we 100% sure that the gcc flags, -mcpu=arm7tdmi and -march=armv4t, are okay for the Cortex 4?

When the code runs in Thumbulator it should run on the STM32F407 too.

?

 

But I seem to remember that I have adjusted the compile flags for DPC+ too.

 

Link to comment
Share on other sites

 

-mthumb

 

but this should be needed for the emulators too..

 

14:08:12 **** Build of configuration Default for project Collect_demo ****
make custom2 
arm-none-eabi-gcc    -c -o custom.o src/custom.S
arm-none-eabi-gcc -mcpu=arm7tdmi -march=armv4t -mthumb  -Wall -ffunction-sections -save-temps -g -Wa,-a,-ad,-alhms=bin/main.lst  -Os     -c -o main.o main.c
arm-none-eabi-gcc -mcpu=arm7tdmi -march=armv4t -mthumb  -Wall -ffunction-sections -save-temps -g -Wa,-a,-ad,-alhms=custom.o  -Os   -o bin/custom2.elf custom.o main.o -T src/custom.boot.lds -nostartfiles -Wl,-Map=bin/custom2.map,--gc-sections 
arm-none-eabi-objcopy -O binary -S bin/custom2.elf bin/custom2.bin
arm-none-eabi-size custom.o main.o bin/custom2.elf
   text	   data	    bss	    dec	    hex	filename
     80	      0	      0	     80	     50	custom.o
    156	      0	      0	    156	     9c	main.o
    224	      0	      4	    228	     e4	bin/custom2.elf

14:08:16 Build Finished. 0 errors, 0 warnings. (took 4s.200ms)

 

Link to comment
Share on other sites

1 minute ago, Al_Nafuur said:

 

-mthumb

 

but this should be needed for the emulators too..

 


14:08:12 **** Build of configuration Default for project Collect_demo ****
make custom2 
arm-none-eabi-gcc    -c -o custom.o src/custom.S
arm-none-eabi-gcc -mcpu=arm7tdmi -march=armv4t -mthumb  -Wall -ffunction-sections -save-temps -g -Wa,-a,-ad,-alhms=bin/main.lst  -Os     -c -o main.o main.c
arm-none-eabi-gcc -mcpu=arm7tdmi -march=armv4t -mthumb  -Wall -ffunction-sections -save-temps -g -Wa,-a,-ad,-alhms=custom.o  -Os   -o bin/custom2.elf custom.o main.o -T src/custom.boot.lds -nostartfiles -Wl,-Map=bin/custom2.map,--gc-sections 
arm-none-eabi-objcopy -O binary -S bin/custom2.elf bin/custom2.bin
arm-none-eabi-size custom.o main.o bin/custom2.elf
   text	   data	    bss	    dec	    hex	filename
     80	      0	      0	     80	     50	custom.o
    156	      0	      0	    156	     9c	main.o
    224	      0	      4	    228	     e4	bin/custom2.elf

14:08:16 Build Finished. 0 errors, 0 warnings. (took 4s.200ms)

 

 

Yes. -mthumb just says to produce thumb instructions instead of arm instructions.

 

You could try -mcpu=cortex-m4 and -march=armv7e-m. That doesn't work on Gopher2600 as is currently stands, but it might be necessary for the hardware.

Link to comment
Share on other sites

10 minutes ago, Thomas Jentzsch said:

I vaguely remember a discussion regarding the Harmony, that sometimes non-Thumb code was created (especially for math functions IIRC) even though Thumb only was selected. Could this be the case here too?

 

Maybe when compiling for -mcpu=cortex-m4 and -march=armv7e-m.

 

This is from the objdump for that binary.

 

image.png.017787172df78cb85f504c5c917ef9a6.png

 

tbb isn't a Thumb instruction.

Edited by JetSetIlly
Link to comment
Share on other sites

Here are some excerpts from the discussions I mentioned:

Quote

I disassembled the code and found that the assembler is replacing the "LDR R4, =0xF80" with the 32-bit Thumb instruction "mov.w R4, #0xF80". Apparently, this is a ARMv6T2 Thumb instruction, but these chips are ARMv4T, I believe, so I don't know that this will work. 

I was able to fix this problem by adding ".cpu  arm7tdmi" to the top of the assembly file (even though I am using -mcpu=arm7tdmi -march=armv4t as compile flags).   I suspect this is a bug in gcc - I'm using the latest linaro toolchain from here, which hasn't been updated since 2019 (as all the new development is now for Cortex):

Quote

I have not found any way to prevent GCC generating ARM code for the divide instructions (I'm using the Linaro builds that do not support Cortex M0).   My solution was to take the divide functions from the GCC library and compile it directly into my code.

Maybe that helps.

Edited by Thomas Jentzsch
Link to comment
Share on other sites

I am not sure:

Quote

Afaik, -mthumb forces the compiler to emit only thumb instructions, but the -march and -mcpu settings tell it that the CPU supports 32-bit ARM, too, so it links agains a corresponding version of the standard library (ARM in this version does not support division, so it generates a library call). I think Batari's suggestion of switching to a thumb-only architecture (like Cortex-M) should "fix" the issue, though, as it will force gcc to link against a thumb-only version of the standard library.

 

Link to comment
Share on other sites

Just now, Thomas Jentzsch said:

I am not sure:

 

The problem instruction I immediately stumbled on is a Thumb-2 instruction rather than a Thumb instruction.

 

@Al_Nafuur's observation that the STM32F407 only supports the "Thumb" instruction set is true but after reading a bit further, I see that it actually supports "Thumb-2" instructions, which has 32 bit instructions, like TBB.

 

The question is whether building with -mcpu=arm7tdmi -march=armv4t will work on the STM32F407. It believe it should but it would be good to ascertain if the alternative flags produce working builds for the PlusCart without any other changes.

 

If the alternative flags do work then @Al_Nafuur's CALLFN code is correct.

 

Link to comment
Share on other sites

I found the "-mno-thumb-interwork" switch in some of the dcp+ projects.

 

But as far as I understand the discussion the ARM instructions are also not wanted for the Harmony projects (because the MCU is switched to thumb mode) too.

 

With ARM instructions our test ROMs shouldn't work in gopher2600 and Stella (but they do!)

 

Also is the collect3 example not using any divisions in "main()" and "Initialize()" which are the functions I am trying to call.

 

Link to comment
Share on other sites

1 minute ago, Al_Nafuur said:

I found the "-mno-thumb-interwork" switch in some of the dcp+ projects.

 

But as far as I understand the discussion the ARM instructions are also not wanted for the Harmony projects (because the MCU is switched to thumb mode) too.

 

With ARM instructions our test ROMs shouldn't work in gopher2600 and Stella (but they do!)

 

The test ROMs we've been building with have used the -mcpu=arm7tdmi -march=armv4t flags. With the -mthumb flag this produces only 16 bit thumb instructions, which are supported by the emulators. For cortex-m4 / armv7e-m, GCC will produce 32-bit thumb instructions, which aren't supported (yet).

 

1 minute ago, Al_Nafuur said:

 

Also is the collect3 example not using any divisions in "main()" and "Initialize()" which are the functions I am trying to call.

 

 

In the case that I found, it has nothing to do with division. TBB is a type of branch instruction.

 

 

Link to comment
Share on other sites

6 minutes ago, JetSetIlly said:

 

I think we have our wires crossed. I've confused things with my suggestion. Sorry.

 

Of course there still might be something "wrong" with the binary's custom CDFJ code, but it can't be ARM instructions (otherwise it wouldn't be working in the emulators too).

 

If it is the binary, then it must be something else. Something that works in thumbulator but not on the STM32F4.

Link to comment
Share on other sites

Just now, Al_Nafuur said:

 

Of course there still might be something "wrong" with the binary's custom CDFJ code, but it can't be ARM instructions (otherwise it wouldn't be working in the emulators too).

 

Agreed. I was really wondering if there was something about the Cortex-M4 architecture that required specific compile flags. Probably not but you never know.

 

Personally, I would try running a binary on the hardware that has been built with the flags I suggested. If that also doesn't work then we've eliminated a possibility at least.

 

Link to comment
Share on other sites

6 minutes ago, JetSetIlly said:

 

Agreed. I was really wondering if there was something about the Cortex-M4 architecture that required specific compile flags. Probably not but you never know.

 

Personally, I would try running a binary on the hardware that has been built with the flags I suggested. If that also doesn't work then we've eliminated a possibility at least.

 

 

can you post the binary you have build with the "-mcpu=cortex-m4"  flags?

 

Link to comment
Share on other sites

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...