Introducing picocart - it works

JasonACT · June 17, 2023

No, there's no hardware CRUIN override, it's all software (see the program snip in the image posted above as a tease).

Since I don't have access (from the Pico) to the 256 byte scratchpad (though I do for all my expanded memory), the LOAD interrupt retrieves a WORD from that inaccessible memory to mine (or wherever it is, decoded from the CRU instruction and stored return WP) and immediately writes it back to where it came from. In between those instructions, the Pico modifies the value (now in Pico memory that I do have access to) and when written back - it's my desired value.

I just keep track of which column is selected, to know which row of data to use to modify that "WORD with the original CRU values" before it's written back.

Tursi · June 17, 2023

8 hours ago, JasonACT said:

No, there's no hardware CRUIN override, it's all software (see the program snip in the image posted above as a tease).

Since I don't have access (from the Pico) to the 256 byte scratchpad (though I do for all my expanded memory), the LOAD interrupt retrieves a WORD from that inaccessible memory to mine (or wherever it is, decoded from the CRU instruction and stored return WP) and immediately writes it back to where it came from. In between those instructions, the Pico modifies the value (now in Pico memory that I do have access to) and when written back - it's my desired value.

I just keep track of which column is selected, to know which row of data to use to modify that "WORD with the original CRU values" before it's written back.

Ahhh, okay, that makes sense. I've been trying for some time to figure out how you were overriding the keyboard CRU.

That's actually what Classic99 used to do under the hood too (essentially)

speccery · June 18, 2023

Just completed the revision of the Picocart, this is version 1.1. It fixes the bugs I had in my mind, and adds a micro SD card. I also added a PSRAM chip (underside) to be able to support large cartridges (assuming the software is able to run fast enough - not sure). It will be interesting to see how this turns out, and if the new components will fit...

HOME AUTOMATION · June 18, 2023

May I please see the "(underside)"?

I certify that I am over the age of 13.🎂

Edited June 18, 2023 by HOME AUTOMATION

speccery · June 18, 2023

@HOME AUTOMATIONgood one!

I noticed the Kicad has another rendering mode. I also tested component sizes on printouts of copper top & bottom, and realised the footprints I chose were too wide.

JasonACT · June 19, 2023

Nice work, I look forward to your progress posts.

What I have researched may come in handy: Not including the XIP 16KB cache, the Pico has 6 memory blocks, 4 x 64KB & 2 x 4KB. Though contiguous, they are all on different busses (so to speak, any device or CPU working alone in one of the blocks gets full speed access with no clashes). The trouble I was having, I think, is when I pressed a key on the keyboard - the USB code (which a lot of the interrupt stuff runs from RAM) was clashing with my time sensitive TI bus server (also in RAM). The standard "no_flash" decoration on functions sticks it all in the first 64KB (unless you run out and it flows over to the next 64KB).

Now the 2 x 4KB blocks are set up for CPU1 & CPU0 GCC stacks (in that order, so if you're not using CPU1, CPU0 can overflow into the first 4KB block to have an 8KB stack).

If your "special" function is 2KB or less, you can specify "int __scratch_x("fname") fname () {" (or scratch_y for CPU0) and while you lose half the stack, you get your important function into full-speed memory that nothing else in the device uses. Mine uses 24 bytes of stack, so that's not a problem for me at all... What is a problem is my function is now 360 bytes longer than 2KB The linker complains the default 2KB stack no longer fits, and the Arduino distribution I'm using has this hardcoded and compiled into a libpico.a file which I can't rebuild. I've changed the linker script though to move things around a bit to make it fit and that seems to run ok.

I'm still fighting a random issue with repeated file record access, and as soon as I put in a SERIAL.printf to debug it, it starts to behave - damn it. Now you might think it sounds like a speed sensitive issue (prints slow it down) but it doesn't matter if I put the printf somewhere it's never used, by just pulling in all the printf routines (linker detects it used, so links in a bunch of other stuff too) the flash file grows and the issue goes away. I suspect something is corrupting RAM, and that moves around depending on what I'm linking in, or how much code I've written.

My printfs show it's not me writing bytes where I shouldn't - so I'm stuck, for the moment.

speccery · June 19, 2023

@JasonACT thanks for your comments and sharing your findings!

In my StrangeCart project I came across similar issues. The LPC54114 chip I use in that project doesn't have a cache. It doesn't require it as badly as the RP2040, since it's using on-chip Flash memory over a 128-bit bus. But similarly to what you have experienced, when both CPU cores are accessing the same memory block, one of the cores has to wait. In the case of the LPC54114 it has four SRAM memory blocks, two 64k blocks and two 32k blocks, plus the flash. There's a cool on-chip crossbar switch which enables concurrent access to different memory blocks. I have structured the firmware so that the CM0+ core is always running from its own SRAM block, while the CM4 core is free to roam around. This way the CM0+ performance is consistent. It does occasionally access other memory blocks, but typically only to fetch ROM/GROM data. These accesses occur rarely compared to handling the TI-99/4A bus, so in practice the timing is predictable.

JasonACT · June 20, 2023

This hasn't fixed my weird issue, but apparently you can also set the priority of the 4 bus masters (DMA read, DMA write, CPU 0 and CPU 1) on the Pico:

#include "hardware/structs/bus_ctrl.h"

...

bus_ctrl_hw->priority = BUSCTRL_BUS_PRIORITY_PROC1_BITS;

I should have read the doco earlier, this say CPU 1 won't be slowed down by any SRAM clash, and will give cycle accurate performance.

The doco also says, banks 0-3 are word-striped for performance (unless you use the alternate mirror address, which my dist. doesn't). This may explain why it starts to work when I link in a bit more code. I.E. Where things end up in the stripe pattern may affect performance of some things. But I would have thought the bus priority setting above would have fixed this.

Oh, and I also see (after a lot of testing) that my TI has locked up due to the Pico instability, by reading >981A (effectively the grom read address port) and the console groms have decided to hold the ready signal low for some reason. This is at ROM address >0060 which is apparently a grom address write instruction. I may never know what leads to this, but it does it every time quite reliably when I'm part or most way through reading in the tombstone city file TOMBA in the EA editor.

speccery · June 20, 2023

17 minutes ago, JasonACT said:

This hasn't fixed my weird issue, but apparently you can also set the priority of the 4 bus masters (DMA read, DMA write, CPU 0 and CPU 1) on the Pico:

#include "hardware/structs/bus_ctrl.h"

...

bus_ctrl_hw->priority = BUSCTRL_BUS_PRIORITY_PROC1_BITS;

I should have read the doco earlier, this say CPU 1 won't be slowed down by any SRAM clash, and will give cycle accurate performance.

The doco also says, banks 0-3 are word-striped for performance (unless you use the alternate mirror address, which my dist. doesn't). This may explain why it starts to work when I link in a bit more code. I.E. Where things end up in the stripe pattern may affect performance of some things. But I would have thought the bus priority setting above would have fixed this.

Oh, and I also see (after a lot of testing) that my TI has locked up due to the Pico instability, by reading >981A (effectively the grom read address port) and the console groms have decided to hold the ready signal low for some reason. This is at ROM address >0060 which is apparently a grom address write instruction. I may never know what leads to this, but it does it every time quite reliably when I'm part or most way through reading in the tombstone city file TOMBA in the EA editor.

This is useful information for the bus control stuff.

I have implemented GROM support already many times (with FPGA, several different microcontrollers) but I don't think I have encountered the issue you mention. Perhaps I have been lucky. I wonder if this is some kind of glitching or something. In your circuit you mention you are relying the pico to tolerate 4.9V. Is the databus buffered or directly connected to the pico? I don't think this is the problem, I'm just curious

JasonACT · June 20, 2023

Everything is buffered except A15 (2nd instance for fast read access), MEMEN, DBIN, WE & CRUCLK. LOAD, EXTINT & READY are via a 2N7000 transistor, when I want to pull down the signal. (It's not me pulling READY low either, when the TI has found itself in that dark hole.)

JasonACT · June 20, 2023

I chuckle to myself.

I've written my own hex string formatter, and send bytes out to the serial port directly (so to keep the consistent crash happening with my current build). I also store the last 16000 addresses (ROM and my RAM - along with byte values for my RAM which I have). When it crashes, I dump it all out the serial port, I can see a trace that includes the last DSR read that worked and this DSR read that didn't work.........

When the TI-EA DSR @>2D42 finishes, it does a RTWP, this enables interrupts, for that last time it fails it decides it needs to immediately run the console interrupt (VDP probably) and I can see it finishes properly and also does a RTWP... Which the next address is >0108 or >0109 (I can't tell, ROM addresses always have A15 set high) but I'm pretty sure that isn't where it came from. 5 memory accesses later it's locked up in the GROM address read (I wonder if TI knew was a fairly big problem with the console with groms?).

I have a bad feeling though, those byte values (for the bad address) look a lot like DSR PAB values. 01, 08, 09...

Good news is, it doesn't appear to be a Pico timing issue, especially after all these bus optimisations. I'm recording things fairly reliably after all.

Bad news is, that bad feeling means I've maybe screwed something up in the DSR and should change my code to not use the scratch pad since I've got oodles of DSR RAM.

Tursi · June 20, 2023

6 hours ago, JasonACT said:

and the console groms have decided to hold the ready signal low for some reason

It may be important to know that GROMs leave ready low when they are inactive, and only raise it when an actual access cycle is complete. That is, for a GROM, they are always "not ready" except when a write completes or data from a read is available. They return to "not ready" as soon as the select is released.

This doesn't hold the CPU normally because the GROM Ready signal is only connected to the CPU's hold circuitry during an access.

speccery · June 20, 2023

1 hour ago, Tursi said:

It may be important to know that GROMs leave ready low when they are inactive, and only raise it when an actual access cycle is complete. That is, for a GROM, they are always "not ready" except when a write completes or data from a read is available. They return to "not ready" as soon as the select is released.

True, it's something I realised not too long ago. It makes microcontroller implementation easy, since by default the code can keep GREADY low. This will automatically stall the processor when it accesses a GROM, and thus gives the code as much time as needed to work on. Or more to the point, the code can prioritise normal memory cycles, and less often monitor GROM accesses. This is how it works on the StrangeCart. Even with this kind of prioritisation at least in my case the MCU will be ready much, much faster than ordinary GROM chips.

Edited June 20, 2023 by speccery

Tursi · June 20, 2023

7 hours ago, speccery said:

True, it's something I realised not too long ago. It makes microcontroller implementation easy, since by default the code can keep GREADY low. This will automatically stall the processor when it accesses a GROM, and thus gives the code as much time as needed to work on. Or more to the point, the code can prioritise normal memory cycles, and less often monitor GROM accesses. This is how it works on the StrangeCart. Even with this kind of prioritisation at least in my case the MCU will be ready much, much faster than ordinary GROM chips.

Yeah, helped with UberGROM too, even though an 8MHz AVR is still easily 3 times faster.

JasonACT · June 21, 2023

On 6/20/2023 at 7:03 PM, JasonACT said:

Good news is, it doesn't appear to be a Pico timing issue

Totally a Pico timing issue, it turns out. Once the exact place it happens is known, everything else becomes obvious; as long as you dump scratch into Pico memory for final analysis. >0108 was correct, and nothing to do with my DSR. GROM being accessed with a WS pointer set to >9800 also explains how the READY signal was tricked into staying low.

I don't reckon an emulator would do this any justice.

speccery · June 27, 2023

New PCBs arrived from manufacturing. I haven’t had time to do anything with them except just look, and they don’t look bad on the first glance. Looking forward to having the time and energy to build a couple and adapt the firmware.

JasonACT · June 28, 2023

I'll be very interested in what you can come up with, with the PSRAM. I've put a 2MB one on my board...

But currently, I'm holding *READY low while I copy data out of it into Pico RAM (@>6000 on the TI) when a page changes.

Things like "2MegCART" and MegaDemo work really, really well, because they appear to just copy into expansion memory and you don't notice any pause. FCMD however pages all the time, and runs like a basic program (I.E. slowly... I have the LED turning on and off with SD or PSRAM access to see).

speccery · June 28, 2023

@JasonACT I will keep you posted on the progress with the new PCBs and updated firmware goes. I have 64Mbit PSRAM chips, 8 megabytes. Unlike on the side port, there is no *READY line on the cartridge port. So whatever I end up doing, it needs be very fast My plan is to use fast QPI mode transfers, but I know it is going to be challenge to be able to do them in real time. The critical path will be the ability to respond to read requests with the first byte in time. Once in burst mode, data will move fast.

On the strangecart I have used a flash ROM in SPI mode with a 48MHz clock to stream data from the ROM. This has worked well for cartridges with known access patterns such as the cool Bad Apple demo (5 megs), but random access is a challenge.

JasonACT · June 28, 2023

I forgot to mention, I've changed the 220ohm resistor tonight in the Speech PWM circuit, to a 470ohm one. I had changed the PWM resolution to 7 bits (instead of 8 ) to make speech about the right volume level - but I've changed it back to 8 bits now. It's still quite loud, compared to beeps and honks.

The digital noise being introduced though (Pico stuff happening) is about half - so I don't think it's my (admittedly bad) circuit design (where I didn't bother adding caps anywhere other than where absolutely needed). I think I can go with a 1K resistor and cut down noise a bit more, while keeping the volume about right.

I'm fairly happy though, with how it stands now.

JasonACT · June 30, 2023

Wins & Losses...

So I noticed that WarZoneII wasn't being controlled by my Playstation 4 USB controller, so I checked out the code, and I saw a lot of TB instructions, and I've come up with a fool-proof way to work out how to reset the PC to the last jump by saving the last 5 CPU addresses read (I say fool-proof, but I know it's actually teetering). If the *LOAD trigger point is one of those addresses (+2) then all is good, otherwise a JMP happened after a TB, which I can reset now using that cache. I also now see the console GROM reading keyboard CRUs via a CPU "X" instruction at >0604... I have not bothered to decode that, it only happens when FCNT-= is pressed, and seems to work OK without me doing anything extra... If I come across a GROM that does its own keyboard scan, I may reconsider it.

Feeling good about all of that, I thought I could save 40KB of Pico RAM by making GROM access go directly to the new PSRAM. How wrong was I! I've got it clocked at 125MHz (seems half the CPU speed [250MHz] is the upper limit) but I still needed to hold the *READY line low for a lot longer than a real GROM does... While I can go into Editor/Assembler and edit a file, I can see building the screens are like an extended basic program now (faster than basic, but not all that great).

I've looked at the timing diagrams of the PSRAM chips today, I think you need to put them into Q mode to have any chance, even SPI mode with a Q read isn't going to be fast enough. But this is all guesses, I didn't bother to get out the logic analyser to see the carnage, once I could see the screen being drawn so slow in the E/A cart.

I'm still stoked with all this though

JasonACT · June 30, 2023

Ah, all those worried about developing with a Pico, and the re-flashing of its 2MB F-ROM... Don't fret, the boot select button dies much earlier!

2 part epoxy to the rescue, holding down the tiny metal cap, literally spread with the eye of a needle:

JasonACT · July 1, 2023

15 hours ago, JasonACT said:

Feeling good about all of that, I thought I could save 40KB of Pico RAM by making GROM access go directly to the new PSRAM. How wrong was I! I've got it clocked at 125MHz (seems half the CPU speed [250MHz] is the upper limit) but I still needed to hold the *READY line low for a lot longer than a real GROM does... While I can go into Editor/Assembler and edit a file, I can see building the screens are like an extended basic program now (faster than basic, but not all that great).

I think I've got it working, but as with the E/A file-load timing fix I needed to do, only time will tell. Things I'm doing:

.Since SD and PSRAM are on a single SPI, I default modes to prioritise PSRAM (I.E. at the end of all SD accesses) so it's always ready

.Start the PSRAM read early (before A15 [@>9800] goes from high to low)

.No longer turn on the LED, on the PicoW this needs to talk to the WiFi chip to get its GPIO to control the LED

.Stop using the Arduino SPI wrapper (which wasn't RAM resident) and in fact create my own new optimised function in the 4KB CPU1 stack area

With only some of these things, it was almost working, except once in a while I would see "3 for review module library" so I knew it was close, but not quite there without READY being held low. Now I don't have to worry about the READY signal, and I got my 40KB back, so I can now power up the WiFi chip which the doco says needs 40KB. If that's not bonus enough, the new faster PSRAM read function has made my emulated ROM page switches quite a lot faster too.

JasonACT · July 1, 2023

It was working so well, I needed to see what A15 and *CE on my PSRAM was doing:

Yep, started before A15 (top line) went low, finishes well before the console GROMs release READY.

Not entirely obvious here, but this is the PicoW's LED being switched on and off (with me holding READY to allow it to work).

That's 1ms for the Pico to talk to the WiFi chip to tell it to switch its GPIO controlling the LED.

No wonder I could see the screen being drawn like an XB program!

I kind of not want to use the LED anymore for SD Card access! I can also see, PSRAM on SPI is no good for ROM, it needs to be QSPI.

JasonACT · July 1, 2023

Oh no, at 250MHz I can see in image #1 I only just start the PSRAM read before the first half of the TI 8 bit read cycle has finished.

My terribly complicated implementation here wouldn't allow even QSPI to work, by the look of it.

speccery · July 2, 2023

I assembled one of the new PCBs. I haven't had enough time to do meaningful testing yet. I did port the firmware to accommodate some of the pinout differences. But I just had to see if I can get the neopixels running - and I did. So these are the two multicolour LEDs just above the micro SD connector.

The board has not been plugged in to the TI yet. I am powering the logic with the two wires on the left, bringing 5V from a lab power supply to the board. My plan is to test tomorrow this in the actual TI - finger crossed it won't blow up anything. That testing will be with the old firmware features, i.e. I will first see if this can do what the previous version did and emulate a cart with both ROM and GROM. Extended Basic would be a good test drive, as it is a banked ROM cartridge. If that works I will try to bring up the SD card support and then start to wrestle with the PSRAM. Or perhaps in the other order, first PSRAM as that is more interesting.

Edited July 2, 2023 by speccery

Introducing picocart - it works

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members