matthew180 Posted January 14, 2013 Share Posted January 14, 2013 (edited) I'm starting this thread as a means to hopefully promote some F18A development, answer specific questions about programming the F18A, and finally as place to look for links to updated documentation and eventually firmware updates.This first post will always have the latest documents and updates attached, so there is no need to go digging through the thread to find the most recent information. I also hope it will contain questions, answers, and code examples. I would like to keep this thread technical and on-topic, so if you have other general F18A questions or comments, please start a new thread or use the other existing F18A thread.* Documentation: On-going. This is something I hope to complete, but until then Rasmus has collected many of the F18A programming posts from the forum and created PDF of them (thank you Rasmus!) See the files attached to this thread, and please ask F18A technical questions in this thread.The main F18A webpage (http://codehackcreate.com/archives/30) has the main feature list, as well as an initial post to getting started with programming the F18A. As I add documentation, I will post it on the website first, then make an update here to let anyone interested know there is something new.* Register Use Spreadsheet: Libre Office / Open Office .ods format. This is the primary spreadsheet I used while developing the F18A, and all functionality was documented in the spreadsheet first, then converted into HDL. That means the spreadsheet is always up to date with respect to the F18A's functionality.While some of the F18A's features require more documentation to use, much of the functionality is very self explanatory and can be used just by looking at the spreadsheet and reading the notes. For example, it does not take much to guessing to figure out what the "horizontal scroll register" does.*************COMPATIBILITY*************Pin-compatible replacement for the TMS9918A, 9928, 9929, and TMS9118 Video Data Processors.The F18A has been tested in the following systems:TI-99/4A Home ComputerColecoVison Game Console*ColecoVision ADAM Computer#Toshiba HX-10 MSX1 ComputerToshiba Pasopia-IQ MSX1 ComputerJVC Victor HC-7 MSX1 ComputerYamaha CX5M MSX1 Computer@SpectraVideo 328 Computer*@Tomy Tutor Computer*@SEGA SG-1000 Game ConsoleSEGA SC-1000II (replaced a TMS9118 VDP)Telegames Personal ArcadePowertran Cortex Computer* Note1: These systems are known to have the original VDP soldered directly to the system circuit board and will require desoldering and a socket installed.# Note2: The ADAM computer requires an "offset board" to keep the F18A inside the main PCB outline. This is an available option when ordering and F18A.@ Note3: These systems are known to require USR4 jumper removed because the main system uses the CPUCLK output from the VDP as the main system clock.************************F18A FIRMWARE Change Log************************ F18A V1.9 Dec 31, 2018 (CRC: 147A)* Prepare for open source release.* Split up the original "core" to create a top-module for the stand-alone F18A, and a "main core" that can be used as part of a larger SoC.* Fixed the VGA horizontal timing error caused by treating the pixel time as 40ns instead of 39.68ns. Because events were being counted in "pixels", this caused the horizontal sync pulse to be slightly off, and the overall line time to be 32us instead of 31.746us. This error meant each line was around 6.4 pixels too long, and pushed the total frame rate to 59.2Hz. This error was enough to cause games to fail (Pole Position on the 99/4A), and some monitors to not sync properly when run through video converters. The timing error also caused many problems for the PAL ColecoVision.* Removed sprite-linking. This was an unused feature and helped free up FPGA resources to allow the core to better fit in the Spartan-3E 250K.* Removed programmable GROMCLK divisor. Unused feature, free up resources.* Register mode and cd_i inputs to CPU component. V1.8 - Aug 24, 2016 (CRC: F981)* Fixed sprite collision bug where sprite collisions were being incorrectly detected outside of the active display, after line 191 or 239 depending on the line mode.* Added hybrid VR write restriction to mask VR writes to three-bits when the F18A is locked, like the real 9918A does. However, if mode bit M4 is set (80-columns), writes to VRs over VR7 are *ignored* instead of masked to three-bits. This allows various 9938 programs to work (or continue to work), as well as continue to support TurboForth that writes to VRs 0..15 to set up 80-columns (if straight masking was used, VRs 8..15 would over-write VR 0..7). V1.7 - Jan 1, 2016 (CRC: A3B5) * Fixed Bitmap-Layer (BML) display bug* Fixed GPU's PIX instruction to properly calculate BML addresses* Added power-on graphic that shows the current firmware version V1.6 - Apr 26, 2015 (CRC: 40CC)* Removed fixed tile functionality* Removed border scroll limit functionality* Removed banner functionality* Removed host-side 32-bit counter* Removed host-side 32-bit RNG* Removed GPU 32-bit counter* Removed GPU 32-bit RNG* Removed the sprite "disable value" (>F8) in the sprite Y-location when ROW30 is enabled.* Added second tile layer with its own NTBA, h/v page sizes, and h/v scroll regs* Added ECM2/3 pattern table size selections for tiles and sprites.* Added host-side segmented counter with 10ns accuracy.* Added configurable HSYNC and VSYNC GPU triggers.* Added fat-pixel (2x1) with 16-color support to the bitmap layer (BML).* Added 1x1 page scroll support for T40 and T80 modes.* Added option to reset most VDP registers to their power-on values.* Added option to disable Tile Layer 1, which includes GM1, GM2, MCM, T40, and T80.Sprites, the BML, and TL2 are still active and can be enabled/disabled independently.* Added option to allow attribute byte to be fg/bg color select in T40 and T80.* Added per-position tile attribute support.* Added DMA capability to the GPU:8xx0 - MSB src8xx1 - LSB src8xx2 - MSB dst8xx3 - LSB dst8xx4 - width8xx5 - height8xx6 - stride8xx7 - 0..5 | !INC/DEC | !COPY/FILL8xx8 - triggerFILL (active high) will read a single byte at the src address and fill thedestination with that byte.src, dst, width, height, and stride are copied to dedicated counters whenthe DMA is triggered, thus the original values remain unchanged.* Added USR3 jumper to control GROMCLK/CPUCLK output on pin37 to provide support for 9128/29* Added USR2 jumper to disable/enable simulated scan lines (every other VGA scan line has itscolor reduced by 50%.) Also controllable via a new VDP register bit.* Added a 5th sprite reporting option instead of reporting the max-sprite, which on the F18Amight be different than the original VDP because all 32 sprites can be on a single scan line.* Added a new register (VR51) to limit the maximum sprite processed. This has nothingto do with the number of sprites that can be visible on a scan line, which is controlledby a separate register (VR30). This register is always active and can be used instead ofthe >D0 byte in the sprite Y-location, and is the only way to limit sprite processing earlywhen ROW30 is enabled.* Changed the GPU interlock so that polling the VDP status register will not cause the GPUto pause. This should greatly increase GPU performance during heavy VDP interrupt polling.* Fixed T80 NTBA two LSbit problem. They are ignored (set to "00") when the F18Ais locked to provide compatibility with the 9938 and avoid problem with softwarethat set the two LSbits of the NTBA to other than "11" as the 9938 documentationspecifies they should be. This limits the T80 name table to 4K boundaries. Whenthe F18A is unlocked, all 4-bits of the NTBA are used and the T80 name table canbe located on 1K boundaries.* Fixed the 5th number update during a scan line. As long as the 5S flag is zero, the 5thnumber register follows the sprite scanning sequence. Seems to be a transparent latch thatfollows the input (current sprite being scanned) until latched by the 5S flag. If the statusregister is being polled and 5S is reset mid frame, then the 5th number begins following thescanned sprites again. This bug is known to have affected Miner49er on the 99/4A.V1.5 - July 2013Not really a *bug* fix since the problem it corrects exists on the real 9918A, and only has to do with sporadic collision bit reporting during heavy polling of the original 9918A VDP status register. This was discovered while Rasmus was writing Titanium. The 9918A was not designed to have its status register polled which is why it provides an interrupt output.I don't think the original 9918A designers took the hazard into consideration, but I decided to make this correction because it is what the original designers would have done given their preference (and I asked Karl Guttag about it). Thus, the F18A implements what you would consider the "expected behavior", and will work as expected where the original 9918A might not. I did not make this decision lightly.V1.4 - April 2013Fixed the sprite collision bug and a GPU bug with the divide circuit. The sprite bug is mostly affected by XB when a program uses CALL COINC(ALL). Most assembly games probably don't rely on the collision bit alone for sprites and perform coordinate testing, which is most likely why the bug slipped through all the testing (and I tested with a *lot* of games on a lot of platforms).V1.3 - July 2012Original release firmware.********UPDATING********The In-System firmware update is available for 99/4A users. I am very thankful to Rasums and Tursi for their help in making this possible. You can download the F18AUpdate_vXX.zip file below. Detailed instructions are available on my website here: http://codehackcreate.com/archives/418Alternatively you can update your F18A in any system via a JTAG programming cable. You can purchase a JTAG programming cable for about $59 USD from Digilent:JTAG HS3 programming cable/This is very inexpensive for a JTAG cable (my Xilinx-brand cable was over $250!), and Digilent makes quality gear.You also need the Xilinx ISE-Webpack tools:http://www.xilinx.com/support/download/index.htmThis is a free download from Xilinx, but it is BIG! About 6GB the last time I checked. There is a smaller download that contains just the programming tools called "Lab Tools" and is only about 1G. I'm still looking for a smaller / simpler solution. You will have to create an account (which is free). The primary program you need is called IMPACT and is used to program the FPGA and SPI-flash.Once you get the tools installed, download and unzip the f18a_250k_vXX.zip file. In the zip file you will find the MCS file:f18a_250k_vXX.mcsThe .mcs file is used to update the SPI-flash ROM attached to the FPGA. Here are the quick instructions.The term "system" means your 99/4A, ColecoVision, MSX, etc., and "PC" means the modern personal computer you are running the Xilinx tools on.0. Make sure your system is powered OFF to begin1. Open your system to get physical access to the F18A2. Plug the JTAG programmer in to your PC (via USB) and the F18A (via JTAG)3. Power ON your system4. Launch the Xilinx IMPACT tool5. Double-click on "Boundary Scan", then right-click in the main area and select "initialize chain"6. The FPGA should be detected and show up in the big area. A window will open with device properties, just click "ok"7. Above the FPGA icon should be a dotted line with "SPI/BPI ?" in it. Right-click on that box and select "Add SPI/BPI Flash..."8. Navigate to the f18a_250k_vXX.mcs file you extracted from the .zip file and choose "Open"9. Select "SPI PROM" and "M25P80" from the two drop-down selections and click "OK"10. The box above the FPGA should now say "FLASH" in it. Right-click the box and select "Program"Once the programming is finished, cycle power on your system and make sure it comes up. ********Examples******** Included in the zip file is a demos disk that shows many of the enhanced features of the F18A. The source for all the programs are included. I did not write these programs and I am very thankful to Rasmus and Tursi for contributing them. rasmus_scroll.zip F18A documentation.pdf f18a_register_use.zip F18A_V19.zip Edited January 2, 2019 by matthew180 11 1 Quote Link to comment Share on other sites More sharing options...
+5-11under Posted January 14, 2013 Share Posted January 14, 2013 I'm interested in creating games that could be enhanced by the F18A, but would work normally without the F18A. For instance, I'd like to know how to do the following, if possible: - Check to see if an F18A is present - change one or more of the colours - Use 2 or more colors per sprite Any help would be appreciated! 1 Quote Link to comment Share on other sites More sharing options...
Tursi Posted January 14, 2013 Share Posted January 14, 2013 Matthew has a supported register-testing method for detecting the F18A, but I coded a simpler method into my slideshow program, which was to simply to upload a tiny GPU program to VDP memory that changed a byte and then stopped. Then set the registers to execute the program and check the VDP memory. If it changed, there was a GPU present. Enhancing titles for F18A seems the best way forward! Quote Link to comment Share on other sites More sharing options...
matthew180 Posted January 15, 2013 Author Share Posted January 15, 2013 And Tursi didn't share his idea until now. That is a good idea, using the GPU to set a byte is probably a lot less code than the test I wrote. I'll use Tursi's method below in the example, but before you can detect the F18A you have to unlock it. The F18A defaults to a "locked" mode of operation to prevent legacy software from accidentally enabling any of the enhanced features. I added the lock because during testing some ColecoVision games were causing strange behavior which I discovered was due to the software writing to VDP registers over register 7. Since the 9918A only has 8 registers (0 to 7), it did not matter, and the higher values were simply masked to a number between 0 and 7. But, the F18A supports VDP register values from 0 to 63 which is how you take advantage of the new features. This is also how the 9938/58 add additional features, and the datasheets for the 9918A indicates registers over 7 are reserved. However, that didn't stop some software from not following the rules, and on the 9918A/9928/9929 the bad behavior did not have any impact. But, the F18A has to protect itself from that old software, thus it powers up locked. Since the unlocking sequence has to be performed "in band", i.e. using the standard 9918A registers, I had to come up with a way that would would never happen on the real 9918A. VR1 is probably the most critical VDP register since it contains most of the mode bits plus the memory size bit, thus it is VR1 in the form of VR57 (VR57 the same as VR1 on a non-F18A system) that is used to unlock the F18A. Unlocking is done by writing >1C to VR57 twice, which on a real 9918A VDP is the same as writing to VR1. The value >1C was chosen because it sets the bits in VR1 to something you would never do on a real 9918A, even accidentally because it makes the VDP almost useless. And to write such a value twice, consecutively, is hopefully beyond all probability of happening accidentally. Value >1C in VR1 looks like this: |4/16K|BLANK| IE0 | M1 | M2 | X |SIZE | MAG | | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | By writing >1C on the real 9918A you are setting 4K VRAM, blank the screen, no interrupts, both M1 and M2 to '1' which is an illegal mode, and a '1' to the unused bit-5 that the datasheet indicates should always be '0'. This would pretty much make the real 9918A useless, and any working software would never operate with this combination of bits in VR1. Writing to VR57 (binary: 111001) is VR1 on the 9918A which only sees the low 3-bits "001", and must be done twice in a row with no other CPU-to-VDP access. On the F18A you will be writing to VR57, not VR1, and after two consecutive writes the ERM (Enhanced Register Mode) will be unlocked. Any further writes to VR57 after being unlocked will re-lock the F18A. Because writing >1C to VR1 on the real 9918A would mess up the video mode and other critical VDP configuration, a write to VR1 should immediately follow the unlock sequence if you care to detect the F18A and write software that works on both the 9918A and F18A. Thus you would have something like: VDPERM LIMI 0 * Interrupts must be off LI R0,>391C * VR1/57, value 00011100 BL @VWTR * Write once BL @VWTR * Write twice, unlock LI R0,>01E0 * VR1, value 11100000, a sane setting BL @VWTR * Write Note that I'm using my version of VWTR here, not the E/A (or XB) versions. Now you can test for the F18A, which is coming up in my next post. Quote Link to comment Share on other sites More sharing options...
matthew180 Posted January 15, 2013 Author Share Posted January 15, 2013 (edited) To test for the F18A, I'm going to use Tursi's idea of using the GPU, which should make for a smaller test. Assuming the F18A unlock sequence has been performed, a small GPU program will be loaded to the VRAM and executed that will change 1 byte in VRAM. If the byte changed, the F18A is present, otherwise the system is running a stock VDP. The GPU is a slightly modified 9900 CPU so you can use any standard 9900 assembler to write code for the F18A's GPU. Since the GPU is inside the VDP it can only access the VRAM, plus an additional 2K of memory above the normal 16K of VRAM. The GPU's memory map looks like this: VRAM 14-bit, 16K @ >0000 to >3FFF (0011 1111 1111 1111) GRAM 11-bit, 2K @ >4000 to >47FF (0100 x111 1111 1111) PRAM 7-bit, 128 @ >5000 to >5x7F (0101 xxxx x111 1111) VREG 6-bit, 64 @ >6000 to >6x3F (0110 xxxx xx11 1111) current scanline @ >7000 to >7xx0 (0111 xxxx xxxx xxx0) blanking @ >7001 to >7xx1 (0111 xxxx xxxx xxx1) 32-bit counter @ >8000 to >8xx6 (1000 xxxx xxxx x110) 32-bit rng @ >9000 to >9xx6 (1001 xxxx xxxx x110) F18A version @ >A000 to >Axxx (1010 xxxx xxxx xxxx) GPU status data @ >B000 to >Bxxx (1011 xxxx xxxx xxxx) "GRAM" means GPU-RAM and has nothing to do with "GROM or GRAM" of the TI console. It is just a coincidence. PRAM is the palette RAM in the F18A, and VREG is the VDP registers to which the GPU has full read/write access. The program will be loaded up high in VRAM. I like >3F00 for no particular reason, other than it is 256 bytes from the top of VRAM and probably unused unless there is disk access going on (which there won't be during the test). This is the code that will be loaded into VRAM for the GPU to execute: 0000 3F00 DEF MAIN AORG >3F00 MAIN 3F00 04E0 CLR @>3F00 3F02 3F00 3F04 0340 IDLE 3F06 0000 END That is a total of 6 bytes of assembly, which is pretty small for the test. The GPU will clear the word at >3F00, which in this case is the CLR instruction's opcode itself. You have to love self modifying code. :-) After the code runs, the value at VRAM >3F00 be >00 if the F18A is present, otherwise it will be >04 on a stock VDP. This is the code to load the program to VRAM. I'm including all the support routines here too so it is a complete program: DEF MAIN * VDP Memory Map * VDPRD EQU >8800 * VDP read data VDPSTA EQU >8802 * VDP status VDPWD EQU >8C00 * VDP write data VDPWA EQU >8C02 * VDP set read/write address * Workspace * WRKSP EQU >8300 * Workspace R0LB EQU WRKSP+1 * R0 low byte reqd for VDP routines GPU DATA >04E0 * 3F00 04E0 CLR @>3F00 DATA >3F00 * 3F02 3F00 DATA >0340 * 3F04 0340 IDLE GPUEND MAIN LIMI 0 LWPI WRKSP * F18A Unlock LI R0,>391C * VR1/57, value 00011100 BL @VWTR * Write once BL @VWTR * Write twice, unlock LI R0,>01E0 * VR1, value 11100000, a real sane setting BL @VWTR * Write reg * Copy GPU code to VRAM LI R0,>3F00 LI R1,GPU LI R2,GPUEND-GPU BL @VMBW * Set the GPU PC which also triggers it LI R0,>363F BL @VWTR LI R0,>3700 BL @VWTR * Compare the result in >3F00 LI R0,>3F00 BL @VRAD MOVB @VDPRD,R0 JEQ PASS * FAIL * PASS ********************************************************************* * * VDP Set Write Address * * R0 Address to set VDP address counter to * VWAD MOVB @R0LB,@VDPWA * Send low byte of VDP RAM write address ORI R0,>4000 * Set the two MSbits to 01 for write MOVB R0,@VDPWA * Send high byte of VDP RAM write address ANDI R0,>3FFF * Restore R0 top two MSbits B *R11 *// VWAD / VRAD ********************************************************************* * * VDP Set Read Address * * R0 Address to set VDP address counter to * VRAD MOVB @R0LB,@VDPWA * Send low byte of VDP RAM write address ANDI R0,>3FFF * Make sure the two MSbits are 00 for read MOVB R0,@VDPWA * Send high byte of VDP RAM write address B *R11 *// VRAD ********************************************************************* * * VDP Multiple Byte Write * * R0 Starting write address in VDP RAM * R1 Starting read address in CPU RAM * R2 Number of bytes to send to the VDP RAM * * R1 is modified by the value of R2 * R2 is changed to 0 * VMBW MOVB @R0LB,@VDPWA * Send low byte of VDP RAM write address ORI R0,>4000 * Set the two MSbits to 01 for write MOVB R0,@VDPWA * Send high byte of VDP RAM write address VMBWLP MOVB *R1+,@VDPWD * Write byte to VDP RAM DEC R2 * Byte counter JNE VMBWLP * Check if done ANDI R0,>3FFF * Restore R0 top two MSbits B *R11 *// VMBW ********************************************************************* * * VDP Write To Register * * R0 MSB VDP register to write to * R0 LSB Value to write * VWTR MOVB @R0LB,@VDPWA * Send low byte (value) to write to VDP register ORI R0,>8000 * Set up a VDP register write operation (10) MOVB R0,@VDPWA * Send high byte (address) of VDP register ANDI R0,>3FFF * Restore R0 top two MSbits B *R11 *// VWTR END This code triggers the GPU: * Set the GPU PC which also triggers it LI R0,>363F BL @VWTR LI R0,>3700 BL @VWTR The PC (program counter) in the GPU is 16-bit, just like the normal 9900, so it takes two bytes to set up the address. VR54 (>36) is the MSB and VR55 (>37) is the LSB. After writing the LSB to VR55, the GPU automatically triggers and begins execution as the address just set up. In this case it executes the CLR instruction, then goes idle via the IDLE instruction which is perfectly fine on the GPU (don't use IDLE in the 9900 in your 99/4A though!) Now the value at >3F00 is tested. The VRAD routine sets up a VDP read address without doing a read. * Compare the result in >3F00 LI R0,>3F00 BL @VRAD MOVB @VDPRD,R0 JEQ PASS The MOVB moves the byte at >3F00 in VRAM into the MSB of R0. R0 will now be >0000 if the GPU was present, or >0400 on a stock VDP. The MOVB instruction will automatically compare R0 to zero, so the JEQ will cause a jump if the R0 == 0, i.e. the F18A is present. Or you can put JNE if you need the opposite jump. Note that writing to VR54 and VR55 is the same as VR6 and VR7 on a stock VDP, so if the test for the F18A fails, you should restore those values to something sensible, or simply set up your VDP accordingly now that you know if you have an F18A or stock VDP (9918A/9928/9929). Edited January 15, 2013 by matthew180 Quote Link to comment Share on other sites More sharing options...
matthew180 Posted January 15, 2013 Author Share Posted January 15, 2013 (edited) - change one or more of the colours By "change colour" I'm going to assume that you are talking about changing a palette register. The F18A has 64 palette registers (PR) that are 12-bits each, which gives it a color palette of 4069 colors. Which Palette Register (PR) is used to specify the color of a given pixel depends on a lot of settings. The PRs are grouped into "banks" depending on how many bits are used to resolve a pixel's color. In the 9918A compatible modes (1-bit per pixel (bpp)), there are 4-banks of 16-colors each. VR49 has two bits that control which of the 4-banks will be used for tiles in 1-bpp modes, which means you could set up each of the 4-banks with 16 different colors and change to the new palette with a single register write. In the Enhanced Color Modes (ECM), there are more bits used to specify a single pixel's color, and thus the number of palette banks grows, but the number of colors in each bank shrinks. With 2-bpp, there are 16-banks with 4-colors each, and the 2-bits for each pixel select the color from the bank. With 3-bpp, there are 8-banks with 8-colors each. There are a lot of options for colors, but in this example I'll stick to just updating the palette registers themselves which allows you to use any of the 4096 colors. *NOTE* palette changes survive a soft-reset! If you modify the palette and then exit, those changes will remain in effect until the system is power-cycled or hard reset (a cartridge is plugged in, etc.) Palette registers are numbered 0 to 63 and consist of 12-bits to specify a color in the format: | BYTE 1 | BYTE 2 | | ----rrrr | ggggbbbb | Because there is only one "data write port" to the 9918A (mapped at address >8C00 on the 99/4A) and subsequently to the F18A as well, the F18A has a "Data Port Mode", controlled by a bit in VR47, to select between writing data to VRAM or to Palette Registers. Two byte writes are required to update a single palette register. Palette registers are written to (they cannot be read by the host CPU) by setting the Data Port Mode (DPM) bit in VR47 to 1, then writing to the VDP data port as normal. After the second byte is written, the 12-bit color will be latched into the specified palette register. Side note: a nice advantage of the GPU is that it has full read/write access to the palette registers using normal word instructions like MOV. If the auto increment bit in VR47 was not set, then the DPM automatically falls back to the default "write to VRAM" mode after the second palette byte has been written. If a large number of palette registers need to be updated, setting the auto increment flag will keep the DPM in palette mode until VR47 is written again to return to normal VRAM write mode, or the palette address rolls over to 0, which will force the DPM back to VRAM mode. This is a fail-safe to prevent the VDP from inadvertently getting stuck in the write-to-palette DPM. DMP is also exited any time *any* VDP status register is read, the VDP is externally reset, the palette address rolls over to 0, or by setting VR47 DPM bit to 0. VR47 controls data-port mode and palette address: 0 | 1 | 2 3 4 5 6 7 | DPM | AUTO INC | PAL REG ADDR | DPM = Data Port Mode 0 = VDP data writes go to VRAM as normal 1 = VDP data writes go to the palette AUTO INC = Auto Increment 0 = Do NOT increment the palette address after a single *palette* write, which consists of *TWO* bytes written to the VDP. After the second byte has been received, the addressed palette register will be updated and the Data Port Mode defaults back to normal VRAM for writes to the VDP. This mode of operation is intended for updating a single palette register at a time. 1 = Increment the palette address every time a palette register is updated, which consists of *TWO* bytes written to the VDP with the DPM bit = 1. This allows multiple palette registers to be updated consecutively and quickly. PAL REG ADDR = Palette Register Address This is the address of the single palette register to update, or the first palette regiser to update when AUTO INC = 1. Here is an example of writing PR4, which is normally "dark blue" (RGB: 54F) to a pure blue (RGB: 00F): LI R0,>2F84 * Reg 47, value: 1000 0100, DPM = 1, AUTO INC = 0, PR4. BL @VWTR LI R1,>000F * RGB: 00F, or pure blue in place "dark blue" * Two bytes written to the VDP now go to PR1 MOVB R0,@VDPWD SWPB R0 MOVB R0,@VDPWD After the second write (MOVB instruction), the DPM will fall back to normal VRAM mode since the auto-increment bit in VR47 was not set. If you were going to update a whole set of registers, then you could use the auto-increment feature. If your update does not cause the PR number to roll over to zero, then you must leave DPM mode after updating: * Update the first 7 palette values from the host CPU * Palette 0 is not updated to keep the screen color stable. LI R0,PAL0+2 LI R1,>0111 * Add 1 to each R,G,B value LI R2,7 INCPAL A R1,*R0+ * Update the 12-bit color DEC R2 JNE INCPAL LI R0,>2FC1 * Reg 47, value: 1100 0001, DPM = 1, AUTO INC = 1, start PR1. BL @VWTR * Every two bytes written to the VDP now go to the palette registers. LI R0,PAL0+2 LI R2,14 * Each 12-bit palette entry requires 2 bytes UPDPAL MOVB *R0+,@VDPWD DEC R2 JNE UPDPAL LI R0,>2F00 * Reg 47, value: 0000 0000, exit DMP BL @VWTR ** * Standard color palette * * 12-bit color format: ---rrrrggggbbbb EVEN PAL0 DATA >0000 * Transparent DATA >0000 * Black DATA >02C3 * Medium Green DATA >05D6 * Light Green DATA >054F * Dark Blue DATA >076F * Light Blue DATA >0D54 * Dark Red DATA >04EF * Cyan DATA >0F54 * Medium Red DATA >0F76 * Light Red DATA >0DC3 * Dark Yellow DATA >0ED6 * Light Yellow DATA >02B2 * Dark Green DATA >0C5C * Magenta DATA >0CCC * Gray DATA >0FFF * White PAL0E Edited January 15, 2013 by matthew180 2 Quote Link to comment Share on other sites More sharing options...
+InsaneMultitasker Posted January 15, 2013 Share Posted January 15, 2013 Controller cards (i.e., TI, CorComp, BwG, CF7+) which use VDP for disk buffers usually store a copy of the last accessed disk's sector 0 from 0x3ef5 to 0x3ff4. I do not recall if this sector is ever flushed to disk or if it is a read-only copy. If the former, blanking out 0x3F00 could result in changing the disk bitmap. It's likely improbably, but worst case the sectors at this bitmap address would be marked as available, creating a future over-write condition. I recommend you save and restore this byte to be safe. 1 Quote Link to comment Share on other sites More sharing options...
matthew180 Posted January 15, 2013 Author Share Posted January 15, 2013 Good to know, and anyone doing F18A detection will have to determine their situation. In these examples I'm assuming the programs are probably games (but maybe not), and that this detection is done at program initialization, i.e. before any disk access or such. Also, in this case all 6-bytes would need to be saved/restored. However, it is also possible to write the bytes to any VRAM address, i.e. the name table is a perfectly good place too, and probably safer since the name table is usually under user program control, even in environments like XB. 1 Quote Link to comment Share on other sites More sharing options...
Willsy Posted January 15, 2013 Share Posted January 15, 2013 This is really good information. Thanks. It seems a lot more straightforward to do register/pallette writes etc on the GPU side, rather than the 9900 side. I'll probably explore this method more. 1 Quote Link to comment Share on other sites More sharing options...
matthew180 Posted January 15, 2013 Author Share Posted January 15, 2013 (edited) - Use 2 or more colors per sprite Multi-color sprites. The color enhancements to sprites and tiles, as far as the pattern representation goes, works the same way. As you know, the original VDP could support 2-colors per sprite with one of the colors always being transparent, thus we tend to think of sprites a having one color. The sprite's color could be any of the sixteen original colors, including transparent which presents some interesting possibilities. In the original sprite pattern data, a '0' bit specifies a transparent pixel and also does not count in collision detection. However, a '1' bit in the pattern specifies that the sprite has a pixel at that location, regardless of the color. That is an important distinction to keep in mind, because is allows you to have a pixel (1-bit in the pattern) that is transparent, i.e. the sprite's main color is set to 0. So, the bit patterns in the original sprite (and tile) modes don't actually represent the color, they represent if the sprite has a pixel at that location. The color data is derived from a different location, and in the case of sprites the color comes from the Sprite Attribute Table (SAT) entry for the given sprite. When moving to the ECM (enhanced color modes) for sprites and tiles, the pattern data itself *does* actually select a color from a palette. However there is a distinction between tiles and sprites when it come to the "zero-index", i.e. color data "0", "00", or "000". For sprites the zero-index is *ALWAYS* transparent, so the number of actual colors you can display is 1, 3, or 7 vs 2, 4, or 8. In the ECMs for tiles there is an attribute byte for each tile "name" (0-255) were you can specify if the zero-index is transparent or the color at that index. To provide more pixel data, the extra pattern bits need to come from some where. To keep some sort of compatibility with existing patterns, I chose to implement the extra bits via "bit planes". This allows you to start off with some existing sprite or tile patterns, and expand them to support more colors at a later time in your development. Also, the 3-bpp mode does not pack neatly into a single byte had I tried to use a linear bit-packing method. The down-side to bit-planes is that making patterns is more of a pain. Luckily sometimes99er has implemented some initial support for the multi-color sprites via his sprite editor. For example, for a 2-bpp pattern you need two bits to specify which of the 4-colors to use for a given pixel. There are four possible values: bits | index color 00 | 0 01 | 1 10 | 2 11 | 3 .This is simply binary representation, and for 3-bpp it becomes "000" to "111". Note that the least significant bit comes from the original pattern table (bit plane), and the 2nd or 3rd bits come from subsequent bit-planes. For example, here are two pattern bytes that will have four pixels next to each other, each being one of the four possible colors in a given palette (which is specified by a tile's or sprite's attribute byte). 01010000 pattern-plane 0 00110000 pattern-plane 1, 2K (2048) byte-offset from bit-plane 0 (the original pattern table) -------- 01230000 color index values. .For each byte that makes up a sprite's or tile's pattern, there are one or two more bytes in the additional pattern-planes that are used to make the final color index for a given pixel. The additional pattern-planes are always 2K (and 4K for 3-bpp) bytes offset from the Sprite Pattern Generator Table (SPGT). This means that in 1-bpp the SPGT is the normal 2K, for 2-bpp the SPGT is 4K, and 3-bpp it is 6K. To get a pixel's color index you combine the bits *vertically* from each pattern byte in each plane. So, the second pixel is "01", or index 1. The byte from the first pattern-plane represents the LSbit in the final index value. Shown below is the sprite data using 3-bpp to show all eight colors in a single row: bits | color index 000 | 0 (0 is always transparent for sprites) 001 | 1 010 | 2 011 | 3 100 | 4 101 | 5 110 | 6 111 | 7 the bits are combined vertically | | | | | | | | V V V V V V V V |0|1|0|1|0|1|0|1| LSbit pattern-plane 0, 2K total |0|0|1|1|0|0|1|1| pattern-plane 1, 4K total, 2048 bytes offset from the SPGT |0|0|0|0|1|1|1|1| MBbit pattern-plane 2, 6K total, 4096 bytes offset from the SPGT | | | | | | | | V V V V V V V V |0|1|2|3|4|5|6|7| color index values. .The first step to making a multi-color sprite is to make a pattern that has data for all the bit-planes, which depends on the number of colors you want. You set up the sprite tables as you normally would, load the patterns, set up the SAT, and finally enable the ECM for sprites. VR49 controls the ECM for both tiles and sprites: | 0 | 1 | 2 3 | 4 | 5 | 6 7 | FIXED_EN | ROW30 | ECMT0 ECMT1 | Y_REAL | LINK | ECMS0 ECMS1 | .The bit fields for ECM(T)iles and ECM(S)sprites are: 00 - 0 - original 9918A mode 01 - 1 - 1-bpp 10 - 2 - 2-bpp 11 - 3 - 3-bpp Thus, to enable sprites to use 2-bpp just write >02 to VR49. Heh, all that talking just to say that... Really all the detail is in setting up the patterns, which is just additional data written to the VRAM. Edited September 23, 2013 by matthew180 Quote Link to comment Share on other sites More sharing options...
matthew180 Posted January 16, 2013 Author Share Posted January 16, 2013 (edited) Plotting Pixels on the Bitmap Layer (BML) and GM2.The GPU can plot a BML pixel, given an XY location, in a single instruction. It can also read a pixel, conditionally set a pixel based on the current pixel color, read and write a pixel at the same time, just calculate a pixel's VRAM address, or calculate a GM2 pixel's address!I call the new instruction PIX, and it uses the same opcode as the 9900's XOP instruction, so you can use any 9900 assembler to code the PIX instruction. The F18A GPU does not have a Workspace Pointer (since its registers are hard-wired instead of memory-base), so XOP was not implemented.The XOP format is multi-addressing for the source, and workspace register for the destination. This makes it very flexible for the PIX instruction. Here are the options you can use with PIX: Format: MAxxRWCE xxOOxxPP M - 1 = calculate the effective address for GM2 instead of the new bitmap layer 0 = use the remainder of the bits for the new bitmap layer pixels A - 1 = retrieve the pixel's effective address instead of setting a pixel 0 = read or set a pixel according to the other bits R - 1 = read current pixel into PP, only after possibly writing PP 0 = do not read current pixel into PP W - 1 = do not write PP 0 = write PP to current pixel C - 1 = compare OO with PP according to E, and write PP only if true 0 = always write E - 1 = only write PP if current pixel is equal to OO 0 = only write PP if current pixel is not equal to OO OO - pixel to compare to existing pixel PP - new pixel to write, and previous pixel when reading The source value is the XY location as two bytes, the X being the MSB. Since the XOP supports multiple addressing for the source parameter, you can use a register or memory location. XY values are 0 to 255.The destination parameter is the PIX instruction as indicated above. If you use the M or A operations (calculate addresses only), the destination register will contain the address after the instruction has executed. If you use the R operation, the read pixel will be in PP (over writes the LSbits). You can read and write at the same time, in which case the PP bits are written first and then replaced with the original pixel bits.Example (this is code running on the GPU): LI R0,>2020 * xy=32,32 LI R1,>0001 * write a pixel of "01" XOP R0,R1 * PIX R0,R1 - or - EVEN * make sure XPIX is an even address XPIX BYTE 50 YPIX BYTE 50 . . . LI R1,>0801 * Read existing pixel at XPIX,YPIX and write a "01" pixel in its place XOP @XPIX,R1 - or - LI R1,>0302 * ONLY write a 2("10") pixel if the current pixel is 0("00") XOP @XPIX,R1 - or - LI R1,>0213 * ONLY write a 3("11") pixel if the current pixel is NOT 1("01") XOP @XPIX,R1 - or - LI R1,>8000 * Get the GM2 effective address of the pixel at XY location XOP @XPIX,R1 * R1 now contains the VRAM address byte containing the pixel. * Doing (XPIX AND >07) will isolate the bit in the specified byte. The PIX instruction was really designed to assist with the BML, so using it with GM2 does require a little extra work to update the pixel in the appropriate byte. However, the address of the byte that contains the pixel to be updated it calculated for you, which replaces all this code (from the E/A manual page 336): MOV R1,R4 SLA R4,5 SOC R1,R4 ANDI R4,>FF07 MOV R0,R5 ANDI R5,7 A R0,R4 S R5,R4 This is a very nice routine, and it took me a long time to figure out how it worked. But once I did, I was very impressed with what was going on, and I was also intrigued to know that all this code does is bit-twiddling. Since bit-twiddling is something that takes a lot of work via programming, but something that hardware does naturally, all that code can be replaced with a single bit of hardware (shown here as HDL): gm2_addr <= "00" & ( (pgba & src_oper(8 to 12) & "00000" & src_oper(13 to 15)) + -- y / 8 * 256 + (y % ("0000" & src_oper(0 to 4) & "000")); -- + (x AND >F8) (mask out the pixel index bits) Two dashes -- are comments in HDL. So, one adder and some bit twiddling and the address is calculated in 10ns. Not that you need to know that, but I thought it was interesting. Edited September 23, 2013 by matthew180 Quote Link to comment Share on other sites More sharing options...
matthew180 Posted March 20, 2013 Author Share Posted March 20, 2013 See the first thread in this post for the firmware download and instructions for people using a JTAG cable for the update. The in-system update is not ready yet, sorry. March 20, 2013 firmware update: * Fixed the sprite collision bug * Fixed the text1 (40-column mode) bug when the F18A 30-row option is enabled * Updated default built-in GPU support functions Quote Link to comment Share on other sites More sharing options...
TheMole Posted April 4, 2013 Share Posted April 4, 2013 Is there a list of software that is out there that is written to support the F18A (even regardless of platform, CV also ok)? I know of Tursi's slideshow program, but apart from that I have no idea. You asked before what might be missing out there to get some traction on F18A (game) development. I think a set of higher level language constructs might be useful for some of us (e.g. some XB CALL LINK routines or somesuch). Either way, now that I've received mine, my first project will be porting my Alex Kidd proof-of-concept to the F18A. Hopefully I can turn that into a full side-scrolling platform game (if not a remake of the original). With a bit of luck I can set up the hardware this week-end. 2 Quote Link to comment Share on other sites More sharing options...
Asmusr Posted April 4, 2013 Share Posted April 4, 2013 Could you post a list of the updated built-in GPU support functions? Quote Link to comment Share on other sites More sharing options...
matthew180 Posted April 6, 2013 Author Share Posted April 6, 2013 The pre-loaded routines are probably not as much as you might think. Rather than try to guess what everyone might need, I decided to keep it to a minimum (and not hold up the update any longer) and just provide some sort of software library later. The initial firmware had two pre-loaded routines, a block copy and font load. In the v1.4 update I just added a minimal number of routines, and made room for the user to add their own routines using the same parameter mechanism and vector table. BLKCPY * Block Copy FONTLD * Font Load GETINF * Get catalog version, free memory, vector tables GETIDX * Get a catalog index entry BLOBLD * Load a data blob from the catalog The firmware has a catalog file that I had hoped to fill with more routines, sound data, patterns, etc. but it never happened (too much work, not enough time). The catalog currently contains the pre-loaded code itself (so the F18A can be software reset if desired), the default palettes, and about 22 character sets (patterns for tiles 0 to 255). The pre-loaded code is attached to this post. gpu_preload.zip Quote Link to comment Share on other sites More sharing options...
Asmusr Posted April 12, 2013 Share Posted April 12, 2013 Perhaps a crazy idea, but would it be possible to run an assembler on the F18A GPU? Being able to assemble fast would reduce the pain of programming on a real TI considerably. But I guess it would be difficult to fit an assembler and the source code into the 16K VDP memory + 2K, or what? This bring me to my actual question: If you want to run some code on the GPU that uses branch instructions and not just jump instructions, how would you produce and 'upload' that code? Thanks. Quote Link to comment Share on other sites More sharing options...
matthew180 Posted April 13, 2013 Author Share Posted April 13, 2013 You could run an assembler on the F18A, but you would have to write it from scratch probably. I don't think the E/A could be fixed up to run on the GPU. As for loading GPU code, you assemble using any 99/4A compatible assembler (E/A, asm994a, etc.), but you have to use AORG with a VRAM value from >0000 to >47FE, just like you do for cartridge development (except the cart uses AORG >6000 and is in host CPU RAM, not VDP VRAM). Once you have the opcodes, you have to include them in an E/A loadable assembly program, or load the data from disk at runtime. It gets a little tedious I know, and eventually I hope to have a few tools to help with development. Using the branch instructions are just like any others, you just use them as you would in any assembly program. With AROG, all the references are resolved at compile time and the code will only run correctly if loaded at the specified address. I did a lot of this kind of code when testing the F18A. Here is an example of the GPU and host (99/4A) programs I used to validate the GPU's jump instructions. This is the GPU code, i.e. to be included in the host assembly program and loaded to VRAM for execution by the GPU: * F18A GPU Test * Matthew Hagerty * June 13, 2012 * * Test jump instructions DEF MAIN AORG >3F10 MAIN IDLE MOV @>3F00,@JINST * Jump opcode to execute at >3F00, >3F01 CLR R0 LI R14,JINST MOV @>3F02,R15 * Flag value at >3F02, >3F03 RTWP * R14->PC, R15->status flags JINST DATA 0 * Replaced by opcode INC R0 * Only executed if jump falls through MOV R0,@>3F00 * R0 result in >3F00, >3F01 B @MAIN END I take the listing and convert the opcodes to DATA statements to be included in the host assembly program: Asm994a TMS99000 Assembler - v3.010 * Asm994a Generated Register Equates * 0000 0000 R0 EQU 0 0000 0001 R1 EQU 1 0000 0002 R2 EQU 2 0000 0003 R3 EQU 3 0000 0004 R4 EQU 4 0000 0005 R5 EQU 5 0000 0006 R6 EQU 6 0000 0007 R7 EQU 7 0000 0008 R8 EQU 8 0000 0009 R9 EQU 9 0000 000A R10 EQU 10 0000 000B R11 EQU 11 0000 000C R12 EQU 12 0000 000D R13 EQU 13 0000 000E R14 EQU 14 0000 000F R15 EQU 15 * 1 * F18A GPU Test 2 * Matthew Hagerty 3 * June 13, 2012 4 * 5 * Test jump instructions 6 7 0000 3F10 DEF MAIN 8 AORG >3F10 9 3F10 0340 MAIN IDLE 10 3F12 C820 MOV @>3F00,@JINST * Jump opcode to execute at >3F00, >3F01 10 3F14 3F00 10 3F16 3F24 11 3F18 04C0 CLR R0 12 3F1A 020E LI R14,JINST 12 3F1C 3F24 13 3F1E C3E0 MOV @>3F02,R15 * Flag value at >3F02, >3F03 13 3F20 3F02 14 3F22 0380 RTWP * R14->PC, R15->status flags 15 3F24 0000 JINST DATA 0 * Replaced by opcode 16 3F26 0580 INC R0 * Only executed if jump falls through 17 3F28 C800 MOV R0,@>3F00 * R0 result in >3F00, >3F01 17 3F2A 3F00 18 3F2C 0460 B @MAIN 18 3F2E 3F10 19 3F30 0000 END 19 Assembly Complete - Errors: 0, Warnings: 0 ------ Symbol Listing ------ JINST ABS:3F24 JINST MAIN ABS:3F10 MAIN R0 ABS:0000 R0 R1 ABS:0001 R1 R10 ABS:000A R10 R11 ABS:000B R11 R12 ABS:000C R12 R13 ABS:000D R13 R14 ABS:000E R14 R15 ABS:000F R15 R2 ABS:0002 R2 R3 ABS:0003 R3 R4 ABS:0004 R4 R5 ABS:0005 R5 R6 ABS:0006 R6 R7 ABS:0007 R7 R8 ABS:0008 R8 R9 ABS:0009 R9 Here is the host-side assembly with the GPU program included as data. It will be copied to the VRAM at the location used in the AORG, which is >3F10 in this case. Once in VRAM, the GPU can execute the code. * F18A CPU to GPU jump instruction test driver * Matthew Hagerty * June 4, 2012 * * 99/4A driver for the GPU jump instruction test. * Each jump instruction is executed, then the GPU executes * the same jump instruction and the jump / no-jump result * is compared. DEF MAIN * VDP Memory Map * VDPRD EQU >8800 * VDP read data VDPSTA EQU >8802 * VDP status VDPWD EQU >8C00 * VDP write data VDPWA EQU >8C02 * VDP set read/write address * Workspace * WRKSP EQU >8300 * Workspace R0LB EQU WRKSP+1 * R0 low byte reqd for VDP routines R1LB EQU WRKSP+3 * R1 low byte R2LB EQU WRKSP+5 * R2 low byte R3LB EQU WRKSP+7 * R3 low byte R4LB EQU WRKSP+9 * R4 low byte R5LB EQU WRKSP+11 * R5 low byte R6LB EQU WRKSP+13 * R6 low byte * R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15 * 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 SAVR0 DATA 0 PASFAL DATA 0 * Pass/fail flag TXTPAS TEXT 'pass' HEX TEXT '0123456789ABCDEF' SPACE BYTE 32 * space character EVEN GPU DATA >0340 * 3F10 0340 MAIN IDLE DATA >C820 * 3F12 C820 MOV @>3F00,@JINST * Jump opcode to execute at >3F00, >3F01 DATA >3F00 * 3F14 3F00 DATA >3F24 * 3F16 3F24 DATA >04C0 * 3F18 04C0 CLR R0 DATA >020E * 3F1A 020E LI R14,JINST DATA >3F24 * 3F1C 3F24 DATA >C3E0 * 3F1E C3E0 MOV @>3F02,R15 * Flag value at >3F02, >3F03 DATA >3F02 * 3F20 3F02 DATA >0380 * 3F22 0380 RTWP * R14->PC, R15->status flags DATA >0000 * 3F24 0000 JINST DATA 0 * Replaced by opcode DATA >0580 * 3F26 0580 INC R0 * Only executed if jump falls through DATA >C800 * 3F28 C800 MOV R0,@>3F00 * R0 result in >3F00, >3F01 DATA >3F00 * 3F2A 3F00 DATA >0460 * 3F2C 0460 B @MAIN DATA >3F10 * 3F2E 3F10 GPUEND MAIN LIMI 0 LWPI WRKSP * F18A blind unlock, no testing for success or failure. * Perform the Enhance Register Mode (ERM) unlock sequence * for the F18A. LI R0,>391C * VR1/57, value 00011100 BL @VWTR * Write once BL @VWTR * Write twice, unlock LI R0,>01E0 * VR1, value 11100000, a real sane setting BL @VWTR * Write reg * Copy GPU code to VRAM LI R0,>3F10 LI R1,GPU LI R2,GPUEND-GPU BL @VMBW LI R0,33 * 1,1 screen location to start output LI R7,33 * Screen next row LI R3,JLIST LI R13,WRKSP * Used in the RTWP instruction and will never change LI R14,JINST * Instruction loop ILOOP BL @VWAD MOVB *R3+,@VDPWD * Display the instruction name MOVB *R3+,@VDPWD MOVB *R3+,@VDPWD MOVB *R3+,@VDPWD AI R0,5 * Adjust past the name and add a space MOV *R3,@JINST * Copy the jump opcode for execution CLR R15 * Reset the count CLR @PASFAL * Clear the pass/fail flag CLOOP CLR R5 RTWP * R13->WP, R14->PC, R15->status flags JINST DATA 0 * Replaced with jump instruction opcode INC R5 * Copy to VRAM for GPU MOV R0,@SAVR0 * Temp save R0 LI R0,>3F00 BL @VWAD MOV *R3,R0 MOVB R0,@VDPWD * Jump opcode >3F00, >3F01 MOVB @R0LB,@VDPWD MOV R15,R0 MOVB R0,@VDPWD * Current flag to >3F02, >3F03 MOVB @R0LB,@VDPWD * Set the GPU PC which also triggers it LI R0,>363F BL @VWTR LI R0,>3712 BL @VWTR * Compare the result in >3F01 LI R0,>3F01 BL @VRAD MOV @SAVR0,R0 * Restore R0 CB @VDPRD,@R5LB JEQ CNEXT * Test failed, display the value that failed INC @PASFAL BL @HEXDMP JMP BAIL * Skip the rest of the flags CNEXT AI R15,>0400 * Increment the flags JNE CLOOP BAIL * Check if test passed (nothing has been displayed yet) MOV @PASFAL,@PASFAL * Compare to 0 JNE JNEXT * If there was a failure, do not display 'PASS' * If the flags failed, display 'flag', otherwise 'pass' LI R1,TXTPAS LI R2,4 * R0 is already set up BL @VMBW JNEXT AI R7,32 * Next screen row MOV R7,R0 INCT R3 MOV *R3,R4 * Next instruction JEQ DONE B @ILOOP DONE JMP DONE JLIST TEXT 'JEQ ' JEQ $+4 TEXT 'JGT ' JGT $+4 TEXT 'JH ' JH $+4 TEXT 'JHE ' JHE $+4 TEXT 'JL ' JL $+4 TEXT 'JLE ' JLE $+4 TEXT 'JLT ' JLT $+4 TEXT 'JMP ' JMP $+4 TEXT 'JNC ' JNC $+4 TEXT 'JNE ' JNE $+4 TEXT 'JNO ' JNO $+4 TEXT 'JOC ' JOC $+4 TEXT 'JOP ' JOP $+4 DATA 0 ** * Display R15 as a hex number at screen location in R0 * Uses R5 * HEXDMP MOV R11,R5 * Save return address BL @VWAD MOV R5,R11 * Restore return address MOV R15,R5 ANDI R5,>F000 * Isolate the first digit SRL R5,12 * Convert to a number MOVB @HEX(R5),@VDPWD * Convert to ASCII and write to the screen MOV R15,R5 ANDI R5,>0F00 SRL R5,8 MOVB @HEX(R5),@VDPWD MOV R15,R5 ANDI R5,>00F0 SRL R5,4 MOVB @HEX(R5),@VDPWD MOV R15,R5 ANDI R5,>000F MOVB @HEX(R5),@VDPWD MOVB @SPACE,@VDPWD AI R0,5 * 4 digits plus a space B *R11 *// HEXDMP ********************************************************************* * * VDP Set Write Address * * R0 Address to set VDP address counter to * VWAD MOVB @R0LB,@VDPWA * Send low byte of VDP RAM write address ORI R0,>4000 * Set the two MSbits to 01 for write MOVB R0,@VDPWA * Send high byte of VDP RAM write address ANDI R0,>3FFF * Restore R0 top two MSbits B *R11 *// VWAD / VRAD ********************************************************************* * * VDP Set Read Address * * R0 Address to set VDP address counter to * VRAD MOVB @R0LB,@VDPWA * Send low byte of VDP RAM write address ANDI R0,>3FFF * Make sure the two MSbits are 00 for read MOVB R0,@VDPWA * Send high byte of VDP RAM write address B *R11 *// VRAD ********************************************************************* * * VDP Multiple Byte Write * * R0 Starting write address in VDP RAM * R1 Starting read address in CPU RAM * R2 Number of bytes to send to the VDP RAM * * R1 is modified by the value of R2 * R2 is changed to 0 * VMBW MOVB @R0LB,@VDPWA * Send low byte of VDP RAM write address ORI R0,>4000 * Set the two MSbits to 01 for write MOVB R0,@VDPWA * Send high byte of VDP RAM write address VMBWLP MOVB *R1+,@VDPWD * Write byte to VDP RAM DEC R2 * Byte counter JNE VMBWLP * Check if done ANDI R0,>3FFF * Restore R0 top two MSbits B *R11 *// VMBW ********************************************************************* * * VDP Write To Register * * R0 MSB VDP register to write to * R0 LSB Value to write * VWTR MOVB @R0LB,@VDPWA * Send low byte (value) to write to VDP register ORI R0,>8000 * Set up a VDP register write operation (10) MOVB R0,@VDPWA * Send high byte (address) of VDP register ANDI R0,>3FFF * Restore R0 top two MSbits B *R11 *// VWTR END 1 Quote Link to comment Share on other sites More sharing options...
Manic1975 Posted April 21, 2013 Share Posted April 21, 2013 Matthew is there something new about in-system update? It would be nice to update my F18A and play all extended basic games. Quote Link to comment Share on other sites More sharing options...
matthew180 Posted April 21, 2013 Author Share Posted April 21, 2013 (edited) Sorry, no updates yet. :-( I'm spending as much spare time on it as I can. Edited April 21, 2013 by matthew180 Quote Link to comment Share on other sites More sharing options...
Manic1975 Posted April 22, 2013 Share Posted April 22, 2013 Is there any difference in updating F18A with JTAG cable? What would I need to update F18A this way? Quote Link to comment Share on other sites More sharing options...
matthew180 Posted April 22, 2013 Author Share Posted April 22, 2013 A potential work-around for the XB games might be to limit the number of sprites so the CALL COINC(ALL) only acts on the visible sprites. I think there is a way to do that, but I don't remember. Updating via JTAG is the same way I initially program the firmware. The advantage of using the JTAG cable is that it is fast (about 20 seconds for the update) and it does not matter if something goes wrong, if you lose power during the update, etc, you can just start over. With the "in place" update, if there is a problem and power is lost, the F18A will have to be updated with a JTAG cable. Look at the first post in this thread for details on how to update via a JTAG cable. I'm not sure about out-side of the U.S. but Digilent sells a JTAG programmer for $50USD, which is pretty cheap for a JTAG cable. You also need the Xilinx tools (as noted), and I suggest getting just the "Lab Tools" (also noted above) since it is a smaller download. Quote Link to comment Share on other sites More sharing options...
matthew180 Posted April 26, 2013 Author Share Posted April 26, 2013 Small update, I attached a new f18a_250k.zip file to the first post with a fix for the GPU DIV instruction. If you have a JTAG programming cable, you can update your F18A firmware. Quote Link to comment Share on other sites More sharing options...
Manic1975 Posted June 1, 2013 Share Posted June 1, 2013 Hello Matthew, I have buy JTAG programming cable and have updated my F18A firmware. I follow your instruction and eveerything work just fine! Thank you for good instruction and firmware. Just continue publish new firmware this way. 1 Quote Link to comment Share on other sites More sharing options...
Asmusr Posted September 12, 2013 Share Posted September 12, 2013 When moving to the ECM (enhanced color modes) for sprites and tiles, the pattern data itself *does* actually select a color from a palette. This is different than the 1-bpp original mode because, for example, in 2-bpp mode a "00" pixel means palette index 0, which will cause a color to be displayed. Maybe. In ECMs there is a setting to specify if a "00" or "000" pixel value means the pixel is transparent, or if the pixel should use the 0-index in the palette. So, just like the original mode for sprites where you have 2-colors but one is always transparent, you can select if you want sprites to have transparent pixels or index colors. When you select to have transparent colors for "0", "00", or "000" pixels, you sacrifice 1 color. Hi Matthew, I'm reading this again because I'm thinking about using multi-color sprites for my Scramble game (and fall back to monochrome sprites if an F18A is not detected) but I'm not sure I fully understand how it works. Do all sprites share the same palette of 4 or 8 colors, or can each sprite use a different palette? To provide more pixel data, the extra pattern bits need to come from some where. To keep some sort of compatibility with existing patterns, I chose to implement the extra bits via "bit planes". This allows you to start off with some existing sprite or tile patterns, and expand them to support more colors at a later time in your development. Also, the 3-bpp mode does not pack neatly into a single byte had I tried to use a linear bit-packing method. I'm not sure what you mean by "This allows you to start off with some existing sprite or tile patterns, and expand them to support more colors at a later time in your development."? It would be nice if you could keep the monochrome sprite pattern table unmodified and just add more planes when you switch to multi-color sprites, but that's not how it works, right? You will also need to replace the first plane/table. For each byte that makes up a sprite's pattern, there are one or two more bytes in the additional planes that are used to make the final color index for a given pixel. The additional pattern planes are always 2K (and 4K for 3-bpp) bytes offset from the Sprite Pattern Generator Table (SPGT). This means that in 1-bpp the SPGT is the normal 2K, for 2-bpp the SPGT is 4K, and 3-bpp it is 6K. Could a future version of the firmware include an option only to have 1K or 512 bytes between the planes? This could save a lot of RAM if you only have a few sprite patterns. Having to reserve 6k for sprite patterns would be problematic in many cases. Thanks, Rasmus Quote Link to comment Share on other sites More sharing options...
sometimes99er Posted September 13, 2013 Share Posted September 13, 2013 Could a future version of the firmware include an option only to have 1K or 512 bytes between the planes? This could save a lot of RAM if you only have a few sprite patterns. Having to reserve 6k for sprite patterns would be problematic in many cases. Don't forget the power over overlapping areas. In principle TI Basic and XB have the PDT in over the SIT, CT, SAL and SDT. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.