JasperAK Posted January 20, 2021 Share Posted January 20, 2021 8 hours ago, Andrew Davie said: Can't get much closer than this without lots of work, and basically I'm not gunna. Just thought I'd share the effort since you were interested enough to ask. That is so amazing. Thank You. There is something about that color scheme that gets me. Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 21, 2021 Author Share Posted January 21, 2021 3 hours ago, SpiceWare said: Correct. my custom mode file (syntax highlighting rules) for jEdit colorizes binary, which really makes graphics in source code stand out. Since the categories went away I created a jEdit index for my blog entries related to it. That's lovely and what I need, but I'm firmly committed to using Atari Dev Studio these days. It's probably easy enough to fiddle with the syntax highlighting myself... will look into it. BTW: while I have your attention, the getRandom() function in your CDFJ collect source code didn't work well at all for me (randomly moving the pieces). I recall I was using the low bits to select squares. Perhaps it was my coding/usage, but I replaced it with the following which seems to be doing an excellent job... unsigned int m_z = 2342; unsigned int m_w = 122561; unsigned int getRandom32b() { m_z = 36969 * (m_z & 65535) + (m_z >> 16); m_w = 18000 * (m_w & 65535) + (m_w >> 16); return (m_z << 16) + m_w; } The initial values I just chose... randomly. Quote Link to comment Share on other sites More sharing options...
+SpiceWare Posted January 21, 2021 Share Posted January 21, 2021 39 minutes ago, Andrew Davie said: That's lovely and what I need, but I'm firmly committed to using Atari Dev Studio these days. It's probably easy enough to fiddle with the syntax highlighting myself... will look into it. I'd mentioned it to @mksmith about a year ago. He looked into it and said the background colors could not be set. I told him that even just different foreground colors would help, as that's how I'd originally implemented it. Don't know if he looked into it anymore after that. 39 minutes ago, Andrew Davie said: BTW: while I have your attention, the getRandom() function in your CDFJ collect source code didn't work well at all for me (randomly moving the pieces). I recall I was using the low bits to select squares. Perhaps it was my coding/usage, but I replaced it with the following which seems to be doing an excellent job... I've been using that since I started doing DPC+ projects. I thought I got it from the wiki page I linked to, but don't see the code - possibly the page was edited, or I found it somewhere else. The ARM has an inline barrel roller so bit-shifting is done for free. I cover that here in the comments of Part 3 of the CDFJ tutorial. So I tend to use different bits, such as this snippet of the robot initial position routine in Frantic: gSpriteX[spot] = 12 + (x * 28) + (((getRandom32() & 0xff) * 18) >> 8); gSpriteY[spot] = 10 + (y * 60) + (((getRandom32() & 0xffff) * 43) >> 16); // start each robot's eye in a different position gSpriteAnimFrame[spot] = (6 * (getRandom32() & 0xffffff)) >> 24; Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 21, 2021 Author Share Posted January 21, 2021 (edited) Here's how I define the pieces... I manually toggle the 1's and 0's in these blocks of code. Yes, I could write an tool to convert a graphics image (I did that for the earlier engine), but you know, you start doing something manually just to figure it out, and then you find you're nearly done and the diversion to write the tool seems greater than just continuing hacking away the way you've been doing it. You're nearly finished, and yet a year later you realise you're still hacking away manually and you could have written that tool after all and it would have been much easier. But.. c'est la vie... it's too late now so you continue to hack. const unsigned char WHITE_KNIGHT[] = { // RED BLUE GREEN MASK SCANLINE # 0b00001, 0b00001, 0b00001, 0b11110, // 0 0b00001, 0b00001, 0b00001, 0b11110, // 1 0b00101, 0b00101, 0b00101, 0b11010, // 2 0b00110, 0b00110, 0b00110, 0b11000, // 3 0b01011, 0b01011, 0b01010, 0b10000, // 4 0b11011, 0b11011, 0b01010, 0b00000, // 5 0b11110, 0b11110, 0b01110, 0b00000, // 6 0b11111, 0b11111, 0b11110, 0b00000, // 7 0b11111, 0b11111, 0b11110, 0b00000, // 8 0b11110, 0b11110, 0b11110, 0b00000, // 9 0b01111, 0b01111, 0b01110, 0b00000, // 10 0b11111, 0b11111, 0b11110, 0b00000, // 11 0b00110, 0b00110, 0b00110, 0b10000, // 12 0b01111, 0b01111, 0b01110, 0b10000, // 13 0b01111, 0b01111, 0b01110, 0b10000, // 14 0b01110, 0b01110, 0b01110, 0b10001, // 15 0b01100, 0b11100, 0b01100, 0b00001, // 16 0b11010, 0b11010, 0b11000, 0b00001, // 17 0b11111, 0b11111, 0b11110, 0b00000, // 18 0b11111, 0b11111, 0b11110, 0b00000, // 19 0b11111, 0b11111, 0b11110, 0b00000, // 20 0b11111, 0b11111, 0b11110, 0b00000, // 21 0b00000, 0b00000, 0b00000, 0b00000, // 22 0b11111, 0b00000, 0b00000, 0b00000, // 23 }; Each chess piece is defined as a 5-bit x 24 scanline "image" represented as binary in the code above. Each line in the definition gives a single scanline of 5 pixels. Each pixel is one of the bits (left to right) in the binary numbers. There are three columns - one for each colour, and a fourth column giving a mask. So on line 1 of the screen where the shape is drawn, we'd see the pixels in the "RED" column, scanline 1. On line 2, the colour "rolls" so we'd see the pixels from the "BLUE" column. On line 3, the "GREEN" column would be used. So, we're continually "rolling" through RED/BLUE/GREEN/RED/BLUE/GREEN... etc... as we go down the scanlines. But the next frame we start with BLUE instead of RED. And the frame after that start with GREEN... and then RED... etc. So, the 4th and final byte in each "line" is a mask. This has 0 where there is any pixel in use in the corresponding RED/GREEN/BLUE columns. You'd think you could generate this mask given you just have the RED/GREEN/BLUE already, but this is not the case. Even if your RGB pixels are completely unused, you may wish to mask out background before you draw. For example, black pixels - should they be transparent (showing background instead of black)... or opaque... showing black? The mask is where you determine this. If you want black, then you set the mask pixel as 0 (= used). If you want transparent, set to 1 (= unused). This is useful for putting black borders around objects, or having non-seethrough stuff (like an eye on the knight). Effectively we're going .... screen[y][x] = (screen[y][x] & mask) | pixel The squares are coloured simply by drawing a similarly-defined "square" shape. It works exactly the same way, but has all-BLUE pixels, and of course the mask is set to all-zero; causing all the previous contents to be "erased" to the square colour. So to draw a piece, we first draw the blank-square piece (blue or black), and then we draw the piece over that. This differs significantly from the pure 6507 version of the earlier engine, which had to pre-define all the shifted versions of all the pieces on both square colours for all the pieces for both piece colours. It took quite a few banks of shape definitions. Here I'm trading the power of the ARM to just define the pieces in isolation, and use the ARM's efficient shifting and speed to do the draw in "efficient" stages. I have separate code for each of the 8 horizontal positions of the squares, specifically because you have to shift/mask/mirror the image based on the skewy 2600 PF registers. Specifically, PF0, PF2 are mirrored and PF1 is not. But the squares are 5-pixels wide, so each of the 8 positions are unique combinations of mirroring/masking/shifting. For example, here's file "C"... // column C = PF1 D1D0, PF2<-D0D1D2 int column = 2; image = *charSet[screen[row][column]]; for (int y = 0; y < 24; y++) { int scanline = row * 24 + y; int lineIndex = y * 4; int maskIndex = lineIndex + 3; int colourIndex = lineIndex + ((_rgb & 3)); // the SQUARE first // remove span of pixels used arena_pf1_left[scanline] &= 0b11111100; arena_pf2_left[scanline] &= 0b11111000; // add the square colour if (!((row + column) & 1)) { arena_pf1_left[scanline] |= (squareBaseImage[colourIndex] >> 3) & 0b00000011; arena_pf2_left[scanline] |= (BitReversal(squareBaseImage[colourIndex]) >> 5) & 0b00000111; } if (screen[row][column]) { arena_pf1_left[scanline] &= (image[maskIndex] >> 3) | 0b11111100; arena_pf1_left[scanline] |= (image[colourIndex] >> 3) & 0b00000011; arena_pf2_left[scanline] &= (BitReversal(image[maskIndex]) >> 5) | 0b11111000; arena_pf2_left[scanline] |= (BitReversal(image[colourIndex]) >> 5) & 0b00000111; } _rgb++; if (_rgb > 2) _rgb = 0; } It's not terribly efficient - and I draw all 64 squares every single frame. That's why I doubt this will run on hardware. But that's an easy fix.... triple-buffer the screen, and then just present the buffers on demand. And rather than drawing 64 pieces every single frame, I just draw one piece/frame, or as time allows. That should work, provided I have RAM for the triple-buffer. It's all a learning experience becoming familiar with the limitations of CDFJ. Especially hard without the hardware I need to test it's all working properly. So, I guess I'm saving up for a Harmony Encore cartridge! Edit: I realise... (a) I'm using char-based accesses and ARM is 32-bit-based. Inefficient. (b) since I'm only using 20 bits/line, I could/should mangle them all together and use the ARM masking/shifting to extract. This would reduce the 72-byte requirement for each down to 24 bytes. So, yeah, that's the way to go. I'll do that ASAP. And then, we also have the (a) sorted, as we'll just declare them as 24 ints. Edited January 21, 2021 by Andrew Davie 2 Quote Link to comment Share on other sites More sharing options...
+mksmith Posted January 21, 2021 Share Posted January 21, 2021 2 hours ago, SpiceWare said: I'd mentioned it to @mksmith about a year ago. He looked into it and said the background colors could not be set. I told him that even just different foreground colors would help, as that's how I'd originally implemented it. Don't know if he looked into it anymore after that. I've been using that since I started doing DPC+ projects. I thought I got it from the wiki page I linked to, but don't see the code - possibly the page was edited, or I found it somewhere else. The ARM has an inline barrel roller so bit-shifting is done for free. I cover that here in the comments of Part 3 of the CDFJ tutorial. So I tend to use different bits, such as this snippet of the robot initial position routine in Frantic: gSpriteX[spot] = 12 + (x * 28) + (((getRandom32() & 0xff) * 18) >> 8); gSpriteY[spot] = 10 + (y * 60) + (((getRandom32() & 0xffff) * 43) >> 16); // start each robot's eye in a different position gSpriteAnimFrame[spot] = (6 * (getRandom32() & 0xffffff)) >> 24; Hi guys, still on the list at this point as the way the rendering engine works in VS Code make's it almost impossible. It been 10 months since I opened an item in Github and VS Code has made some changes (I believe) around rendering so they may have some enhancements in this area (there has been a lot of feedback to allow this sort of thing). I'll take another look shortly and see if it might be possible. 4 Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 22, 2021 Author Share Posted January 22, 2021 I've spent many hours trying to get some good shading/detail on the pieces. I've mostly finished white; black pieces have a bit of work to do here and there. But most of my time has been spent with that really mind-killing reverse-order of PF0 and PF2 and trying to correctly mask a 5-pixel-wide shape onto a mirrored-normal-mirrored stretch of 20 pixels. Getting there, but I still have some masking issues (removing the previous content) so the blue squares disappear here and there, and columns C and G have incorrect colours for some pieces (because of what's left behind). chess8.mp4 But overall, I think it's looking nice. I tried to record/convert the video to preserve the shimmer. I still haven't seen this on a CRT. 1 Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 22, 2021 Author Share Posted January 22, 2021 So I ended up following my own advice from earlier, and rewrote the code to use the compacted shape definitions. I was hoping to be able to do some macro string manipulation so I could pass strings with discernable visuals for 0 and 1 (the shaded ASCII block characters would have done nicely). But I couldn't see any way to do a substring operator in the macro, so I ended up just using 1s and 0s. Here's what I ended up with... // range: all 0 - 0b11111 #define B(red, green, blue, mask) \ ((0b##red & 0b11111) << 0) \ | ((0b##green & 0b11111) << 5) \ | ((0b##blue & 0b11111) << 10 ) \ | ((0b##mask & 0b11111) << 15) // Pieces defined thus... const unsigned int WHITE_PAWN[] = { // R G B MASK B( 00000, 00000, 00000, 00000 ), // 0 B( 00000, 00000, 00000, 00000 ), // 1 B( 00000, 00000, 00000, 00000 ), // 2 B( 00100, 00000, 00100, 00100 ), // 3 B( 00100, 00100, 00100, 00100 ), // 4 B( 00100, 00100, 00100, 00100 ), // 5 B( 00100, 00100, 00100, 00100 ), // 6 B( 00100, 00000, 00000, 00100 ), // 7 B( 00000, 00000, 00000, 00100 ), // 8 B( 01110, 01100, 01110, 01110 ), // 9 B( 01110, 01100, 01110, 01110 ), // 10 B( 00000, 00000, 00000, 01110 ), // 11 B( 00100, 00100, 00100, 00100 ), // 12 B( 00100, 00100, 00100, 00100 ), // 13 B( 00100, 00100, 00100, 00100 ), // 14 B( 00100, 00100, 00100, 00100 ), // 15 B( 00100, 00100, 00100, 00100 ), // 16 B( 00100, 00100, 00100, 00100 ), // 17 B( 01110, 01100, 01110, 01110 ), // 18 B( 01110, 01100, 01110, 01110 ), // 19 B( 01110, 01100, 01110, 01110 ), // 20 B( 00000, 01110, 00000, 01110 ), // 21 B( 01110, 00000, 00000, 01110 ), // 22 B( 00000, 00000, 00000, 00000 ), // 23 }; Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 22, 2021 Author Share Posted January 22, 2021 Well, I figured out a kind of workaround that does make things a bit easier... even without highlighting... // range: all 0 - 0b11111 #define B(red, green, blue, mask) \ ((red & 0b11111) << 0) \ | ((green & 0b11111) << 5) \ | ((blue & 0b11111) << 10 ) \ | ((mask & 0b11111) << 15) #define _____ 0b00000 #define ____X 0b00001 #define ___X_ 0b00010 #define ___XX 0b00011 #define __X__ 0b00100 #define __X_X 0b00101 #define __XX_ 0b00110 #define __XXX 0b00111 #define _X___ 0b01000 #define _X__X 0b01001 #define _X_X_ 0b01010 #define _X_XX 0b01011 #define _XX__ 0b01100 #define _XX_X 0b01101 #define _XXX_ 0b01110 #define _XXXX 0b01111 #define X____ 0b10000 #define X___X 0b10001 #define X__X_ 0b10010 #define X__XX 0b10011 #define X_X__ 0b10100 #define X_X_X 0b10101 #define X_XX_ 0b10110 #define X_XXX 0b10111 #define XX___ 0b11000 #define XX__X 0b11001 #define XX_X_ 0b11010 #define XX_XX 0b11011 #define XXX__ 0b11100 #define XXX_X 0b11101 #define XXXX_ 0b11110 #define XXXXX 0b11111 const unsigned int WHITE_PAWN[] = { // R G B MASK B( _____, _____, _____, _____ ), // 0 B( _____, _____, _____, _____ ), // 1 B( _____, _____, _____, _____ ), // 2 B( __X__, _____, __X__, __X__ ), // 3 B( __X__, __X__, __X__, __X__ ), // 4 B( __X__, __X__, __X__, __X__ ), // 5 B( __X__, __X__, __X__, __X__ ), // 6 B( __X__, _____, _____, __X__ ), // 7 B( _____, _____, _____, __X__ ), // 8 B( _XXX_, _XX__, _XXX_, _XXX_ ), // 9 B( _XXX_, _XX__, _XXX_, _XXX_ ), // 10 B( _____, _____, _____, _XXX_ ), // 11 B( __X__, __X__, __X__, __X__ ), // 12 B( __X__, __X__, __X__, __X__ ), // 13 B( __X__, __X__, __X__, __X__ ), // 14 B( __X__, __X__, __X__, __X__ ), // 15 B( __X__, __X__, __X__, __X__ ), // 16 B( __X__, __X__, __X__, __X__ ), // 17 B( _XXX_, _XX__, _XXX_, _XXX_ ), // 18 B( _XXX_, _XX__, _XXX_, _XXX_ ), // 19 B( _XXX_, _XX__, _XXX_, _XXX_ ), // 20 B( _____, _XXX_, _____, _XXX_ ), // 2X B( _XXX_, _____, _____, _XXX_ ), // 22 B( _____, _____, _____, _____ ), // 23 }; Quite the glorious hack. Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 22, 2021 Author Share Posted January 22, 2021 (edited) Here's a version. Firstly, it's pretty much @SpiceWare's collect demo with just a few tweaks. All the special CDFJ magic is nothing to do with me. So, thanks again for that. What this is is a simple Interleaved Chronocolour(TM) display of a chessboard. The board is generated every frame in ARM code, and frankly I don't see how this could work on real hardware. If you have a Harmony Cart, you might try to run it - but as noted... doubt it will work. But, it works on Stella and that's all I really wanted. If you do run on Stella, you should (to be fair) set the phosphor setting (TAB/Video&Audio/TV Effects) to something which you think approximates a real TV, in terms of phosphor persistance. I find that anything from 50% upwards seems "fair" to me, but YMMV. I guess at this stage that's a wrap. I've done all I wanted to do and this one can be put in the filing cabinet. chess9.mp4 CDFJChess.bin Edited January 22, 2021 by Andrew Davie 2 Quote Link to comment Share on other sites More sharing options...
+SpiceWare Posted January 22, 2021 Share Posted January 22, 2021 9 hours ago, Andrew Davie said: Well, I figured out a kind of workaround that does make things a bit easier... even without highlighting... That's how I used to do it: graphics.h ; graphics SEG.U VARS ; preceeding with zz so these variables don't replace others in Stella's debugger zz________ = %00000000; $0 0 zz_______X = %00000001; $1 1 zz______X_ = %00000010; $2 2 zz______XX = %00000011; $3 3 zz_____X__ = %00000100; $4 4 zz_____X_X = %00000101; $5 5 zz_____XX_ = %00000110; $6 6 ... Quote Link to comment Share on other sites More sharing options...
+SpiceWare Posted January 22, 2021 Share Posted January 22, 2021 59 minutes ago, Andrew Davie said: The board is generated every frame in ARM code, and frankly I don't see how this could work on real hardware. If you have a Harmony Cart, you might try to run it - but as noted... doubt it will work. Board looks correct, though the screen rolls: One of the things I do in my projects early on is add a way to show VB and OS time remaining to make it easier to figure out where I'm having timing issues. From Part 8 - Score & Timer: Quote One of the challenges when developing DPC+ARM code is that Stella does not emulate how long ARM code takes to run. As far as it's concerned, all ARM code will finish executing in 0 cycles of 6507 time. Because of this it's very easy to write something that will run just fine in emulation, but will cause screen jitters and/or rolls, or even a fatal crash when run on a real Atari. We're already checking timers in our 6507 code, so we can easily save those values and display them in the score. ... Left B and Right A - Timing Remaining in Vertical Blank and Overscan Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 22, 2021 Author Share Posted January 22, 2021 1 minute ago, SpiceWare said: Board looks correct, though the screen rolls: One of the things I do in my projects early on is add a way to show VB and OS time remaining to make it easier to figure out where I'm having timing issues. From Part 8 - Score & Timer: Thanks for testing and for the timer info. Can you explain how things work on the cart itself? How does the ARM have "heaps" of time to draw stuff, when the 6507 is running too? Specifically, in the PlusCart for example, the address bus needs to be serviced pretty regularly and quickly. I'm unclear/unsure what exactly the relationship is between the CDFJ cart and the 6507. How does the ARM "get away" with having heaps of time to do stuff, and yet the 6507 is still running? Is the bus servicing interrupt driven? Quote Link to comment Share on other sites More sharing options...
+SpiceWare Posted January 22, 2021 Share Posted January 22, 2021 30 minutes ago, Andrew Davie said: Can you explain how things work on the cart itself? The ARM runs in 2 modes. When 6507 code is running the ARM is put in a very tight loop monitoring the address bus and reacting as needed to implement the CDFJ registers and such. When the 6507 code does LDY #$FF*/STY CALLFN to trigger your custom C code the ARM captures the current 6507 address, puts a NOP on the databus to idle the 6507, runs your custom code, then puts a JMP ADDRESS+1 on the databus to return the 6507 to the instruction after STY CALLFN. At the bottom of this blog entry is Further Reading with a few links to @cd-w blog entries about the Harmony that should prove enlightening. * If you used LDY #$FE/STY CALLFN for digital audio support then an interrupt is used on the ARM to have it output the proper values on the databus to update AUDV0 once per scanline. I'm not sure of the specifics, but it probably outputs LDA #xx/STA AUDV0/NOP 1 Quote Link to comment Share on other sites More sharing options...
+SpiceWare Posted January 22, 2021 Share Posted January 22, 2021 In regards to #2, as the 6507 executes the NOPs the PC will be incrementing. So the STY CALLFN should be towards the beginning of the ROM address space. If you have it towards the end you risk the PC wrapping around to $0000, which would crash the 6507 code. Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 23, 2021 Author Share Posted January 23, 2021 (edited) @SpiceWare thanks for the explanation. I understand. I obviously need to get that timing info onscreen ASAP. Meanwhile, it only works if I can either draw the entire board in the available time... OR I setup a buffered board (3 frames for RGB) on the ARM, and feed these directly as the original PrepArenaBuffers() did. That would require a 2nd set (3 frames for RGB) to be available to draw into while the buffered set are being displayed. A fair whack of (ARM) memory; 6 bytes x 192 scanlines x 3 colours (RGB) * 2 buffer sets = 6912 bytes. The current draw takes just the 6 x 192 bytes. I've had a go at optimising the current draw, just to see if it will fit. It's almost twice as quick, I'm pretty sure. I guess I'm asking nicely for someone to test the attached binary on their Harmony Cart and see if I've fixed the timing and the screen no longer rolls. For posterity, here is the complete screen draw source code... void PrepArenaBuffers() { // This function loads the selected Arena layout into the 6 playfield buffers. // Set the cycling colours for ICC int cno = cbase; for (int i = 1; i < 192; i++) { //??? if (cno > 2) cno = 0; RAM[_BUF_COLUPF + i] = ColorConvert(RGB[cno]); cno++; } unsigned char *arena_pf0_left = RAM + _BUF_PF0_LEFT; unsigned char *arena_pf1_left = RAM + _BUF_PF1_LEFT; unsigned char *arena_pf2_left = RAM + _BUF_PF2_LEFT; unsigned char *arena_pf0_right = RAM + _BUF_PF0_RIGHT; unsigned char *arena_pf1_right = RAM + _BUF_PF1_RIGHT; unsigned char *arena_pf2_right = RAM + _BUF_PF2_RIGHT; #define PF0 0 #define PF1 1 #define PF2 2 unsigned char *arenas[][3] = { { arena_pf0_left, arena_pf1_left, arena_pf2_left }, { arena_pf0_right, arena_pf1_right, arena_pf2_right }, }; // Choose a random square. If there's a piece there, try to move it randomly int x = getRandom32b() & 7; int y = getRandom32b() & 7; if (screen[y][x] && (getRandom32b() & 0xFF) < 0x20) { int tox = getRandom32b() & 7; int toy = getRandom32b() & 7; if (!screen[toy][tox]) { screen[toy][tox] = screen[y][x]; screen[y][x] = 0; } } int rgb = cbase; int scanline; int mask, mask2; unsigned int piece; const unsigned int *im; int boardSquareImage; int pixels; // The draw of all pieces for (int row = 0; row < 8; row++) { for (int half = 0; half < 2; half++) { // column A = PF0<-D4D5D6D7, PF1 D7 piece = screen[row][half * 4]; if (!piece) piece += row & 1; im = *charSet[piece]; boardSquareImage = *(*charSet[row & 1]); mask = boardSquareImage >> 15; scanline = row * 24; for (int y = 0; y < 24; y++) { if (rgb > 2) rgb = 0; mask2 = *im >> 15; pixels = (((boardSquareImage >> (rgb * 5)) & ~mask2) | (*im >> (rgb * 5))) & 0b11111; mask2 |= boardSquareImage >> 15; arenas[half][PF0][scanline] = (arenas[half][PF0][scanline] & (~BitReversal(mask2 >> 1) & 0b11110000)) | (BitReversal(pixels >> 1) & 0b11110000); arenas[half][PF1][scanline] = (arenas[half][PF1][scanline] & ~(mask2 << 7)) | ((pixels << 7) & 0b10000000); scanline++; im++; if (++rgb > 2) rgb = 0; } // column B = PF1 D6D5D4D3D2 piece = screen[row][1 + half * 4]; if (!piece) piece += (row + 1) & 1; im = *charSet[piece]; boardSquareImage = *(*charSet[(row + 1) & 1]); scanline = row * 24; for (int y = 0; y < 24; y++) { mask2 = *im >> 15; pixels = (((boardSquareImage >> (rgb * 5)) & ~mask2) | (*im >> (rgb * 5))) & 0b11111; mask2 |= boardSquareImage >> 15; arenas[half][PF1][scanline] = (arenas[half][PF1][scanline] & ~(mask << 2)) | ((pixels << 2) & 0b01111100); scanline++; im++; if (++rgb > 2) rgb = 0; } // column C = PF1 D1D0, PF2<-D0D1D2 piece = screen[row][2 + half * 4]; if (!piece) piece += (row + 2) & 1; im = *charSet[piece]; boardSquareImage = *(*charSet[(row + 2) & 1]); scanline = row * 24; for (int y = 0; y < 24; y++) { mask2 = *im >> 15; pixels = (((boardSquareImage >> (rgb * 5)) & ~mask2) | (*im >> (rgb * 5))) & 0b11111; mask2 |= boardSquareImage >> 15; arenas[half][PF1][scanline] = (arenas[half][PF1][scanline] & ~(mask >> 3)) | ((pixels >> 3) & 0b11); arenas[half][PF2][scanline] = (arenas[half][PF2][scanline] & ~(BitReversal(mask) >> 5)) | ((BitReversal(pixels) >> 5) & 0b111); scanline++; im++; if (++rgb > 2) rgb = 0; } // column D = PF2<-D3D4D5D6D7 piece = screen[row][3 + half * 4]; if (!piece) piece += (row + 3) & 1; im = *charSet[piece]; boardSquareImage = *(*charSet[(row + 3) & 1]); scanline = row * 24; for (int y = 0; y < 24; y++) { mask2 = *im >> 15; pixels = (((boardSquareImage >> (rgb * 5)) & ~mask2) | (*im >> (rgb * 5))) & 0b11111; mask2 |= boardSquareImage >> 15; arenas[half][PF2][scanline] = (arenas[half][PF2][scanline] & ~BitReversal(mask)) | (BitReversal(pixels) & 0b11111000); scanline++; im++; if (++rgb > 2) rgb = 0; } } if (++cbase > 2) cbase = 0; } } CDFJChess.bin Edited January 23, 2021 by Andrew Davie 1 Quote Link to comment Share on other sites More sharing options...
Dionoid Posted January 23, 2021 Share Posted January 23, 2021 On 1/21/2021 at 3:20 AM, Andrew Davie said: // range: all 0 - 0b11111 #define B(red, green, blue, mask) \ ((0b##red & 0b11111) << 0) \ | ((0b##green & 0b11111) << 5) \ | ((0b##blue & 0b11111) << 10 ) \ | ((0b##mask & 0b11111) << 15) // Pieces defined thus... const unsigned int WHITE_PAWN[] = { // R G B MASK B( 00000, 00000, 00000, 00000 ), // 0 B( 00000, 00000, 00000, 00000 ), // 1 B( 00000, 00000, 00000, 00000 ), // 2 B( 00100, 00000, 00100, 00100 ), // 3 B( 00100, 00100, 00100, 00100 ), // 4 B( 00100, 00100, 00100, 00100 ), // 5 ... Alternatively, you could define this data in 6502 assembly, and call it from C/ARM directly. I store all my CDFJ data in assembly banks, as I found C arrays use overhead-bytes in ROM. Also, as you're only using 20 bits per line, you don't need the full 32 bits of int. Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 23, 2021 Author Share Posted January 23, 2021 Just now, Dionoid said: Alternatively, you could define this data in 6502 assembly, and call it from C/ARM directly. I store all my CDFJ data in assembly banks, as I found C arrays use overhead-bytes in ROM. Also, as you're only using 20 bits per line, you don't need the full 32 bits of int. Noted, and thanks for the comments. I was wondering what the advantages were. Having said that, the issue here is speed, not memory. 1 Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 23, 2021 Author Share Posted January 23, 2021 (edited) Just recording the best RGB triplet I've found so far... const int RGB[] = { 0x32, 0xD6, 0x82 }; White pieces use colours 0, 1, 2 (and 0, 2 for shading) Black pieces uses colour 0 (and 1 for shading) Squares use colour 3 NOTE: It doesn't look like this to the eye... each line "shimmers", quite unlike interlaced flicker, but kind of like it. It's a 20Hz colour interlace with the lines in a static (same) position. Hard to describe, even harder to emulate. These result in fairly "rich/vibrant" colours. Image is 1/3 frames of the ICC triplet (i.e., just a screen grab). Stella phosphor is set to 30% per others' recommendations. Edit: brighter with... const int RGB[] = { 0x36, 0xDA, 0x86 }; Edited January 23, 2021 by Andrew Davie Quote Link to comment Share on other sites More sharing options...
+SpiceWare Posted January 23, 2021 Share Posted January 23, 2021 12 hours ago, Andrew Davie said: I guess I'm asking nicely for someone to test the attached binary on their Harmony Cart and see if I've fixed the timing and the screen no longer rolls. Still rolls 12 hours ago, Andrew Davie said: A fair whack of (ARM) memory; 6 bytes x 192 scanlines x 3 colours (RGB) * 2 buffer sets = 6912 bytes. In CDFJ the data streams for the 6507 are restricted to the 4K Display Data RAM. Display Data wraps, so after fetching $0fff the next byte would be returned from $0000. The forthcoming CDFJ+ supports more memory, and Display Data is no longer limited to 4K. On 11/16/2020 at 2:42 PM, SpiceWare said: ... CDFJ+ is currently in development. Still uses 2K of ROM and 2K of RAM, but supports a newer Melody board with configurations of: 64K ROM & 16K RAM 128K ROM & 16K RAM 256K ROM & 32K RAM 512K ROM & 32K RAM There's at least one game in the pipeline that'll be using CDFJ+. Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 23, 2021 Author Share Posted January 23, 2021 3 hours ago, SpiceWare said: Still rolls Thanks for testing. That's a shame. The frame itself is only 274 scanlines; hopefully that's not going to cause the TV issues. Just checking. Quote Link to comment Share on other sites More sharing options...
+SpiceWare Posted January 24, 2021 Share Posted January 24, 2021 3 hours ago, Andrew Davie said: Thanks for testing. That's a shame. The frame itself is only 274 scanlines; hopefully that's not going to cause the TV issues. Just checking. C=1084S monitor, can handle 274 just fine. Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 24, 2021 Author Share Posted January 24, 2021 This is a "how it works" version showing in slow-motion the three ICC frames. This is playing at 1/8 normal cycling speed, so you can clearly see the makeup of each ICC 'pixel'. If you look, for example, at the green dot in both the normal (60 Hz) version, and in this (7.5 Hz) version, you can see how the green scanlines "roll" over 3 lines. But in the fast version this isn't readily apparent, and the whole 3 lines merges into a reasonable "green". CDFJChess.bin 1 Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 24, 2021 Author Share Posted January 24, 2021 (edited) 14 hours ago, SpiceWare said: Still rolls It's been bugging me big-time. So close. OK, I've worked a new algorithm for drawing the screen. This is at least twice as quick as the last version. I'm running out of magic tricks for speedups. Fingers crossed that this one will work though. CDFJChess.bin Edited January 24, 2021 by Andrew Davie Even more speedup (TM) Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 24, 2021 Author Share Posted January 24, 2021 This is the entire of the new draw. Not much room left for optimisation of code. Perhaps data structures might be changed for efficiency; but there's not a lot there I can see to improve. // The draw of all pieces for (int row = 0; row < 8; row++) { for (int half = 0; half < 2; half++) { // column A = PF0<-D4D5D6D7, PF1 D7 // column B = PF1 D6D5D4D3D2 // column C = PF1 D1D0, PF2<-D0D1D2 // column D = PF2<-D3D4D5D6D7 piece = screen[row][half * 4] + (row & 1) * 32; im = *charSet[piece]; piece2 = screen[row][1 + half * 4] + ((row + 1) & 1) * 32; im2 = *charSet[piece2]; piece3 = screen[row][2 + half * 4] + ((row + 2) & 1) * 32; im3 = *charSet[piece3]; piece4 = screen[row][3 + half * 4] + ((row + 3) & 1) * 32; im4 = *charSet[piece4]; scanline = row * 24; for (int y = 0; y < 24; y++) { int shifter = rgb * 5; pixels = *im++ >> shifter; pixels2 = *im2++ >> shifter & 0b11111; pixels3 = *im3++ >> shifter & 0b11111; pixels4 = *im4++ >> shifter & 0b11111; arenas[half][PF0][scanline] = BitReversal(pixels >> 1); arenas[half][PF1][scanline] = ((pixels << 7) & 0b10000000) | (pixels2 << 2) | (pixels3 >> 3); arenas[half][PF2][scanline] = BitReversal(pixels3) >> 5 | BitReversal(pixels4); scanline++; if (++rgb > 2) rgb = 0; } } 1 Quote Link to comment Share on other sites More sharing options...
+Andrew Davie Posted January 24, 2021 Author Share Posted January 24, 2021 (edited) I see one more thing; replace 'BitReversal' with table lookup. ... and if I get really desperate (I'm close)... change the shape definitions to have both normal and mirrored shapes so that I don't have to do the bit reversal at runtime. Edited January 24, 2021 by Andrew Davie Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.