Jump to content
IGNORED

Multicolor mode - the mode everybody wants


Asmusr

Recommended Posts

On 11/20/2022 at 12:40 PM, SteveB said:

Please make this two contests ... one in assembler and one in XB or other language ..

Fair enough, but the games in other language category shouldn't be allowed to use assembly at all 🙂.

 

Seriously, all entries would have to use assembly to a certain degree. What if you write two lines of XB and the rest is assembly support? And what about C, is that allowed to compete with XB?

 

We could also make the rules so that all entries are in their own category, then everybody wins 🙂.

  • Like 3
Link to comment
Share on other sites

Here is a demo in multicolor mode with 'sprites' and scrolling.

I'm using a linear screen buffer with one byte per pixel. The background image is 128x128 and is also stored in a RAM buffer (16 KiB). Each frame a screen is copied from the image into the screen buffer (depending on scroll position), then the 'sprites' are added, and finally the screen buffer is copied to the VDP. 

multicolor2-8.bin

  • Like 11
Link to comment
Share on other sites

This MC talk has got my imagination going (but then what doesn't?) I did not see discussion yet of alternatives for screen table layout. 

Assuming the screen image table is static, and  you rewrite the whole pattern table on each frame. 

 

The "by the E/A" book scheme was 0-1F repeated 4 times. Then 20-3F , etc. 

 

I imagine a more contiguous arrangement in column-major order. 
 

Down one screen column: 0 four times, then 1 four times, etc. Next column starts with 6. 
 

CPU frame buffer would also be organized column-major. In ray casting, this would be a natural arrangement, as you fill  pixels column by column. The inner loop deals with packing 2 columns  per byte. 

 

Screen blits could push the entire pattern table in one go. 
 

Link to comment
Share on other sites

Next version of my demo, with a pretty decent frame rate. I tried rendering a couple of ways. The fastest I was able to come up with is this version, where the source frame buffer 128*64 is in memory in two versions: "normal" version and a version scrolled to the left by one pixel. The code is drawing a spinning object in there, which is drawn twice per frame, to both frame buffers (the normal and left shifted one). This approach allows fast writing to VDP RAM, without having to do shifting of the whole picture on each frame.

Edit: the screen capture below is from js99er.net, and seems to show tearing, i.e. its partially updated about half way through.

image.thumb.png.8be7ec9648caf16120ec942f0248b424.png

multicolor.asm MULTICOLOR.bin

Edited by speccery
  • Like 6
Link to comment
Share on other sites

13 hours ago, FarmerPotato said:

This MC talk has got my imagination going (but then what doesn't?) I did not see discussion yet of alternatives for screen table layout. 

Assuming the screen image table is static, and  you rewrite the whole pattern table on each frame. 

 

The "by the E/A" book scheme was 0-1F repeated 4 times. Then 20-3F , etc. 

 

I imagine a more contiguous arrangement in column-major order. 
 

Down one screen column: 0 four times, then 1 four times, etc. Next column starts with 6. 
 

CPU frame buffer would also be organized column-major. In ray casting, this would be a natural arrangement, as you fill  pixels column by column. The inner loop deals with packing 2 columns  per byte. 

 

Screen blits could push the entire pattern table in one go. 
 

That's what I was describing here:

I also made a mock raycaster demo to test the speed:

It's still not a very clean layout, because you have 2 pixels per column to deal with, but it can be very useful in some cases.

Edited by Asmusr
  • Like 2
Link to comment
Share on other sites

If we want scroll the screen by pixels in multicolor mode as fast as possible, here is another idea:

 

Let's think of the multicolor screen as consisting of characters like in graphics mode, but in this case each character is 2 pixels wide and 8 pixels tall. In the name table (screen image table) a character is represented by a column of four identical numbers (names). There can be 32x6=192 such characters on the screen simultaneously, but we don't have to use a different character at each position, and we have a total of 256 characters to choose from.

 

Now we apply smooth scrolling techniques from graphics mode to these characters. We have a set of original characters and a map that we want to scroll through. We generate a new map by identifying unique pairs of adjacent characters in the original map, and from these 'transition characters' we generate a new set of characters for each scroll offset within a character. There is a full explanation of how this works in the Smooth scrolling thread https://forums.atariage.com/topic/210888-smooth-scrolling.

 

For horizontal scrolling there are only two scroll offsets, 0 and 1, so we can easily store a character set for each offset in VDP RAM. Every frame we flip between the character sets and every second frame we need to scroll the name table by one column. We can split the work between frames so we only have to update 768/2=384 bytes each frame.

 

For vertical scrolling there are 8 scroll offsets (0 - 7) so we need to store 8 character sets. Since we don't have 16K VDP RAM available for this, we would need to limit the character sets to 128 characters (corresponding to 128 adjacent pairs in the original map). On the other hand we would have 8 frames to scroll the name table by 4 rows, so only 768/8=96 bytes to transfer to the VDP each frame!

 

The downside of this technique is that you cannot have unique graphics all over the screen, but would need some repetition. How much this would mean in practice is difficult to say without trying. We would probably want to work with metatiles of 4 characters next to each other to get a square of 8x8 pixels. Each metatile would use 3 characters internally and each adjacent pair of metatiles would use one more (that's for horizontal scrolling).

 

Another downside is that drawing multicolor sprites on top of the characters would be impossible, so we would have to use regular sprites only.

 

Unfortunately Magellan is not geared to handle this type of graphics, so it would require some tool development before it could be tried in practice.

 

 

Edited by Asmusr
  • Like 4
Link to comment
Share on other sites

12 hours ago, Asmusr said:

That's what I was describing here:

I also made a mock raycaster demo to test the speed:

It's still not a very clean layout, because you have 2 pixels per column to deal with, but it can be very useful in some cases.

Clear now!  I did not get it the first time. 

 

I'm thinking through raycaster and how to represent geometry.  Alternative to keeping a gigantic raster map.  I think of this ,while yardwork consumes my time.

 

I remember the magazine articles on Quake binary polygon search tree but that's too much.

 

I would like to try some straightforward line segment (ie walls) database for a raycaster.   To start with, all I want is 4 bounding walls and an obstacle (4 sided box) inside it. 

 

I'm familiar (ie loved) Preparata & Shamos' Computational Geometry book, algorithms for sorting and selecting line segments, though surely there are newer books.    Line segments for walls, Pre-sorted.  Visibility tests: select objects in frustrum, normal test for facing, calculate intersection with rays.  Reuse results for next frame, add/drop visible objects dynamically.  

 

But I really should not start a new project.  Yet, yesterday I wrote all the initial MC routines I will need...

 

  • Like 2
Link to comment
Share on other sites

On 11/27/2022 at 11:40 AM, Asmusr said:

For horizontal scrolling there are only two scroll offsets, 0 and 1, so we can easily store a character set for each offset in VDP RAM. Every frame we flip between the character sets and every second frame we need to scroll the name table by one column. We can split the work between frames so we only have to update 768/2=384 bytes each frame.

 

For vertical scrolling there are 8 scroll offsets (0 - 7) so we need to store 8 character sets. Since we don't have 16K VDP RAM available for this, we would need to limit the character sets to 128 characters (corresponding to 128 adjacent pairs in the original map). On the other hand we would have 8 frames to scroll the name table by 4 rows, so only 768/8=96 bytes to transfer to the VDP each frame!

Thanks for the detailed explanation, although I'm not sure where the 8 scroll offsets vertically come from. Need to think about this more.

 

The version I posted on Saturday uses about 65000 clock cycles per frame (measured with my own emulator, so take it as a rough number) for the transfer from main memory to VDP, irrespective whether the frame is located horizontally in an even or odd pixel offset, since it's using separate source frame buffers for horizontally even/odd aligned frames. This translates to around 46 fps, although that probably gets scaled down to 30 fps since the code waits for frame sync before each swap of pattern definitions. I think for now this performance is high enough for me as a proof of concept, next need to think about what kind of a game would be interesting with these super chunky pixels :) Of course depending on the game scrolling might not b required, but developing this code which is also good for scrolling is a good way to understand what kind of screen update frequencies we can expect in general. Like I mentioned in the Pandemic Zoom call on Saturday, I'm intrigued by the idea of making a lunar lander, with Artemis happening and all :) 

Edited by speccery
  • Like 2
Link to comment
Share on other sites

2 hours ago, speccery said:

Thanks for the detailed explanation, although I'm not sure where the 8 scroll offsets vertically come from. Need to think about this more.

Because a multicolor character is 8 multicolor pixels tall, so transitioning between two characters takes 8 frames.

  • Thanks 1
Link to comment
Share on other sites

3 hours ago, speccery said:

The version I posted on Saturday uses about 65000 clock cycles per frame (measured with my own emulator, so take it as a rough number) for the transfer from main memory to VDP, irrespective whether the frame is located horizontally in an even or odd pixel offset, since it's using separate source frame buffers for horizontally even/odd aligned frames.

Yes the approach with two source frame buffers sounds like a good solution if you only have to update a limited part of the screen every frame, e.g. if your background is a static image and you draw your multicolor sprites in empty areas so you can erase them again.

  • Like 2
Link to comment
Share on other sites

11 hours ago, pixelpedant said:

Yeesh.  That is beautiful and wonderful and nuts. 

 

MSX developers from forever ago, always showing us how it ought to be done. 

 

How dare they.

Yeah. In some ways, the architecture of the MSX makes this kind of stuff easier. The Z80 is much better at moving bytes around than the 9900. It's a shame the 4A didn't have a 9995 - but it didn't exist when the 4A went into production. 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...