42bs Posted May 24 Share Posted May 24 Blitter registers must be written with long access. The reason for (r14+x) opcode But the Blitter needs likely more time to draw then shuffeling everything into the right format. Are you using the painter's algo to draw the stripes or just the visible parts? Quote Link to comment Share on other sites More sharing options...
SainT Posted May 25 Author Share Posted May 25 9 hours ago, 42bs said: Are you using the painter's algo to draw the stripes or just the visible parts? Just visible spans, front to back. It’s amazingly efficient and easy to implement. Very elegant algorithm. 😄 1 Quote Link to comment Share on other sites More sharing options...
agradeneu Posted May 25 Share Posted May 25 Any video how it moves? 1 Quote Link to comment Share on other sites More sharing options...
SainT Posted May 25 Author Share Posted May 25 7 hours ago, agradeneu said: Any video how it moves? Not yet, but I will do before long! A little bit more optimisation (and bug fixing where blits go accross map boundaries) and I'll show it in action! 3 Quote Link to comment Share on other sites More sharing options...
Sporadic Posted May 25 Share Posted May 25 1 hour ago, SainT said: Not yet, but I will do before long! A little bit more optimisation (and bug fixing where blits go accross map boundaries) and I'll show it in action! I've never tried it, but would the blitter mask register be a cheap way to allow the map boundaries to wrap? Quote Link to comment Share on other sites More sharing options...
42bs Posted May 26 Share Posted May 26 9 hours ago, Sporadic said: I've never tried it, but would the blitter mask register be a cheap way to allow the map boundaries to wrap? The blitter is used to draw the spans. The map (at least in my case) is in main RAM and "wrapping" is just and "and" on the map pointer. 1 Quote Link to comment Share on other sites More sharing options...
Sporadic Posted May 26 Share Posted May 26 2 hours ago, 42bs said: The blitter is used to draw the spans. The map (at least in my case) is in main RAM and "wrapping" is just and "and" on the map pointer. I thought saint said he was using the blitter to traverse the height map Quote Link to comment Share on other sites More sharing options...
SainT Posted May 26 Author Share Posted May 26 7 minutes ago, Sporadic said: I thought saint said he was using the blitter to traverse the height map Yep, in my code the blitter is used in the opposite way than you’d perhaps expect! 😄 Unfortunately the and mask is on A2 and I’m using A1 for the map traversal. Otherwise that would have done the job. 😞 I should be able to use the step update in the outer blitter loop to subtract the map width / height at the appropriate place to get it to wrap. Although it wouldn’t work for wrap in both x and y without multiple blits. The one downside of this method. 2 Quote Link to comment Share on other sites More sharing options...
42bs Posted May 26 Share Posted May 26 So you copy the visible part of the map rotated into the GPU RAM? Quote Link to comment Share on other sites More sharing options...
SainT Posted May 26 Author Share Posted May 26 5 hours ago, 42bs said: So you copy the visible part of the map rotated into the GPU RAM? Basically, yes. The blitter replaces the map position increment and read in the inner loop such that the GPU just has a local linear buffer to work with. It makes the inner loop much quicker. Quote Link to comment Share on other sites More sharing options...
42bs Posted May 26 Share Posted May 26 Oh, I found waiting for the blitter drawing the spans is the most time consuming part. Even the "div" for the z axis does not have much of an impact. Do you have a separate color map or do you derive the color from the height? Quote Link to comment Share on other sites More sharing options...
SainT Posted May 26 Author Share Posted May 26 (edited) 9 minutes ago, 42bs said: Oh, I found waiting for the blitter drawing the spans is the most time consuming part. Even the "div" for the z axis does not have much of an impact. Do you have a separate color map or do you derive the color from the height? I use one over z and multiply for the inner loop for perspective projection. When you’re doing 16,000 iterations in your inner loop (160 wide with 100 depth samples) then every last cycle helps. I’ve not even started using the blitter for rendering yet, but I will do. As the height map is in local RAM then the blitter can fill in parallel quite nicely. There is a separate colour and height map, but they’re interleaved such that a single 32bit map entry has 8 bit height and 16 bit colour. Edited May 26 by SainT 1 Quote Link to comment Share on other sites More sharing options...
42bs Posted May 26 Share Posted May 26 1 hour ago, SainT said: I use one over z and multiply for the inner loop for perspective projection. When you’re doing 16,000 iterations in your inner loop (160 wide with 100 depth samples) then every last cycle helps. I’ve not even started using the blitter for rendering yet, but I will do. As the height map is in local RAM then the blitter can fill in parallel quite nicely. There is a separate colour and height map, but they’re interleaved such that a single 32bit map entry has 8 bit height and 16 bit colour. Ouch, so you drawning "by hand"? I made a 192x200x112 and a 320x200x112 version. The difference is massive. Keen to see your stuff in action. Quote Link to comment Share on other sites More sharing options...
SainT Posted May 26 Author Share Posted May 26 Just now, 42bs said: Ouch, so you drawning "by hand"? I made a 192x200x112 and a 320x200x112 version. The difference is massive. Keen to see your stuff in action. Yep, all drawn on the GPU! 😆 If I just draw one pixel per span it goes from 30fps to 38fps, so I'm expecting similar gains by switching to the blitter for span rendering. A bit more tweaking and I'll dig the video capture device out! 4 Quote Link to comment Share on other sites More sharing options...
42bs Posted May 26 Share Posted May 26 30FPS?! Z depth of 100? I guess I should try reading a map row with the Blitter. 🙂 1 Quote Link to comment Share on other sites More sharing options...
agradeneu Posted May 26 Share Posted May 26 On 5/25/2024 at 8:02 PM, SainT said: Not yet, but I will do before long! A little bit more optimisation (and bug fixing where blits go accross map boundaries) and I'll show it in action! Nice, looking forward to it! 1 Quote Link to comment Share on other sites More sharing options...
SainT Posted May 30 Author Share Posted May 30 (edited) With some reasonable (lower) quality settings I can get a pretty consistent 50fps, so it's possible you could get a game running at about 25fps with this kind of landscape. I got around the clipping / wrapping of map data into GPU RAM by just implementing a sliding window on the GD instead. So there is a virtual 512x512 window in memory which you can specify the top left corner with a position from 0-1023 and it reads the subsection of a 1024x1024 map. This way the blitter is always rendering from 256,256 and I just alter the map read position on the GD. This also means it's reading the map data directly from the cart. This cost a couple of fps, but makes things much more flexible and the clipping would have cost a bit of time, so in general a good tradeoff. The voxel map is now 8bit colour and 8bit height, so a 1024x1024 map is 2MB, it could go bigger as you can access the whole 16MB cart space on the GD and load from memory card if you wanted. Edited May 30 by SainT 13 Quote Link to comment Share on other sites More sharing options...
42bs Posted May 30 Share Posted May 30 16MB cardspace? I need to read some docs I guess. So what is the resolution now? 160x200 or less? Or even more? Quote Link to comment Share on other sites More sharing options...
SainT Posted May 30 Author Share Posted May 30 (edited) 31 minutes ago, 42bs said: 16MB cardspace? I need to read some docs I guess. So what is the resolution now? 160x200 or less? Or even more? The GD has 16MB of RAM onboard, you can page it in and out of the 6MB physical space the Jag provides. This allows you to access it via a sliding window as well. The rendered resolution is 160x200, its rendering into a 320x200 image such that it's easy then to composite sprites over the top. Edited May 30 by SainT 2 1 Quote Link to comment Share on other sites More sharing options...
alucardX Posted May 30 Share Posted May 30 Your work looks great @SainT! 1 Quote Link to comment Share on other sites More sharing options...
SainT Posted May 31 Author Share Posted May 31 Hmmm, this may be a daft question, but how the hell do you get the blitter to write Z data? I just want to write some Z data with the column. The image is setup with a pitch of 2 and a z offset of 1, so its interleaved pixel and z data. No matter what settings I've tried (PIXEL16|WID320|XADDPIX|ZOFFS1|PITCH2|DSTWRZ for example), nothing gets written to the Z data. I've tried pixel mode, phrase mode, reading z data, not reading z data, it all ends up writing nothing. Is there something I'm missing like it only writes Z using A1 as dest or something? Quote Link to comment Share on other sites More sharing options...
42bs Posted May 31 Share Posted May 31 I use this in jag_ball demo: movei #(1<<18)|BLIT_DSTENZ|BLIT_DSTWRZ|BLIT_PATDSEL for CMD and set BLIT_SRCZ1 In the demo I interleave screen0, screen1 and Z (no need to waste Z buffer twic) e and set A1 to BLIT_XADDPIX|BLIT_WID320|BLIT_ZOFFS1|BLIT_PIXEL16|BLIT_PITCH3 or BLIT_XADDPIX|BLIT_WID320|BLIT_ZOFFS2|BLIT_PIXEL16|BLIT_PITCH3 https://github.com/42Bastian/JaguarDemos/tree/main/jag_ball 1 Quote Link to comment Share on other sites More sharing options...
SainT Posted May 31 Author Share Posted May 31 I've been a complete idiot, I was using the DSTWRZ flag in the wrong register! Now I've put it in CMD, it's doing what I'd expect! 🙄 1 Quote Link to comment Share on other sites More sharing options...
SainT Posted May 31 Author Share Posted May 31 Ouch, Z buffer completely kills performance. Even just having the video memory as interleaved seems to kill it -- must be the number of page misses increasing. I was hoping it wouldn't be this bad. Quote Link to comment Share on other sites More sharing options...
42bs Posted May 31 Share Posted May 31 30 minutes ago, SainT said: Ouch, Z buffer completely kills performance. Even just having the video memory as interleaved seems to kill it -- must be the number of page misses increasing. I was hoping it wouldn't be this bad. For the vertical stripes? But yes, you have now only the half of the pixels per page plus the blitter has to read. All this is a performance killer. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.