Jump to content
IGNORED

Any 3D game with flatshading on A800 ?


VladR

Recommended Posts

 

Yes. Only couple cycles are taken. You can see cpu timing diagrams in this text file:

attachicon.gifAntic_Timings.txt

 

"AA" is for lms address change so I guess it only fetches lo+hi byte of new address.

ps. I don't remember were there some corrections to this file, but 99% of it is correct and you can safely use it.

 

I didn't propose making such a complex game on A8 ;)

Maybe start with something simpler like version of Star Raiders with 3d ships ?

 

Source code is available and is full of nice info about 3d structures and math involved in making it all work on 6502:

https://github.com/lwiest/StarRaiders

 

Enjoy :)

 

1. Antic Timings : Thanks. A week ago this did not make much sense, but having read DeRe Atari and Mapping the Atari, it's making much more sense now. Looks like a quick look today answered my own question on Narrow mode impact on 6502.

2. Complex game : I understand. But there's an option - if we could use a highlevel C for top-level non-time critical game loop (menus, dialogs, trading, screens, maps, RPG elements) and only certain areas would be in ASM (e.g. 3D space combat). I suspect the C code would take gobs of RAM on A800, though. C would however allow a very quick iteration of those additional gameplay elements (that would take forever / never in ASM).

3. Star Raiders: Yes, that is the idea right now - a minimum playable thing is something like SR, but I don't want 6 DOF, just rotation (e.g. like in recent Rebel Galaxy on PS4, which showed we absolutely do not need 6 DOF in this genre)

4. Source code : Thanks - I saw the SRII src thread about a year ago. I'm however planning on porting my Integer 3D math from jaguar. It's mostly add/sub/shift/compare/load/store/jump anyway. Initially, I could use 28 bytes in page 0 and change all jaguar's register access with LDA/STA to page 0. A triangle drawing routine should be doable to port within one weekend. Also,l am not going to use star generation method as SR does - it does expensive proj.transformation with division on each star. I got few ideas how to do starfield without division, but for way less cycles (in theory).

 

All P/M and display list DMA is one cycle too early in that document, but that's minor. Virtual reads are also incorrect for modes 2-5 but there are very few circumstances in which that matters.

Thanks. Should not need these precise timings in near future as I want to avoid DLIs as they butcher framerate too much (for few additional colors).

What is a virtual read ?

Link to comment
Share on other sites

DLIs are usually worth the tradeoff for cycles lost vs extra colour, objects, effects they can provide.

 

Really, if you're just doing general graphics rendering the cycle counting exercise doesn't need to be spot on. In any case, 3D rendering is a highly variable demand task.

 

Virtual reads as I understand it are where the graphics memory scan counter is updated in Antic but no DMA takes place, no cycle lost. You can see such instances when HScrolling is being used with larger values, the graphics DMA stops but the memory scan still gets incremented.

Link to comment
Share on other sites

Using DLIs plus PMg Shapes can save a lot cpu cycles. Particular if the moving object has to be updated every frame, or even more, if just some blocky overlay is needed.

 

 

Also: If "3d rendering" is only used for path and tunnels, and PMg is used for the "protagonist", things don't only need "flat fills" to have playable speeds.

 

Edited by emkay
  • Like 1
Link to comment
Share on other sites

Also,l am not going to use star generation method as SR does - it does expensive proj.transformation with division on each star. I got few ideas how to do starfield without division, but for way less cycles (in theory).

As long as you keep them looking fantastic like in original it's fine ;)

 

Imho those stars rotating around and flying by are responsible for half of immersion into game effect. Other part is sound of bullets fired... That game is crazy good with couple of those "simple" things done sooooo good.

 

I guess calculating projection of "stars" (they are actually simple points in space) is simpler in your case with no full freedom (like you said in that new game).

Reminds me of kind of 3d engine that game Moonfall uses (on c64, Amiga and Atari ST). It's based on a planet and only freedom of movement is up-down and left-right rotation. But it has trading, navigation, warp mode, action with lasers and missiles.

 

I didn't find video of C64 version but you can check it out yourself:

http://csdb.dk/search/?seinsel=all&search=moonfall&Go.x=0&Go.y=0&Go=Go

 

In any case perspective does not require division. It can be done using tables like here:

http://codebase64.org/doku.php?id=base:perspective

  • Like 1
Link to comment
Share on other sites

Hm.... I wonder if this thread also ends up in some nirvana ;)

 

Else, it would be very nice to have at least ONE 3D game for the Atari.

Exactly, for a STOCK Atari. Which means : ALL features of a STOCK Atari were used... even the undocumented opcodes and hardware features.

 

Could it be possible after 40 years ?

Link to comment
Share on other sites

DLIs are usually worth the tradeoff for cycles lost vs extra colour, objects, effects they can provide.

 

Really, if you're just doing general graphics rendering the cycle counting exercise doesn't need to be spot on. In any case, 3D rendering is a highly variable demand task.

 

Virtual reads as I understand it are where the graphics memory scan counter is updated in Antic but no DMA takes place, no cycle lost. You can see such instances when HScrolling is being used with larger values, the graphics DMA stops but the memory scan still gets incremented.

For 2D games, the butchered CPU time via STA WSYNC is certainly not a big problem. Although, one has the option of exact cycle timing (which I luckily don't have to do for 3D engine) and avoiding STA WSYNC, if I am reading the documentation right.

 

Yes, the framerate will vary greatly - from 60 fps when spaceship is in distance, down to - uhm, no real idea right now (when polygons fill the full viewport) :)

 

@Virtual Read - thanks!

 

Also: If "3d rendering" is only used for path and tunnels, and PMg is used for the "protagonist", things don't only need "flat fills" to have playable speeds.

Very nice framerate on the Maze demo!

 

Hm.... I wonder if this thread also ends up in some nirvana ;)

 

Else, it would be very nice to have at least ONE 3D game for the Atari.

Exactly, for a STOCK Atari. Which means : ALL features of a STOCK Atari were used... even the undocumented opcodes and hardware features.

 

Could it be possible after 40 years ?

Don't know about stock. I'm not thinking of using 64KB at all. Perhaps stock 130 XE, but even that's pushing it. It's extremely easy to eat up all memory via Lookup tables and unrolled code (granted, such code can be generated at runtime). Let alone storing few full-screen gfx....

 

Atari has a LOT of features. I don't intend to spend a decade trying to learn all 2D tricks there are.

 

I do, however, intend to try to come up with some new 3D ones, especially with texturing. You never know what research&development brings, until you spend the effort. Looks like the research effort I spent on Jaguar is reusable without much butchering even on 1.79 MHz A800 - because on C64, if the 3D game falls down to 1 fps (60 * ~25,000 cycles = 1.5 Million cycles - which should be enough for some interesting 3D scenes), nobody seems to complain much, so we should be able to extend same courtesy to our Atari too :)

 

Yesterday, I was doing some quick port of my road texturing routines from jag (just in text editor and excel-for counting cycles), and we a specially created road texture, the drawing routine shouldn't take more than 7,000 cycles (128x32). That seems crazy to me, that A800 could do that at 60 fps and still have about 15,000 cycles left for everything else (input/audio). Can't wait to configure my Altirra dev env, actually...

  • Like 1
Link to comment
Share on other sites

Don't know about stock. I'm not thinking of using 64KB at all. Perhaps stock 130 XE, but even that's pushing it. It's extremely easy to eat up all memory via Lookup tables and unrolled code (granted, such code can be generated at runtime). Let alone storing few full-screen gfx....

A stock machine allows ROM and RAM enhancements.

Link to comment
Share on other sites

uj.... I am late at the party... but too much distracted at the moment with VBXE stuff and now PICO-8... ;)

 

well... I only can advice go looking at codebase64 first to see the 6502 tricks in 3d stuff... I always run into accuracy issues but found following points crucial

 

- fastmul table approach or LOG tables (Oxyron uses them sometimes) for your 3d-2d trans

- optimised rotation matrix approach for rotating around x,y,z

- 2d culling (with 4 EORs then standard 2d dot product...)

- persptrans

- poly rasteriser (I am just leaving bresenham for DDA for using scanedge poly filler)

- EOR filler unrolled

- subpixel rendering (not sure if handy in game scenario)

 

have a look not only on Star Raiders but more interesting are the Star Raiders 2 (unreleased sequel, not the commercial game!) with a lot of commented source code for midline 17 point line routines, cylindric coords, etc.

 

thanks to A8 the next bitch in 3d can be "minimized" - 2d clipping (3d clipping is more tricky). you can setup screen with "dirty buffers" top/bottom/left/right

 

@emkay... Rainbow Walker is not 3d like Wolf/doom is not 3d... and Pole Position is not - Captue the Flag is ;) 3d here means "real 3d" not how the game is looking... and yes... Yoomp is not 3d either... Ballblazer is... not while Rof, Eidolon, Koronis Rift are...

 

Really bitch is on 6502 all those 24bit or 16bit maths in signed... it took me months/years to develope my 3d engine so far and when moved out of fun to PICO-8... it took less than 1 week to achieve better results than in 3 years of 6502 assembling coding :D

 

so coming from 68k or RISC environment can frustrate you when you do a e.g. -4 LSR and you realise that result is not -2 ;)

 

ah another advice... strick to triangles and not quads... makes life a lot easier again...

  • Like 2
Link to comment
Share on other sites

ah... lightsourced faces...

 

in a perfect world you could use "low-res" mode10 which gives you 9 color registers (i used some of them in the cube) so fading might be easier... otherwise your poly filler could take "dithered" colors.

 

C64 sometimes use so called "dirty char" method... means you dynamicly assign chars and draw segments into it... which might help (needs more explanation) but here you could set/clear/fill with a rate of 8x8 or 4x8 pixels (1 char). only limitation... the damned 128 chars vs 256.

 

and I often used 256 byte scanlines as Popmilo mentioned. thanks to Antic but costs some DMA but pixel adressing damned easy.

 

or Arsantica 2 used Rescue on Fractalus "double buffer"... left/right screen interleaved... so scanline is 96 bytes (or 128 bytes with some "dirty buffer left/right for clipping"). but you can have your unrolled code do a STA buffer,X where X can be put to screen1 (at 0-47) or X=48 to point to screen2 (48-95).

  • Like 2
Link to comment
Share on other sites

@emkay... Rainbow Walker is not 3d like Wolf/doom is not 3d... and Pole Position is not - Captue the Flag is ;) 3d here means "real 3d" not how the game is looking... and yes... Yoomp is not 3d either... Ballblazer is... not while Rof, Eidolon, Koronis Rift are...

Since we have just small 8-bit computers in this forum, it is fully nuts to talk of "real 3D". The 3D should fit to get a game visible and playable , to get great 1st person look and feel.

The projection never can get "full 3d" as even today, a graphics card handles all 3d calculations, because newest CPUs were not able to handle fluent 3D scenes. Even more , not the newest PC is able to show real 3D fluent.

 

So, if at least the controls and gameplay is "3D" depending, everything would be fine on such a small 8 bit computer.

Link to comment
Share on other sites

yeah. but I was refering to your Rainbow Walker "is 3d" as it is not... but look at RoF... its "full 3d". or Star Raiders. So it's possible 3d in my definition is having 2 or 3 controllable axis/camera, polygons, 3d points, perspective transformation.... if that's somehow in the "game" it's 3d... in my own definition my Arsantica 3 starfield is "not 3d" as its only "animation" ;) precalculated...

 

But here discussion was about "3d engines" from Jaguar to A8.

Link to comment
Share on other sites

As long as you keep them looking fantastic like in original it's fine ;)

Well :)

There's always going to be some trade-off between realism and speed, especially on 1.79 MHz.

But, I'm doing high-level experimenting with the algorithm / lookup tables, and the feel, in C from Visual Studio and keep all versions active, so it's always easy to switch to any of them (and tweak any of them).

 

 

Imho those stars rotating around and flying by are responsible for half of immersion into game effect. Other part is sound of bullets fired... That game is crazy good with couple of those "simple" things done sooooo good.

More like 95%, as 95% of time you are staring just at those stars :)

So, in a way, it's justified it's using perspective projection on the stars. I just disagree with using division at run-time ;)

Then again, I have all the time in the world, there's no commercial deadline/pressure...

 

 

I guess calculating projection of "stars" (they are actually simple points in space) is simpler in your case with no full freedom (like you said in that new game).

 

In any case perspective does not require division. It can be done using tables like here:

http://codebase64.org/doku.php?id=base:perspective

 

Very good link! Will come in handy for sure in transition from RISC to 6502!

 

I've been using LUTs to remove division, initially even on jag, while I was working from C (and just 68000). The division was so incredibly slow from C on 68000, that tables were practically a requirement from day one. I usually only kept the first version of any code with division for reference, as it was unusably slow at run-time.

 

When I switched to GPU RISC ASM 6 months ago, even though div there takes only 16 cycles on GPU, it was still brutally slowed down, when I used one div per scanline. So, right now, I only use division for perspective projection there and all flatshading works with bitshifting.

  • Like 1
Link to comment
Share on other sites

ah... lightsourced faces...

 

in a perfect world you could use "low-res" mode10 which gives you 9 color registers (i used some of them in the cube) so fading might be easier... otherwise your poly filler could take "dithered" colors.

 

C64 sometimes use so called "dirty char" method... means you dynamicly assign chars and draw segments into it... which might help (needs more explanation) but here you could set/clear/fill with a rate of 8x8 or 4x8 pixels (1 char). only limitation... the damned 128 chars vs 256.

 

and I often used 256 byte scanlines as Popmilo mentioned. thanks to Antic but costs some DMA but pixel adressing damned easy.

 

or Arsantica 2 used Rescue on Fractalus "double buffer"... left/right screen interleaved... so scanline is 96 bytes (or 128 bytes with some "dirty buffer left/right for clipping"). but you can have your unrolled code do a STA buffer,X where X can be put to screen1 (at 0-47) or X=48 to point to screen2 (48-95).

- mode 10 : I'll consider it for in-game sequences, but for gameplay, it's just too lowres.

 

- char mode : I don't understand how the overhead of breaking polygon rasterizing into charset overweighs the complexity ? That does not seem like easy feature to do efficiently each frame. I mean, I could probably reimplement the scanline fill function that would "draw" scanlines into charsets, but that's far from efficient. What exactly does charset mode give us, other than fast clearing ? I mean, we have to use DLI to switch to second charset (or perhaps even third). There's 5th color, though - that would be nice, for sure...

 

- 256-Byte scanline : the Antic impact should be just 3 bytes (=cycles) / scanline, right ? So, that's 3*192 = 576 ? Or wait, that's each frame, damn - so we're down ~35k cycles per second on this. On the other hand, using LUT to get screen address is not free either, so I suspect this is one of these thing that are just worth it for the coder's comfort, right ?

 

- Fractalus : double buffer - I don't think I understand - What's the advantage doing it this way ? Reusing same unrolled code (just with different X reg), I guess ?

 

Re Starfield....

 

can be simplified to death - see Arsantica 3 loader... just "few" tables and all stars are a simple combination of those tables.

Yeah. I've got currently 3 LUTs, and am only using add/sub as operations. I have yet to introduce perspective, though. That will be another LUT.

 

Nice frame rate, but the view distance is the pits.

Sure, but the technique could be reused for slightly angled, top-down 3D game, which would automatically reduce the complexity of frustrum culling down to traversing simple 2D grid with pretransformed vertex coords. Floor would use the same texturing routine as the vertical walls. The only problem would be the walls, where a simplified texturing would have to be implemented (or in worst case only the floor would be textured, and walls would be flatshaded). At 128x96, this could run at around 15 fps.

Link to comment
Share on other sites

regarding chars:

 

http://codebase64.org/doku.php?id=base:filling_the_vectors

 

check out:

 

https://youtu.be/9SRRTgo-LWA?t=357

 

can not find right now at codebase bitbreakers explanation of that vector "dirty char" approach...

 

but it is simple... you assign each block you hit with your line to a new char in charset...

  • Like 2
Link to comment
Share on other sites

re: Rof dubble buffer...

 

think of having 2 screens beneath

 

screen1 - screen 2 line 0

screen 1- screen 2 line 1

...

 

so instead of 2 separate buffers you have 2 but interleaved...

 

advantage is that you need only 1 clear, eor filler, etc routine as you adress the unrolled code chunks with an X or Y offset register...

  • Like 1
Link to comment
Share on other sites

Dirty char means not mimic bitmap mode... that does not make sense but think of vector gfx... most time it can be build of few chars.... and as I have written you calc where your line hit the screen... now you take a new char "from" stack if not already on screen... put on screen and draw I to that char...

 

Speed gain in big filled vectors, clear screen etc... it's not so complex as it might look at the moment...

Link to comment
Share on other sites

yeah. but I was refering to your Rainbow Walker "is 3d" as it is not...

You read my test correct?

 

I wrote something like Rainbow Walker is more 3D, than people might realize. The Rainbow is not just some moving screen FX. The tiles were calculated in depth. You know how that would look like without the additional depth calculation?

Link to comment
Share on other sites

regarding chars:

 

http://codebase64.org/doku.php?id=base:filling_the_vectors

 

check out:

 

https://youtu.be/9SRRTgo-LWA?t=357

 

can not find right now at codebase bitbreakers explanation of that vector "dirty char" approach...

 

but it is simple... you assign each block you hit with your line to a new char in charset...

Damn, that demo is awesome ! They must have some really fast line drawing routine for that terrain.

I suppose the StarRaider midpoint idea (of the 17-point line) could be quickly implemented as a first version...

 

re: Rof dubble buffer...

 

think of having 2 screens beneath

 

screen1 - screen 2 line 0

screen 1- screen 2 line 1

...

 

so instead of 2 separate buffers you have 2 but interleaved...

 

advantage is that you need only 1 clear, eor filler, etc routine as you adress the unrolled code chunks with an X or Y offset register...

Thanks for confirming the code size savings.

 

Dirty char means not mimic bitmap mode... that does not make sense but think of vector gfx... most time it can be build of few chars.... and as I have written you calc where your line hit the screen... now you take a new char "from" stack if not already on screen... put on screen and draw I to that char...

 

Speed gain in big filled vectors, clear screen etc... it's not so complex as it might look at the moment...

Yeah, when the polygon is close to camera and spans many full scanlines, it must be, like, 10x faster (or more) to just draw the char, instead of 8x8 pixels (16 Bytes).

I'll keep thinking about rendering to chars...

 

On a8 I would vote for mode D (gr. 7) 160x96x4

 

Best DMA vs speed ratio.... 1:1 pixel aspect... that's why Star raiders and RoF use that mode...

 

3,5k full screen vram.

Isn't this mode frowned upon by users too much (as being lowres) ? I can see these advantages from my point of view:

- half bandwidth cost of clearing, filling

- half Antic cost of cycles lost

- obviously faster framerate

- in 3D the aspect ratio is not a big deal (unlike 2D)

- 3.5k more RAM than 160x192

 

I didn't know RoF used that mode ! If it was good enough for them...

Link to comment
Share on other sites

Sure, but the technique could be reused for slightly angled, top-down 3D game, which would automatically reduce the complexity of frustrum culling down to traversing simple 2D grid with pretransformed vertex coords. Floor would use the same texturing routine as the vertical walls. The only problem would be the walls, where a simplified texturing would have to be implemented (or in worst case only the floor would be textured, and walls would be flatshaded). At 128x96, this could run at around 15 fps.

 

Fair enough, but that stuff is coming in mighty tight, regardless.

 

 

It's sufficient enough for any game.

 

...if you don't mind your enemies popping up 8 feet from your face. Most 3D games employ distance shooting as one of the main challenges, even most dungeon-type shooters.

 

I'd say insufficient for most, marginally sufficient for something stuck in tight dungeon corridors.

Edited by MrFish
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...