Asmusr Posted January 8, 2015 Share Posted January 8, 2015 Here's my first attempt at some 3D vector graphics using the F18A: a spinning polyhedron. You use the joystick to spin: left/right to spin around the y-axis, up/down to spin around the x-axis, and fire button to spin around the z-axis (one way only). The demo is using the F18A 4 color bitmap layer to draw the graphics, and the F18A GPU for the 3D calculations and the polygon rendering routine. For each frame the GPU is performing the following steps: Clear the bitmap layer Draw the (up to 5) polygons that are currently visible Rotate the vertices to prepare for the next frame Translate the vertices to screen coordinates Calculations are done using fixed point math, and sine and cosine values are looked up in a table. The main CPU is only responsible for waiting for vertical retrace and activating the GPU at the right moment to draw a frame, it then reads the joystick (which the GPU cannot do) and stores the result in VDP RAM for the GPU to read. This goes on in a endless loop. This all looks very well in the demo, but drawing on the visible screen would not be plausible in a game. With 6 polygons I can just manage to clear the screen and draw the polygons before the beam catches up. (I even had to change my scanline routine from one that was using the F18A GPU PIX instruction to draw a pixel, to a faster one that uses direct memory access in order to draw 8 pixels at a time, before it looked OK.) So we need double buffering, which is fortunately very easy to do but is limited by the amount of VDP RAM. The demo is using a bitmap size of 192x192 pixels which takes up 9216 bytes of VDP RAM, so two of those would not fit. We also need room for the GPU code (which might fit into the additional 2K VDP RAM only accessible to the GPU), and we need room for the standard VDP tables. So this is already pretty crowded, but... Perhaps we also need room for a third screen buffer: a depth buffer? In the demo I'm using a simple algorithm to remove polygons that are facing away from the 'camera' to remove the backside of the polyhedron. In another algorithm (Painter's) you sort polygons by their depth and draw them back to front, but this too only works for relatively simple scenes. So a standard solution is to calculate a depth value for every pixel you write, compare it to the value from the buffer of any pixel already drawn at the same place, and only draw the new pixel if it's closer to the camera than any pixel already written. Perhaps this is overkill for our level of graphics, I'm not sure what algorithm 8-bit games were using? One question that perhaps Matthew can answer: Why is there a pixel at the bottom right corner of the bitmap on my hardware? It doesn't show up in emulation (js99er.net). Does anyone else see it? Poly3D.zip 6 Quote Link to comment Share on other sites More sharing options...
+Vorticon Posted January 8, 2015 Share Posted January 8, 2015 Very nice indeed! Perhaps a more reasonable approach however for a game would be to just use wire frame graphics in order to avoid the delays in filling and hiding polygons. Coupled with a smaller view window (say only 2/3rd of the bitmap screen) it might just be possible to get it to work. I have attempted such work within the confines of TI FORTH using the excellent book called Flights of Fantasy as a reference for the necessary calculations, but it was very slow despite using trig look up tables as you did. Quote Link to comment Share on other sites More sharing options...
TheMole Posted January 9, 2015 Share Posted January 9, 2015 (edited) Very nice! Perhaps we also need room for a third screen buffer: a depth buffer? In the demo I'm using a simple algorithm to remove polygons that are facing away from the 'camera' to remove the backside of the polyhedron. In another algorithm (Painter's) you sort polygons by their depth and draw them back to front, but this too only works for relatively simple scenes. So a standard solution is to calculate a depth value for every pixel you write, compare it to the value from the buffer of any pixel already drawn at the same place, and only draw the new pixel if it's closer to the camera than any pixel already written. Perhaps this is overkill for our level of graphics, I'm not sure what algorithm 8-bit games were using? Well, it really depends on what type of 3D graphics you want to do... But a software z-buffer will almost always not be the right answer. If you're trying to make a 1st or 3rd person view indoor shooter, there are basically two common approaches you can follow: A "portal" rendering engine (ala Unreal): you divide the world into convex rooms (called "sectors") that are connected via "portals" (typically doorways or windows in levels). First you determine the sector you are in, then you determine the visible polygons in that sector via frustrum culling (simply testing whether polys are at least partially within your field of view). For each of the polygons that are flagged as being a portal, you adjust your frustrum and recursively render the sector it links to. Everything else within a sector is rendered using the painters algorithm, which we know will always work since we have a convex space. A BSP rendering engine (ala Quake): instead of storing your polygons in a flat list, you store them in a binary tree. You build the binary tree (offline) by taking a random first polygon and using that to divide your space in two subspaces (hence the binary nature of the tree): a subspace consisting of the polygons that are in front of the plane defined by your first polygon, and a subspace consisting of the polygons that are at the back of that plane. When rendering your BSP tree, you start at the root node and simply test if your viewpoint is in front or in back of the polygon defined in the root node and recursively follow the same procedure for the "side" of the tree from that node onwards you've just determined. Once you hit a lead node, you render the polygon there and move back up through your recursion, rendering the polygon at that node, and so on... For resource constraint devices, the BSP tree is by far the fastest approach (linear, and no sorting required, just a simply back/front test). However, it was never used in 8-bit games since Doom was the first game to actually use the technique (although in pseudo 3d, 'cause Doom used lines instead of polys). For outdoor rendering, or other non-first person setups there's less room for aggressively optimized rendering algorithms. Most efficient would probably be to simply do higher level culling and occlusion tests (e.g. bounding box tests, octrees, ...) combined with the BSP approach for what are typically called "brushes". Having said that, most 8-bit 3D that I've seen is not really 3D, but uses 2D raycasting to create a 3D effect like in my Wolfie3D demo. Edited January 9, 2015 by TheMole Quote Link to comment Share on other sites More sharing options...
Imperious Posted January 9, 2015 Share Posted January 9, 2015 Awesome, Elite here we come!! Only kidding of course, but would something like Tempest be a possibility? 3 Quote Link to comment Share on other sites More sharing options...
ti99iuc Posted January 9, 2015 Share Posted January 9, 2015 Very Nice Rasmus ! ... always nice jobs Quote Link to comment Share on other sites More sharing options...
am1933 Posted January 9, 2015 Share Posted January 9, 2015 Awesome, Elite here we come!! Only kidding of course, but would something like Tempest be a possibility? What the hell do you mean "Only Kidding"? :mad: 1 Quote Link to comment Share on other sites More sharing options...
Asmusr Posted January 9, 2015 Author Share Posted January 9, 2015 For resource constraint devices, the BSP tree is by far the fastest approach (linear, and no sorting required, just a simply back/front test). However, it was never used in 8-bit games since Doom was the first game to actually use the technique (although in pseudo 3d, 'cause Doom used lines instead of polys). Yeah, I think you're right, BSP would be the best approach. A Z-buffer would take up too many resources unless you made it very low-res. But the next thing I will try is probably double-buffering with more polygons. BTW, I have now added the demo as one of the resident items in js99er.net, but be aware that the GPU timing is far from realistic. Quote Link to comment Share on other sites More sharing options...
matthew180 Posted January 10, 2015 Share Posted January 10, 2015 I have not had a chance to run the code yet, but as always it looks like it will be great. Keep in mind that the GPU has 16-bit access to the VRAM and you can clear two bytes at a time as long as you stick to even address. The blitter in the upcoming V6 firmware should also help with clearing the screen between frames, as well as horizontal and vertical fills. I will check on that pixel you mentioned Quote Link to comment Share on other sites More sharing options...
Asmusr Posted January 10, 2015 Author Share Posted January 10, 2015 As a curiosity I have made a version of the 3D demo that's not dependent on the F18A and is using standard bitmap mode. I'm drawing to a buffer in CPU RAM and copying the full 6K to VPD RAM each frame. Run: E/A#3 P3D2 or E/A#5 POLY2 The frame rate is actually a little better than I expected (2-3 FPS), but adding colors would slow it down significantly since this would add another 6K to be copied. And colors would also create problems on the boundaries between polygons. POLY3D.dsk 3 Quote Link to comment Share on other sites More sharing options...
+Vorticon Posted January 11, 2015 Share Posted January 11, 2015 (edited) Do you think we can gain at least 1 fps if this was done in wire frame only? Would you mind trying it as a quick demo in standard bitmap without the F18 GPU? Edited January 11, 2015 by Vorticon Quote Link to comment Share on other sites More sharing options...
xjas Posted January 11, 2015 Share Posted January 11, 2015 Impressive work on this! I'd be really surprised if there isn't more speed to be found. I've seen a spinning cube on just about every other 8-bit-era system so far. It's kind of a demoscene in-joke as to who will be the first to implement it on any given platform (Rasmus, I believe you've just claimed yourself a title here...) Is the TI really this sluggish? Here's one on Atari VCS for example: http://youtu.be/wcCJM7b9EMU?t=2m58s I know I've seen VIC-20 & ZX Spectrum examples before too. And of course approx. a million on C64. Quote Link to comment Share on other sites More sharing options...
Asmusr Posted January 12, 2015 Author Share Posted January 12, 2015 Impressive work on this! I'd be really surprised if there isn't more speed to be found. I've seen a spinning cube on just about every other 8-bit-era system so far. It's kind of a demoscene in-joke as to who will be the first to implement it on any given platform (Rasmus, I believe you've just claimed yourself a title here...) Is the TI really this sluggish? Here's one on Atari VCS for example: http://youtu.be/wcCJM7b9EMU?t=2m58s I know I've seen VIC-20 & ZX Spectrum examples before too. And of course approx. a million on C64. I'm sure it could be optimized if your only objective was to make a spinning cube. Then you could take advantage of the symmetric properties of the cube, for instance, and try to calculate as much as possible in advance. But my polyhedron is not actually a cube, and I have tried to make my routines general rather than optimizing them for a specific purpose. The biggest speed gain could probably be obtained by reducing the part of the screen that is updated. I'm clearing 6K of CPU RAM and copying it to VDP RAM every frame, and that will always take some time, but it could be optimized a bit by running the code from scratch pad. Quote Link to comment Share on other sites More sharing options...
TheMole Posted January 12, 2015 Share Posted January 12, 2015 Now that I'm looking at this again, it seems like you are doing parallel instead of perspective projection. Is that an artistic choice, or are you doing it to save on a division? Any idea if the additional division per vertex would pull down the framerate by much (on the F18A version, that is...)? Quote Link to comment Share on other sites More sharing options...
Asmusr Posted January 13, 2015 Author Share Posted January 13, 2015 Now that I'm looking at this again, it seems like you are doing parallel instead of perspective projection. Is that an artistic choice, or are you doing it to save on a division? Any idea if the additional division per vertex would pull down the framerate by much (on the F18A version, that is...)? I don't think a parallel projection would pull down the frame rate. The F18A should have capacity for displaying several more polygons provided you switch to double buffering. But I don't see much point in a parallel projection for a single, simple object. Quote Link to comment Share on other sites More sharing options...
TheMole Posted January 13, 2015 Share Posted January 13, 2015 I might be wrong, but aren't you doing parallel projection now (it is the cheaper of the two operations)? It's a bit difficult to see 'cause of the odd shape of the polyhedron you're using, but if in the demo I rotate over one axis I can clearly see the projected line segments remaining the same on-screen size as they rotate into the screen (towards the back). If you were doing a perspective projection the line segment would become smaller the further away from the viewer it goes. Typically, perspective projection is used in 3D games since it gives a better impression of depth, whereas parallel projection is mostly only used in certain CAD applications and views where the length of a line segment needs to be visually comparable to others regardless of z-position. Note, parallel projection is often called orthogonal or orthographic projection, like in the picture below, so my terminology might be causing some confusion. Just to be clear, from the looks of it you are not doing the second type of projection, right? Quote Link to comment Share on other sites More sharing options...
Asmusr Posted January 13, 2015 Author Share Posted January 13, 2015 I might be wrong, but aren't you doing parallel projection now (it is the cheaper of the two operations)? It's a bit difficult to see 'cause of the odd shape of the polyhedron you're using, but if in the demo I rotate over one axis I can clearly see the projected line segments remaining the same on-screen size as they rotate into the screen (towards the back). If you were doing a perspective projection the line segment would become smaller the further away from the viewer it goes. Typically, perspective projection is used in 3D games since it gives a better impression of depth, whereas parallel projection is mostly only used in certain CAD applications and views where the length of a line segment needs to be visually comparable to others regardless of z-position. Note, parallel projection is often called orthogonal or orthographic projection, like in the picture below, so my terminology might be causing some confusion. Just to be clear, from the looks of it you are not doing the second type of projection, right? Sorry I was writing in a hurry. What I meant to say was that I don't see much point in doing perspective projection for a single object. You're right I am doing a orthographic/parallel projection. Quote Link to comment Share on other sites More sharing options...
TheMole Posted January 13, 2015 Share Posted January 13, 2015 Sorry I was writing in a hurry. What I meant to say was that I don't see much point in doing perspective projection for a single object. You're right I am doing a orthographic/parallel projection. Ok, makes sense. Just wanted to make sure I understood it right. Quote Link to comment Share on other sites More sharing options...
matthew180 Posted January 20, 2015 Share Posted January 20, 2015 I can't help with more VRAM at this point, but maybe I can give you back some cycles. I'm finally getting back to testing the changes I made for the V1.6 F18A firmware update, and the blitter will be pretty fast. Here are some capabilities: constant -> destination: 1-byte every 10ns, or 163.84us to clear all 16K of VRAM, or 100 million bytes/sec. source -> destination: 1-byte every 20ns, or 50 million bytes, sec. Clearing 9K would take about 92.16us, or approx 1.3 scan lines (TI scan lines @ ~64us per scan line). In one TI scan line you could copy (src -> dst) ~3200 bytes, or fill (constant -> dst) ~6400 bytes. The blitter also has a an 8.8 signed scale factor for both the source and destination addresses. 2 Quote Link to comment Share on other sites More sharing options...
TheMole Posted January 20, 2015 Share Posted January 20, 2015 The blitter also has a an 8.8 signed scale factor for both the source and destination addresses. As-in hardware assisted scaling for software sprites? Very, very cool! Quote Link to comment Share on other sites More sharing options...
matthew180 Posted January 22, 2015 Share Posted January 22, 2015 Actually I should call the new feature a "DMA", not a "blitter", since it works on bytes not pixels. If there was an 8-bit per pixel mode (and enough RAM) then it could be considered a blitter, but that is not the case with the F18A. 1 Quote Link to comment Share on other sites More sharing options...
matthew180 Posted February 5, 2015 Share Posted February 5, 2015 I was just thinking, the GPU can easily test the current scan line so you could clear-behind during the active display. With the DMA, clearing half of the screen should take less than one scan line, and you could even then render the top half of the display. That would leave you the vertical refresh and top half of the display to clear and draw the bottom. A bit of a pain I know, but possible. Quote Link to comment Share on other sites More sharing options...
Omega-TI Posted February 8, 2015 Share Posted February 8, 2015 Vector graphics, if you could pull it off for more than one item on an F18A would open up the field for some classics for sure. Battle Zone Omega Race Tempest Lunar Lander I'd be amazed if it could be pulled off. Now Qix from what I'm told is probably quite possible... 1 Quote Link to comment Share on other sites More sharing options...
Iwantgames:) Posted February 9, 2015 Share Posted February 9, 2015 Battlezone would be pretty sweet 1 Quote Link to comment Share on other sites More sharing options...
Wildstar Posted February 21, 2015 Share Posted February 21, 2015 (edited) Actually I should call the new feature a "DMA", not a "blitter", since it works on bytes not pixels. If there was an 8-bit per pixel mode (and enough RAM) then it could be considered a blitter, but that is not the case with the F18A. Blitters always works with bytes or blocks of bytes which is what graphical data is. DMA is a different matter. BLITter gets its name from BLIT ( BLock Image Transfer) operations. You might refer to it more like BLock Data Transfer (BLDT) or a BLDTer if you like if it describes a block data transfer similar to image but more general purpose but DMA adopted the used of Blitter technology in what is known as Block Mode DMA so we can effectively have a block mode DMA and in fact the Commodore 128 had this in its MMU features. BL-DMA (BLock mode- DMA) might be an appropriate terminology if that is what describes the DMA operation style. In addition to using this f18A VDP, I'm thinking of using Green Array's EVB001 - dual GA144A12 equipped board, with my TI-99/4A and possibly also the Amiga 1200. This would be an interesting and powerful addition to the TI-99/4A especially in the Forth programming realm. Edited February 21, 2015 by Wildstar Quote Link to comment Share on other sites More sharing options...
mäsäxi Posted March 3, 2015 Share Posted March 3, 2015 Smoothly scrolling 3D maze Pac-Man clone would be nice. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.