VladR Posted October 28, 2015 Share Posted October 28, 2015 could you imagine how fast the division would go? Yes, I can. It's just that I personally find 256 KB LUTs obscene on a Jaguar that has 2 MB of RAM. An 8 KB game with a single-purpose 1 MB LUT seems like insane waste of RAM, but yes - it's a quick experiment that can be done in n about 5-10 minutes, once you got the code that generates the LUT. Pardon my ignorance - but is that 1 MB mapped and accessible at any time without any bank switching ? I'm just wondering how would one access that on an 8-bit CPU - e.g. do you have access - within same clock cycle, to data at first and last 64 KB of that 1 MB ? Quote Link to comment Share on other sites More sharing options...
Rybags Posted October 28, 2015 Share Posted October 28, 2015 It's only wasteful if the owner is expected to go out and pay for it. This isn't really the case. Plenty of people have 1 MB flashcarts and you can keep reusing it if you can't be bothered keeping the game. Practically any such expansion is banked. Cartridges usually switch banks via a store to a $D5xx address and the typical banked window is 8K. The nature of the beast is that e.g. for a 16-bit value that contributes for a lookup table you might take some of the high bits which form a bank address then OR some low bits with the base (high) address of the cart which gives you the lookup address. Generally under 25 cycles to setup and read from your lookup table and store a result. Versus what we've got which might be a situation of using the best part of a frame to perform a single maths operation. Potential savings of thousands of cycles at the cost of using a buttload of either Ram or a flashcart - most serious owners around here own at least one of either, some multiples of both. This so-called wastefullness is why we keep using our computers. Games, demos using brute force techniques which give results that we didn't think possible from the hardware. 5 Quote Link to comment Share on other sites More sharing options...
phaeron Posted October 28, 2015 Share Posted October 28, 2015 Lookup tables can be both magic and fast, but they're being overestimated here. A lookup table bigger than 256 bytes takes longer to use and one that requires bank switching is even worse. That's not the worst issue, though. Let's assume that by slight of hand, the time for the perspective projection routine could be reduced to zero -- no time at all. Guess how much of a difference it would make? 15%. That's the total amount of time that the divisions take in the version that I posted, if you look at the profile (CPU clocks ILOG + IDIV1 + IDIV2). The maximum theoretical possible speedup, which is not actually attainable, is therefore 1.17x. Other routines like the rotation routines need attention and aren't necessarily as easy to optimize with lookup tables as the divide. That having been said, it's not like there's an epidemic of people actually working on Star Raiders, so if someone can make the game awesome with 1MB strapped to an Atari, who are we to argue.... As for precision, the vast majority of the stars are within 12 bits for all coordinates (|XYZ| < $1000). Reducing precision down to 8 bits ($FFF0 mask) is about as far as can be tolerated; below that the motion fluidity starts being affected such that the game might as well be running at 30 fps, which kind of ruins some of the magic. You can test this in the version I posted by masking off low bits from A on input to the ILOG function. Therefore, when it comes to a direct division table, the minimum needed is 64K, and it would be better to go to 256K to fold in the sign bits as the abs calculations in CALCVH and DIVIDE/IDIV1 are fairly expensive. Using Veronica for math acceleration would be silly. It has a 65C816 accelerator running at eight times the 6502's speed with no DMA overhead and a full 128K of local memory. You'd run the whole game on the 65C816, including framebuffer rendering, and use the 6502 for I/O and sound. 1 Quote Link to comment Share on other sites More sharing options...
Joey Z Posted October 28, 2015 Share Posted October 28, 2015 Using Veronica for math acceleration would be silly. It has a 65C816 accelerator running at eight times the 6502's speed with no DMA overhead and a full 128K of local memory. You'd run the whole game on the 65C816, including framebuffer rendering, and use the 6502 for I/O and sound. I hadn't looked into the veronica cart at all, so I didn't know you could generate the framebuffer itself on the veronica cart. I guess that would make sense, if it has a shared memory window in the cartridge region. Quote Link to comment Share on other sites More sharing options...
JamesD Posted October 28, 2015 Share Posted October 28, 2015 (edited) Dropping to 8 bits would save a lot of time throughout the code and at speed I'm sure it would look reasonably good but when you are flying slow I think you'd notice. I haven't tested it so I'm just guessing. But if the only time it's a problem is during explosions? Maybe just create some pre-calculated sprite images and use player missile graphics for a lot of the explosion instead of overloading the 3D engine? *edit*I'm not suggesting totally replacing the additional stars, just reducing them to where there's no slowdown any use the PM graphics for the initial explosion.You could even throw in some additional color that way. Edited October 28, 2015 by JamesD Quote Link to comment Share on other sites More sharing options...
fujidude Posted October 28, 2015 Share Posted October 28, 2015 After reading more here and thinkg a bit more about it... I think it would be best if whatever optimizations or enhancements would still allow it to run on a 130XE (128KB), or at most a RamboXL (256KB). If it is made to not require more specs than that, it should be usable by almost everyone who still uses the real iron. I'm also finding it more and more amazing how good it is inside of only 8KB. Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted October 28, 2015 Share Posted October 28, 2015 (edited) cool to see people doing optimisations... did someone ever run Star Raiders on a 65816 card? I remember the Rescue on Fractalus patched version on youtube where you see in fluid motion how inaccurate the calculations are... so... speeding up SR and getting it near to 50 or 30 fps.... we might find it "ugly". and for using ram... i would go for 130xe (128k) or maybe 320 kb machines. I can not see why the LUTs would be so large as normal mul routines tend to have 4kb tables... so plenty of more table would fit into 64kb. next... as Phaeron mention... look at the profiler and optimise there the bottle necks... now... did someone tested if the game itself brokes at higher speed? fex. I remember the hyper jump... when it would rendered faster... would the motion of the crosshair also automaticly faster? etc etc... I did not looked in the code yet itself but of course we need to keep basic game mechanics ok. so what else runs in the "main loop" without tight to VBL sync? Edited October 28, 2015 by Heaven/TQA Quote Link to comment Share on other sites More sharing options...
Rybags Posted October 28, 2015 Share Posted October 28, 2015 With the explosions would it be possible to have half the particles with positioning that's dependant on the other half - so that only half the 3D calculations are done, and the mirrored particles are positioned onscreen with only bounds checking required. ie within an imaginary grid with X and Y +ve and -ve values, have diagonally opposite particles mapped together. Quote Link to comment Share on other sites More sharing options...
phaeron Posted October 28, 2015 Share Posted October 28, 2015 now... did someone tested if the game itself brokes at higher speed? fex. I remember the hyper jump... when it would rendered faster... would the motion of the crosshair also automaticly faster? etc etc... I did not looked in the code yet itself but of course we need to keep basic game mechanics ok. so what else runs in the "main loop" without tight to VBL sync? Yes, hyperspace warping gets faster and thus theoretically harder, if you're playing on a difficulty where you have to keep it aligned. I have the debugger rigged to display the number of vblanks per frame on screen and in the original the hyperwarp out sequence runs at 2-3 vblanks, whereas with the fast divide it runs at 1-2. Explosion sequences run 3-4 vblanks, fastdiv gets it down to 2-3. I'm attacking the rotation routine to see if I can get the explosions down to a solid 2 (30 fps). This would potentially make the dogfights harder since you wouldn't have the benefit of slow-mo after an explosion to realign your next shot. Having the game run too fast would be a good problem to have, though. To be honest, it's a bit premature to be talking about hardware expansions when the game is still under 32K for code+data. Pretty sure we can get it faster on stock hardware just with some finessing of the code, and we haven't even started using dirty tricks yet. 7 Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted October 28, 2015 Share Posted October 28, 2015 Phaeron how did you triggered the debugger? I always wanted to have a frame counter without implementing debug code itself like frame counter cpu color bars etc? Quote Link to comment Share on other sites More sharing options...
Thorsten Günther Posted October 28, 2015 Share Posted October 28, 2015 Here comes Centron 4D!!! Quote Link to comment Share on other sites More sharing options...
phaeron Posted October 28, 2015 Share Posted October 28, 2015 Phaeron how did you triggered the debugger? I always wanted to have a frame counter without implementing debug code itself like frame counter cpu color bars etc? One non-blocking conditional breakpoint on the VBI to establish a frame counter in a temporary variable, another in the main loop to record the delta in the frame counter since last tick, and then a watch expression to display it on screen: bp -n main1 "r @t2 @t0-@t1; r @t1 @t0" bp -n vbnmi "r @t0 @t0+1" wx @t2 Unfortunately, there is a bug in current versions of Altirra that causes the debugger windows to update constantly when executing non-blocking breakpoints, which I need to fix. It works, but it can slow down the emulator. 2 Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted October 28, 2015 Share Posted October 28, 2015 Ah... Ok... Any further explanation please Quote Link to comment Share on other sites More sharing options...
VladR Posted October 28, 2015 Share Posted October 28, 2015 Out of curiosity - how much work is it to set up and configure the ASM Dev environment for A800 on a Win7 these days ? Last time I worked with A800 ASM was with ATMAS II (some quarter century ago). I reckon it should be easier with Altirra (et al). Just something that would allow me to quickly deploy builds of Star Raiders into emulator. I am thinking of reserving some time for this in few weeks... Quote Link to comment Share on other sites More sharing options...
+Stephen Posted October 28, 2015 Share Posted October 28, 2015 Out of curiosity - how much work is it to set up and configure the ASM Dev environment for A800 on a Win7 these days ? Last time I worked with A800 ASM was with ATMAS II (some quarter century ago). I reckon it should be easier with Altirra (et al). Just something that would allow me to quickly deploy builds of Star Raiders into emulator. I am thinking of reserving some time for this in few weeks... You'll probably want to go with the WUDSN plug-in for Eclipse. Use can choose 3 different assemblers (MADS will be the best), and have a one button build & launch emulator command, with integrated debugging, etc. 1 Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted October 28, 2015 Share Posted October 28, 2015 (edited) Wudsn IDE for sure plus MADS plus Altirra... Best dev tool chain even beating everything on c64/Amiga and ST http://www.wudsn.com/index.php/ide Edited October 28, 2015 by Heaven/TQA 2 Quote Link to comment Share on other sites More sharing options...
phaeron Posted October 29, 2015 Share Posted October 29, 2015 Optimized the projection routine (CALCVH) further, and took a crack at the rotation and forward motion routines. This version can almost hold 30 fps (2 frames) on explosions. Also, a few of the object tables were crossing page boundaries and are now rearranged above $2000 to avoid that. I am starting to wonder whether the dithering in the rotation code is actually needed. The rotations don't seem to be sensitive to LSB rounding, at least. CALCVH is getting harder to optimize as the sign handling and STHPOS/STVPOS subroutines are starting to take a disproportionate amount of time compared to the divide. Note that there is a rare crash bug in this version that I need to track down. Occasionally, the explosion code will spawn a particle with an initial VPOS out of range (~$F0), which can stomp code. I moved ORG down to $8000 temporarily in this version, which is now triggering the crash because the stomp is in the $8xxx range. Haven't figured out yet whether this was a bug in the original that just wasn't triggered because there was no writable code there or if I introduced it (the latter being more likely). StarRaiders-FastDiv2.zip 9 Quote Link to comment Share on other sites More sharing options...
JamesD Posted October 29, 2015 Share Posted October 29, 2015 The warp doesn't seem to be working right Quote Link to comment Share on other sites More sharing options...
ClausB Posted October 29, 2015 Share Posted October 29, 2015 Years ago I tried modding Star Raiders for mode E res but got stuck without source code to study. The thread was called Star Raiders HD or some such. Later someone shared a .txt of the original source in confidence but I haven't got back to the res mod. I did work on a divide speedup using quarter square LUT multiplication and a reciprocal LUT scaled by 80 but the rounding was too coarse and the stars jittered a little, so I never posted it. I'll dig it out again. Finally the source is out and mods are flying! I agree they should be runnable on original hardware and 16K ROM and 32K RAM should be plenty. 4 Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted October 29, 2015 Share Posted October 29, 2015 HD is nice but The 2:1 aspect ratio is not good in PAL land... I guess not as strong on NTSC but on PAL well... Quote Link to comment Share on other sites More sharing options...
ClausB Posted October 29, 2015 Share Posted October 29, 2015 Agreed. On an analog TV the stars would look more pinpoint though. The plan was also to try mode F. 2 Quote Link to comment Share on other sites More sharing options...
phaeron Posted November 1, 2015 Share Posted November 1, 2015 More optimizations... slightly faster divide, slightly faster rotation (EORing another value into RANDOM is pointless), faster motion routine, faster star brightness routine. I also extended the VCONL/VCONH tables to full pages to fix the intermittent crash. The problem occurs because the explosion routine presets HPOS/VPOS for the new explosion particles, which results in a weird grid pattern for the first frame. STVPOS is bypassed when VPOS is set, so the vertical positions are not clipped and can wrap. The out of bounds VPOS values are now clamped. The game can now hold 30 fps during an explosion, but it'll be hard to get it to 60 fps locked. The divide can possibly be sped up a bit more by adding direct tables for the very common $0000-0FFF range, and it looks like the star store/clear routines could be sped up by adding LMS lines to the display list so no mode line crosses a page boundary. The rotation routines, however, are going to be hard to speed up more as there's only so fast that four shear matrices can be applied to 47 points every tick. Hyperwarp is just barely short of 60 fps because of the extra overhead of all of the new stars that it generates, but arguably that is already running too fast as it is. One other thing that's apparent is that the constant use of RANDOM everywhere would complicate hoisting Star Raiders into Veronica. A possible approach would be to have the 6502 generate a new page of random numbers every frame, since it's wouldn't have much else to do. The warp doesn't seem to be working right Uh, mind being a little bit more specific? I don't see anything wrong myself, although on higher difficulties it can be hard to stay on target and avoid miswarping. StarRaiders-FastDiv3.zip 11 Quote Link to comment Share on other sites More sharing options...
Mclaneinc Posted November 1, 2015 Share Posted November 1, 2015 (edited) Phaeron, thanks for working on one of the most important games in Atari and possibly gaming history to make it better... You are the man.... And of course for all the others putting forward idea's etc, this is what make so proud of our group here, always pushing for better... Thanks guys.. Edited November 1, 2015 by Mclaneinc 1 Quote Link to comment Share on other sites More sharing options...
JamesD Posted November 1, 2015 Share Posted November 1, 2015 (edited) Uh, mind being a little bit more specific? I don't see anything wrong myself, although on higher difficulties it can be hard to stay on target and avoid miswarping. It's warping to the wrong quadrant when you stay on the galactic map when you warp. Or is this a feature, it works when I return to the front view. *edit* Feature. I guess it was just dumb luck I ended up in the right quadrant the few times I had warped and left the galactic map on. Edited November 1, 2015 by JamesD Quote Link to comment Share on other sites More sharing options...
Grevle Posted November 1, 2015 Share Posted November 1, 2015 (edited) It's warping to the wrong quadrant when you stay on the galactic map when you warp. Or is this a feature, it works when I return to the front view. Well. That would happen if the difficulty is set higher than Novice, because then you need to use the front view to control the crosshairs manually when you are in hyperspace, Or you might end up in a different quadrant then you selected in the map, This normal behaviour in Star Raiders orginal Atari game. Edited November 1, 2015 by BioFreeze Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.