+Propane13 Posted July 1, 2008 Share Posted July 1, 2008 (edited) Hi! I'm having a bit of trouble with some code I need to write. For Steam Tunnel Bob, "game 2" will involve having 2 sprites on-screen at the same time. It happens over a period of 93 scanlines or so. I have this dummy code, currently. It works, but the P0 sprite can't move up or down, which is something I want to change. I'll explain my dilemmas after the codebox. ;======================== ; scanline 20 to 113 ;======================== LDX #93 LDY Player1YPosition NoP1TopLoop STA WSYNC DEY BEQ WeHitTheTopOfP1Gfx LDA Player0GraphicsColors,X STA COLUP0 LDA Player0Graphics,X STA GRP0 DEX JMP NoP1TopLoop WeHitTheTopOfP1Gfx LDA Player0GraphicsColors,X STA COLUP0 LDA Player0Graphics,X STA GRP0 DEX LDY Player1HeightGame2 DrawingP1Loop STA WSYNC LDA Player0GraphicsColors,X STA COLUP0 LDA Player0Graphics,X STA GRP0 LDA (Temp3),Y ; P1 colors STA COLUP1 LDA (Temp2),Y ; P1 gfx STA GRP1 DEX DEY BPL DrawingP1Loop NoP1BottomLoop STA WSYNC LDA Player0GraphicsColors,X STA COLUP0 LDA Player0Graphics,X STA GRP0 DEX BPL NoP1BottomLoop ;======================== So, here's the technical problems I can't seem to grasp. 1) There's a space vs cycles issue right away, with regard to "zero-padding". I'm sure anyone who has written a 2600 game before knows about this. Basically, for the P0 sprite, I can put zero-padding around the main graphic, as I don't anticipate it changing much (maybe between 2 graphics, at the end). But, for the P1 graphic, zero-padding may take up too much space (as I'm planning on having a different P1 graphic per screen). In light of this I tried to use an index counter that would determine the start position to draw the P1 graphic, and one that would determine its height. With these values, you can set X or Y-registers appropriately, and therefore store $00 into GRP0 at appropriate conditions, thereby saving the user space. However, I'm torn. Maybe it makes sense just to zero-pad everything (P0 and P1 graphic alike) , and store the graphics and this small routine in an isolated 4K bank. I'm just not sure what the best way for that is. If I zero-pad, I get the advantage of more cycles, which means I have more horizontal position availability for the sprites. One solution is to get extra HW (256 bytes) that I could dump graphics in, and zero-out manually, but for now, I'm trying to avoid "extra hardware", and stay a purist for as long as I can. 2) I'm having issues with indirect indexing. As you can see, in the example above, the P1 position is dynamic, by indirect fetch. However, the P0 position is not. I'd like to fix this, but it sounds like I'd have to use an indirect mode to do this. This is ok, but indirect modes based on the X-register have never worked for me, as they end up being pre-indexed indirect instead of post-indexed indirect, which is what I typically use and need for graphics fetches. So, even if I figure out a solution for problem 1, I'm not quite sure how to proceed with this issue. Can anyone give me some pointers? I think I need a swift programming kick to the head; this is something that has eluded me for a long time, yet I think wouldn't be too hard to implement. Thanks in advance! -John Edited July 1, 2008 by Propane13 Quote Link to comment Share on other sites More sharing options...
Ben_Larson Posted July 2, 2008 Share Posted July 2, 2008 Hey John, Can't give an in-depth answer right now since I'm at work , but in a nutshell: yes you cannot use the X register to do indirect addressing the way you'd like to. The 6502 doesn't support it. If you want to use indirect addressing with an offset (i.e. LDA (Address),Offset), you have to use the Y register for the offset. Which means that if you're doing two player graphics, you have to save the Y register and reuse it. As to the zero-padding method, yes that is bar-none the fastest way to draw two players on the screen, but as you mentioned it is extremely wasteful ROM-wise. Other methods like skipdraw are much more ROM-efficient but obviously use more CPU cycles... Later, Ben Quote Link to comment Share on other sites More sharing options...
vdub_bobby Posted July 2, 2008 Share Posted July 2, 2008 (edited) You need to use a variant of SkipDraw! Here are some methods for drawing sprites. Assume Y is a decrementing line-counter in all cases: Simplest case: lda (GfxPtr),Y sta GRP0 ;+8 cycles Pros: Very fast. Cons: Must pad with zeroes. Next: lda (GfxPtr),Y and (MaskPtr),Y sta GRP0 ;+13 cycles Pros: Still pretty fast. Only need to pad your Mask with zeroes, so doesn't use as much space. Cons: Even padding one table with zeroes is a lot of wasted ROM Now, the fancy ones: SkipDraw: lda #SPRITEHEIGHT dcp SpriteTemp bcc SkipDraw lda (GfxPtr),Y sta GRP0 ReturnFromSkipDraw ;+17 cycles Pros: Still pretty fast, and runs in constant time if you set up the SkipDraw branch correctly. Requires (almost) no zero padding. Cons: Doesn't write to GRP0 every line (can be necessary if you are using VDEL); for this reason you do need at least one zero of padding on all sprite graphics. Not as fast as the other methods. A minor pain to setup the variables. Can be a huge hassle to setup the SkipDraw branch if your kernel is complicated. Note: If you didn't notice, it uses an illegal opcode. A variant of SkipDraw that I call DoDraw lda #SPRITEHEIGHT dcp SpriteTemp bcs DoDraw lda #0 .byte $2C DoDraw lda (GfxPtr),Y sta GRP0 ;+18 cycles Pros: Runs in constant time, relatively fast, and doesn't require any branches in/out of your kernel. (Just make sure that short branch doesn't cross a page boundary!) No padding required in your graphics. Writes to GRP0 every time - sometimes required when you want to use VDEL. Cons: The slowest method so far. The '.byte $2C' opcode-skip won't work on the SuperCharger and maybe other exotic bankswitch schemes. Still a minor pain to setup your variables. Note: This guy uses an illegal opcode also. Finally, the fanciest of the fancy: SwitchDraw cpy SpriteTop beq SwitchDraw bmi WaitDraw lda (GfxPtr),Y sta GRP0 ReturnFromSwitchDraw;+15 cycles Elsewhere, you need... SwitchDraw lda SpriteBottom sta SpriteTop jmp ReturnFromSwitchDraw WaitDraw SLEEP 4 bmi ReturnFromSwitchDraw ;branch always Note that SpriteBottom = the bottom of the sprite ORed with $80. Pros: The fastest non-padding method. Again, runs in constant time. Cons: *Only works for Y < 128!* So if you want to use it over the whole screen you have to do some tricky, and generally very painful, kernel setup and graphics interlacing. Edited July 2, 2008 by vdub_bobby 1 Quote Link to comment Share on other sites More sharing options...
+Propane13 Posted July 3, 2008 Author Share Posted July 3, 2008 THIS IS AWESOME! I really think the masking method is innovative-- that never would have crossed my mind. I'm going to study these a little more in detail. I can see that using these, I can setup P0 and P1 graphics with no worries. I was thinking I needed a bunch of branches that handled all case statements and ran dynamically, i.e. 1) if nothing this line, branch 2) If drawing p0 and not p1, jump to that routine 3) If drawing p0 and p1, jump to that routine etc... This takes a problem I was making way too complicated and makes it really nice and easy. Thanks a bunch! -John Quote Link to comment Share on other sites More sharing options...
supercat Posted July 11, 2008 Share Posted July 11, 2008 (edited) Another drawing method, not listed, is to do something like this. Assume sprites have at least one line of padding available on top and bottom, so that the two sprites will never begin or end on the same scan line. Assume further that sprites don't quite go to the top and bottom of the screen. toploop_early: SLEEP 7 toploop: .. 52 cycles of other stuff lda #0 sta GRP0 sta GRP1 iny zzz; nop 255 or something similar cpy sprite1top beq sprite1start_early cpy sprite0top bne sprite0start sprite0loop:; ** STARTS ONE CYCLE EARLIER THAN THE OTHERS! .. 53 cycles of other stuff lda (sprite0),y sta GRP0 lda #0 sta GRP1 iny zzz cpy sprite1bot beq toploop_early cpy sprite1top bne sprite0loop nop bothloop: .. 52 cycles of other stuff lda (sprite0),y sta GRP0 lda (sprite1),y sta GRP1 iny cpy sprite0bot; Sprites must be chosen so zero ends first! bne bothloop etc... Cycle counts may not be right above, but a key feature of the code is that sprite decision-making time is minimized in the loop when both sprites are displayed (both sprites are handled in 23 cycles, including the INY and looping branch). Hammering the code into shape may be a pain, but it's faster even than maskdraw. BTW, while interleaving sprite data (for a two-line unrolled loop) may seem an ugly approach, it has a lot of advantages and I'd recommend it if one is trying to optimize for speed. Among other things, having a loop counter go from 0 to 95 instead of 0 to 191 greatly increases the portion of each page that can be accessed without page crossings. Edited July 11, 2008 by supercat Quote Link to comment Share on other sites More sharing options...
+Propane13 Posted July 14, 2008 Author Share Posted July 14, 2008 (edited) Hey I was thinking of something. As SkipDraw takes some cycles, do people usually do something like this: (excuse the pseudo-code) if (Player0's horiz position = right of screen) and (player 1's horiz position = right of screen) { Kernal 1: do Skipdraw immediately after wsync for both p0 and p1 } else if (Player0's horiz position = left of screen) and (player 1's horiz position = right of screen) { Kernal 2: do Skipdraw with P0's update immediately after wsync, followed by p1's update } else if (Player0's horiz position = right of screen) and (player 1's horiz position = left of screen) { Kernal 3: do Skipdraw with P1's update immediately after wsync, followed by p2's update } else // (Player0's horiz position = left of screen) and (player 1's horiz position = left of screen) { Kernal 4: do Skipdraw with wasted cycles to pass the gfx, then update P0 and P1 with next line's data, instead of current } I think to give full-range motion, this would be the only effective way to do it. Is this how others handle it? Or, do they limit the horizontal range/position of their sprites? Thanks! -John Edited July 14, 2008 by Propane13 Quote Link to comment Share on other sites More sharing options...
+batari Posted July 14, 2008 Share Posted July 14, 2008 Hey I was thinking of something. As SkipDraw takes some cycles, do people usually do something like this: (excuse the pseudo-code) Just use VDEL, then store P1 in the first 21 cycles of the scanline and store P0 wherever you want. No reason to make it more complicated than necessary, unless you've got some ideological aversion to VDEL or something. Quote Link to comment Share on other sites More sharing options...
+Propane13 Posted July 15, 2008 Author Share Posted July 15, 2008 Thanks batari! That's so simple, yet solves my problems. I had completely forgotten about VDEL, and initially had thought it something to make sprites with less granularity. In thinking about the 6-digit score routine, what you say makes perfect sense. Ah... this community is so cool. I'm starting to understand things that long escaped my grasp these years. Thanks again for everyone's help. I hope I can use this knowledge to "build a better homebrew". -John Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.