MaPa Posted November 11, 2010 Share Posted November 11, 2010 Just posting my test of use MWP technique in a game scenario. It is working somehow, not ideally coded and it made me too many headaches to code it. One file is with hw sprite, one with software sprite and one with soft sprite and with indication how much CPU time it takes to draw the only one sprite. Yellow parts are character definitions drawing (4x4 chars), purple parts some overhead which takes quite a lot cpu time due to MWP video ram wrap etc. Yes, it was coded with purpose to do a finished game, but the project is now frozen (like almost for two years now). dino.xex dino_soft.xex dino_soft_bars.xex 6 Quote Link to comment Share on other sites More sharing options...
Rybags Posted November 11, 2010 Share Posted November 11, 2010 Looks good... though you should know that now you'll be unindated with requests to finish it... or turn it into something else. Quote Link to comment Share on other sites More sharing options...
MaPa Posted November 11, 2010 Author Share Posted November 11, 2010 (edited) Looks good... though you should know that now you'll be unindated with requests to finish it... or turn it into something else. I don't worry, I'm a stoic person so I'll ignore it successfully or maybe it will push me to do something with it and finish it in some form which would be good too. Edited November 11, 2010 by MaPa Quote Link to comment Share on other sites More sharing options...
mimo Posted November 11, 2010 Share Posted November 11, 2010 Looks very nice Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted November 11, 2010 Share Posted November 11, 2010 I just went yesterday into Boinxx... so I am a stoic person, too... Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted November 11, 2010 Share Posted November 11, 2010 I think MWP is useful for 5200 or 400 games but I still not see any advantage in realgame situations? Quote Link to comment Share on other sites More sharing options...
popmilo Posted November 11, 2010 Share Posted November 11, 2010 It looks like mwp is working... Soft sprite routine is rather slow, but I bet there must be a reason ps. I collected all the fruit and nothing happend you really should do something about it Quote Link to comment Share on other sites More sharing options...
MaPa Posted November 12, 2010 Author Share Posted November 12, 2010 (edited) I think MWP is useful for 5200 or 400 games but I still not see any advantage in realgame situations? IMHO it's for saving memory, in antic mode 4 you need only about 1kB for 8way scroll screen instead of 4kb (AFAIK how 8way should be done in normal way). But for just a little 3kB saving it brings other complications etc. It looks like mwp is working... Soft sprite routine is rather slow, but I bet there must be a reason ps. I collected all the fruit and nothing happend you really should do something about it Hehe, the item collecting was added relatively recently. I must have been bored that I did something new into it. The soft sprite routine surely can be done faster but IMHO not that much if you will not do totally unrolled loops with hardcoded data which is IMHO unrealistic in good game scenario in 64kB RAM. Just pure copying and masking data into 16 chars that the sprites occupies takes around 2400 cycles (only 26 lines high sprite so not copying full 4 chars) which is around 40 scanlines and that's around 7 char lines (due to badlines). And you need to adjust pointer where you read video ram to get char under the sprite, get its offset and prepare pointer to its definition. In MWP where the screen is notlinear, it can wrap anywhere back to the beginning of video memory area, so you need to check that condition etc. so it will sum up to some "nice" overhead. As pure copying takes 23 cycles per byte (indexed lda, and, ora, sta = 21 cycles without page crossing + iny = 23 cycles total) * 8 bytes in char = 184 cycles copying one char and then comes the overhead where you have some multiple lda, sta, adc, sbc, cmp, bne etc. and you have easily another 50+ cycles per character. Edited November 12, 2010 by MaPa Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted November 12, 2010 Share Posted November 12, 2010 Mapa... yeah. that's what I am thinking, too... makes more things complicated for simple 3k.... but for 5200/400 it can be advantage where RAM is tight... Quote Link to comment Share on other sites More sharing options...
snicklin Posted November 12, 2010 Share Posted November 12, 2010 MaPa, you are a seriously talented coder. This demo looks a little like a game on the Amiga several years ago. By the way, what is "MWP" which I keep on seeing on this board? Quote Link to comment Share on other sites More sharing options...
sack-c0s Posted November 12, 2010 Share Posted November 12, 2010 That's looking pretty damn good actually. I'd like to see it finished as well, but seeing as I'm sitting in the shadow of a massive pile of work I need to finish *yesterday* I can sympathise on the finding time front Quote Link to comment Share on other sites More sharing options...
popmilo Posted November 12, 2010 Share Posted November 12, 2010 ... And you need to adjust pointer where you read video ram to get char under the sprite, get its offset and prepare pointer to its definition. In MWP where the screen is notlinear, it can wrap anywhere back to the beginning of video memory area, so you need to check that condition etc. so it will sum up to some "nice" overhead. As pure copying takes 23 cycles per byte (indexed lda, and, ora, sta = 21 cycles without page crossing + iny = 23 cycles total) * 8 bytes in char = 184 cycles copying one char and then comes the overhead where you have some multiple lda, sta, adc, sbc, cmp, bne etc. and you have easily another 50+ cycles per character. I have done soft sprites on C64, and never thought about nonlinear screens... But, same core principal as yours I guess. Lda, and, ora, sta (all absolute, x or y indexed). in my routine I separated it in this way: "restore screen chars under previous sprite position" "new background to buffer" "sprite to buffer" "buffer chars to screen" I have rough skeleton of new routine that would combine second and third step with assumption of random chars under sprite. Calculations show it would be significantly faster. And on A8 with its faster cpu it should be possible to make it faster ... All this with using lookup tables for shifting and a lot of self modifying code. On c64 it makes sense because 8 sprites (4x4 size) take 128 chars and 128 is left for background... On A8 this is not the case, so I'm thinking to use some "bitmap" like mode... maybe made of chars to get that one color more but who knows... Are you using preshifted sprite data ? Quote Link to comment Share on other sites More sharing options...
popmilo Posted November 12, 2010 Share Posted November 12, 2010 MaPa, you are a seriously talented coder. This demo looks a little like a game on the Amiga several years ago. By the way, what is "MWP" which I keep on seeing on this board? I don't know if you know about "AtariWiki" but this is great info: http://atariwiki.strotmann.de/wiki/Wiki.jsp?page=Ironman%20Atari#section-Ironman+Atari-MWP 1 Quote Link to comment Share on other sites More sharing options...
MaPa Posted November 12, 2010 Author Share Posted November 12, 2010 By the way, what is "MWP" which I keep on seeing on this board? MWP stands for Minimum Wrapping Principle or something like that. AFAIK analmux "invented" it some years ago. This scrolling technique allows 8way scroll and uses memory about of one screen. It uses 2 LMS commands in DLIST, first points to first displaying line (of course) and the second is positioned so that at the end of memory area it points to its beginning again. When scrolling the position of second LMS command varies. Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted November 12, 2010 Share Posted November 12, 2010 MWP reminds me somehow Sync Scrolling on ST... weird cycle exact Shifter manipulation at top of the screen and a lookup table... Quote Link to comment Share on other sites More sharing options...
MaPa Posted November 12, 2010 Author Share Posted November 12, 2010 (edited) I have done soft sprites on C64, and never thought about nonlinear screens... But, same core principal as yours I guess. Lda, and, ora, sta (all absolute, x or y indexed). in my routine I separated it in this way: I don't know how on c64 but on ATARI I can't image now how I can have absolute,x or y indexed if 256 are not enough to cover all sprite definitions, character definitions etc. so I would need to self- modify code and prepare several addresses on several places or have kilobytes of code for several situations and combinations. For example I have something like this (speeded up a little by unrolling loop): lda ($fa),y ; load char definition data under soft sprite and ($f8),y ; AND mask ora ($fc),y ; ORA sprite data sta ($f6),y ; save to sprite char definition iny lda ($fa),y ; 2. byte and ($f8),y ora ($fc),y sta ($f6),y iny lda ($fa),y ; 3. byte and ($f8),y ora ($fc),y sta ($f6),y iny lda ($fa),y ; 4. byte and ($f8),y ora ($fc),y sta ($f6),y iny lda ($fa),y ; 5. byte and ($f8),y ora ($fc),y sta ($f6),y iny lda ($fa),y ; 6. byte and ($f8),y ora ($fc),y sta ($f6),y iny lda ($fa),y ; 7. byte and ($f8),y ora ($fc),y sta ($f6),y iny lda ($fa),y ; 8. byte and ($f8),y ora ($fc),y sta ($f6),y iny If I would replace indexed (),y by absolute,x then I would need to change 1 address on 8 places or do not unroll loop and then I will add 3 cycles for BNE or whatever per byte and will save 4 cycles by changing (),y into $abs,x and y but on the other hand, preparing ZP pointers takes less cycles then preparing $abs addresses somewhere in RAM. Or am I missing something? Are you using preshifted sprite data ? Yes, I'm. Edited November 12, 2010 by MaPa Quote Link to comment Share on other sites More sharing options...
snicklin Posted November 12, 2010 Share Posted November 12, 2010 (edited) Thanks PopMilo and MaPa, I can see how this works now. Hmm, it's a powerful technique to say the least. I guess though that it starts to get very complicated when you're using tiles, that'll be another "layer" on top of your calculations. Edited November 12, 2010 by snicklin Quote Link to comment Share on other sites More sharing options...
analmux Posted November 12, 2010 Share Posted November 12, 2010 (edited) MWP stands for Minimum Wrapping Principle or something like that. (edited one time) Exactly (not "minimal warp principal" ) But respect that you try to combine s.w.sprites and mwp scrolling. That was never my intention. The SMB3 stuff was supposed to work only with PM gfx, no software sprites, or at least not in the scrolling zones, only in bowser zones. OK, all problems can be solved of course, but you need to keep track of the 2nd copy (2nd LMS) screen line. Then, if you're doing s.w.sprites, the first problem will be the shadow copies, sometimes occurring, sometimes not. Combining scrolling and s.w.sprites, I'd use a different scheme. Use only vertical wrapping and 24 LMS'es, and lines of 80 characters long. Then do double buffering only in horizontal direction. OK, then it will need twice as much screenmemory. Edited November 12, 2010 by analmux Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted November 12, 2010 Share Posted November 12, 2010 Analmux... but then where is the advantage in the updated MWP method? I am still a fan of the NES/GB method. Quote Link to comment Share on other sites More sharing options...
popmilo Posted November 12, 2010 Share Posted November 12, 2010 I don't know how on c64 but on ATARI I can't image now how I can have absolute,x or y indexed if 256 are not enough to cover all sprite definitions, character definitions etc. so I would need to self- modify code and prepare several addresses on several places or have kilobytes of code for several situations and combinations. For example I have something like this (speeded up a little by unrolling loop): ... If I would replace indexed (),y by absolute,x then I would need to change 1 address on 8 places or do not unroll loop and then I will add 3 cycles for BNE or whatever per byte and will save 4 cycles by changing (),y into $abs,x and y but on the other hand, preparing ZP pointers takes less cycles then preparing $abs addresses somewhere in RAM. Or am I missing something? Nothing wrong with your approach On C64 I used 4 verticaly placed chars to get "column" of 32 bytes. 4 columns are 128 bytes. So any byte of sprite can be reached with indexed addressing. This is how my code for 21 line, 3 byte wide sprites looks like: ;main sprite routine (masked) spriteplot_main ldy #20 spriteplot_01 ldx $1000,y ;x=m0 spriteplot_02 lda $1000,x ;a=shl_mask(m0) spriteplot_03 and $1000,y ;a=m0 and ch0 spriteplot_04 ldx $1000,y ;x=s0 spriteplot_05 ora $1000,x ;a=a or shl(s0) spriteplot_06 sta $1000,y ;left byte spriteplot_11 ldx $1000,y ;x=m0 spriteplot_12 lda $1000,x ;a=shr_mask(m0) spriteplot_13 ldx $1000,y ;x=m1 spriteplot_14 ora $1000,x ;a=a or shl_mask(m1) spriteplot_15 and $1000,y ;a=a and ch1 spriteplot_16 ldx $1000,y ;x=s0 spriteplot_17 ora $1000,x ;a=shr(s0) spriteplot_18 ldx $1000,y ;x=s1 spriteplot_19 ora $1000,x ;a=shl(s1) spriteplot_10 sta $1000,y ;middle byte 1 spriteplot_21 ldx $1000,y ;x=m1 spriteplot_22 lda $1000,x ;a=shr_mask(m1) spriteplot_23 ldx $1000,y ;x=m2 spriteplot_24 ora $1000,x ;a=a or shl_mask(m2) spriteplot_25 and $1000,y ;a=a and ch2 spriteplot_26 ldx $1000,y ;x=s1 spriteplot_27 ora $1000,x ;a=shr(s1) spriteplot_28 ldx $1000,y ;x=s2 spriteplot_29 ora $1000,x ;a=shl(s2) spriteplot_20 sta $1000,y ;middle byte 2 spriteplot_31 ldx $1000,y ;x=m2 spriteplot_32 lda $1000,x ;a=shr_mask(m2) spriteplot_33 and $1000,y ;a=m2 and ch3 spriteplot_34 ldx $1000,y ;x=s2 spriteplot_35 ora $1000,x ;a=a or shr(s2) spriteplot_36 sta $1000,y ;right byte dey bpl spriteplot_01 rts It does require a lot of addresses set before main cycle, but if I remember correctly, it was well worth it. Saving in absolute vs indirect in 21 main loop cycles was good enough to make it work faster. Will have to do those calculations again.. it does look fishy... And I wouldn't be able to do the lookup table masking and shifting without ",x" ... I will try to refine this code and combine it with direct background read in main loop and see how it goes. If its slow -> out goes the masking. If its still slow -> out goes shifting online - in goes preshifted data If its still slow -> reduce vertical resolution. If its still slow -> reduce size of sprites If its still slow -> go to the bar and get drunk Quote Link to comment Share on other sites More sharing options...
popmilo Posted November 12, 2010 Share Posted November 12, 2010 (edited) double post.. dont read Edited November 12, 2010 by popmilo Quote Link to comment Share on other sites More sharing options...
emkay Posted November 13, 2010 Share Posted November 13, 2010 This thing is really great. Hopefully, it will turn into a game sometimes? It already looks very good, and seeing the protagonist, heck, this back in the 80s, could have become the "known A8 hero" , just like Mario on the Nintendo. Not sure,whether the cpu usage is such a big problem. Reducing the height of the gamescreen and put some 32 bytes wide info panel beneath it, gives some additional CPU cycles. .... whatever In this state, all PM is free? Placing additional moving elements anywhere on the screen gets easier now, than to multiplex one Player into 2 . What a nice perspective of the possible Quote Link to comment Share on other sites More sharing options...
MaPa Posted November 13, 2010 Author Share Posted November 13, 2010 This thing is really great. Hopefully, it will turn into a game sometimes? Because of all the responses here I'm thinking about returning to it and continue after probably complete rewrite. It already looks very good, and seeing the protagonist, heck, this back in the 80s, could have become the "known A8 hero" , just like Mario on the Nintendo. If I remember it right, protagonist was done by PG and Ooz did its animation. In this state, all PM is free? Placing additional moving elements anywhere on the screen gets easier now, than to multiplex one Player into 2 . In the file with soft sprite yes, but in the other with hw sprites, the protagonist uses 2 out of 4. Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted November 13, 2010 Share Posted November 13, 2010 PG "loves" doing animation... he is more into static gfx... 1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.