Jump to content
IGNORED

A Game and SoftSprites (PMs. sprites overlay)


José Pereira

Recommended Posts

Hi, probably back into this, but...

 

I've another one possible looking into A8.

A remake of GREEN BERET, but from NES Rush'n'Attack.

 

It is using ANTIC4, 48wideMode for Scrolling.

-> 21CharLinesHigh Playing Area, using Charsets like this:

0

1

2

3

0

1

2

3

(and so on untill the 21st Line)

1more Charset and 2CharLines for a bottom only Chars/PFs. Status Area.

-> 10DLIs. that only are changing PF2&PF3.

-> Our Player is PM0, Flames PM1.

-> Enemys use:

-> P2 and M2 are 2Enemys.

-> P3 and M3 are other 2Enemys.

-> 2Multiplex on top of the bottom PM2 (and also more two from PM3 if cycles allowed).

Enemys mostly only walk on Horizontal (3Chars+1Shifthing x 32pixels vertical)

->Only our Guy moves vertical and another one Enemy:

3Chars+1Char Shifting x 32vertical (more 8pixels vertical shifting)

post-6517-129286127198_thumb.png

 

 

More information if you want, but please give me your opinions...

(all in 50Hz. moving?

OR

Get our GreenBeret and the Flying/Jumping Guy at 50Hz and more 3+3Enemys at 25+25Hz.)

 

 

Thanks.

Greetings.

José Pereira.

Link to comment
Share on other sites

Hi Jose,

 

It really depends on the situation.

 

• Option 1: Using a standard general multipurpose engine. It does include full background masking and sprite-on-sprite masking. No self-modifying code, no unrolled loops.

 

• Option 2: Using f.e. cartridge space for code. Then you can do unrolled loops of machine language. This might speed up the softsprite routines for 30%, maybe 40%.

 

• Option 3: Using self-modifying code. This might help winning some free cpu-cycles as we can use 'direct addressing' commands instead of 'indexed addressing'.

 

• Option 4: Using a smart engine which drops the AND/ORA part of the masking procedure, for example if masking is not needed (f.e. on empty background, or if one part of the sprite already needs the full 8*4 pixel area in one character).

 

• A smart combination of the previous 4 options, optimized for your specific situation.

 

 

For option 1, a softwaresprite of 3*3 characters/tiles of antic 4 will need already 12% of CPU time in the ideal case, but 25% in the worst case (i.e. it covers a 4*4 area, as it crossed character-boundaries). And, this is with respect to PAL timing.

 

A fast & sloppy calculation says: In the worst case situation, an 8*8 char block already eats all CPU time of a 50Hz frame.

 

 

...but, on the other hand, in the world of precalculated gfx and cheating, much more should be possible. Especially in combination with option 4...

Link to comment
Share on other sites

...thinking of another solution...

 

There's multiple copies of the same 'guys' in some areas of the screen.

 

Now, if we take a better look, it's striking that the difference in their positions always seems a multiple of 4 pixels. Now also supposing these guys are walking in sync, and take advantage of the free space in the font memory, we could do the following.

 

If one 'bad guy' walks on screen (from left border), then we could reuse the graphics if a second (or even a third) bad guy enters the screen 12 or 16 frames later. Thus, don't overwrite the reserved area in the font space, but 'keep it' for some frames longer. Use some 'buffer swapping' techniques, and only overwrite again if font space is totally filled, or if 'new bad guys' are needed in a different setting. This could especially be a nice solution if we also need to take account of the background gfx, as in this case.

 

This might be a very specific 'way of cheating' to get much faster routines to do what we want to reach here.

Link to comment
Share on other sites

Hi Jose,

 

It really depends on the situation.

 

• Option 1: Using a standard general multipurpose engine. It does include full background masking and sprite-on-sprite masking. No self-modifying code, no unrolled loops.

 

• Option 2: Using f.e. cartridge space for code. Then you can do unrolled loops of machine language. This might speed up the softsprite routines for 30%, maybe 40%.

 

• Option 3: Using self-modifying code. This might help winning some free cpu-cycles as we can use 'direct addressing' commands instead of 'indexed addressing'.

 

• Option 4: Using a smart engine which drops the AND/ORA part of the masking procedure, for example if masking is not needed (f.e. on empty background, or if one part of the sprite already needs the full 8*4 pixel area in one character).

 

• A smart combination of the previous 4 options, optimized for your specific situation.

 

 

For option 1, a softwaresprite of 3*3 characters/tiles of antic 4 will need already 12% of CPU time in the ideal case, but 25% in the worst case (i.e. it covers a 4*4 area, as it crossed character-boundaries). And, this is with respect to PAL timing.

 

A fast & sloppy calculation says: In the worst case situation, an 8*8 char block already eats all CPU time of a 50Hz frame.

 

 

...but, on the other hand, in the world of precalculated gfx and cheating, much more should be possible. Especially in combination with option 4...

 

Thanks Analmux, but I already know some things and others I don't...

For example I've learn already the SoftSprite Masking: LDA,AND,ORA,STA it's 21cycles/byte (but only 16bytes in the free shifting SoftSprite clean byte Lines).

I also understand that if we have the Enemys at constant/always the same movement at x,y we might get just the Screen Chars replaced, we could have already SoftSprite Chars that simply replace the Backgr. Gfxs. (just take one off and SoftSprite on and then again the Backgr. gfx... This way will I get more free cycles?

Also Rybags talks some time ago (on that Gameboy Gradius Thread) that we also must remember in the Horizontal colour clock one 2:1 ratio but I don't see how would I win cycles...

I also optimized 2,3 pixels that are Backgr. colour on the bottom Floor, 16cycles only.

This is saying that for a simply, normal Horizontal movement of this syze SoftSprites I will need (includ. Horiz. 1Char for Shifting):

4bytes(x)30lines(x)21cycles(=)2610cycles

4bytes(x)2lines(x)16cycles (=)0128cycles

------------------------------__________

TOTAL CYCLES/1Sprite_Horiz.(=)2738cycles

 

Normal way:GreenBeret(+)4Enemys (=) 5(x)2738cycles (=) 13690

But this seems a little bit too much, isn't it?

 

 

My Screen for cycle uses something like (+/-)11500cycles (Charsets, 48wide, DLIS,PMs...).

If this values, how much left for the other things?

If, for example 18000 free for all code, how much could be reserved for the SoftSprites?

 

 

Thanks.

José Pereira.

 

 

I am trying to get some way (even not beeing a coder, to understand the basis, cycle counting to get all my Thoughts and Game/Screens Engine that a coder can take it without problems.

I must learn once and for all where/how much SoftSprites can I get on a Screen...

And, by the way, could you explain what is "unrolled loops", "self modifying code", "'direct adressing' instead of 'indexed adressing'" or "4*4Char crosses char boundarie"?

 

 

Thanks again.

José Pereira.

Link to comment
Share on other sites

Normal way:GreenBeret(+)4Enemys (=) 5(x)2738cycles (=) 13690

But this seems a little bit too much, isn't it?

Yes it seems too much, at least using standard mode and not using any smarter engines. In reality it's even worse. The real problem is, you're only counting the bare LDA/AND/ORA/STA routine, which is indeed 21 cycles. But, this is only 50% or 60% of the total job, as you also need to compute, increment and recompute the (indirect) address pointers, involved in this process. So, I think 20000 cycles is somewhat more realistic.

 

 

My Screen for cycle uses something like (+/-)11500cycles (Charsets, 48wide, DLIS,PMs...). If this values, how much left for the other things?

If, for example 18000 free for all code, how much could be reserved for the SoftSprites?

Yes, the point is: We shouldn't use ALL cpu-time, just running a softsprite engine. I would say, 60% of CPU-time might be a real maximum, as we would also need time for the music player, PM underlays and game logic. Anyway, it's hard to say.

 

However, one solution might be changing from antic 4 to antic 5 gfx mode, if you get into trouble when you're halfway during development. Antic 5 only needs one half of the sprite data, and it also gives you back some DMA.

 

 

And, by the way, could you explain what is "unrolled loops", "self modifying code", "'direct adressing' instead of 'indexed adressing'" or "4*4Char crosses char boundarie"?

• Unrolled loops: By definition a loop restarts a number (let's say number N) of times. But, instead of CHECKING a counter, we can repeat the same subroutine of instructions N times in the source code itself.

 

Instead of:

FOR X=1 TO 3:PRINT "HELLO":NEXT X

 

...we will write

PRINT "HELLO":PRINT "HELLO":PRINT "HELLO"

One advantage is that one of the index registers (X and Y) in Machine Code will be free again. Another advantage, the CPU doesn't need to execute the "FOR" & "NEXT" commands, so less CPU-cycles are needed. However, we need more memory. That's why it's a good solution only if we have a large amount of RAM or ROM. Anyway, it has advantages and disadvantages.

 

 

• Self-modifying code: Instructions in indirect indexed addressing mode need more cpu-cycles. Dynamically changing the code itself, we could do a direct "POKE" or "PEEK" instead. This also has advantages and disadvantages. But, it could be hard to explain in a few sentences. One disadvantage is that it's extremely useless when combining this with unrolled code. In case of an unrolled loop of N steps, we'd also need to "self-modify" N addresses.

 

 

• About 4*4Chars...I suppose it means the same as the following, which you already wrote in the 1st post:

 

...3Chars+1Char Shifting x 32vertical (more 8pixels vertical shifting)

This would be 3(X)*4(Y) chars in the minimal (or "ideal") case, and 4(X)*5(Y) in the worst case.

Link to comment
Share on other sites

Hi,

There is nothing faster than unrolling code. Actually the fastest thing is to unroll the sprite data into the code also, because then you can omit unnecessary instructions (like ORA #$00 or AND #$ff). Of cause this comes at the expense of memory. Here's something I did many years ago (and which will be part of one of thoses very-very-long lasting projects/ideas ;-) ). It creates 8 soft-sprites with 9x16 pixel and PM underlay at full frame rate in Graphics 15 with wide display. Full masking and background restoration. Sorry, the underlays are only squares and will show some artifacts but that's just because it didn't paint them yet.

post-17404-129292889301_thumb.png

ShipsMultiplexDLI.avi.zip

Edited by peter.dell
Link to comment
Share on other sites

Thanks to share Peter.

 

Very interesting... but just some questions:

-> You are using 4Players+4 Multiplexing and without any Missiles?

-> If it is like this, could you add the Missile to each Sprite, this will get 10pixels wide, probably many C64 sprites could turn good(and you have here) into 10pixels...

Will the 4+4Missiles adding change anything in the speed/cycles?

 

 

Greets.

José Pereira.

 

Just one(s) more question:

"Using an ANTIC4 to get PF3 is out of question as you will run out of cycles, right?"

(and ANTIC4 will also be a bad choice because of Charsets cycles and 128Chars?)

Edited by José Pereira
Link to comment
Share on other sites

-> You are using 4Players+4 Multiplexing and without any Missiles?

Yes

-> If it is like this, could you add the Missile to each Sprite, this will get 10pixels wide, probably many C64 sprites could turn good(and you have here) into 10pixels... Will the 4+4Missiles adding change anything in the speed/cycles?

The sprites are based on C64 sprites, but completely drawn from scratch to match the memory layout. The major trick is to make them 9 pixels wide, because then you always have 3 bytes/12 pixels to copy/mask.

 

 

Byte   0   1   2    0   1   2    0   1   2    0   1   2    
Pixel  0123456789AB 0123456789AB 0123456789AB 0123456789AB 
Bitmap AAAAAAAAA... .AAAAAAAAA.. ..AAAAAAAAA. ...AAAAAAAAA

 

The missles would be quite expensive. Probably 1 sprite less because you double the number of required PM bytes to be cleared and you have to mask all missles togehter using LDA ORA STA instead of LDA STA. In addition DLI time might run out due to the additional "CLC:ADC #8:STA HPOSMx".

 

->Using an ANTIC4 to get PF3 is out of question as you will run out of cycles, right?

That depends. If you have few, big sprites and/or animated backgrounds it can be an advantage because then you can use unrolled code with absolute addressing "LDA spritedata,x STA chrset+$100" instead of "STA (P1),Y". Of course masking can become more complex if the sprite has many "holes".

 

To sum it up: It completely depends on the gameplay/design.

Edited by peter.dell
Link to comment
Share on other sites

Me again...

You use a 48Wide and BitMap Mode.

You are moving the 8Sprites over (00),(01),(10)and(11)

 

->The same but if you were moving the Sprites only over the Backgr. colour (00) could you get 2or more sprites on the Screen (same size...forget the 2more Players needed by now)?

 

->This Screen in BitMap Mode you move 8Sprites, if you have this Screen in ANTIC4, and 1Charset on each CharLine how many of this size Sprites could you move?

 

 

Thanks.

Greetings.

José Pereira.

Link to comment
Share on other sites

->The same but if you were moving the Sprites only over the Backgr. colour (00) could you get 2or more sprites on the Screen (same size...forget the 2more Players needed by now)?

Without players and with double buffering, 15 are possible, 14 are realistic (you need some time for game logic/music).

post-17404-12929576569_thumb.png

 

->This Screen in BitMap Mode you move 8Sprites, if you have this Screen in ANTIC4, and 1Charset on each CharLine how many of this size Sprites could you move?

That's hard to tell as I never implemeted it that way. You have additional DMA and since you need double buffering for #lines*2*1k for charsets. I guess 7-8 should be possible.

Link to comment
Share on other sites

->The same but if you were moving the Sprites only over the Backgr. colour (00) could you get 2or more sprites on the Screen (same size...forget the 2more Players needed by now)?

Without players and with double buffering, 15 are possible, 14 are realistic (you need some time for game logic/music).

post-17404-12929576569_thumb.png

 

->This Screen in BitMap Mode you move 8Sprites, if you have this Screen in ANTIC4, and 1Charset on each CharLine how many of this size Sprites could you move?

That's hard to tell as I never implemeted it that way. You have additional DMA and since you need double buffering for #lines*2*1k for charsets. I guess 7-8 should be possible.

 

Thanks.

Man you're amazing!... you are making my Day!

-> First question:

Bitmap on empty Backgr. with Players Multiplex (one Player in each Sprite) instead of 14 we would get how many?

 

->2nd question:

I was talking about ANTIC4 and Sprites moving over PFs., like on your Screen.

All that in ANTIC4 1Charset for each Charline but moving over Backgr. (00) colour only and like you did 1Player on each and Multiplex, how many would you get?

 

---------------------------------------------------------------------------------------------------------------------

A new question:

-> Have your 2Ships (like Armalyte move over PFs.) but Enemys on Backgr. colour only and ANTIC4 with 1Charset for each Charline.

P0&P1: our 2Ships.

P2&P3 for Enemys Multiplex

 

How many Enemys could you get?

_____________________________________________________________________________________________________________________

 

Great if you could just answer this last ones, I think I am start something running on my Mind ;)

Thanks.

Greetings.

José Pereira.

Link to comment
Share on other sites

I think I am start something running on my Mind ;)

 

José Pereira.

Yes, I was thinking in something!...

 

I use Peter Dell's Sprites for the Enemys but I can say that I starting to remake the C64 Armalyte Sprites into 9x16pixels and they look very good (for now, here you can see our 2Player Ships).

 

-> First: A clean Screen using ANTIC4 with only 4colours (PF2):

post-6517-129303258856_thumb.png

.g2f File:armalyte_azul no sprites_g2f.zip

 

And here we start to get some things:

post-6517-129303273517_thumb.png

.g2f&.xex Files:2Ships&8Enemys.zip

 

What I've done:

-> ANTIC4

-> PRIOR0

-> Moving over a clean Backgr. (will discuss about the Stars points later, I also have some idea(s)...)

-> Charsets like 0,1,2,0,1,2 (->CharLines)

(this way, only 3Chars for each Sprite on each Chset. 3x10=30Chars. 98 free for Gfxs&Fire)

-> Colours:

Backgr.-always Black

PF0- always White

PF1- Dark Blue here (Dark colour, change to any on other Levels possible)

PF2- Light Blue here (Light colour, change to any other colour on other Levels possible)

PF3- a DarkGray-(0,6) - Always this.

->PF3 use only on Enemy Sprites

->Our Players are one P0 and the other P1

(only M0&M1 used as 5th Player enabled in the Front of our two Ships, just Missiles as 5th Player will go above all the Backgr. Gfxs. and no colour clash)

-> P2&P3 for Enemys: I just Multiplex one more.

(I have the idea of having something like you see here: the Enemys interlaced:

SoftSprite with Player Underlay / SoftSprite only/ SoftSprite with Player underlay/...

Other Attacking Waves can have different combinations...)

 

10SoftSprites, 4+2Multiplex Players and just 2Missiles... on an ANTIC4, Peter Dell is this possible?

(or even 10Enemys...)

 

Thanks.

Greetings.

José Pereira.

Edited by José Pereira
Link to comment
Share on other sites

Hi Jose, the concept is ok, following what we will be doing with Pacmania using the player overlays over the software sprites provides the best options. Ten software sprites at that size along with the other game procedures would be tight but should be do-able.

Link to comment
Share on other sites

Hi Jose, the concept is ok, following what we will be doing with Pacmania using the player overlays over the software sprites provides the best options.

 

:ponder: Oops, now you said... no problem :P

José Pereira.

 

 

P.s.- I was thinking in speed, as you would need to move the Ships very fast sometimes...

Probably more fast than in Pacmania, or not?

José Pereira.

Link to comment
Share on other sites

First question: Bitmap on empty Backgr. with Players Multiplex (one Player in each Sprite) instead of 14 we would get how many?

Not much difference here since the major work per sprite is multiplexing/PM updates and they are independent of that.

Hence instead of 8 with background 10 are possible without, but only 9 are realistic.

post-17404-129303930017_thumb.png

 

All Antic 4 questions:

Sorry, José, but that is beyond estimation without implementation (and I cannot do it now ;-) ).

The idea with 3 charsets sounds reasonable. Also using PF3 are grey for the softsprite is a good idea.

But I really cannot give any estimation here. Also multiplexing becomes very hard in char mode

because I will not have scanline exact DLIs. Also having only 2 player statically available for multiplexing is hardly enough.

Apart from that, Armalyte uses many more sprites (for example for the shots) and they often move above the background.

Link to comment
Share on other sites

It's best to think first about the available machine cycles we physically have to play with for all the procedures ... although it'd be a best guestimate until it is actually put into practice, an experienced coder will know roughly whats what and set everything out with prior planning. It doesn't all have to be done in one frame remember.

 

EDIT.. Peter beat me to it with the reply :)

Edited by Tezz
Link to comment
Share on other sites

It's best to think first about the available machine cycles we physically have to play with for all the procedures ... although it'd be a best guestimate until it is actually put into practice, an experienced coder will know roughly whats what and set everything out with prior planning. It doesn't all have to be done in one frame remember.

 

EDIT.. Peter beat me to it with the reply :)

 

Tezz, you can download the clean Sprites .g2f screen.

You have there the 0,1,2 Charsets, 48wide and 25CharLines on the Playing Area.

 

You also need 3CharLines more to the bottom Status Area. This probably as Charset3.

How much cycles this screen uses? And how many free (total and just for the SoftSprites)?

---------------------------------------------------------------------------------------------------------------------

 

Peter Dell:

About the Fire/Shooting I was thinking in them as Chars.

Like Turrican on C64, White Hi-Resol are Chars. Here at Armalyte we have PF0-always White and like C64: PF1&PF2 (changes colour acording to Screen/Levels colours)

OR

Chars: PF0-always White and PF3-Always DarkGray.

 

 

 

Greets.

José Pereira.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...