Jump to content
IGNORED

Most efficient code for sprite animation


Recommended Posts

Is there an efficient assembler library template code to display an animated sprite? Few options that comes to mind, compute frame offset using multiplication to get next offset, or use fixed mapping table. Also alignment must be compact so I think you might waste some data space depending on line height.

  • Like 1
Link to comment
Share on other sites

Just a comment, but at least to some degree the effiency of the code would depend upon the height of the sprite.  Very efficient code could be written for 16-line heights, for example.  For this reason, this type of code is normally purpose-built to optimize for the particular situation.

 

  • Like 1
Link to comment
Share on other sites

    processor 6502
    include "vcs.h"
    include "macro.h"

char_height equ 9

    seg.u Variables
    org $80

temp 		.byte
digit0 		.byte
frame_cnt	.byte

    seg Code
    org $f000

Num 
    .byte #%01110000	; Zero
    .byte #%10001000
    .byte #%10001000
    .byte #%11001000
    .byte #%10101000
    .byte #%10011000
    .byte #%10001000
    .byte #%10001000
    .byte #%01110000	; 8
    
    .byte #%11111000	; One
    .byte #%00100000
    .byte #%00100000
    .byte #%00100000
    .byte #%00100000
    .byte #%00100000
    .byte #%10100000
    .byte #%01100000
    .byte #%00100000	; 17
    
    .byte #%11111000	; Two
    .byte #%10000000
    .byte #%01000000
    .byte #%00100000
    .byte #%00010000
    .byte #%00001000
    .byte #%10001000
    .byte #%10001000
    .byte #%01110000	; 26
    
    .byte #%01110000	; Three
    .byte #%10001000
    .byte #%10001000
    .byte #%00001000
    .byte #%01110000
    .byte #%00001000
    .byte #%10001000
    .byte #%10001000
    .byte #%01110000
    
    .byte #%00001000	; Four
    .byte #%00001000
    .byte #%00001000
    .byte #%11111000
    .byte #%10001000
    .byte #%01001000
    .byte #%00101000
    .byte #%00011000
    .byte #%00001000
    
    .byte #%01110000	; Five
    .byte #%10001000
    .byte #%10001000
    .byte #%00001000
    .byte #%00001000
    .byte #%11110000
    .byte #%10000000
    .byte #%10000000
    .byte #%11111000
    
    .byte #%01110000	; Six
    .byte #%10001000
    .byte #%10001000
    .byte #%10001000
    .byte #%11110000
    .byte #%10000000
    .byte #%10001000
    .byte #%10001000
    .byte #%01110000
    
    .byte #%00010000	; Seven
    .byte #%00010000
    .byte #%00010000
    .byte #%00010000
    .byte #%00010000
    .byte #%00010000
    .byte #%00001000
    .byte #%00001000
    .byte #%11111000
    
    .byte #%01110000	; Eight
    .byte #%10001000
    .byte #%10001000
    .byte #%10001000
    .byte #%01110000
    .byte #%10001000
    .byte #%10001000
    .byte #%10001000
    .byte #%01110000
    
    .byte #%00001000	; Nine
    .byte #%00001000
    .byte #%00001000
    .byte #%00001000
    .byte #%01111000
    .byte #%10001000
    .byte #%10001000
    .byte #%10001000
    .byte #%01110000
    .byte #0

Start:
    CLEAN_START

    lda #0
    sta digit0
    lda #0
    sta frame_cnt
    lda #$0E
    sta COLUP0
    sta COLUP1

Frame:
    ; VSYNC
    lda #2
    sta VSYNC
    sta WSYNC
    sta WSYNC
    sta WSYNC
    lda #0
    sta VSYNC

    ldx #36
VSYNC_Loop:
    sta WSYNC
    dex
    bne VSYNC_Loop

    ; load digit value
    lda digit0
    clc
    ; multiply by 9 (8+1)
    asl
    asl
    asl
    adc digit0
    ; add digit height offset
    adc #char_height
    tax
    sta WSYNC
	
    ; x indexes the offset from the beginning address of the digit bitmap table
    ; y counts the number of lines until the digit is finished
    ldy #char_height
Disp_num:
    dex
    lda Num,x
    sta GRP0
    sta WSYNC
    dey
    bne Disp_num
	
    lda #0
    sta GRP0


    ldx #183
VIS_Loop:
    sta WSYNC
    dex
    bne VIS_Loop
	
    inc frame_cnt
    lda frame_cnt
    and #%00011111
    bne No_tick
    inc digit0
    ldx digit0
    cpx #10
    bne No_tick
    ldx #0
    stx digit0
    sta WSYNC
	
No_tick:
	
    ldx #28
Overscan_loop:
    sta WSYNC
    dex
    bne Overscan_loop

    jmp Frame

; Jump Table
    org $fffc
    .word Start
    .word Start

Here is a half-decent example, highly simplified.

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

Thanks very much, that's option1 basically to compute it on the fly which works fine. If I look at Pitfall they use RAM pointer and not sure how that's computed and why they need that is needed. Also weird enough player is 22 pixel height just like Pitfall 2 and filled up to make it 22 which is odd number to compute addresses.

Link to comment
Share on other sites

1 hour ago, lucienEn said:

Also weird enough player is 22 pixel height just like Pitfall 2 and filled up to make it 22 which is odd number to compute addresses.

Height doesn't matter too much, though it can make things a bit faster. My digit example uses a character height of 9 and multiplies by 9. One way to multiply by 22:

load value (1) (t3)

clear carry (1) (t5)

multiply by 2 (2) (t7)

store temp value (2) (t10)

multiply by 2 (4) (t12)

multiply by 2 (8) (t14)

add temp value (10) (t17)

add original value (11) (t20)

multiply by 2 (22) (t22)

a to x (t24)

 

Overall, not too long of an operation, 24 cpu clocks.

 

  • Like 1
Link to comment
Share on other sites

Just now, bent_pin said:

Height doesn't matter too much, though it can make things a bit faster. My digit example uses a character height of 9 and multiplies by 9. One way to multiply by 22:

load value (1) (t3)

clear carry (1) (t5)

multiply by 2 (2) (t7)

store temp value (2) (t10)

multiply by 2 (4) (t12)

multiply by 2 (8) (t14)

add temp value (10) (t17)

add original value (11) (t20)

multiply by 2 (22) (t22)

a to x (t24)

 

Overall, not too long of an operation, 24 cpu clocks.

 

Got it, probably not a magic number for performance.

Link to comment
Share on other sites

4 minutes ago, lucienEn said:

Got it, probably not a magic number for performance.

Meh. You have 67 lines to set up your frame, multiplying by 22 takes <1/3 of 1 of those lines 0.47%. Not too costly and a constant time operation. You could speed it up by wasting 2 bytes per picture and multiplying by 24, but you won't gain much and you lose ROM space.

Link to comment
Share on other sites

9 hours ago, bent_pin said:

Height doesn't matter too much, though it can make things a bit faster. My digit example uses a character height of 9 and multiplies by 9. One way to multiply by 22:

load value (1) (t3)

clear carry (1) (t5)

multiply by 2 (2) (t7)

store temp value (2) (t10)

multiply by 2 (4) (t12)

multiply by 2 (8) (t14)

add temp value (10) (t17)

add original value (11) (t20)

multiply by 2 (22) (t22)

a to x (t24)

 

Overall, not too long of an operation, 24 cpu clocks.

 

Nice solution! But you need to move the "clear carry" to just before the first add, otherwise the shifts will clobber it. Alternatively, if we can be sure that we'll never try to multiply by more than 31 (the results break for anything over 11 anyway), we can dispense with the "clear carry" altogether, since the shifts will always set the carry flag to zero.

 

Just for fun I tried a small change:

lda start_value
asl
asl
asl
(clc)
adc start_value
adc start_value
adc start_value
asl
tax

As counterintuitive as it looks, I believe that it takes the same amount of ROM space (14/13 bytes) and the same number of cycles (24/22), but now we don't need to use a temp value in RAM (or clobber the original value).

 

EDIT: Looking at this again, I realized that if we require a multiplier of less than 12, we can simply do:

lda start_value
asl
asl
adc start_value
asl
adc start_value
asl
tax

I think that this trims it down to 19 cycles and 11 bytes.

Edited by Verdant
  • Thanks 2
Link to comment
Share on other sites

4 hours ago, Verdant said:

EDIT: Looking at this again, I realized that if we require a multiplier of less than 12, we can simply do:

lda start_value
asl
asl
adc start_value
asl
adc start_value
asl
tax

I think that this trims it down to 19 cycles and 11 bytes.

Nice.

That's some quality optimization. I hadn't had a chance to approach this graphwise.

Link to comment
Share on other sites

@bent_pin It took me a little while to understand your graph, but it looks very interesting. I think I figured out how to use it, but could you confirm it?

  1. Find your desired multiplier on the graph.
  2. To traverse the graph at each number, move up one step if you can do so, otherwise move to the next number to the left.
  3. Work backwards toward one and make note of each operation (half = double the A register (asl), and one less = add the multiplicand again (adc)).
  4. Once you arrive at one, you'll have the fastest list of operations (in reverse order) to perform the multiplication.
Edited by Verdant
Link to comment
Share on other sites

3 hours ago, Verdant said:

@bent_pin It took me a little while to understand your graph, but it looks very interesting. I think I figured out how to use it, but could you confirm it?

  1. Find your desired multiplier on the graph.
  2. To traverse the graph at each number, move up one step if you can do so, otherwise move to the next number to the left.
  3. Work backwards toward one and make note of each operation (half = double the A register (asl), and one less = add the multiplicand again (adc)).
  4. Once you arrive at one, you'll have the fastest list of operations (in reverse order) to perform the multiplication.

Pretty much.

I start by searching for my target so it's highlighted.

Always traveling right before going down, I write down the path's numbers.

Then I decide the edge weight as you described +1 = adc, x2 = asl

 

The key to speed it to realize that you will never do to adcs in a row. Then the rest of the graph wrote itself. Took an hour or two, but it was a fun puzzle.

 

Challenge me and see if I made any mistakes.

 

I'll make a demo of use Thursday because I'm off work then.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...