Jump to content
IGNORED

Can I load columns/rows faster than this?


Recommended Posts

Is this the fastest way I can load a column or row after a WAIT?

 

DRAW: PROCEDURE
    IF coarseScroll = 1 THEN

        #offset = (levelWidth*#topRow)+(#rightColumn-19)
        WAIT

        #BACKTAB(  0    ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  20   ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  40   ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  60   ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  80   ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  100  ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  120  ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  140  ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  160  ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  180  ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  200  ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  220  ) = #levelRAM(  #offset   )

    ELSEIF coarseScroll = 2 THEN

        #offset = (levelWidth*#topRow)+#rightColumn
        WAIT

        #BACKTAB(  19   ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  39   ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  59   ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  79   ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  99   ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  119  ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  139  ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  159  ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  179  ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  199  ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  219  ) = #levelRAM(  #offset   )
        #offset = #offset + levelWidth
        #BACKTAB(  239  ) = #levelRAM(  #offset   )

    ELSEIF coarseScroll = 3 THEN

        #offset = ((levelWidth*#topRow)+#rightColumn-19)
        WAIT

        #BACKTAB(  0   ) = #levelRAM(  #offset     )
        #BACKTAB(  1   ) = #levelRAM(  #offset+1   )
        #BACKTAB(  2   ) = #levelRAM(  #offset+2   )
        #BACKTAB(  3   ) = #levelRAM(  #offset+3   )
        #BACKTAB(  4   ) = #levelRAM(  #offset+4   )
        #BACKTAB(  5   ) = #levelRAM(  #offset+5   )
        #BACKTAB(  6   ) = #levelRAM(  #offset+6   )
        #BACKTAB(  7   ) = #levelRAM(  #offset+7   )
        #BACKTAB(  8   ) = #levelRAM(  #offset+8   )
        #BACKTAB(  9   ) = #levelRAM(  #offset+9   )
        #BACKTAB(  10  ) = #levelRAM(  #offset+10  )
        #BACKTAB(  11  ) = #levelRAM(  #offset+11  )
        #BACKTAB(  12  ) = #levelRAM(  #offset+12  )
        #BACKTAB(  13  ) = #levelRAM(  #offset+13  )
        #BACKTAB(  14  ) = #levelRAM(  #offset+14  )
        #BACKTAB(  15  ) = #levelRAM(  #offset+15  )
        #BACKTAB(  16  ) = #levelRAM(  #offset+16  )
        #BACKTAB(  17  ) = #levelRAM(  #offset+17  )
        #BACKTAB(  18  ) = #levelRAM(  #offset+18  )
        #BACKTAB(  19  ) = #levelRAM(  #offset+19  )

    ELSEIF coarseScroll = 4 THEN

        #offset = ((levelWidth*(#topRow+11))+#rightColumn-19)
        WAIT

        #BACKTAB(  220   ) = #levelRAM(  #offset     )
        #BACKTAB(  221   ) = #levelRAM(  #offset+1   )
        #BACKTAB(  222   ) = #levelRAM(  #offset+2   )
        #BACKTAB(  223   ) = #levelRAM(  #offset+3   )
        #BACKTAB(  224   ) = #levelRAM(  #offset+4   )
        #BACKTAB(  225   ) = #levelRAM(  #offset+5   )
        #BACKTAB(  226   ) = #levelRAM(  #offset+6   )
        #BACKTAB(  227   ) = #levelRAM(  #offset+7   )
        #BACKTAB(  228   ) = #levelRAM(  #offset+8   )
        #BACKTAB(  229   ) = #levelRAM(  #offset+9   )
        #BACKTAB(  230   ) = #levelRAM(  #offset+10  )
        #BACKTAB(  231   ) = #levelRAM(  #offset+11  )
        #BACKTAB(  232   ) = #levelRAM(  #offset+12  )
        #BACKTAB(  233   ) = #levelRAM(  #offset+13  )
        #BACKTAB(  234   ) = #levelRAM(  #offset+14  )
        #BACKTAB(  235   ) = #levelRAM(  #offset+15  )
        #BACKTAB(  236   ) = #levelRAM(  #offset+16  )
        #BACKTAB(  237   ) = #levelRAM(  #offset+17  )
        #BACKTAB(  238   ) = #levelRAM(  #offset+18  )
        #BACKTAB(  239   ) = #levelRAM(  #offset+19  )

    ELSE
        WAIT
    END IF


    coarseScroll = 0
END

 

Link to comment
Share on other sites

No.  The fastest way would be using ASM. 😄

 

In IntyBASIC, I suppose that is the fastest way if you are loading from RAM.  If you were loading from ROM, you could using RESTORE and READ, which uses an auto-increment register to advance the pointer.  Unfortunately, IntyBASIC requires a label for RESTORE, so it can't be used on variables or arrays.  :sad:

 

The way you have it still requires round trips to RAM to read, increment, and write the counter, then read again to assign the value to the BACKTAB.

 

If you need faster than what you have, you'll probably need to implement it in Assembly Language to remove the redundancies in read/write to memory.

 

      -dZ.

  • Like 1
Link to comment
Share on other sites

One other thing:  you would be better off returning directly from every IF/ELSE block rather than letting it go all the way to the end, which requires an extra branch just to reach the bottom of the procedure just to clear the coarsescroll and return.

 

So something like:

IF coarseScroll = 1 THEN
  ' ...
  coarsescroll = 0
  return

ELSEIF coarseScroll = 2 THEN
  ' ...
  coarsescroll = 0
  RETURN

' ...

 

It's not much, but every little helps.

 

    -dZ.

  • Like 1
Link to comment
Share on other sites

Posted (edited)
56 minutes ago, carlsson said:

Isn't there any combination of SCREEN and VARPTR you could use here? Or maybe design a custom routine in inline assembly which replicates part of SCREEN but with dynamic entry point.

Since SCREEN only supports an 8-bit offset I tried this just to see if it updated backtab faster, but it didn't seem like it, same top row glitch when scrolling downwards. I'm updating one too many animated GRAM cards which delays the backtab updates if I understand it correctly, but I think DZ-Jay will be able to help me with some custom ASM when he has time, in the meantime I'll stare at the assembly until I have an aneurysm

 

    IF coarseScroll = 1 THEN

        #offset = (levelWidth*#topRow)+(#rightColumn-19)


        #columnBuffer(  0   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  1   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  2   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  3   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  4   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  5   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  6   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  7   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  8   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  9   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  10  ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  11  ) = #levelRAM(  #offset   )

        WAIT
        SCREEN #columnBuffer, 0, 0, 1, 12, 1

 

Edited by Lillapojkenpåön
Link to comment
Share on other sites

1 hour ago, carlsson said:

Isn't there any combination of SCREEN and VARPTR you could use here? Or maybe design a custom routine in inline assembly which replicates part of SCREEN but with dynamic entry point.

 

SCREEN is the fastest way to copy a large block of data into BACKTAB, but it has a considerable overhead in setting itself up for the copy.  I would only recommend it for large blocks, like the whole screen or a large chunk of it.

 

I suppose you could profile a SCREEN copy for a single column against the above code to see if there is any improvement in performance.  Personally, I would opt for an Assembly Language routine (no surprise there, since that's my default mode of execution).  It doesn't even have to be a dedicated special-purpose routine -- just something that copies source to target using registers.  The biggest cost in the above code is in the round-trip incurred by reading, updating, writing, then reading again the variables that track the array indices.  If you replace that with auto-increment registers and indirect mode memory accesses, you gain a considerable speed boost.

 

I'd be willing to help with the assembly routine if you'd like.

 

    -dZ.

  • Like 1
Link to comment
Share on other sites

44 minutes ago, Lillapojkenpåön said:

Since SCREEN only supports an 8-bit offset I tried this just to see if it updated backtab faster, but it didn't seem like it, same top row glitch when scrolling downwards. I'm updating one too many animated GRAM cards which delays the backtab updates if I understand it correctly,

 

Yes, SCREEN takes some effort to set itself up for the copy loop.  Once it starts copying, it is as fast as it can be, but that set up is a killer.  I do not knock it for that -- it is really good for what it is intended to do:  block copy of large chunks of the screen in the fastest way possible, while supporting arbitrary regions and dimensions.  That flexibility requires it to compute at runtime its source and target pointers, and set them up in memory before starting.

 

44 minutes ago, Lillapojkenpåön said:

but I think DZ-Jay will be able to help me with some custom ASM when he has time, in the meantime I'll stare at the assembly until I have an aneurysm

 

You betcha!  Let me know what you need. and I'll try to get something out soon. :)

 

44 minutes ago, Lillapojkenpåön said:

 

    IF coarseScroll = 1 THEN

        #offset = (levelWidth*#topRow)+(#rightColumn-19)


        #columnBuffer(  0   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  1   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  2   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  3   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  4   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  5   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  6   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  7   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  8   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  9   ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  10  ) = #levelRAM(  #offset   )

        #offset = #offset + levelWidth

        #columnBuffer(  11  ) = #levelRAM(  #offset   )

        WAIT
        SCREEN #columnBuffer, 0, 0, 1, 12, 1

 

 

OMG!  My eyes!!!  You are making things worse! 😱

 

You are now doing the same copy as before, but to an intermediate buffer instead of directly to the BACKTAB -- and right after that, doing the block-copy to the BACKTAB anyway!  😆

 

    -dZ.

  • Like 1
Link to comment
Share on other sites

By the way, if you have a finite number of levels, and your offsets are not dynamically based on the current state of game-play, you should be able to pre-compute the offset in a table.  If you do that, you can save the cost of the offset computation -- multiplications by non-powers-of-two are costly.

 

     -dZ.

Link to comment
Share on other sites

Absolutely not the fastest way.

 

This is way better. Only the target offset for the screen is 0-239. The source offset is 16-bit and the stride width is 16-bit.

 

DRAW: PROCEDURE
    IF coarseScroll = 1 THEN

        #offset = (levelWidth*#topRow)+(#rightColumn-19)
        WAIT
		SCREEN #levelRAM, #offset, 0, 1, 12, levelWidth

    ELSEIF coarseScroll = 2 THEN

        #offset = (levelWidth*#topRow)+#rightColumn
        WAIT
        SCREEN #levelRAM, #offset, 19, 1, 12, levelWidth

    ELSEIF coarseScroll = 3 THEN

        #offset = ((levelWidth*#topRow)+#rightColumn-19)
        WAIT
        SCREEN #levelRAM, #offset, 0, 20, 1

    ELSEIF coarseScroll = 4 THEN

        #offset = ((levelWidth*(#topRow+11))+#rightColumn-19)
        WAIT

        SCREEN #levelRAM, #offset, 220, 20, 1

    ELSE
        WAIT
    END IF


    coarseScroll = 0
END

 

  • Like 3
Link to comment
Share on other sites

7 hours ago, nanochess said:

Absolutely not the fastest way.

 

This is way better. Only the target offset for the screen is 0-239. The source offset is 16-bit and the stride width is 16-bit.

 

Oh! Then I misunderstood the manual, that's awesome! THANKS!

 

14 hours ago, DZ-Jay said:

OMG!  My eyes!!!  You are making things worse! 😱

 

You are now doing the same copy as before, but to an intermediate buffer instead of directly to the BACKTAB -- and right after that, doing the block-copy to the BACKTAB anyway!  😆

 

    -dZ.

Stop bullying me! 😆 I knew that was bad but I didn't think SCREEN supported a 16-bit source offset

  • Like 1
Link to comment
Share on other sites

7 minutes ago, cmadruga said:

How about:

 

ON coarseScroll GOTO …

 

instead of  IF ELSEIF ELSEIF…

 

Yes, that will improve it when you have more than two cases and they are all contiguous.  It only needs to compute the jump table address once, as opposed to having to compare the value for each ELSEIF.

 

If you could guarantee that the number of cases are all represented in the target list, then you can make it even faster with "ON x FAST GOTO," which dispenses with bounds checking.

 

      -dZ.

  • Like 1
Link to comment
Share on other sites

Posted (edited)

Thanks guys, I was wondering at what point switching to ON x FAST GOTO would be faster..

I have another question, I don't know if this is even possible at all but I'm trying to use just a lerp function to return values with 8.8 fixed point precision, like 0.5 instead of just integers

First I'm trying to flip the bytes of my fixed point #playerX variable so the integer is in the MSB,
then I try to flip the bytes of the 16-bit lerp result so I can continue using it like a fixed point variable

 

DEF FN lerp(a, b, t) = a + (t * (b-a))


    #int = (#playerX AND $00FF) * 256
    #dec = (#playerX AND $FF00) / 256

    '#playerX flipped
    #temp16 = #dec + #int

    'lerp between 88.0 and #playerX with a t value of 0.5
    #scrollSpeed = lerp(0.88, #temp16, 5.0)  - 0.88

    #int = (#scrollSpeed AND $00FF) * 256
    #dec = (#scrollSpeed AND $FF00) / 256

    'result flipped
    #scrollSpeed = #dec + #int



This gives crazy results, am I doing something wrong or is it just not possible to do this way?

Edited by Lillapojkenpåön
Link to comment
Share on other sites

I haven't tested it (I don't have access to my PC right now), but I think it is because you have a mess of formats.

 

I believe that using fractional constants in IntyBASIC will automatically convert them to Q8.8 format in reverse order (with the integer in the low byte).  So, it seems that your Lerp function will use that format.

 

So, why are you flipping the #playerX and #scrollSpeed variables?  You should maintain them in their Q8.8 format.  You can then use "dot" arithmetic on them to keep them that way.

 

My recommendation is to pick one format: if you want to use fractional constants (0.88, 0.5) then you should make sure you keep the variables in reverse format using "dot" arithmetic (because IntyBASIC automatically converts fractional constants like that).  If you want to use the normal format (integer in MSB), then avoid using fractional constants and convert your numbers yourself -- then make sure you keep the values in that format by avoiding "dot" arithmetic, etc.

 

In any case, the flipping is crazy when all you want is to operate arithmetically on the numbers.

 

   dZ.

Link to comment
Share on other sites

Posted (edited)

Because I couldn't imagine that *. works, and it seems like it doesn't, I'm terrible at math but I think the multiplication would mess everything up since it would treat the values as regular un-flipped 16-bit values? That's the only reason,
notice that I also flipped 88.0 and 0.5 to get unflipped values, which I exactly just now realised is not how that works 🤦‍♂️
I'm gonna try replacing it with the actual values I want there..

EDIT:
Nope
#scrollSpeed = lerp($5800, #temp16, $0080) - $5800
still crazy scrolling

Edited by Lillapojkenpåön
Link to comment
Share on other sites

4 hours ago, Lillapojkenpåön said:

Because I couldn't imagine that *. works, and it seems like it doesn't, I'm terrible at math but I think the multiplication would mess everything up since it would treat the values as regular un-flipped 16-bit values? That's the only reason,
notice that I also flipped 88.0 and 0.5 to get unflipped values, which I exactly just now realised is not how that works 🤦‍♂️
I'm gonna try replacing it with the actual values I want there..

EDIT:
Nope
#scrollSpeed = lerp($5800, #temp16, $0080) - $5800
still crazy scrolling


What do you mean that the “.” does not work?

 

Is there no support for dot-multiplication? (I am out of town right now, so I can’t check the manual.)

 

I thought it did addition subtraction and multiplication, but I could be wrong.

 

    dZ.

Link to comment
Share on other sites

Posted (edited)
1 hour ago, DZ-Jay said:


What do you mean that the “.” does not work?

 

Is there no support for dot-multiplication? (I am out of town right now, so I can’t check the manual.)

 

I thought it did addition subtraction and multiplication, but I could be wrong.

 

    dZ.

Nope, you can only do simple multiplication

Another unrelated thing I noticed you can't do with 8.8 are comparisons like this..

CONST ACC = 0.25
CONST MAX_SPEED = 2.0
IF #playerVelocityX +. ACC > MAX_SPEED THEN..




This was my workaround, I would love to know if it can be done in a better way?

CONST ACC = 0.25
CONST MAX_SPEED = 2.0

DEF FN frac(val)  = ((val) AND $FF00)
DEF FN int(val) = ((val) AND $00FF)


    #temp16 = #playerVelocityX +. ACC

    IF int(#temp16) = int(MAX_SPEED) THEN
        IF frac(#temp16) > frac(MAX_SPEED) THEN..

 

Edited by Lillapojkenpåön
Link to comment
Share on other sites

7 hours ago, Lillapojkenpåön said:

Nope, you can only do simple multiplication
 

 

I see.

 

7 hours ago, Lillapojkenpåön said:

Another unrelated thing I noticed you can't do with 8.8 are comparisons like this..

CONST ACC = 0.25
CONST MAX_SPEED = 2.0
IF #playerVelocityX +. ACC > MAX_SPEED THEN..

 

 

Well, true, since the fraction is in the MSB, a larger fractional part will indicate the "larger" number.

 

7 hours ago, Lillapojkenpåön said:

This was my workaround, I would love to know if it can be done in a better way?

CONST ACC = 0.25
CONST MAX_SPEED = 2.0

DEF FN frac(val)  = ((val) AND $FF00)
DEF FN int(val) = ((val) AND $00FF)


    #temp16 = #playerVelocityX +. ACC

    IF int(#temp16) = int(MAX_SPEED) THEN
        IF frac(#temp16) > frac(MAX_SPEED) THEN..

 

 

Maybe you should not use "+." at all and format your Q8.8 fractions with the integer in the upper part -- like normal people do. ;)

 

Then, when you need the integer portion, just divide by 256 and Bob's your uncle.

 

It seems that would alleviate many of the problems you are encountering.

 

The original point of flipping the integer to the lower byte was as an optimization when updating sprite velocities linearly:  you can then just mask and copy the value directly into the lower-byte of a sprite register without having to swap it first.

 

However, seeing that you have to go through so much trouble to compensate for that, that it limits the operations you can do on the values, and that it needs to add additional code on every arithmetic operation to account for the carry bit -- it seems that it is much more trouble than it is worth.

 

The benefits of formatting your fractions with the integer in the MSB include that all fixed-point arithmetic operations work normally, the sign is propagated correctly, and logical comparisons work as expected.  The drawback is that when you need to extract the integer portion, it requires shifting it down 8-bits.  The good news is that this is simply a SWAP plus AND operation, and is only needed on the exceptional case.

 

     -dZ.

  • Like 1
Link to comment
Share on other sites

23 hours ago, Lillapojkenpåön said:

Nope, you can only do simple multiplication

Another unrelated thing I noticed you can't do with 8.8 are comparisons like this..

CONST ACC = 0.25
CONST MAX_SPEED = 2.0
IF #playerVelocityX +. ACC > MAX_SPEED THEN..




This was my workaround, I would love to know if it can be done in a better way?

CONST ACC = 0.25
CONST MAX_SPEED = 2.0

DEF FN frac(val)  = ((val) AND $FF00)
DEF FN int(val) = ((val) AND $00FF)


    #temp16 = #playerVelocityX +. ACC

    IF int(#temp16) = int(MAX_SPEED) THEN
        IF frac(#temp16) > frac(MAX_SPEED) THEN..

 


One quick hack to try is to swap high and low bytes before comparisons.  You can do this in ASM:

ASM MVI var_&TEMP16, R0
ASM SWAP R0
ASM MVO R0, var_&TEMP16_FIX

IF (#Temp16_Fix = $200) Then
   ...


That swaps the bytes in #Temp16 and saves it to #Temp16_Fix.  You can then do comparisons normally with other values formatted with the integer in the upper byte.

 

That said, I still recommend avoiding the built-in fraction support and formatting the values yourself with the integer in the higher byte.

 

    dZ.

  • Like 1
Link to comment
Share on other sites

14 hours ago, DZ-Jay said:


That said, I still recommend avoiding the built-in fraction support and formatting the values yourself with the integer in the higher byte.

 

    dZ.

I used to do that

but I switched it because personally I prefer this way,
I've done most of the game logic and there's only these two places I could use a swap, my lerp experiment don't count.
With the ASM you provided it's really clean now, thanks!
 

    CONST ACC = 0.25
    CONST ACC_SWAPPED = $0040

    CONST MAX_SPEED = 2.0
    CONST MAX_SPEED_SWAPPED = $0200


accelerate: PROCEDURE
    ASM MVI var_&PLAYERVELOCITYX, R0
    ASM SWAP R0
    ASM MVO R0, var_&TEMP16

    IF #temp16 + ACC_SWAPPED < MAX_SPEED_SWAPPED THEN
        #playerVelocityX = #playerVelocityX +. ACC
        RETURN
    ELSE
        #playerVelocityX = MAX_SPEED
    END IF
END


decelerate: PROCEDURE
    ASM MVI var_&PLAYERVELOCITYX, R0
    ASM SWAP R0
    ASM MVO R0, var_&TEMP16

    IF #temp16 > ACC_SWAPPED THEN
        #playerVelocityX = #playerVelocityX -. ACC
        RETURN
    ELSE
        #playerVelocityX = 0.0
    END IF
END

 

  • Like 1
Link to comment
Share on other sites

8 hours ago, Lillapojkenpåön said:

I used to do that

but I switched it because personally I prefer this way,


That’s fine.  Just know that it is more costly in almost every way.

 

 

    dZ.

Link to comment
Share on other sites

3 hours ago, Lillapojkenpåön said:

It is?? 😮 What if I only have like six +./-. operations in my loop but ALOT of extracting the integer, still more costly?


The compiler needs to account for the Carry when adding and subtracting.  This is done with a cheap instruction, but it is required on every arithmetic operation.

 

Why would you be extracting the integer so much?  All logical operations should be done on the fixed point value and the integer is only needed when converting to physical space, i.e., to sprite or screen coordinates.

 

Obviously, I may be missing something, but it doesn’t seem you are saving so much with that “+./-.”  — especially if you need to compensate by swapping to compare and multiply.

 

    dZ.

  • Like 1
Link to comment
Share on other sites

4 hours ago, DZ-Jay said:


The compiler needs to account for the Carry when adding and subtracting.  This is done with a cheap instruct, but it is required on every arithmetic operation.

 

Why would you be extracting the integer so much?  All logical operations should be done on the fixed point value and the integer is only needed when converting to physical space, i.e., to sprite or screen coordinates.

Yup, obviously sprite position, and screen cordinates and overlap

    DEF FN top    =  ((((#playerY AND 255) - PF_DIFF) + HITBOX_TOP)    / TILE_HEIGHT)
    DEF FN bottom =  ((((#playerY AND 255) - PF_DIFF) + HITBOX_BOTTOM) / TILE_HEIGHT)
    DEF FN above  = (((((#playerY AND 255) - PF_DIFF) + HITBOX_TOP)    - 1) / TILE_HEIGHT)
    DEF FN bellow = (((((#playerY AND 255) - PF_DIFF) + HITBOX_BOTTOM) + 1) / TILE_HEIGHT)


    DEF FN left        =  (((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_LEFT)  / TILE_WIDTH)
    DEF FN right       =  (((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_RIGHT) / TILE_WIDTH)
    DEF FN besideLeft  = ((((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_LEFT)  - 1) / TILE_WIDTH)
    DEF FN besideRight = ((((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_RIGHT) + 1) / TILE_WIDTH)


    DEF FN middleX       =  ((((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_LEFT) + (HITBOX_RIGHT-HITBOX_LEFT)/2) / TILE_WIDTH)
    DEF FN middleY       =  ((((((#playerY AND 255) - PF_DIFF) - fineScrollY) + HITBOX_TOP)  + (HITBOX_BOTTOM-HITBOX_TOP)/2) / TILE_HEIGHT)


    DEF FN overlapRight  = (1           + (((#playerX AND 255) + HITBOX_RIGHT)  - fineScrollX) AND TILE_WIDTH_MASK)
    DEF FN overlapLeft   = (TILE_WIDTH  - (((#playerX AND 255) + HITBOX_LEFT)   - fineScrollX) AND TILE_WIDTH_MASK)
    DEF FN overlapDown   = (1           + (((#playerY AND 255) + HITBOX_BOTTOM) - fineScrollY) AND TILE_HEIGHT_MASK)
    DEF FN overlapUp     = (TILE_HEIGHT - (((#playerY AND 255) + HITBOX_TOP)    - fineScrollY) AND TILE_HEIGHT_MASK)

And I don't need to multiply any of them by 256 to get them to the high byte,
also I get the integer for animation frame offsets, and when to end or restart the animation, and some other things

IF (#swordFrame AND 255) < 3 THEN..
IF (#playerFrame AND 255) < playerStateFrames(playerState) THEN..
IF (#explosionFrame AND 255) < 6 THEN..

VARPTR playerAnimationIndex(playerState) + ((#playerFrame AND 255) * 4)
VARPTR swordGFX((#swordFrame AND 255) * 4)
VARPTR explosionGFX((#explosionFrame AND 255) * 4)

'where to place the sword
IF #playerDirection = FLIPX THEN
    temp1 = (#playerX AND 255) - 6
ELSE
    temp1 = (#playerX AND 255) + 6
END IF

'needed for semi-solids to work correctly
previousBellow = ((((#playerY AND 255) - PF_DIFF) + HITBOX_BOTTOM) + 1)

IF (#playerVelocityY AND 255) > 128 THEN
    GOSUB moveUp
ELSE
    GOSUB moveDown
END IF

IF (#playerX AND 255) > 88 THEN
    temp1 = (#playerX AND 255) - 88
ELSEIF (#playerX AND 255) < 88 THEN
    temp1 = 88 - (#playerX AND 255)
END IF

IF (#playerX AND 255) <> 88 THEN..

A couple can be pre-calculated and are not needed, or could just be compared to a 16-bit value instead, but others I check two times, both in update() and draw(), to keep things simple,
so that's quite alot, I think it pays off in my case since I only have seven +. and one -. in my entire code, and alot of them won't happen on the same frame, just like only one of my two asm byte swaps can happen in one frame.

  • Like 1
Link to comment
Share on other sites

3 hours ago, Lillapojkenpåön said:

Yup, obviously sprite position, and screen cordinates and overlap

    DEF FN top    =  ((((#playerY AND 255) - PF_DIFF) + HITBOX_TOP)    / TILE_HEIGHT)
    DEF FN bottom =  ((((#playerY AND 255) - PF_DIFF) + HITBOX_BOTTOM) / TILE_HEIGHT)
    DEF FN above  = (((((#playerY AND 255) - PF_DIFF) + HITBOX_TOP)    - 1) / TILE_HEIGHT)
    DEF FN bellow = (((((#playerY AND 255) - PF_DIFF) + HITBOX_BOTTOM) + 1) / TILE_HEIGHT)


    DEF FN left        =  (((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_LEFT)  / TILE_WIDTH)
    DEF FN right       =  (((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_RIGHT) / TILE_WIDTH)
    DEF FN besideLeft  = ((((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_LEFT)  - 1) / TILE_WIDTH)
    DEF FN besideRight = ((((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_RIGHT) + 1) / TILE_WIDTH)


    DEF FN middleX       =  ((((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_LEFT) + (HITBOX_RIGHT-HITBOX_LEFT)/2) / TILE_WIDTH)
    DEF FN middleY       =  ((((((#playerY AND 255) - PF_DIFF) - fineScrollY) + HITBOX_TOP)  + (HITBOX_BOTTOM-HITBOX_TOP)/2) / TILE_HEIGHT)


    DEF FN overlapRight  = (1           + (((#playerX AND 255) + HITBOX_RIGHT)  - fineScrollX) AND TILE_WIDTH_MASK)
    DEF FN overlapLeft   = (TILE_WIDTH  - (((#playerX AND 255) + HITBOX_LEFT)   - fineScrollX) AND TILE_WIDTH_MASK)
    DEF FN overlapDown   = (1           + (((#playerY AND 255) + HITBOX_BOTTOM) - fineScrollY) AND TILE_HEIGHT_MASK)
    DEF FN overlapUp     = (TILE_HEIGHT - (((#playerY AND 255) + HITBOX_TOP)    - fineScrollY) AND TILE_HEIGHT_MASK)

 

Those constants can easily be formatted in Q8.8 format, with the integer portion in the upper byte.  ;)  As a matter of fact, it would be more accurate.

 

3 hours ago, Lillapojkenpåön said:

And I don't need to multiply any of them by 256 to get them to the high byte,
also I get the integer for animation frame offsets, and when to end or restart the animation, and some other things

 

If you countdown animation timers, you only need to compare against zero.  You could also separate the frame counter (which is an integer) from the animation timer (which is a fractional value).

 

3 hours ago, Lillapojkenpåön said:
IF (#swordFrame AND 255) < 3 THEN..
IF (#playerFrame AND 255) < playerStateFrames(playerState) THEN..
IF (#explosionFrame AND 255) < 6 THEN..

VARPTR playerAnimationIndex(playerState) + ((#playerFrame AND 255) * 4)
VARPTR swordGFX((#swordFrame AND 255) * 4)
VARPTR explosionGFX((#explosionFrame AND 255) * 4)

'where to place the sword
IF #playerDirection = FLIPX THEN
    temp1 = (#playerX AND 255) - 6
ELSE
    temp1 = (#playerX AND 255) + 6
END IF

'needed for semi-solids to work correctly
previousBellow = ((((#playerY AND 255) - PF_DIFF) + HITBOX_BOTTOM) + 1)

IF (#playerVelocityY AND 255) > 128 THEN
    GOSUB moveUp
ELSE
    GOSUB moveDown
END IF

IF (#playerX AND 255) > 88 THEN
    temp1 = (#playerX AND 255) - 88
ELSEIF (#playerX AND 255) < 88 THEN
    temp1 = 88 - (#playerX AND 255)
END IF

IF (#playerX AND 255) <> 88 THEN..

A couple can be pre-calculated and are not needed, or could just be compared to a 16-bit value instead, but others I check two times, both in update() and draw(), to keep things simple,
so that's quite alot, I think it pays off in my case since I only have seven +. and one -. in my entire code, and alot of them won't happen on the same frame, just like only one of my two asm byte swaps can happen in one frame.

 

I think that's mostly overdone, but it's your code, so you do you. :)

 

  -dZ.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...