Lillapojkenpåön Posted August 20 Share Posted August 20 Is this the fastest way I can load a column or row after a WAIT? DRAW: PROCEDURE IF coarseScroll = 1 THEN #offset = (levelWidth*#topRow)+(#rightColumn-19) WAIT #BACKTAB( 0 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 20 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 40 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 60 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 80 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 100 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 120 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 140 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 160 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 180 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 200 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 220 ) = #levelRAM( #offset ) ELSEIF coarseScroll = 2 THEN #offset = (levelWidth*#topRow)+#rightColumn WAIT #BACKTAB( 19 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 39 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 59 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 79 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 99 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 119 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 139 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 159 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 179 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 199 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 219 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #BACKTAB( 239 ) = #levelRAM( #offset ) ELSEIF coarseScroll = 3 THEN #offset = ((levelWidth*#topRow)+#rightColumn-19) WAIT #BACKTAB( 0 ) = #levelRAM( #offset ) #BACKTAB( 1 ) = #levelRAM( #offset+1 ) #BACKTAB( 2 ) = #levelRAM( #offset+2 ) #BACKTAB( 3 ) = #levelRAM( #offset+3 ) #BACKTAB( 4 ) = #levelRAM( #offset+4 ) #BACKTAB( 5 ) = #levelRAM( #offset+5 ) #BACKTAB( 6 ) = #levelRAM( #offset+6 ) #BACKTAB( 7 ) = #levelRAM( #offset+7 ) #BACKTAB( 8 ) = #levelRAM( #offset+8 ) #BACKTAB( 9 ) = #levelRAM( #offset+9 ) #BACKTAB( 10 ) = #levelRAM( #offset+10 ) #BACKTAB( 11 ) = #levelRAM( #offset+11 ) #BACKTAB( 12 ) = #levelRAM( #offset+12 ) #BACKTAB( 13 ) = #levelRAM( #offset+13 ) #BACKTAB( 14 ) = #levelRAM( #offset+14 ) #BACKTAB( 15 ) = #levelRAM( #offset+15 ) #BACKTAB( 16 ) = #levelRAM( #offset+16 ) #BACKTAB( 17 ) = #levelRAM( #offset+17 ) #BACKTAB( 18 ) = #levelRAM( #offset+18 ) #BACKTAB( 19 ) = #levelRAM( #offset+19 ) ELSEIF coarseScroll = 4 THEN #offset = ((levelWidth*(#topRow+11))+#rightColumn-19) WAIT #BACKTAB( 220 ) = #levelRAM( #offset ) #BACKTAB( 221 ) = #levelRAM( #offset+1 ) #BACKTAB( 222 ) = #levelRAM( #offset+2 ) #BACKTAB( 223 ) = #levelRAM( #offset+3 ) #BACKTAB( 224 ) = #levelRAM( #offset+4 ) #BACKTAB( 225 ) = #levelRAM( #offset+5 ) #BACKTAB( 226 ) = #levelRAM( #offset+6 ) #BACKTAB( 227 ) = #levelRAM( #offset+7 ) #BACKTAB( 228 ) = #levelRAM( #offset+8 ) #BACKTAB( 229 ) = #levelRAM( #offset+9 ) #BACKTAB( 230 ) = #levelRAM( #offset+10 ) #BACKTAB( 231 ) = #levelRAM( #offset+11 ) #BACKTAB( 232 ) = #levelRAM( #offset+12 ) #BACKTAB( 233 ) = #levelRAM( #offset+13 ) #BACKTAB( 234 ) = #levelRAM( #offset+14 ) #BACKTAB( 235 ) = #levelRAM( #offset+15 ) #BACKTAB( 236 ) = #levelRAM( #offset+16 ) #BACKTAB( 237 ) = #levelRAM( #offset+17 ) #BACKTAB( 238 ) = #levelRAM( #offset+18 ) #BACKTAB( 239 ) = #levelRAM( #offset+19 ) ELSE WAIT END IF coarseScroll = 0 END Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted August 21 Share Posted August 21 No. The fastest way would be using ASM. 😄 In IntyBASIC, I suppose that is the fastest way if you are loading from RAM. If you were loading from ROM, you could using RESTORE and READ, which uses an auto-increment register to advance the pointer. Unfortunately, IntyBASIC requires a label for RESTORE, so it can't be used on variables or arrays. The way you have it still requires round trips to RAM to read, increment, and write the counter, then read again to assign the value to the BACKTAB. If you need faster than what you have, you'll probably need to implement it in Assembly Language to remove the redundancies in read/write to memory. -dZ. 1 Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted August 21 Share Posted August 21 One other thing: you would be better off returning directly from every IF/ELSE block rather than letting it go all the way to the end, which requires an extra branch just to reach the bottom of the procedure just to clear the coarsescroll and return. So something like: IF coarseScroll = 1 THEN ' ... coarsescroll = 0 return ELSEIF coarseScroll = 2 THEN ' ... coarsescroll = 0 RETURN ' ... It's not much, but every little helps. -dZ. 1 Quote Link to comment Share on other sites More sharing options...
carlsson Posted August 21 Share Posted August 21 Isn't there any combination of SCREEN and VARPTR you could use here? Or maybe design a custom routine in inline assembly which replicates part of SCREEN but with dynamic entry point. 1 Quote Link to comment Share on other sites More sharing options...
Lillapojkenpåön Posted August 21 Author Share Posted August 21 (edited) 56 minutes ago, carlsson said: Isn't there any combination of SCREEN and VARPTR you could use here? Or maybe design a custom routine in inline assembly which replicates part of SCREEN but with dynamic entry point. Since SCREEN only supports an 8-bit offset I tried this just to see if it updated backtab faster, but it didn't seem like it, same top row glitch when scrolling downwards. I'm updating one too many animated GRAM cards which delays the backtab updates if I understand it correctly, but I think DZ-Jay will be able to help me with some custom ASM when he has time, in the meantime I'll stare at the assembly until I have an aneurysm IF coarseScroll = 1 THEN #offset = (levelWidth*#topRow)+(#rightColumn-19) #columnBuffer( 0 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 1 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 2 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 3 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 4 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 5 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 6 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 7 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 8 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 9 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 10 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 11 ) = #levelRAM( #offset ) WAIT SCREEN #columnBuffer, 0, 0, 1, 12, 1 Edited August 21 by Lillapojkenpåön Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted August 21 Share Posted August 21 1 hour ago, carlsson said: Isn't there any combination of SCREEN and VARPTR you could use here? Or maybe design a custom routine in inline assembly which replicates part of SCREEN but with dynamic entry point. SCREEN is the fastest way to copy a large block of data into BACKTAB, but it has a considerable overhead in setting itself up for the copy. I would only recommend it for large blocks, like the whole screen or a large chunk of it. I suppose you could profile a SCREEN copy for a single column against the above code to see if there is any improvement in performance. Personally, I would opt for an Assembly Language routine (no surprise there, since that's my default mode of execution). It doesn't even have to be a dedicated special-purpose routine -- just something that copies source to target using registers. The biggest cost in the above code is in the round-trip incurred by reading, updating, writing, then reading again the variables that track the array indices. If you replace that with auto-increment registers and indirect mode memory accesses, you gain a considerable speed boost. I'd be willing to help with the assembly routine if you'd like. -dZ. 1 Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted August 21 Share Posted August 21 44 minutes ago, Lillapojkenpåön said: Since SCREEN only supports an 8-bit offset I tried this just to see if it updated backtab faster, but it didn't seem like it, same top row glitch when scrolling downwards. I'm updating one too many animated GRAM cards which delays the backtab updates if I understand it correctly, Yes, SCREEN takes some effort to set itself up for the copy loop. Once it starts copying, it is as fast as it can be, but that set up is a killer. I do not knock it for that -- it is really good for what it is intended to do: block copy of large chunks of the screen in the fastest way possible, while supporting arbitrary regions and dimensions. That flexibility requires it to compute at runtime its source and target pointers, and set them up in memory before starting. 44 minutes ago, Lillapojkenpåön said: but I think DZ-Jay will be able to help me with some custom ASM when he has time, in the meantime I'll stare at the assembly until I have an aneurysm You betcha! Let me know what you need. and I'll try to get something out soon. :) 44 minutes ago, Lillapojkenpåön said: IF coarseScroll = 1 THEN #offset = (levelWidth*#topRow)+(#rightColumn-19) #columnBuffer( 0 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 1 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 2 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 3 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 4 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 5 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 6 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 7 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 8 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 9 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 10 ) = #levelRAM( #offset ) #offset = #offset + levelWidth #columnBuffer( 11 ) = #levelRAM( #offset ) WAIT SCREEN #columnBuffer, 0, 0, 1, 12, 1 OMG! My eyes!!! You are making things worse! 😱 You are now doing the same copy as before, but to an intermediate buffer instead of directly to the BACKTAB -- and right after that, doing the block-copy to the BACKTAB anyway! 😆 -dZ. 1 Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted August 21 Share Posted August 21 By the way, if you have a finite number of levels, and your offsets are not dynamically based on the current state of game-play, you should be able to pre-compute the offset in a table. If you do that, you can save the cost of the offset computation -- multiplications by non-powers-of-two are costly. -dZ. Quote Link to comment Share on other sites More sharing options...
+nanochess Posted August 21 Share Posted August 21 Absolutely not the fastest way. This is way better. Only the target offset for the screen is 0-239. The source offset is 16-bit and the stride width is 16-bit. DRAW: PROCEDURE IF coarseScroll = 1 THEN #offset = (levelWidth*#topRow)+(#rightColumn-19) WAIT SCREEN #levelRAM, #offset, 0, 1, 12, levelWidth ELSEIF coarseScroll = 2 THEN #offset = (levelWidth*#topRow)+#rightColumn WAIT SCREEN #levelRAM, #offset, 19, 1, 12, levelWidth ELSEIF coarseScroll = 3 THEN #offset = ((levelWidth*#topRow)+#rightColumn-19) WAIT SCREEN #levelRAM, #offset, 0, 20, 1 ELSEIF coarseScroll = 4 THEN #offset = ((levelWidth*(#topRow+11))+#rightColumn-19) WAIT SCREEN #levelRAM, #offset, 220, 20, 1 ELSE WAIT END IF coarseScroll = 0 END 3 Quote Link to comment Share on other sites More sharing options...
Lillapojkenpåön Posted August 22 Author Share Posted August 22 7 hours ago, nanochess said: Absolutely not the fastest way. This is way better. Only the target offset for the screen is 0-239. The source offset is 16-bit and the stride width is 16-bit. Oh! Then I misunderstood the manual, that's awesome! THANKS! 14 hours ago, DZ-Jay said: OMG! My eyes!!! You are making things worse! 😱 You are now doing the same copy as before, but to an intermediate buffer instead of directly to the BACKTAB -- and right after that, doing the block-copy to the BACKTAB anyway! 😆 -dZ. Stop bullying me! 😆 I knew that was bad but I didn't think SCREEN supported a 16-bit source offset 1 Quote Link to comment Share on other sites More sharing options...
+cmadruga Posted August 26 Share Posted August 26 How about: ON coarseScroll GOTO … instead of IF ELSEIF ELSEIF… 1 Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted August 26 Share Posted August 26 7 minutes ago, cmadruga said: How about: ON coarseScroll GOTO … instead of IF ELSEIF ELSEIF… Yes, that will improve it when you have more than two cases and they are all contiguous. It only needs to compute the jump table address once, as opposed to having to compare the value for each ELSEIF. If you could guarantee that the number of cases are all represented in the target list, then you can make it even faster with "ON x FAST GOTO," which dispenses with bounds checking. -dZ. 1 Quote Link to comment Share on other sites More sharing options...
Lillapojkenpåön Posted August 27 Author Share Posted August 27 (edited) Thanks guys, I was wondering at what point switching to ON x FAST GOTO would be faster.. I have another question, I don't know if this is even possible at all but I'm trying to use just a lerp function to return values with 8.8 fixed point precision, like 0.5 instead of just integers First I'm trying to flip the bytes of my fixed point #playerX variable so the integer is in the MSB, then I try to flip the bytes of the 16-bit lerp result so I can continue using it like a fixed point variable DEF FN lerp(a, b, t) = a + (t * (b-a)) #int = (#playerX AND $00FF) * 256 #dec = (#playerX AND $FF00) / 256 '#playerX flipped #temp16 = #dec + #int 'lerp between 88.0 and #playerX with a t value of 0.5 #scrollSpeed = lerp(0.88, #temp16, 5.0) - 0.88 #int = (#scrollSpeed AND $00FF) * 256 #dec = (#scrollSpeed AND $FF00) / 256 'result flipped #scrollSpeed = #dec + #int This gives crazy results, am I doing something wrong or is it just not possible to do this way? Edited August 27 by Lillapojkenpåön Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted August 27 Share Posted August 27 I haven't tested it (I don't have access to my PC right now), but I think it is because you have a mess of formats. I believe that using fractional constants in IntyBASIC will automatically convert them to Q8.8 format in reverse order (with the integer in the low byte). So, it seems that your Lerp function will use that format. So, why are you flipping the #playerX and #scrollSpeed variables? You should maintain them in their Q8.8 format. You can then use "dot" arithmetic on them to keep them that way. My recommendation is to pick one format: if you want to use fractional constants (0.88, 0.5) then you should make sure you keep the variables in reverse format using "dot" arithmetic (because IntyBASIC automatically converts fractional constants like that). If you want to use the normal format (integer in MSB), then avoid using fractional constants and convert your numbers yourself -- then make sure you keep the values in that format by avoiding "dot" arithmetic, etc. In any case, the flipping is crazy when all you want is to operate arithmetically on the numbers. dZ. Quote Link to comment Share on other sites More sharing options...
Lillapojkenpåön Posted August 27 Author Share Posted August 27 (edited) Because I couldn't imagine that *. works, and it seems like it doesn't, I'm terrible at math but I think the multiplication would mess everything up since it would treat the values as regular un-flipped 16-bit values? That's the only reason, notice that I also flipped 88.0 and 0.5 to get unflipped values, which I exactly just now realised is not how that works 🤦♂️ I'm gonna try replacing it with the actual values I want there.. EDIT: Nope #scrollSpeed = lerp($5800, #temp16, $0080) - $5800 still crazy scrolling Edited August 27 by Lillapojkenpåön Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted August 27 Share Posted August 27 4 hours ago, Lillapojkenpåön said: Because I couldn't imagine that *. works, and it seems like it doesn't, I'm terrible at math but I think the multiplication would mess everything up since it would treat the values as regular un-flipped 16-bit values? That's the only reason, notice that I also flipped 88.0 and 0.5 to get unflipped values, which I exactly just now realised is not how that works 🤦♂️ I'm gonna try replacing it with the actual values I want there.. EDIT: Nope #scrollSpeed = lerp($5800, #temp16, $0080) - $5800 still crazy scrolling What do you mean that the “.” does not work? Is there no support for dot-multiplication? (I am out of town right now, so I can’t check the manual.) I thought it did addition subtraction and multiplication, but I could be wrong. dZ. Quote Link to comment Share on other sites More sharing options...
Lillapojkenpåön Posted August 27 Author Share Posted August 27 (edited) 1 hour ago, DZ-Jay said: What do you mean that the “.” does not work? Is there no support for dot-multiplication? (I am out of town right now, so I can’t check the manual.) I thought it did addition subtraction and multiplication, but I could be wrong. dZ. Nope, you can only do simple multiplication Another unrelated thing I noticed you can't do with 8.8 are comparisons like this.. CONST ACC = 0.25 CONST MAX_SPEED = 2.0 IF #playerVelocityX +. ACC > MAX_SPEED THEN.. This was my workaround, I would love to know if it can be done in a better way? CONST ACC = 0.25 CONST MAX_SPEED = 2.0 DEF FN frac(val) = ((val) AND $FF00) DEF FN int(val) = ((val) AND $00FF) #temp16 = #playerVelocityX +. ACC IF int(#temp16) = int(MAX_SPEED) THEN IF frac(#temp16) > frac(MAX_SPEED) THEN.. Edited August 27 by Lillapojkenpåön Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted August 27 Share Posted August 27 7 hours ago, Lillapojkenpåön said: Nope, you can only do simple multiplication I see. 7 hours ago, Lillapojkenpåön said: Another unrelated thing I noticed you can't do with 8.8 are comparisons like this.. CONST ACC = 0.25 CONST MAX_SPEED = 2.0 IF #playerVelocityX +. ACC > MAX_SPEED THEN.. Well, true, since the fraction is in the MSB, a larger fractional part will indicate the "larger" number. 7 hours ago, Lillapojkenpåön said: This was my workaround, I would love to know if it can be done in a better way? CONST ACC = 0.25 CONST MAX_SPEED = 2.0 DEF FN frac(val) = ((val) AND $FF00) DEF FN int(val) = ((val) AND $00FF) #temp16 = #playerVelocityX +. ACC IF int(#temp16) = int(MAX_SPEED) THEN IF frac(#temp16) > frac(MAX_SPEED) THEN.. Maybe you should not use "+." at all and format your Q8.8 fractions with the integer in the upper part -- like normal people do. Then, when you need the integer portion, just divide by 256 and Bob's your uncle. It seems that would alleviate many of the problems you are encountering. The original point of flipping the integer to the lower byte was as an optimization when updating sprite velocities linearly: you can then just mask and copy the value directly into the lower-byte of a sprite register without having to swap it first. However, seeing that you have to go through so much trouble to compensate for that, that it limits the operations you can do on the values, and that it needs to add additional code on every arithmetic operation to account for the carry bit -- it seems that it is much more trouble than it is worth. The benefits of formatting your fractions with the integer in the MSB include that all fixed-point arithmetic operations work normally, the sign is propagated correctly, and logical comparisons work as expected. The drawback is that when you need to extract the integer portion, it requires shifting it down 8-bits. The good news is that this is simply a SWAP plus AND operation, and is only needed on the exceptional case. -dZ. 1 Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted August 28 Share Posted August 28 23 hours ago, Lillapojkenpåön said: Nope, you can only do simple multiplication Another unrelated thing I noticed you can't do with 8.8 are comparisons like this.. CONST ACC = 0.25 CONST MAX_SPEED = 2.0 IF #playerVelocityX +. ACC > MAX_SPEED THEN.. This was my workaround, I would love to know if it can be done in a better way? CONST ACC = 0.25 CONST MAX_SPEED = 2.0 DEF FN frac(val) = ((val) AND $FF00) DEF FN int(val) = ((val) AND $00FF) #temp16 = #playerVelocityX +. ACC IF int(#temp16) = int(MAX_SPEED) THEN IF frac(#temp16) > frac(MAX_SPEED) THEN.. One quick hack to try is to swap high and low bytes before comparisons. You can do this in ASM: ASM MVI var_&TEMP16, R0 ASM SWAP R0 ASM MVO R0, var_&TEMP16_FIX IF (#Temp16_Fix = $200) Then ... That swaps the bytes in #Temp16 and saves it to #Temp16_Fix. You can then do comparisons normally with other values formatted with the integer in the upper byte. That said, I still recommend avoiding the built-in fraction support and formatting the values yourself with the integer in the higher byte. dZ. 1 Quote Link to comment Share on other sites More sharing options...
Lillapojkenpåön Posted August 29 Author Share Posted August 29 14 hours ago, DZ-Jay said: That said, I still recommend avoiding the built-in fraction support and formatting the values yourself with the integer in the higher byte. dZ. I used to do that but I switched it because personally I prefer this way, I've done most of the game logic and there's only these two places I could use a swap, my lerp experiment don't count. With the ASM you provided it's really clean now, thanks! CONST ACC = 0.25 CONST ACC_SWAPPED = $0040 CONST MAX_SPEED = 2.0 CONST MAX_SPEED_SWAPPED = $0200 accelerate: PROCEDURE ASM MVI var_&PLAYERVELOCITYX, R0 ASM SWAP R0 ASM MVO R0, var_&TEMP16 IF #temp16 + ACC_SWAPPED < MAX_SPEED_SWAPPED THEN #playerVelocityX = #playerVelocityX +. ACC RETURN ELSE #playerVelocityX = MAX_SPEED END IF END decelerate: PROCEDURE ASM MVI var_&PLAYERVELOCITYX, R0 ASM SWAP R0 ASM MVO R0, var_&TEMP16 IF #temp16 > ACC_SWAPPED THEN #playerVelocityX = #playerVelocityX -. ACC RETURN ELSE #playerVelocityX = 0.0 END IF END 1 Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted August 29 Share Posted August 29 8 hours ago, Lillapojkenpåön said: I used to do that but I switched it because personally I prefer this way, That’s fine. Just know that it is more costly in almost every way. dZ. Quote Link to comment Share on other sites More sharing options...
Lillapojkenpåön Posted August 29 Author Share Posted August 29 1 hour ago, DZ-Jay said: That’s fine. Just know that it is more costly in almost every way. dZ. It is?? 😮 What if I only have like six +./-. operations in my loop but ALOT of extracting the integer, still more costly? Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted August 29 Share Posted August 29 3 hours ago, Lillapojkenpåön said: It is?? 😮 What if I only have like six +./-. operations in my loop but ALOT of extracting the integer, still more costly? The compiler needs to account for the Carry when adding and subtracting. This is done with a cheap instruction, but it is required on every arithmetic operation. Why would you be extracting the integer so much? All logical operations should be done on the fixed point value and the integer is only needed when converting to physical space, i.e., to sprite or screen coordinates. Obviously, I may be missing something, but it doesn’t seem you are saving so much with that “+./-.” — especially if you need to compensate by swapping to compare and multiply. dZ. 1 Quote Link to comment Share on other sites More sharing options...
Lillapojkenpåön Posted August 29 Author Share Posted August 29 4 hours ago, DZ-Jay said: The compiler needs to account for the Carry when adding and subtracting. This is done with a cheap instruct, but it is required on every arithmetic operation. Why would you be extracting the integer so much? All logical operations should be done on the fixed point value and the integer is only needed when converting to physical space, i.e., to sprite or screen coordinates. Yup, obviously sprite position, and screen cordinates and overlap DEF FN top = ((((#playerY AND 255) - PF_DIFF) + HITBOX_TOP) / TILE_HEIGHT) DEF FN bottom = ((((#playerY AND 255) - PF_DIFF) + HITBOX_BOTTOM) / TILE_HEIGHT) DEF FN above = (((((#playerY AND 255) - PF_DIFF) + HITBOX_TOP) - 1) / TILE_HEIGHT) DEF FN bellow = (((((#playerY AND 255) - PF_DIFF) + HITBOX_BOTTOM) + 1) / TILE_HEIGHT) DEF FN left = (((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_LEFT) / TILE_WIDTH) DEF FN right = (((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_RIGHT) / TILE_WIDTH) DEF FN besideLeft = ((((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_LEFT) - 1) / TILE_WIDTH) DEF FN besideRight = ((((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_RIGHT) + 1) / TILE_WIDTH) DEF FN middleX = ((((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_LEFT) + (HITBOX_RIGHT-HITBOX_LEFT)/2) / TILE_WIDTH) DEF FN middleY = ((((((#playerY AND 255) - PF_DIFF) - fineScrollY) + HITBOX_TOP) + (HITBOX_BOTTOM-HITBOX_TOP)/2) / TILE_HEIGHT) DEF FN overlapRight = (1 + (((#playerX AND 255) + HITBOX_RIGHT) - fineScrollX) AND TILE_WIDTH_MASK) DEF FN overlapLeft = (TILE_WIDTH - (((#playerX AND 255) + HITBOX_LEFT) - fineScrollX) AND TILE_WIDTH_MASK) DEF FN overlapDown = (1 + (((#playerY AND 255) + HITBOX_BOTTOM) - fineScrollY) AND TILE_HEIGHT_MASK) DEF FN overlapUp = (TILE_HEIGHT - (((#playerY AND 255) + HITBOX_TOP) - fineScrollY) AND TILE_HEIGHT_MASK) And I don't need to multiply any of them by 256 to get them to the high byte, also I get the integer for animation frame offsets, and when to end or restart the animation, and some other things IF (#swordFrame AND 255) < 3 THEN.. IF (#playerFrame AND 255) < playerStateFrames(playerState) THEN.. IF (#explosionFrame AND 255) < 6 THEN.. VARPTR playerAnimationIndex(playerState) + ((#playerFrame AND 255) * 4) VARPTR swordGFX((#swordFrame AND 255) * 4) VARPTR explosionGFX((#explosionFrame AND 255) * 4) 'where to place the sword IF #playerDirection = FLIPX THEN temp1 = (#playerX AND 255) - 6 ELSE temp1 = (#playerX AND 255) + 6 END IF 'needed for semi-solids to work correctly previousBellow = ((((#playerY AND 255) - PF_DIFF) + HITBOX_BOTTOM) + 1) IF (#playerVelocityY AND 255) > 128 THEN GOSUB moveUp ELSE GOSUB moveDown END IF IF (#playerX AND 255) > 88 THEN temp1 = (#playerX AND 255) - 88 ELSEIF (#playerX AND 255) < 88 THEN temp1 = 88 - (#playerX AND 255) END IF IF (#playerX AND 255) <> 88 THEN.. A couple can be pre-calculated and are not needed, or could just be compared to a 16-bit value instead, but others I check two times, both in update() and draw(), to keep things simple, so that's quite alot, I think it pays off in my case since I only have seven +. and one -. in my entire code, and alot of them won't happen on the same frame, just like only one of my two asm byte swaps can happen in one frame. 1 Quote Link to comment Share on other sites More sharing options...
+DZ-Jay Posted August 30 Share Posted August 30 3 hours ago, Lillapojkenpåön said: Yup, obviously sprite position, and screen cordinates and overlap DEF FN top = ((((#playerY AND 255) - PF_DIFF) + HITBOX_TOP) / TILE_HEIGHT) DEF FN bottom = ((((#playerY AND 255) - PF_DIFF) + HITBOX_BOTTOM) / TILE_HEIGHT) DEF FN above = (((((#playerY AND 255) - PF_DIFF) + HITBOX_TOP) - 1) / TILE_HEIGHT) DEF FN bellow = (((((#playerY AND 255) - PF_DIFF) + HITBOX_BOTTOM) + 1) / TILE_HEIGHT) DEF FN left = (((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_LEFT) / TILE_WIDTH) DEF FN right = (((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_RIGHT) / TILE_WIDTH) DEF FN besideLeft = ((((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_LEFT) - 1) / TILE_WIDTH) DEF FN besideRight = ((((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_RIGHT) + 1) / TILE_WIDTH) DEF FN middleX = ((((((#playerX AND 255) - PF_DIFF) - fineScrollX) + HITBOX_LEFT) + (HITBOX_RIGHT-HITBOX_LEFT)/2) / TILE_WIDTH) DEF FN middleY = ((((((#playerY AND 255) - PF_DIFF) - fineScrollY) + HITBOX_TOP) + (HITBOX_BOTTOM-HITBOX_TOP)/2) / TILE_HEIGHT) DEF FN overlapRight = (1 + (((#playerX AND 255) + HITBOX_RIGHT) - fineScrollX) AND TILE_WIDTH_MASK) DEF FN overlapLeft = (TILE_WIDTH - (((#playerX AND 255) + HITBOX_LEFT) - fineScrollX) AND TILE_WIDTH_MASK) DEF FN overlapDown = (1 + (((#playerY AND 255) + HITBOX_BOTTOM) - fineScrollY) AND TILE_HEIGHT_MASK) DEF FN overlapUp = (TILE_HEIGHT - (((#playerY AND 255) + HITBOX_TOP) - fineScrollY) AND TILE_HEIGHT_MASK) Those constants can easily be formatted in Q8.8 format, with the integer portion in the upper byte. As a matter of fact, it would be more accurate. 3 hours ago, Lillapojkenpåön said: And I don't need to multiply any of them by 256 to get them to the high byte, also I get the integer for animation frame offsets, and when to end or restart the animation, and some other things If you countdown animation timers, you only need to compare against zero. You could also separate the frame counter (which is an integer) from the animation timer (which is a fractional value). 3 hours ago, Lillapojkenpåön said: IF (#swordFrame AND 255) < 3 THEN.. IF (#playerFrame AND 255) < playerStateFrames(playerState) THEN.. IF (#explosionFrame AND 255) < 6 THEN.. VARPTR playerAnimationIndex(playerState) + ((#playerFrame AND 255) * 4) VARPTR swordGFX((#swordFrame AND 255) * 4) VARPTR explosionGFX((#explosionFrame AND 255) * 4) 'where to place the sword IF #playerDirection = FLIPX THEN temp1 = (#playerX AND 255) - 6 ELSE temp1 = (#playerX AND 255) + 6 END IF 'needed for semi-solids to work correctly previousBellow = ((((#playerY AND 255) - PF_DIFF) + HITBOX_BOTTOM) + 1) IF (#playerVelocityY AND 255) > 128 THEN GOSUB moveUp ELSE GOSUB moveDown END IF IF (#playerX AND 255) > 88 THEN temp1 = (#playerX AND 255) - 88 ELSEIF (#playerX AND 255) < 88 THEN temp1 = 88 - (#playerX AND 255) END IF IF (#playerX AND 255) <> 88 THEN.. A couple can be pre-calculated and are not needed, or could just be compared to a 16-bit value instead, but others I check two times, both in update() and draw(), to keep things simple, so that's quite alot, I think it pays off in my case since I only have seven +. and one -. in my entire code, and alot of them won't happen on the same frame, just like only one of my two asm byte swaps can happen in one frame. I think that's mostly overdone, but it's your code, so you do you. :) -dZ. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.