DZ-Jay Posted July 11, 2022 Share Posted July 11, 2022 30 minutes ago, artrag said: Spawning and despawing for enemies is in place TimePilotTest.cfg 80 B · 1 download TimePilotTest.bin 19.41 kB · 1 download Does that mean bullet collisions? (I’ll check it later, when I get home.) dZ. Quote Link to comment Share on other sites More sharing options...
artrag Posted July 11, 2022 Author Share Posted July 11, 2022 No, simply enemies that go out of the screen release their resources, while new enemies are randomly generated according to the fighter direction if there are free resources Quote Link to comment Share on other sites More sharing options...
DZ-Jay Posted July 11, 2022 Share Posted July 11, 2022 44 minutes ago, artrag said: No, simply enemies that go out of the screen release their resources, while new enemies are randomly generated according to the fighter direction if there are free resources Ah. Is that how it works in the original? (It’s been a while since I played it.) I thought there was persistence outside the view, so that when the player flew back to them, they are still in position. dZ. Quote Link to comment Share on other sites More sharing options...
artrag Posted July 11, 2022 Author Share Posted July 11, 2022 Just for fun, a draft implementaion for enemy bullets. This is only for having the idea of what the result can be, the real enemy bullets will be affected by the scrolling direction, will be slower than those from the fighter and maybe yellow instead of white. Enemies have to cast bullets only when actually aiming to the fighter, now it is just a random mayhem TimePilotTest.cfg TimePilotTest.bin Quote Link to comment Share on other sites More sharing options...
DZ-Jay Posted July 11, 2022 Share Posted July 11, 2022 56 minutes ago, artrag said: Just for fun, a draft implementaion for enemy bullets. This is only for having the idea of what the result can be, the real enemy bullets will be affected by the scrolling direction, will be slower than those from the fighter and maybe yellow instead of white. Enemies have to cast bullets only when actually aiming to the fighter, now it is just a random mayhem TimePilotTest.cfg 103 B · 3 downloads TimePilotTest.bin 20.07 kB · 3 downloads Looks neat! Although the random nature of the enemy bullets makes it look too chaotic at the moment. Anyway, here's a thought ... I was just playing the original Time Pilot in MAME earlier today and I wanted to ask you ... would you want to try to replicate the parallax scrolling in your framework? ? I think if you manage to do that, it would look soooo awesome. :) -dZ. Quote Link to comment Share on other sites More sharing options...
artrag Posted July 12, 2022 Author Share Posted July 12, 2022 (edited) Parallax would make the rom space for clouds explode. I would avoid it Edited July 12, 2022 by artrag Quote Link to comment Share on other sites More sharing options...
DZ-Jay Posted July 12, 2022 Share Posted July 12, 2022 5 hours ago, artrag said: Parallax would make the rom space for clouds explode. I would avoid it I understand. Maybe you could explore bank-switching? No worries, though. It will still look great without it. But maybe when everything is done, you could look into it, if you feel like it. dZ. Quote Link to comment Share on other sites More sharing options...
artrag Posted July 12, 2022 Author Share Posted July 12, 2022 Slower yellow bullets for enemies Now enemy bullets are affected by the relative movement of the fighter TimePilotTest.bin TimePilotTest.bin Quote Link to comment Share on other sites More sharing options...
artrag Posted July 12, 2022 Author Share Posted July 12, 2022 4 hours ago, DZ-Jay said: I understand. Maybe you could explore bank-switching? No worries, though. It will still look great without it. But maybe when everything is done, you could look into it, if you feel like it. dZ. From my calculations I should store about 49512 tiles, to not count the backtab layouts. Thus, no, I am not going to add parallax with the current bitmap implementation. Using sprites for the smaller clouds would be possible but it seems quite a waste of resources, provided that we have only 7 free sprites for enemies and bombs (and in later stages missiles) Quote Link to comment Share on other sites More sharing options...
DZ-Jay Posted July 12, 2022 Share Posted July 12, 2022 1 hour ago, artrag said: From my calculations I should store about 49512 tiles, to not count the backtab layouts. Thus, no, I am not going to add parallax with the current bitmap implementation. Yikes! ? Not really worth it, of course. 1 hour ago, artrag said: Using sprites for the smaller clouds would be possible but it seems quite a waste of resources, provided that we have only 7 free sprites for enemies and bombs (and in later stages missiles) No, I wouldn't even consider that. dZ. Quote Link to comment Share on other sites More sharing options...
artrag Posted July 12, 2022 Author Share Posted July 12, 2022 Early attemp of collisions: when the player is hit by enemy bullets it flashes in red shortly. All the other collisions (enemy vs player, player's bullets vs enemies) are WIP TimePilotTest.cfg TimePilotTest.bin Quote Link to comment Share on other sites More sharing options...
artrag Posted July 13, 2022 Author Share Posted July 13, 2022 Added collision of player vs enemies and player vs bullet enemeies. Still missing enemies vs my bullets but the spare cpu time now is very scarce now. WIP TimePilotTest.cfg TimePilotTest.bin Quote Link to comment Share on other sites More sharing options...
artrag Posted July 13, 2022 Author Share Posted July 13, 2022 @DZ-Jay I'm implementing in ASM the collisions among bullets and enemies. Would this function work ? ASM TESTBOX: PROC ; TESTBOX(x1,y1,x2,y2) ASM ; R0 = x1,R1 = y1 ASM ; R2 = x2,R3 = y2 ASM ASM ; return R0 = ((x1>x2) and (x1<x2+8) and (y1>y2) and (y2<y2+8)) ASM ASM CMPR R2,R0 ;x1>x2 ASM BLT __ELSE ASM ASM CMPR R3,R1 ;y1>y2 ASM BLT __ELSE ASM ASM ADDI #8,R2 ASM CMPR R2,R0 ;x1<x2+8 ASM BGT __ELSE ASM ASM ADDI #8,R3 ASM CMPR R3,R1 ;y1<y2+8 ASM BGT __ELSE ASM ASM MVII #1,R0 ASM JR R5 ASM ASM __ELSE ASM ASM CLRR R0 ASM JR R5 Quote Link to comment Share on other sites More sharing options...
DZ-Jay Posted July 13, 2022 Share Posted July 13, 2022 Hi, @artrag, It looks good, except for two things: if your comparison is non-inclusive (< instead of <=) then your “else” branch needs to account for being equal, so “BLE” (Branch on less than or equal). The same for “BGE” (Branch on greater than or equal) on the other comparison. The second issue is the missing “ENDP” at the end of the procedure, but I assume this was just a copy+paste error when inserting it here. All that being said, it occurs to me that if you are comparing two values, first against each other, then against each other but with a term plus 8; then perhaps there is an opportunity to optimize this by actually subtracting the values, and comparing the result against 8. CMPR performs a subtraction but discards the result, so you could subtract the values instead with SUBR which will store the results in the second argument, then do the same thing as before (BLE etc.) to jump in case of ELSE, then on fall-through CMPI with #8 to see if the difference is BGE. Does this make sense? I can provide some code when I get home. dZ. Quote Link to comment Share on other sites More sharing options...
artrag Posted July 13, 2022 Author Share Posted July 13, 2022 (edited) In the end I optimised the test on coordinates into the loop that is evaluating each enemy against the coordinates of the I-th bullet. This is the result, do you see any margin to improve ? array_ES is an array of flags 0/1 telling if the enemy is alive array_&EX and array_&Ey are the enemy coordinates in fixed point 8:8 (we have 7 enemies) array_BULS s an array of flags 0/1 telling if the bullet is alive. array_&BX and array_&BY are the array of x,y coordinates for bullets. We are evaluating the I-th bullet in var_I Note that coordinates for bullets are in 0-159, those for enemies are in 0-167 accounting the 8 pixels of offset for sprites TESTMYBUL:PROC BEGIN MVII #array_&BX,R4 ADD var_I,R4 MVI@ R4,R1 ;R1 = x1-8 MOVR R1,R2 ADDI #8*256,R2 ;R2 = x1 ADDI #array_&BY-array_&BX-1,R4 MVI@ R4,R3 ;R3 = y1-8 MOVR R3,R4 ADDI #8*256,R4 ;R4 = y1 indx QSET 0 REPEAT 7 MVI array_ES+indx,R0 ; test es(i) TSTR R0 BEQ _Next[indx] CMP array_&EX+indx,R2 ;x1>=x2(i) BLT _Next[indx] CMP array_&EY+indx,R4 ;y1>=y2(i) BLT _Next[indx] CMP array_&EX+indx,R1 ;x1<x2(i)+8 BGT _Next[indx] CMP array_&EY+indx,R3 ;y1<y2(i)+8 BGT _Next[indx] ; COLLISION FOUND ;[278] sound 2,330,48 MVII #330,R0 MVO R0,498 SWAP R0 MVO R0,502 MVII #48,R0 MVO R0,509 ;[279] sound 3,500,9 MVII #500,R0 MVO R0,499 SWAP R0 MVO R0,503 MVII #9,R0 MVO R0,506 ; remove the enemy CLRR R0 MVO R0,array_ES+indx ; remove the bullet MVII #array_BULS,R1 ADD var_I,R1 MVO@ R0,R1 MVII #76,R0 MVO R0,var_BLX MVII #44,R0 MVO R0,var_BLY RETURN _Next[indx]: indx QSET indx + 1 ENDR RETURN ENDP The current code works fine in PAL, but it is exceeding the frame time in NTSC TimePilotTest.cfg TimePilotTest.bin Edited July 13, 2022 by artrag Quote Link to comment Share on other sites More sharing options...
DZ-Jay Posted July 14, 2022 Share Posted July 14, 2022 (edited) 2 hours ago, artrag said: In the end I optimised the test on coordinates into the loop that is evaluating each enemy against the coordinates of the I-th bullet. This is the result, do you see any margin to improve ? array_ES is an array of flags 0/1 telling if the enemy is alive array_&EX and array_&Ey are the enemy coordinates in fixed point 8:8 (we have 7 enemies) array_BULS s an array of flags 0/1 telling if the bullet is alive. array_&BX and array_&BY are the array of x,y coordinates for bullets. We are evaluating the I-th bullet in var_I Note that coordinates for bullets are in 0-159, those for enemies are in 0-167 accounting the 8 pixels of offset for sprites TESTMYBUL:PROC BEGIN MVII #array_&BX,R4 ADD var_I,R4 MVI@ R4,R1 ;R1 = x1-8 MOVR R1,R2 ADDI #8*256,R2 ;R2 = x1 ADDI #array_&BY-array_&BX-1,R4 MVI@ R4,R3 ;R3 = y1-8 MOVR R3,R4 ADDI #8*256,R4 ;R4 = y1 indx QSET 0 REPEAT 7 MVI array_ES+indx,R0 ; test es(i) TSTR R0 BEQ _Next[indx] CMP array_&EX+indx,R2 ;x1>=x2(i) BLT _Next[indx] CMP array_&EY+indx,R4 ;y1>=y2(i) BLT _Next[indx] CMP array_&EX+indx,R1 ;x1<x2(i)+8 BGT _Next[indx] CMP array_&EY+indx,R3 ;y1<y2(i)+8 BGT _Next[indx] ; COLLISION FOUND ;[278] sound 2,330,48 MVII #330,R0 MVO R0,498 SWAP R0 MVO R0,502 MVII #48,R0 MVO R0,509 ;[279] sound 3,500,9 MVII #500,R0 MVO R0,499 SWAP R0 MVO R0,503 MVII #9,R0 MVO R0,506 ; remove the enemy CLRR R0 MVO R0,array_ES+indx ; remove the bullet MVII #array_BULS,R1 ADD var_I,R1 MVO@ R0,R1 MVII #76,R0 MVO R0,var_BLX MVII #44,R0 MVO R0,var_BLY RETURN _Next[indx]: indx QSET indx + 1 ENDR RETURN ENDP The current code works fine in PAL, but it is exceeding the frame time in NTSC TimePilotTest.cfg 103 B · 3 downloads TimePilotTest.bin 22.2 kB · 3 downloads I'll take a look more thoroughly later, but off the top of my head, would it be possible to store the flags not in an array but in a bit vector? That is, a 16-bit variable with bits as flags for each enemy. That way, you could just shift right on each iteration and test the carry status flag. That's what I do on my own programs for the same functionality. Shift into carry costs only 6 cycles per iteration as opposed to 16 for the "MVI + TSTR" combo. The rest I'll analyze later tonight. dZ. Edited July 14, 2022 by DZ-Jay Quote Link to comment Share on other sites More sharing options...
artrag Posted July 14, 2022 Author Share Posted July 14, 2022 (edited) The plan was to use the status byte to code also the kind of object (explosions, bonuses, score labels, missiles, bombs, bosses...). Moreover elsewhere I have to manage that status byte from Intybasic and this latter is quite limited when we go to bit shifting/testing. I will try to move to asm other code segments to try to gain cycles first, in case I will return to your proposal. Probably the best way could be to add an extra byte of flags only devoted to bullet testing and keep the status byte Edited July 14, 2022 by artrag Quote Link to comment Share on other sites More sharing options...
DZ-Jay Posted July 14, 2022 Share Posted July 14, 2022 2 hours ago, artrag said: The plan was to use the status byte to code also the kind of object (explosions, bonuses, score labels, missiles, bombs, bosses...). Moreover elsewhere I have to manage that status byte from Intybasic and this latter is quite limited when we go to bit shifting/testing. I will try to move to asm other code segments to try to gain cycles first, in case I will return to your proposal. Probably the best way could be to add an extra byte of flags only devoted to bullet testing and keep the status byte Sounds good. Quote Link to comment Share on other sites More sharing options...
DZ-Jay Posted July 14, 2022 Share Posted July 14, 2022 (edited) Still looking into this but here are a few more quick thoughts: Your comparisons are signed (BGT/BLT), but you're using unsigned 16-bit values for Q8.8 fixed point!!! You can save about 13 cycles by avoiding the "BEGIN/RETURN" pattern (20 cycles). It seems you do not need R5, so just don't touch that register and jump out at the end with "JR R5" (7 cycles). The biggest cost in your procedure is the repeated comparisons, so we should look into optimizing those. (The "Collision Found" section only runs once, after a collision is found.) One quick way would be to store at least one of the coordinate addresses in a register to use indirect memory access on each iteration. You'll probably need a non-increment register (R0..R3), so we'll have to re-shuffle the registers. We'll probably need to conscript R5 into the mix, in which case ignore point #1. I wonder if we could reduce the 4 comparisons to two by leveraging the fact that for axis-aligned rectangles, when a collision occurs, their centers are within distance of half their length across each axis: bool DoBoxesIntersect(Box a, Box b) { return (abs((a.x + a.width/2) - (b.x + b.width/2)) * 2 < (a.width + b.width)) && (abs((a.y + a.height/2) - (b.y + b.height/2)) * 2 < (a.height + b.height)); } If the bounding boxes are constants across all objects, then their width and height (and their respective halves) can be pre-computed. The multiplication by two is merely a shift left, and the ABS() just means unsigned comparisons. Reducing it to two comparisons should free some registers to avoid direct memory accesses. (Still, I'm not sure if this will be cheaper right now, but I will try later today and provide a cycle-count profile for both approaches to decide.) Apart from total cycle counting, we need to account for the most common execution paths. In other words, it's not enough reducing the total cost instructions in the procedure -- we need to make sure that the most common cases are as cheap as possible. That means, for instance, avoiding unnecessary calculations if a bullet or enemy is likely to be inactive, etc. I'll try to get some code out later today. Cheers! -dZ. Edited July 14, 2022 by DZ-Jay Quote Link to comment Share on other sites More sharing options...
DZ-Jay Posted July 14, 2022 Share Posted July 14, 2022 (edited) @artrag, I just realized something: If you can treat your bullets as pixels, rather than a box, then your collision detection becomes simply the intersection of a point on an axis-aligned line. I use this in my current project: if (( unsiged(bullet.x - hitbox.x) < hitbox.width) && ( unsigned(bullet.y - hitbox.y) < hitbox.height)) This is actually quite efficient with unsigned comparisons: when the values are out of range, they "look" negative, which (treated as unsigned) will make them bigger than the range, and thus, fail the comparison. If you pre-load all values into registers (and avoid direct memory accesses), then all you need is a "SUBR, CMPR, BC" combo once for each axis, and move on to the next object. I'll try that one too later on and profile various approaches. -dZ. Edited July 14, 2022 by DZ-Jay Quote Link to comment Share on other sites More sharing options...
artrag Posted July 14, 2022 Author Share Posted July 14, 2022 (edited) I've started to port in ASM some of the code that moves the bullets and tests for on screen/off screen conditions (and fixed the BLT by BC). Now my ASM segment does almost all the processing for active enemies, but still the CPU time is exceeding the frame time in NTSC machines. Any suggestion is welcome, I will move to ASM also the enemy bullets in the hope to gain some other cycles MVTSTBUL:PROC ; move bullet i-th MVII #array_BULDIR,R3 ADD var_I,R3 MVII #label_&SIN_TABLE,R1 ADD@ R3,R1 MVI@ R1,R2 ; R2 = sin() ADDR R2,R2 ; R2 = 2*sin() ADDI #(label_&COS_TABLE-label_&SIN_TABLE) AND $FFFF,R1 MVI@ R1,R0 ; R0 = cos() ADDR R0,R0 ; R0 = 2*cos() ADDI #(array_&BX-array_BULDIR) AND $FFFF,R3 ADD@ R3,R0 MVO@ R0,R3 ; R0 = bx(i) = bx(i) +2*cos() ; test bullet i-th vs x offscreen CMPI #(151*256),R0 BC OffScreen ; here bx<151 MOVR R0,R1 ; save #bx in R1 for later SWAP R0 MVO R0,var_BLX ADDI #(array_&BY-array_&BX) AND $FFFF,R3 ADD@ R3,R2 MVO@ R2,R3 ; by(i) = by(i) +2*sin() ; test bullet i-th vs y offscreen CMPI #(79*256),R2 BC OffScreen ; here by<79 MOVR R2,R3 ; save #by in R3 for later SWAP R2 MVO R2,var_BLY OnScreen: ; test bullet i-th vs enemies ;R1 = x1-8 MOVR R1,R2 ADDI #8*256,R2 ;R2 = x1 ;R3 = y1-8 MOVR R3,R4 ADDI #8*256,R4 ;R4 = y1 MVI var_TESTFLAGS,R0 indx QSET 0 REPEAT 7 RRC R0,1 ; same as test es(i) BNC _Next[indx] CMP array_&EX+indx,R2 ;x1>=x2(i) BNC _Next[indx] CMP array_&EY+indx,R4 ;y1>=y2(i) BNC _Next[indx] CMP array_&EX+indx,R1 ;x1<x2(i)+8 BC _Next[indx] CMP array_&EY+indx,R3 ;y1<y2(i)+8 BC _Next[indx] ; COLLISION FOUND ; remove the enemy CLRR R0 MVO R0,array_ES+indx B ManageCollision _Next[indx]: indx QSET indx + 1 ENDR JR R5 ManageCollision: ;[278] sound 2,330,48 MVII #330,R0 MVO R0,498 SWAP R0 MVO R0,502 MVII #48,R0 MVO R0,509 ;[279] sound 3,500,9 MVII #500,R0 MVO R0,499 SWAP R0 MVO R0,503 MVII #9,R0 MVO R0,506 ; remove the bullet i-th OffScreen: CLRR R0 MVII #array_BULS,R1 ADD var_I,R1 MVO@ R0,R1 MVII #76,R0 MVO R0,var_BLX MVII #44,R0 MVO R0,var_BLY JR R5 ENDP Edited July 14, 2022 by artrag Quote Link to comment Share on other sites More sharing options...
artrag Posted July 14, 2022 Author Share Posted July 14, 2022 (edited) 6 hours ago, DZ-Jay said: Still looking into this but here are a few more quick thoughts: Your comparisons are signed (BGT/BLT), but you're using unsigned 16-bit values for Q8.8 fixed point!!! You can save about 13 cycles by avoiding the "BEGIN/RETURN" pattern (20 cycles). It seems you do not need R5, so just don't touch that register and jump out at the end with "JR R5" (7 cycles). The biggest cost in your procedure is the repeated comparisons, so we should look into optimizing those. (The "Collision Found" section only runs once, after a collision is found.) One quick way would be to store at least one of the coordinate addresses in a register to use indirect memory access on each iteration. You'll probably need a non-increment register (R0..R3), so we'll have to re-shuffle the registers. We'll probably need to conscript R5 into the mix, in which case ignore point #1. I wonder if we could reduce the 4 comparisons to two by leveraging the fact that for axis-aligned rectangles, when a collision occurs, their centers are within distance of half their length across each axis: bool DoBoxesIntersect(Box a, Box b) { return (abs((a.x + a.width/2) - (b.x + b.width/2)) * 2 < (a.width + b.width)) && (abs((a.y + a.height/2) - (b.y + b.height/2)) * 2 < (a.height + b.height)); } If the bounding boxes are constants across all objects, then their width and height (and their respective halves) can be pre-computed. The multiplication by two is merely a shift left, and the ABS() just means unsigned comparisons. Reducing it to two comparisons should free some registers to avoid direct memory accesses. (Still, I'm not sure if this will be cheaper right now, but I will try later today and provide a cycle-count profile for both approaches to decide.) Apart from total cycle counting, we need to account for the most common execution paths. In other words, it's not enough reducing the total cost instructions in the procedure -- we need to make sure that the most common cases are as cheap as possible. That means, for instance, avoiding unnecessary calculations if a bullet or enemy is likely to be inactive, etc. I'll try to get some code out later today. Cheers! -dZ. - BGT/BLT REPLACED - R5 if safe, BEGIN/RETURN replaced - yes any cycle saved there is multiplied by 28 times (4 bullets x 7 enemies) - you should elaborate about using auto increment registers - reducing the 4 comparisouns to 2 seems possible using the fact that negative values, as unsigned, are larger than $7FFF ... Any proposal where you do not need to reload the bx(i) by(i) in registers at any iteration? - the critical path is the one passing troutgh all 4 comparisons Edited July 14, 2022 by artrag Quote Link to comment Share on other sites More sharing options...
artrag Posted July 14, 2022 Author Share Posted July 14, 2022 (edited) Having ported to ASM also the enemy bullets now the CPU time is sufficient. This should work without frame drops in both NTSC and PAL machines. Now enemies aim their bullets to the fighter TimePilotTest.cfg TimePilotTest.bin Edited July 14, 2022 by artrag Quote Link to comment Share on other sites More sharing options...
DZ-Jay Posted July 14, 2022 Share Posted July 14, 2022 I’m glad you managed. If you still want to try additional micro-optimizations, just send me the source to the ASM procedures in a PM. I’d be happy to go through them. I could try the ideas I mentioned before. I’ll also respond later more completely the questions you posted in the previous messages. dZ. Quote Link to comment Share on other sites More sharing options...
artrag Posted July 15, 2022 Author Share Posted July 15, 2022 Now all bullets are interely managed in assembly and the framerate is stable in NTSC @DZ-Jay do not worry, no need to micro optimisations in this moment, the code is provisional as I have to add other game elements TimePilotTest.cfg TimePilotTest.bin Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.