SebRmv Posted July 15, 2009 Share Posted July 15, 2009 (edited) They are all aligned on 4 bytes. Which instruction? JUMP, JR or both.....look at the example carefully before you answer. Target labels are also aligned on 4 bytes. When are they aligned on a long address? And are they always? Again, look carefully. Sorry, I meant absolute jumps (ie JUMP) in both cases. As I said, I haven't yet figured out a rule for relative jumps (ie JR) They are all aligned on 4 bytesMy guess was that: - for intra-page jumps, jump and target are aligned on 4 bytes. What do you mean by intra page jumps? Either you are local to the page(256 bytes) or you are not local. Alignments to local or external are different. They are all aligned on 4 bytes- for inter-page jumps, same rule applies but the pipeline should be emptied to the next phrase boundary before the jump occurs (ie add sufficiently enough nops to prevent unexpected results). It's not so much you making sure the pipeline gets emptied. It's more a question of when the instruction WILL empty the pipeline. But this is over analyzing still. The pipeline plays and important part but not an overly complicated part. You really do not need to know that to get the alignments right. yes, intra-page jumps = local to the page inter-page jumps = the contrary once again, I was just talking about JUMP but apparently this is not a correct guess Edited July 15, 2009 by SebRmv Quote Link to comment Share on other sites More sharing options...
SebRmv Posted July 15, 2009 Share Posted July 15, 2009 Ok, so let's recapitulate the rules 1- 2- 3- 4- 5-Two nops after JUMP or JR Quote Link to comment Share on other sites More sharing options...
SebRmv Posted July 15, 2009 Share Posted July 15, 2009 Regarding rule 5, are the two nops always executed? Or does it depend on the fact that it is absolute/relative, local to the page/not local ? Quote Link to comment Share on other sites More sharing options...
Gorf Posted July 16, 2009 Author Share Posted July 16, 2009 Sorry, I meant absolute jumps (ie JUMP) in both cases. Ah ok then now you have TWO RULES figured out. As I said, I haven't yet figured out a rule for relative jumps (ie JR) You are close though. I left you a hint below but apparently this is not a correct guess Actually it was! It just was not clear you only meant JUMPS. That is why I asked you to clarify. Excellent work so far Seb. The rules so far: 1 - the placement of a JUMP must always be long aligned. This is to say the address ends in 0,4,8 or C hex. 2 - HINT: JR Placement differs from JUMP .... 3 - HINT: All external page jumps must be .... 4 - HINT: All internal page jumps must be .... 5 - two NOP's must follow each JUMP/JR Nice dude! Three left. Quote Link to comment Share on other sites More sharing options...
Gorf Posted July 16, 2009 Author Share Posted July 16, 2009 (edited) Ok, so let's recapitulate the rules 1- 2- 3- 4- 5-Two nops after JUMP or JR Update...you now have two. 1 - JUMP placement must ALWAYS be long aligned. 2 - ??? 3 - ??? 4 - ??? 5 - Two NOP's must ALWAYS cone after a JUMP or a JR instruction. Edited July 16, 2009 by Gorf Quote Link to comment Share on other sites More sharing options...
Gorf Posted July 16, 2009 Author Share Posted July 16, 2009 Regarding rule 5, are the two nops always executed?Or does it depend on the fact that it is absolute/relative, local to the page/not local ? I have been told that at very least the first one does but I have to believe by the fact that they both need to be there they are both being executed or at very least they are assuring a clear pipeline for where the Jump will go to. I've never personally tried to put a useful instruction at the first(or second) NOP(or both) but my guess is it may be possible. Im quite happy it works with the NOP's ...for now. Quote Link to comment Share on other sites More sharing options...
Atari_Owl Posted July 16, 2009 Share Posted July 16, 2009 Regarding rule 5, are the two nops always executed?Or does it depend on the fact that it is absolute/relative, local to the page/not local ? I have been told that at very least the first one does but I have to believe by the fact that they both need to be there they are both being executed or at very least they are assuring a clear pipeline for where the Jump will go to. I've never personally tried to put a useful instruction at the first(or second) NOP(or both) but my guess is it may be possible. Im quite happy it works with the NOP's ...for now. I've been putting a useful instruction in place of the first nop - these have been single register instructions and they seem to work just fine so far Quote Link to comment Share on other sites More sharing options...
Gorf Posted July 16, 2009 Author Share Posted July 16, 2009 Regarding rule 5, are the two nops always executed?Or does it depend on the fact that it is absolute/relative, local to the page/not local ? I have been told that at very least the first one does but I have to believe by the fact that they both need to be there they are both being executed or at very least they are assuring a clear pipeline for where the Jump will go to. I've never personally tried to put a useful instruction at the first(or second) NOP(or both) but my guess is it may be possible. Im quite happy it works with the NOP's ...for now. I've been putting a useful instruction in place of the first nop - these have been single register instructions and they seem to work just fine so far Im going to have to play around with that. I'd like to know for sure which instructions will and wont work there. Quote Link to comment Share on other sites More sharing options...
SebRmv Posted July 16, 2009 Share Posted July 16, 2009 (edited) Ok, another try 3 - non local jumps : phrase aligned 4 - local jumps : long aligned 2 - JR: does the rule depends on the length of the jump ? Edited July 16, 2009 by SebRmv Quote Link to comment Share on other sites More sharing options...
Gorf Posted July 16, 2009 Author Share Posted July 16, 2009 3 - non local jumps : phrase aligned No phrase aligns are necessary. 4 - local jumps : long aligned Try figuring out Rule # 2 first. I will say the alignment is offset. 2 - JR: does the rule depends on the length of the jump ? No, just if it is in the same page or not. Again, JR's placement is not so critical. Quote Link to comment Share on other sites More sharing options...
Gorf Posted July 16, 2009 Author Share Posted July 16, 2009 (edited) Lets review with yet a little more detail the example label and symbol table provided in the example. Label table: GoMovecannon = 00040084 the JUMP instruction to 'call' MoveCannon() in another PAGE AimCannonRET = 0004008C the 'return' address for MoveCannon() to return TO when finished FROM another PAGE Now what TWO things do you notice about these two addresses? MoveCannon = 00041000 this is the entry point(JUMP FROM another PAGE) MoveCannonJ1 = 0004101C this is nothing more than a label to make sure the placement for this JUMP is correct MoveCannonJ2 = 00041038 same purpose as 'MoveCannonJ1' for this JUMP as well DoneMoveCannon = 00041070 placement label again, for the last JUMP going back TO the caller(in another PAGE) What do all of these JUMP placement and destination addresses have in common? CannonMove = 00041022 these are all local DESTINATION adresses in the same page being JUMP/JR'ed FROM. SubCannon = 0004103E IncHotSpot = 00041056 DecHotSpot = 0004105A What do all of these JUMP/JR destination addresses have in common? I hope I did not OFFSET your way of thinking about this... WORD up, yo! Edited July 16, 2009 by Gorf Quote Link to comment Share on other sites More sharing options...
Gorf Posted July 16, 2009 Author Share Posted July 16, 2009 (edited) Here is a more detailed JR example......it really says just about all for JR. Examine the placement of the JR's the destinations and the pages. 00040000 jr EQ,nogoodreason 00040002 nop 00040004 nop 00040006 jr CS,nobetterreason 00040008 nop 0004000A nop 0004000C jr CC,whocares 0004000E nop 00040010 nop 00040012 jr nooftheabove 00040014 nop 00040016 nop 00040018 nop nogoodreason: 0004001A moveq #NOGOODREASON,r0 0004001C jr nooftheabove 0004001E nop 00040020 nop nobetterreason: 00040022 moveq #NOBETTERREASON,r0 00040024 jr nooftheabove 00040026 nop 00040028 nop whocares: 0004002A moveq #WHOCARES,r0 0004002C jr nooftheabove 0004002E nop 00040030 nop nooftheabove: 00040032 jr nooftheabove 00040034 nop 00040036 nop Edited July 16, 2009 by Gorf Quote Link to comment Share on other sites More sharing options...
mcjakeqcool Posted July 16, 2009 Share Posted July 16, 2009 Nice word gorf. Quote Link to comment Share on other sites More sharing options...
SebRmv Posted July 16, 2009 Share Posted July 16, 2009 I just try to answer your questions. Maybe this will get clearer to me once written down. Lets review with yet a little more detail the example label and symbol table provided in the example. Label table: GoMovecannon = 00040084 the JUMP instruction to 'call' MoveCannon() in another PAGE AimCannonRET = 0004008C the 'return' address for MoveCannon() to return TO when finished FROM another PAGE Now what TWO things do you notice about these two addresses? These addresses are long aligned but not phrase aligned (ie of the form 8n + 4) ? Ends with 4 or C MoveCannon = 00041000 this is the entry point(JUMP FROM another PAGE)MoveCannonJ1 = 0004101C this is nothing more than a label to make sure the placement for this JUMP is correct MoveCannonJ2 = 00041038 same purpose as 'MoveCannonJ1' for this JUMP as well DoneMoveCannon = 00041070 placement label again, for the last JUMP going back TO the caller(in another PAGE) What do all of these JUMP placement and destination addresses have in common? Ends with 0, 8 or C CannonMove = 00041022 these are all local DESTINATION adresses in the same page being JUMP/JR'ed FROM.SubCannon = 0004103E IncHotSpot = 00041056 DecHotSpot = 0004105A What do all of these JUMP/JR destination addresses have in common? I hope I did not OFFSET your way of thinking about this... WORD up, yo! Ends with 2, 6, A or E Quote Link to comment Share on other sites More sharing options...
SebRmv Posted July 16, 2009 Share Posted July 16, 2009 Here is a more detailed JR example......it really says just about all for JR.Examine the placement of the JR's the destinations and the pages. 00040000 jr EQ,nogoodreason 00040002 nop 00040004 nop 00040006 jr CS,nobetterreason 00040008 nop 0004000A nop 0004000C jr CC,whocares 0004000E nop 00040010 nop 00040012 jr nooftheabove 00040014 nop 00040016 nop 00040018 nop nogoodreason: 0004001A moveq #NOGOODREASON,r0 0004001C jr nooftheabove 0004001E nop 00040020 nop nobetterreason: 00040022 moveq #NOBETTERREASON,r0 00040024 jr nooftheabove 00040026 nop 00040028 nop whocares: 0004002A moveq #WHOCARES,r0 0004002C jr nooftheabove 0004002E nop 00040030 nop nooftheabove: 00040032 jr nooftheabove 00040034 nop 00040036 nop JR addresses end with 0, 2, 4, 6 or C target labels end with 2 or A Quote Link to comment Share on other sites More sharing options...
Gorf Posted July 16, 2009 Author Share Posted July 16, 2009 GoMovecannon = 00040084 the JUMP instruction to 'call' MoveCannon() in another PAGEAimCannonRET = 0004008C the 'return' address for MoveCannon() to return TO when finished FROM another PAGE Ends with 4 or C ....ok.....so 4 and C are on what size boundary? Excellent that you see that they are NOT on phrase aligns...no need. There is only one alignment necessary for JUMP placements. You already got this rule btw. MoveCannon = 00041000 this is the entry point(JUMP FROM another PAGE)MoveCannonJ1 = 0004101C this is nothing more than a label to make sure the placement for this JUMP is correct MoveCannonJ2 = 00041038 same purpose as 'MoveCannonJ1' for this JUMP as well DoneMoveCannon = 00041070 placement label again, for the last JUMP going back TO the caller(in another PAGE) Ends with 0, 8 or C ...ok then....as the above what size boundary are they aligned on? This is a must for all JUMP instruction placements only and always. CannonMove = 00041022 these are all local DESTINATION adresses in the same page being JUMP/JR'ed FROM.SubCannon = 0004103E IncHotSpot = 00041056 DecHotSpot = 0004105A Ends with 2, 6, A or E Ooooh you're getting warm!!! Ok so if 0,4,8,C are on a specific size boundary outside a PAGE I will repeat.... And if 2,6,A,E are on a specific size boundary inside a PAGE I will repeat.... ( For crying out loud Seb, I bolded it for you last time! Maybe this will help ) I hope I did not OFFSET your way of thinking about this... WORD up, yo! Quote Link to comment Share on other sites More sharing options...
viMaster Posted July 16, 2009 Share Posted July 16, 2009 I hope I did not OFFSET your way of thinking about this... WORD up, yo! you nut! Quote Link to comment Share on other sites More sharing options...
SebRmv Posted July 16, 2009 Share Posted July 16, 2009 Ooooh you're getting warm!!! Ok so if 0,4,8,C are on a specific size boundary outside a PAGE I will repeat.... Ok, I think I get it Outside a PAGE: 0 + 4n (ie simply long aligned) And if 2,6,A,E are on a specific size boundary inside a PAGE I will repeat....( For crying out loud Seb, I bolded it for you last time! Maybe this will help ) I hope I did not OFFSET your way of thinking about this... WORD up, yo! Inside a PAGE: 2 + 4n Is it correct? Quote Link to comment Share on other sites More sharing options...
Gorf Posted July 17, 2009 Author Share Posted July 17, 2009 (edited) Outside a PAGE: 0 + 4n (ie simply long aligned)Inside a PAGE: 2 + 4n Is it correct? Yes both are correct but to be more clear: An external page destination must be long aligned, ie , the address ends in a 0,4,8 or C. An internal page destination must be long aligned with word offset, ie , the address ends in a 2,6,A or E. Lets review what we have so far then.... 1 - the placement of a JUMP must always be long aligned. This is to say the address ends in 0,4,8 or C hex. 2 - HINT: JR Placement differs from JUMP .... 3 - All external page jumps must be LONG ALIGNED!!! 4 - All internal page jumps must be word offset from long aligned! 5 - two NOP's must follow each JUMP/JR Excellent job Seb! You are just one more rule away from error free main code execution. # 2 is the JR placement. I already gave it to you but I will summerize by showing you the placement addresses one more time.... 00040000 jr EQ,nogoodreason 00040006 jr CS,nobetterreason 0004000C jr CC,whocares 00040012 jr nooftheabove 0004001C jr nooftheabove 00040024 jr nooftheabove 0004002C jr nooftheabove 00040032 jr nooftheabove tell me now....look carefully at the addresses...... what do you see? Get this and the whole world knows all the rules to main code execution! Edited July 17, 2009 by Gorf Quote Link to comment Share on other sites More sharing options...
Gorf Posted July 17, 2009 Author Share Posted July 17, 2009 I hope I did not OFFSET your way of thinking about this... WORD up, yo! you nut! I'm glad some one got it . Quote Link to comment Share on other sites More sharing options...
viMaster Posted July 17, 2009 Share Posted July 17, 2009 I hope I did not OFFSET your way of thinking about this... WORD up, yo! you nut! I'm glad some one got it . But then I cheated... I saw Robert's posts I do like your hint system, though Quote Link to comment Share on other sites More sharing options...
SebRmv Posted July 17, 2009 Share Posted July 17, 2009 I hope I did not OFFSET your way of thinking about this... WORD up, yo! you nut! I'm glad some one got it . But then I cheated... I saw Robert's posts I do like your hint system, though Unfortunately, I am too bad in english to get fully your pun. But great hint, indeed, now I got the answer Quote Link to comment Share on other sites More sharing options...
SebRmv Posted July 17, 2009 Share Posted July 17, 2009 00040000 jr EQ,nogoodreason00040006 jr CS,nobetterreason 0004000C jr CC,whocares 00040012 jr nooftheabove 0004001C jr nooftheabove 00040024 jr nooftheabove 0004002C jr nooftheabove 00040032 jr nooftheabove tell me now....look carefully at the addresses...... what do you see? Get this and the whole world knows all the rules to main code execution! This is where I do not see something special. Ends with 0, 2, 4, 6 or C If I take MODULO 4, this means no special alignment is needed. Quote Link to comment Share on other sites More sharing options...
SebRmv Posted July 17, 2009 Share Posted July 17, 2009 (edited) Finally, just had a look at smac source code. It seems that the five rules are: 1 - JUMP placement must ALWAYS be long aligned. 2 - JR placement is FREE 3 - All external page jumps must be LONG ALIGNED!!! 4 - All internal page jumps must be word offset from long aligned! 5 - Two NOP's must ALWAYS come after a JUMP or a JR instruction. It seems also that it is forbidden to make a JR from main memory to local memory and vice versa. But what about JUMP? Is that correct Gorf? -- First I have to say that I am impressed that you found this on your own. Especially rule 4 seems to come from nowhere. But I guess it is somehow related to rule 5. How did you figure out? By looking at the NET files? -- Let's continue the technical discussion. What happens when interruptions are enabled? How does it behave? What happens if each rule is individually broken? Edited July 17, 2009 by SebRmv Quote Link to comment Share on other sites More sharing options...
Gorf Posted July 17, 2009 Author Share Posted July 17, 2009 1 - JUMP placement must ALWAYS be long aligned.2 - JR placement is FREE 3 - All external page jumps must be LONG ALIGNED!!! 4 - All internal page jumps must be word offset from long aligned! 5 - Two NOP's must ALWAYS come after a JUMP or a JR instruction. DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! DING! You win the 64,000 BIT question contest!! Your prize? You figured it out on your own!!! Now is'nt that better? It seems also that it is forbidden to make a JR from main memory to local memory and vice versa. But what about JUMP?Is that correct Gorf? Since there are no pages of main RAM before or after $00F03000 close enough for a JR to reach, it is impossible but not because it is forbidden. SMAC does not allow for it however which is proper of course. First I have to say that I am impressed that you found this on your own. Especially rule 4 seems to come from nowhere. But I guess it is somehow related to rule 5. How did you figure out? By looking at the NET files? Honest? You really want to know? I simply got tired of watching my frames drop in what was formerly 'Gorf 3D'. This was not only due to the inefficient renderer but the fact that I was flipping in 6 modules to the local. That is a few thousand cycles per frame for each flip. so yo can easily eat up 4000+, cycles just moving code to the GPU....and you cant run the GPU in local while you are loading it's local(unless you are real clever). So now you not only eat up roughly 4000+ cycles blitting the code, you now lose all that time running the GPU which is sitting stopped. So effectively you are losing 8,000+ cycles per module. 8000x 6 modules = 48,000 cycles per frame wasted. This does not include the cumbersome blitter code set up every frame or any bus contentions with the DSP and OPL either. So with that I was determined to find a way to get around the bug. I already knew that one of the problems was that the jumps were not reliable across a page. this is all in the docs. Reading the bug reports reveals much about the flaws in T&J. So I just wrote code and tried to get a jump from one page to another, where it simply added to a counter variable and signaled the 68k to print the value. It was running in side a page that screwed me up. I could not get anything to work on long aligns and just simply said..."FUCK IT" Im going to offset them by a word. BINGO! It worked and then I whipped up Surrounded in a couple weeks to be sure. The NET's confirmed what I discovered after the fact. Downix, who has been consuming the NET's for a few years now, probably knows more about the chips then even the designers did. Every time I speak with him, I'm left scratching my head. HE essentially said he found the hardware issue that causes the bug. If he wants to explain it he may. Every time I try I screw it up so... What happens when interruptions are enabled? How does it behave? Scott(JagMod) is now running tests to see exactly how this works. He has told me that he has has stable interrupt code running out on main. Still working on this one. Remember, when the GPU is interrupted, it will jump to $00F03000 + (interrupt number * 16 bytes) as it's vector. So if you are indeed out in main, the GPU will certainly work to jump back to local as all those vector addreses land on a long align in local. Our only concern is from where the intterupt occurs in main. if the resume address is misaligned it could spell disaster but so far Scott says it seems to be working. It's very small test code though and he might just be getting lucky. OR perhaps the pipeline and prefetch assure that the last instruction executed would leave the resume addres on along align(pure conjecture on my part though.) What happens if each rule is individually broken? It will fail every time. There is no room for error. Follow the rules and the code never fails. Misalign one thing and it's 'game over man'. Again the only exception is to rule #5. Owl says you can use certain single register instructions as the first NOP. I think you need to make sure you are not using a register that has been recently used before the jump. Owl can clarify that though. I have never tried this yet so I can't honestly comment. Welp, there you have it Seb. now think for a moment how difficult it would be to track every address by hand with out having a tool like SMAC. I can tell you, if I had SMAC, I could have written Surrounded! in a day or two. Instead I had to globalize EVERY jump placement and destination and add nop's to pad them to the proper alignment. I then would have ALN output a list of symbols to track those addresses. Night mare indeed. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.