djmips Posted June 12, 2005 Share Posted June 12, 2005 (edited) As I was growing up, I kept a notebook full of cool code snippets and ideas. My notebook had been misplaced but I ran across it recently and here is one of the pages which is from a 1987 Dr. Dobbs article by Mark S. Ackerman. "6502 Killer Hacks". Post your own 6502 Killer Hacks and share them with the rest of us! I also checked into Mark S. Ackerman with our trusty tool Google and found his 'vita' - Pretty sure it's the same guy as he worked at GCC from 1982 - 1984 and was the lead on Ms. PacMan, Galaxian and Moon Patrol - time to update AtariAge database as these games are empty when it comes to staff He has a patent on the Galaxian kernel. Well here is the killer hack. This one is to scrimp on RAM. Incrementing only the lower 4 bits of a byte (with wrap) ... lda word ; original byte and #$0f ; retrieve lower nybble tay ; index lda word clc ; might not be needed adc nextinc,y ; could be ora or sbc sta word ... nextinc .byte 1,2,3,4,5,6,7,8 .byte 9,10,11,12,13,14,15,0 Well, funny thing is - maybe I didn't transcribe it properly back in '87 - because it doesn't seem like it would work. Seems like it needs an AND #$F0 after the second LDA word So I thought I'd take a shot at a working version... ... lda word ; original byte and #$0f ; retrieve lower nybble tay ; index lda word clc adc nextinc,y sta word ... nextinc .byte 1,1,1,1,1,1,1,1 .byte 1,1,1,1,1,1,1,-15 who knows if that one works either. ? If someone has the original article from Feb 1987 Dr. Dobbs Journal, I'd be curious to see the code. Also, post your own 6502 Killer Hacks and share them with the rest of us! - David Updated 2017: Just came across the original PDF of the article by Mark S. Ackerman and confirmed that I did transcribe it incorrectly but my fixed version is the same as the published version. http://archive.6502.org/publications/dr_dobbs_journal_selected_articles/6502_hacks.pdf See the following post for a better version of this hack. Edited May 25, 2021 by djmips improvement. 1 Quote Link to comment Share on other sites More sharing options...
Cybergoth Posted June 12, 2005 Share Posted June 12, 2005 Hi there! ... lda word ; original byte and #$0f ; retrieve lower nybble tay ; index lda word clc adc nextinc,y sta word ... nextinc .byte 1,1,1,1,1,1,1,1 .byte 1,1,1,1,1,1,1,-15 872906[/snapback] Yours should work a lot better than the other version. So you're going for speed here? A version without table would certainly waste less ROM space: LAX word INX AND #$F0 STA temp TXA AND #$0F ORA temp STA word As many cycles, but 14 bytes saved... (Also this can count n bits, it's not fixed to 4) Greetings, Manuel Quote Link to comment Share on other sites More sharing options...
djmips Posted June 12, 2005 Author Share Posted June 12, 2005 Hi there! ... lda word ; original byte and #$0f ; retrieve lower nybble tay ; index lda word clc adc nextinc,y sta word ... nextinc .byte 1,1,1,1,1,1,1,1 .byte 1,1,1,1,1,1,1,-15 872906[/snapback] Yours should work a lot better than the other version. So you're going for speed here? A version without table would certainly waste less ROM space: LAX word INX AND #$F0 STA temp TXA AND #$0F ORA temp STA word As many cycles, but 14 bytes saved... (Also this can count n bits, it's not fixed to 4) Greetings, Manuel 872915[/snapback] I like that version, nice use of LAX, but you gotta be brave and not use the temp LAX word INX AND #$F0 STA word TXA AND #$0F ORA word STA word Quote Link to comment Share on other sites More sharing options...
Cybergoth Posted June 12, 2005 Share Posted June 12, 2005 Hi there! I like that version, nice use of LAX, but you gotta be brave and not use the temp 872999[/snapback] Uihjah... so my kung fu is ok, just needs some work on the finishing move... Greetings, Manuel Quote Link to comment Share on other sites More sharing options...
Alex H Posted June 12, 2005 Share Posted June 12, 2005 Post your own 6502 Killer Hacks and share them with the rest of us! Here's one I like: ; unsigned divide by 3 sta temp lsr lsr clc adc temp ror lsr clc adc temp ror lsr clc adc temp ror lsr clc adc temp ror lsr Quote Link to comment Share on other sites More sharing options...
+batari Posted June 12, 2005 Share Posted June 12, 2005 [code] LAX word INX AND #$F0 STA word TXA AND #$0F ORA word STA word 872999[/snapback] I think I can save a byte... I am not totally sure if this will work though, I've been wrong before. inc word lax word and #$0f bne no txa sbx #$10 stx word no Quote Link to comment Share on other sites More sharing options...
Cybergoth Posted June 13, 2005 Share Posted June 13, 2005 Hi there! The last 2 instructions should rather be SBC and STA I think And, to make it totally failproof, you'd probably need to add a SEC before the subtraction. Greetings, Manuel Quote Link to comment Share on other sites More sharing options...
+batari Posted June 13, 2005 Share Posted June 13, 2005 Hi there! The last 2 instructions should rather be SBC and STA I think And, to make it totally failproof, you'd probably need to add a SEC before the subtraction. Greetings, Manuel 873284[/snapback] Actually, SBX is an illegal opcode that stores to X the result of (A&X)-Immediate (and it ignores the carry!) For this reason I've found it useful to save a byte here and there. Quote Link to comment Share on other sites More sharing options...
Cybergoth Posted June 13, 2005 Share Posted June 13, 2005 Hi there! Actually, SBX is an illegal opcode that stores to X the result of (A&X)-Immediate (and it ignores the carry!) Oh... I thought that opcode was called AXS Anyway, clever usage! Greetings, Manuel Quote Link to comment Share on other sites More sharing options...
+batari Posted June 13, 2005 Share Posted June 13, 2005 Hi there! Actually, SBX is an illegal opcode that stores to X the result of (A&X)-Immediate (and it ignores the carry!) Oh... I thought that opcode was called AXS Anyway, clever usage! Greetings, Manuel 873290[/snapback] I've noticed that different documents use different mnemonics for the illegals, so maybe SBX=AXS. Though SBX works in dasm, maybe AXS does too Quote Link to comment Share on other sites More sharing options...
Cybergoth Posted June 13, 2005 Share Posted June 13, 2005 Hi there! To contribute something myself, here's an IMO very usefull bit I wrote for Gunfight back then and which I'm using one way or another in Star Fire and Crazy Balloon as well. It checks wether a point is within a rectangle (software collision detection!): LDA rect.right SBC point.x BMI NoHit SBC rect.width BPL NoHit LDA rect.top SBC point.y BMI NoHit SBC rect.height BPL NoHit ;BANG! NoHit Greetings, Manuel 1 Quote Link to comment Share on other sites More sharing options...
vdub_bobby Posted June 13, 2005 Share Posted June 13, 2005 Hi there! To contribute something myself, here's an IMO very usefull bit I wrote for Gunfight back then and which I'm using one way or another in Star Fire and Crazy Balloon as well. It checks wether a point is within a rectangle (software collision detection!): LDA rect.right SBC point.x BMI NoHit SBC rect.width BPL NoHit LDA rect.top SBC point.y BMI NoHit SBC rect.height BPL NoHit ;BANG! NoHit Greetings, Manuel 873320[/snapback] Here's the software collision routine I worked up for Go Fish! A little different - I use it to check if two boxes overlap. Call it once for X values, then call it again with Y values. CheckBoundaries lda rect1.leftortop cmp rect2.leftortop bmi Check2 cmp rect2.rightorbottom bmi InsideBoundingBox Check2 lda rect2.leftortop cmp rect1.leftortop bmi NotInsideBoundingBox cmp rect1.rightorbottom bpl NotInsideBoundingBox InsideBoundingBox sec rts NotInsideBoundingBox clc rts Quote Link to comment Share on other sites More sharing options...
Cybergoth Posted June 13, 2005 Share Posted June 13, 2005 Hi there! Here's the software collision routine I worked up for Go Fish!A little different - I use it to check if two boxes overlap. Call it once for X values, then call it again with Y values. 873438[/snapback] Do you use it for fish:fish collisions? Probably not, as the hardware detection should be good enough for that, or? Greetings, Manuel Quote Link to comment Share on other sites More sharing options...
vdub_bobby Posted June 13, 2005 Share Posted June 13, 2005 Hi there! Here's the software collision routine I worked up for Go Fish!A little different - I use it to check if two boxes overlap. Call it once for X values, then call it again with Y values. 873438[/snapback] Do you use it for fish:fish collisions? Probably not, as the hardware detection should be good enough for that, or? Greetings, Manuel 873446[/snapback] Use it for playerLEFT-controlled-fish to playerRIGHT-controlled-fish collisions in two-player game. Flicker. Quote Link to comment Share on other sites More sharing options...
+batari Posted June 13, 2005 Share Posted June 13, 2005 (edited) What this does is set a number of random bits in a memory location. X is defined before calling this routine. Actually, it's not "killer" yet - I think this can be improved for cycles as well as space... but I'm at a loss for ideas... Anyone? EDIT: oops, "bits" should be zero page, not immediate. makemines lda bits sta TEMPVAR loop JSR randomize; returns random value in accumulator AND #7 TAY LDA maskbit,y ORA minefield-1,x; x is defined outside this routine STA minefield-1,x dec TEMPVAR BPL loop rts maskbit .byte %00000001 .byte %00000010 .byte %00000100 .byte %00001000 .byte %00010000 .byte %00100000 .byte %01000000 .byte %10000000 Edited June 13, 2005 by batari Quote Link to comment Share on other sites More sharing options...
djmips Posted June 14, 2005 Author Share Posted June 14, 2005 What this does is set a number of random bits in a memory location. X is defined before calling this routine. Actually, it's not "killer" yet - I think this can be improved for cycles as well as space... but I'm at a loss for ideas... Anyone? EDIT: oops, "bits" should be zero page, not immediate. makemines lda bits sta TEMPVAR loop JSR randomize; returns random value in accumulator AND #7 TAY LDA maskbit,y ORA minefield-1,x; x is defined outside this routine STA minefield-1,x dec TEMPVAR BPL loop rts maskbit .byte %00000001 .byte %00000010 .byte %00000100 .byte %00001000 .byte %00010000 .byte %00100000 .byte %01000000 .byte %10000000 873707[/snapback] hmmm. So let me see, so bits + 1 is the maximum number of bits you want set. So if bitsis 2 for example, some legitimate output has 1 to 3 bits set because your random routine could return the same result each time for instance. Is that what you really want or does it matter if the routine always returned the number of bits I thought about something where you generate a bit per loop and the following is an idea for the inner loop. loop: jsr random cmp threshold rol temp I don't think this approach will result in an improvement over your version but maybe it sparks an idea. Quote Link to comment Share on other sites More sharing options...
+batari Posted June 15, 2005 Share Posted June 15, 2005 loop: jsr random cmp threshold rol temp I don't think this approach will result in an improvement over your version but maybe it sparks an idea. 873984[/snapback] I though of doing something similar, it would save bytes but add cycles... But it does spark an idea. This routine is only called every 12 frames, so I should use something like the above and spread it out over several frames, then I'll have cycles to spare. Quote Link to comment Share on other sites More sharing options...
+batari Posted June 15, 2005 Share Posted June 15, 2005 (edited) I like little code snippets like the ones posted here... I don't want this thread to die, so I'll post another one of my hacks that saved a few bytes. In 2600 games, there's often tons of STA WSYNCs, so I wondered if there was a way to get basically the same effect while saving space. So I came up with this, which works in cases where you have some kernel timing to spare, and the stack pointer is constant (let's assume $FF). Basically you replace all STA WSYNCs with BRK, but don't add the extra byte after the BRK, by using this short BRK routine: brkroutine DEC $FE; correct return address to eliminate the byte after the BRK ; ONLY works when low byte of return address is not zero! STA WSYNC RTI If you replace 6 or more STA WSYNCs, you start saving space... Edited June 15, 2005 by batari 1 Quote Link to comment Share on other sites More sharing options...
Bruce Tomlin Posted June 15, 2005 Share Posted June 15, 2005 ...but you better not be tight for cycles! That adds 6 cycles after the STA WSYNC, and 10 before it. In Red Box/Blue Box there were only a few places that I could JSR to a copy of my DoSound macro, and it saved a LOT of bytes when I did. There were only a few places because I only had 15 cycles to spare on each scan line after the DoSound macro. It would have only been 11 except that I saved 4 cycles from using LAX. Quote Link to comment Share on other sites More sharing options...
+batari Posted June 15, 2005 Share Posted June 15, 2005 ...but you better not be tight for cycles! That adds 6 cycles after the STA WSYNC, and 10 before it. In Red Box/Blue Box there were only a few places that I could JSR to a copy of my DoSound macro, and it saved a LOT of bytes when I did. There were only a few places because I only had 15 cycles to spare on each scan line after the DoSound macro. It would have only been 11 except that I saved 4 cycles from using LAX. 875071[/snapback] Yeah, there were lots of places where I couldn't use this trick. But I think if you're already using this trick, I think you could use it to create an 8 byte VSYNC! Now, this assumes that BRK/RTI will restore flags on returning. It does, right? If so, then this should work when SP=$FF and you use it right after your INTIM loop, like this: .1 LDX INTIM BNE .1 so you are certain X=0 and the Z flag is 1. Anyway: ; 8 byte VSYNC! BRK;STA WSYNC, plus restore flags (?) TXS;stack pointer = 0, which is VSYNC PHP;Z=1, which writes a 1 to bit 1 of VSYNC BRK BRK BRK STX VSYNC Quote Link to comment Share on other sites More sharing options...
djmips Posted June 16, 2005 Author Share Posted June 16, 2005 Here's a killer hack from the stella archives. It's near and dear to me because it is a key component used in most of the modern moving 48 wide sprite code ( like my Amiga Boing demo 2.0 (derived from R. Kudla/ E. Stolberg) . It is also used in the various Fu Kung demos from A. Davies) Also, it is very cool. Definately a killer hack. It was originally posted by the late Jim Nitchals on Mar 18 1998 Hi, Here's a way to implement single cycle resolution without the use of the carry flag (which adds overhead in the setup and at the end): ; A is assumed to hold the delay value plus the offset address of JumpTable. ; Or, you can align JumpTable to a page boundary. sta indjmp jmp (indjmp) ; point indjmp+1 to JumpTable somewhere in your init code JumpTable: dc.b $C9 dc.b $C9 ; repeat as many $C9's as you need for the maximum number of cycles you ; you need to delay by. dc.b $C9 ; opcode: CMP immediate (4 cycles: uses the $C5, executes ; the NOP below.) dc.b $C5 ; opcode: CMP zero page (3 cycles, uses up the NOP as a ; destination address of $EA) nop ; opcode: NOP (2 cycles by itself) You may find the reduced overhead of this technique useful. Quote Link to comment Share on other sites More sharing options...
cd-w Posted June 19, 2005 Share Posted June 19, 2005 This is a fairly obvious hack, but here goes ... If you want to display more than 2 sprites, you can use the missile and ball graphics to construct pseudo-sprites. Obviously you can only display a limited number of shapes, but if you are clever, you can obtain the appearance of extra flicker-free sprites. The following code fragment illustrates how to draw a man sprite using only missile 0: Kernel sta WSYNC sta HMOVE ; [0] + 3 ; Draw Sprite (SwitchDraw Variant) cpy PSWITCH ; [3] + 3 bpl PSwitch ; [6] + 2/3 lda (PPTR),Y ; [8] + 5 sta ENAM0 ; [13] + 3 sta HMM0 ; [16] + 3 asl ; [19] + 2 asl ; [21] + 2 sta NUSIZ0 ; [23] + 3 PContinue dey bpl Kernel ; SwitchDraw Routines PSwitch bne PWait ; [9] + 2/3 lda PEND ; [11] + 3 sta PSWITCH ; [14] + 3 SLEEP 6 ; [17] + 6 bcs PContinue ; [23] + 3 PWait sta HMCLR ; [12] + 3 SLEEP 8 ; [15] + 8 bpl PContinue ; [23] + 3 ; Player Data ; Bit 7-4 = HMove ; Bit 3-2 = Missile Width (1, 2, 4, or 8 pixels) ; Bit 1-0 = Missile Enable Player1 DC.B %00000000 DC.B %00000110 DC.B %00000010 DC.B %00000110 DC.B %11111010 DC.B %00001010 DC.B %00001010 DC.B %00001010 DC.B %00001010 DC.B %00010010 DC.B %00000110 DC.B %00000010 DC.B %00000110 DC.B %11111010 DC.B %00001010 DC.B %00010110 I have attached the full code to this message which allows you to move the sprite around the screen. Chris msprite.zip 1 Quote Link to comment Share on other sites More sharing options...
supercat Posted June 20, 2005 Share Posted June 20, 2005 Actually, SBX is an illegal opcode that stores to X the result of (A&X)-Immediate (and it ignores the carry!) For this reason I've found it useful to save a byte here and there. 873288[/snapback] What's the best resource for finding a list of such opcodes, DASM's preferred mnemonics for them, and any side-effects or weirdnesses? Quote Link to comment Share on other sites More sharing options...
+batari Posted June 21, 2005 Share Posted June 21, 2005 Actually, SBX is an illegal opcode that stores to X the result of (A&X)-Immediate (and it ignores the carry!) For this reason I've found it useful to save a byte here and there. 873288[/snapback] What's the best resource for finding a list of such opcodes, DASM's preferred mnemonics for them, and any side-effects or weirdnesses? 877857[/snapback] Emulator source code is the most comprehensive resource. Second best is probably this. Quote Link to comment Share on other sites More sharing options...
djmips Posted July 14, 2005 Author Share Posted July 14, 2005 (edited) Not a killer hack but I'd like to share a link to an interview with William Mensch, 6502 design team member. Real format video of interview Edited February 25, 2021 by djmips Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.