Thomas Jentzsch Posted March 9, 2007 Share Posted March 9, 2007 This is how I would do it: lda v ; 2 ldx #-1 ; 2 .count: inx ; 2 .loop: lsr ; 2 bcs .count ; 2³ bne .loop ; 2³ No someone do the math. Quote Link to comment Share on other sites More sharing options...
Urchlay Posted March 9, 2007 Share Posted March 9, 2007 This is how I would do it: lda v ; 2 ldx #-1; 2 .count: inx ; 2 .loop: lsr ; 2 bcs .count; 2³ bne .loop ; 2³ No someone do the math. Holy crap, *that* is a killer hack! 10 cycles if v was 0, and 63 cycles if it was $FF (worst case)... in 10 bytes of code! Also, your version is optimized for lower numbers... the execution time depends on where the most significant set bit is, instead of the number of set bits like the Kernighan version (so your version is faster for $0F than it would be for $F0, even though both have 4 bits set). If I were wearing a hat, I'd take it off to you Quote Link to comment Share on other sites More sharing options...
vdub_bobby Posted March 9, 2007 Share Posted March 9, 2007 Thanks Urchlay and TJ, lots of fun reading. Now tell me what in the world "parity" is. Quote Link to comment Share on other sites More sharing options...
Urchlay Posted March 9, 2007 Share Posted March 9, 2007 Thanks Urchlay and TJ, lots of fun reading. Now tell me what in the world "parity" is. Long description: http://en.wikipedia.org/wiki/Parity_%28telecommunication%29 Short version: An 8th bit that's added to a 7-bit byte, so that the number of 1 bits comes out to an even number (even parity) or to an odd number (odd parity). Used to detect errors on a serial line: the sender sends bytes with correct parity, and the receiver calculates what the parity bit should be... and if they're different, the receiver knows the data got garbled during transmission. Quote Link to comment Share on other sites More sharing options...
vdub_bobby Posted March 9, 2007 Share Posted March 9, 2007 (edited) Thanks. EDIT: How about this one: Determining if an integer is a power of 2 unsigned int v; // we want to see if v is a power of 2 bool f; // the result goes here f = (v & (v - 1)) == 0; Note that 0 is incorrectly considered a power of 2 here. To remedy this, use: f = !(v & (v - 1)) && v; What's the killer 6502 hack for this? I've come up with this: lda v dec v and v inc v Which returns zero in the accumulator if v is a power of two and anything else if it isn't. And preserves v. Except it doesn't handle v==0 and I'd rather have it return 0 if false and anything else if true. Edited March 9, 2007 by vdub_bobby Quote Link to comment Share on other sites More sharing options...
Urchlay Posted March 10, 2007 Share Posted March 10, 2007 (edited) lda v dec v and v inc v Which returns zero in the accumulator if v is a power of two and anything else if it isn't. And preserves v. Except it doesn't handle v==0 and I'd rather have it return 0 if false and anything else if true. The return value should really be in a flag, so you can immediately use a branch instruction afterwards... otherwise you'll need something like "cmp #0" in the caller. How about: ; Warning: untested code! is_x_power_of_2:; convenience entry point: checks X instead of location v stx v .byte $2C; aka BIT absolute opcode, skips the "sta v" below is_a_power_of_2:; convenience entry point: checks accumulator instead of location v sta v ; Main entry point: is_v_power_of_2: subroutine; { lda v bne .main lda #1; Special case: return with Z flag false, if v==0 rts .main; v is known to be non-zero, here dec v and v; accumulator now contains 0 if v is a power of 2 (Z flag is set correctly, but "inc v" is about to destroy it) ; If you don't care about preserving v, you can comment out both these lines: inc v cmp #0; set Z flag true if a==0, or false otherwise (needed because "inc v" trashes the Z flag) .done rts ; } It's not so "killer" any more, but it's easy to use: ldx $whatever; get a byte from somewhere stx v jsr is_v_power_of_2; Was it a power of 2? beq yes_it_was; yep! ; no, do whatever here... ;; Alternatively: lda $whatever; or, A could be the result of some calculations jsr is_a_power_of_2 beq yes_it_was Also, since you're only dealing with an 8-bit byte, you can just brute-force it: ; Return Z=true if A is a power of two, or Z=false if it isn't ; Uses no RAM, preserves A/X/Y, trashes flags is_a_power_of_2: subroutine; { cmp #1 beq .yes cmp #2 beq .yes cmp #4 beq .yes cmp #8 beq .yes cmp #16 beq .yes cmp #32 beq .yes cmp #64 beq .yes cmp #128 beq .yes ; if none of the compares are true, Z ends up being false .yes; if any of the compares are true, Z is true when we branch here ; Either way, just return it. rts ; } Converting the brute-force method to a loop... which will be slower than the original: is_a_power_of_2: subroutine; { ldx #1 stx tmp .loop; { cmp tmp beq .done; return with Z=true asl tmp bne .loop ; } ; original: lda #1; return with Z=false (too bad there's no CLZ in 6502-speak) ; EDIT: ldx #1; return with Z=false, preserving the value of A .done rts ; } However, TJ is probably about to post something that blows all this code away Edited March 10, 2007 by Urchlay Quote Link to comment Share on other sites More sharing options...
+batari Posted March 10, 2007 Share Posted March 10, 2007 Thanks. EDIT: How about this one: Determining if an integer is a power of 2 unsigned int v; // we want to see if v is a power of 2 bool f; // the result goes here f = (v & (v - 1)) == 0; Note that 0 is incorrectly considered a power of 2 here. To remedy this, use: f = !(v & (v - 1)) && v; What's the killer 6502 hack for this? I've come up with this: lda v dec v and v inc v Which returns zero in the accumulator if v is a power of two and anything else if it isn't. And preserves v. Except it doesn't handle v==0 and I'd rather have it return 0 if false and anything else if true. Nice! Quote Link to comment Share on other sites More sharing options...
Urchlay Posted March 10, 2007 Share Posted March 10, 2007 However, TJ is probably about to post something that blows all this code away This isn't up to TJ's standards, but I just thought of another solution in the shower: ; destructively test to see if the accumulator is a power of two ; on exit, A is trashed, X and Y are preserved, and Z is set if A was a power of two is_a_power_of_2: subroutine; { cmp #0; Test for special case A==0 bne .loop adc #1; another "CLZ" rts ; All powers of two have only one bit set! .loop; { ; This loop terminates when we find a set bit lsr bcc .loop ; } rts; If we found a set bit, the Z flag will be set if that was the *only* set bit ; } That would actually be kind of elegant, if it weren't for the need to test for the special A==0 case Quote Link to comment Share on other sites More sharing options...
supercat Posted March 10, 2007 Share Posted March 10, 2007 That would actually be kind of elegant, if it weren't for the need to test for the special A==0 case Hmm... cmp #1 adc #255 lp: lsr bcc lp Should do an early exit (NZ) in the A=0 case. Quote Link to comment Share on other sites More sharing options...
Thomas Jentzsch Posted March 10, 2007 Share Posted March 10, 2007 lda v dec v and v inc v Which returns zero in the accumulator if v is a power of two and anything else if it isn't. And preserves v. Except it doesn't handle v==0 and I'd rather have it return 0 if false and anything else if true. How about dec v lda v inc v beq .exit and v .exit: ? Not perfect, since you still need a compare. Quote Link to comment Share on other sites More sharing options...
Urchlay Posted March 11, 2007 Share Posted March 11, 2007 That would actually be kind of elegant, if it weren't for the need to test for the special A==0 case Hmm... cmp #1 adc #255 lp: lsr bcc lp Should do an early exit (NZ) in the A=0 case. Nice! Now it's elegant. Quote Link to comment Share on other sites More sharing options...
supercat Posted March 11, 2007 Share Posted March 11, 2007 Now it's elegant. Maybe. Still a bit slow, though. cmp #1 ; Optional first part to make zero yield a non-zero result adc #255 tax dex sbx #0 If for some reason it's better to have the result in the carry flag, replace the last instruction with "sbx #1". In that case, the zero flag will not be meaningful. Quote Link to comment Share on other sites More sharing options...
djmips Posted March 11, 2007 Author Share Posted March 11, 2007 Now it's elegant. Maybe. Still a bit slow, though. cmp #1 ; Optional first part to make zero yield a non-zero result adc #255 tax dex sbx #0 If for some reason it's better to have the result in the carry flag, replace the last instruction with "sbx #1". In that case, the zero flag will not be meaningful. Ahhh... I wasn't going to say anything about your previous routine but I was wondering... I mean it looped! But this one ties it all together. Quote Link to comment Share on other sites More sharing options...
+batari Posted March 11, 2007 Share Posted March 11, 2007 To detect zero, just do this: tax dex bmi done sbx #0 done Also, why not load the value to be tested with lax, so you don't need the tax above? Quote Link to comment Share on other sites More sharing options...
vdub_bobby Posted March 12, 2007 Share Posted March 12, 2007 (edited) Cool! Here's another one: Modulus. I need to find n mod 20. Well, to be specific, I need to find if n mod 20 is zero or otherwise. Here's how I do it: lda n sec Loop sbc #20 bcc Remainder bne Loop ;--remainder is zero Which is slow. Is there a better way? EDIT: And another: I have often wanted to find the absolute value of a signed byte, usually I do something like this: lda v bpl FoundAbs lda #0 sec sbc v FoundAbs Which is ok, I guess. The website djmips posted has this non-branching solution: Compute the integer absolute value (abs) without branching int v; // we want to find the absolute value of v int r; // the result goes here int const mask = v >> sizeof(int) * CHAR_BIT - 1; r = (v + mask) ^ mask; Patented variation: r = (v ^ mask) - mask; Which I translated into 6502 assembly; seems a bit messy: lda v eor #$80 asl sta temp sbc temp sta temp clc adc v eor temp Any improvements on either routine? Edited March 12, 2007 by vdub_bobby Quote Link to comment Share on other sites More sharing options...
Urchlay Posted March 12, 2007 Share Posted March 12, 2007 I have often wanted to find the absolute value of a signed byte, usually I do something like this: lda v bpl FoundAbs lda #0 sec sbc v FoundAbs How about: lda v bpl .foundabs eor #$ff clc; not needed if the state of C is known up front adc #1; would be "adc #0", if C is known to be set .foundabs Not much of an improvement, but it does run 1 cycle faster if v is negative (or 3, if you can leave out the CLC). Reading over this thread, I can see that I need to learn how to use the illegal/undocumented opcodes. About the only ones I've ever used are DCP and some of the extended NOPs (e.g. to sleep for 3 cycles). Quote Link to comment Share on other sites More sharing options...
Thomas Jentzsch Posted March 13, 2007 Share Posted March 13, 2007 Reading over this thread, I can see that I need to learn how to use the illegal/undocumented opcodes. About the only ones I've ever used are DCP and some of the extended NOPs (e.g. to sleep for 3 cycles). LAX is also often very useful (especially LAX (),y). And the various NOPs can also be used to skip instructions (like BIT does) but without changing the flag and sometimes even slightly faster (NOP #imm). Quote Link to comment Share on other sites More sharing options...
Thomas Jentzsch Posted March 13, 2007 Share Posted March 13, 2007 I need to find n mod 20. Well, to be specific, I need to find if n mod 20 is zero or otherwise. Using prime factors might help here. 20 = 2*2*5, so first we could SHR twice and check the carries. But then we still need an effective solution for 5. Not sure, but this may work: while (n > 0) { n = n/4 - n%4 } n has to be zero. Quote Link to comment Share on other sites More sharing options...
Cybergoth Posted March 13, 2007 Share Posted March 13, 2007 But then we still need an effective solution for 5. A 60 byte lookup table? Quote Link to comment Share on other sites More sharing options...
Thomas Jentzsch Posted March 13, 2007 Share Posted March 13, 2007 A 60 byte lookup table? Nah, that's cheating. But I just had another idea: Add the two nibbles and check if the result can be divided by 5 (== 5, A, F). For results above F you have to repeat. Quote Link to comment Share on other sites More sharing options...
Bruce Tomlin Posted March 13, 2007 Share Posted March 13, 2007 (edited) Here's a quick mod 5 I just came up with: LDX #5*32; start with 5 shifted left as far as possible STX temp .10 CMP temp BCS .20; branch if A < temp SEC; may or not be needed, I can never remember how carry works 6502 subtract SBC temp; subtract temp from A ; BNE .30; can short-cut exit here if exact multiple .20 LSR temp; shift divisor right BCC .10; loop until all divisors tried .30 ; remainder is in A Completely untested of course. This should work for a mod by any odd number, using the low bit as an end condition. If you shift out and save the two low bits, then shift them back in, you should have a mod 20. You can then use LDX #5*8 instead of LDX #5*32. It might even be possible to completely unroll this: LSRA ROR temp LSRA ROR temp CMP #5*8 BCS .10 SEC SBC #5*8 .10 CMP #5*4 BCS .20 SEC SBC #5*4 .20 CMP #5*2 BCS .30 SEC SBC #5*2 .30 CMP #5*1 BCS .40 SEC SBC #5*1 .40 ASL temp ROLA ASL temp ROLA .50 BNE notdiv20 Fix the carry flag stuff to work the right way of course. If all you need to do is test for divide by 20, then just BCS notdiv20 instead of ROR temp. Edited March 13, 2007 by Bruce Tomlin Quote Link to comment Share on other sites More sharing options...
supercat Posted March 13, 2007 Share Posted March 13, 2007 Here's a quick mod 5 I just came up with: Decent approach, though I don't see the need for temp: ldx #8 lp: cmp #80 bcc nosub sbc #160 ; Note carry set by cmp bcs nosub adc #80 ; Note carry clear by sbc nosub: asl dex bne lp Quote Link to comment Share on other sites More sharing options...
supercat Posted March 13, 2007 Share Posted March 13, 2007 Here's another interesting approach: ; Enter at s0 s4: asl bcs s4 beq done_bad s3: asl bcs s2 s1: asl bcs s3 s2: asl bcc s4 s0: asl bcs s1 bne s0 done_good: Quote Link to comment Share on other sites More sharing options...
djmips Posted May 21, 2007 Author Share Posted May 21, 2007 (edited) Arithmetic Shifts ------------------- For signed shifting you need to use an arithmetic shift. The consensus on how to do a single arithmetic shift right is. cmp #$80 ror And here is the code for when you are doing 4 or more bits at once. 4 in this example. lsr lsr lsr lsr clc adc #$F8 eor #$F8 For an arithmetic shift left, you can just use ASL but it doesn't correctly set the overflow flag. If you need to check the overflow flag then use the following code sequence. clc sta temp adc temp Credits - Lee Davison, TJ Edited May 24, 2007 by djmips Quote Link to comment Share on other sites More sharing options...
supercat Posted May 21, 2007 Share Posted May 21, 2007 If you need to check the overflow flag then use the following code sequence. Or you could do the shift the normal way and then do something like: bcs shifted_out_1 bpl overflow_nope overflow_yep: ... shifted_out_1: bpl overflow_yep overflow_nope: That takes 5 cycles if no overflow; 4 or 6 cycles if overflow. The accumulator is not disturbed. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.