vdub_bobby Posted July 22, 2009 Share Posted July 22, 2009 (edited) The Daily WTF has an article up today asking for the best implementation (in any non-esoteric language) of Russian Peasant Multiplication. See the post for the algorithm; it's pretty simple. I contributed a 6502 assembly algorithm (I wasn't first with a 6502 version, but mine gets points for handling 8-bit operands, not trashing X, not having unnecessary code (I think!), and being harder to understand ) ;--$00 holds operand 1 ;--$01 (low byte) and $02 (high byte) hold operand 2 ;--$03 (low byte) and $04 (high byte) hold product cld Loop lsr $00 bcc NoAdd clc lda $03 adc $01 sta $03 lda $04 adc $02 sta $04 .byte $2C NoAdd beq Done asl $01 rol $02 bcc Loop ;this assumes that operand 2 is 8-bit Done Anyone come up with a faster/smaller version? Or do any of our Z80 programmers want to contribute a version? Edited July 22, 2009 by vdub_bobby Quote Link to comment Share on other sites More sharing options...
roland p Posted February 1, 2010 Share Posted February 1, 2010 Not the same, but this was mine russian-peasant-like multiplication. It multiplies two eight-bit values and returns the high byte as a result: MAC A_MUL_B_DIV256 ;{1} * {2} / 256 result in Accumulator LDA #0 LDX #8 .LOOP ROL {1} BCC .cont0 LSR {2} ADC {2} DEX BNE .LOOP jmp .end .cont0 LSR {2} DEX BNE .LOOP .end ENDM I'm now replacing it with a routine that uses squares. Quote Link to comment Share on other sites More sharing options...
Thomas Jentzsch Posted February 1, 2010 Share Posted February 1, 2010 I'm now replacing it with a routine that uses squares. In C=Hacking Steve Judd did a provide a brilliant code for this: * A*Y -> A Signed, 8-bit result STA ZP1 ;ZP1 -- zero page pointer to table of g(x) EOR #$FF CLC ADC #$01 STA ZP2 ;ZP2 also points to g(x) LDA (ZP1),Y ;g(Y+A) SEC SBC (ZP2),Y ;g(Y-A) Have a look at issue 9 for details. Quote Link to comment Share on other sites More sharing options...
roland p Posted February 1, 2010 Share Posted February 1, 2010 In C=Hacking Steve Judd did a provide a brilliant code for this: I tried to use it, but I had to change it to meet my requirements. I had most problems with the g(Y-A) part, but maybe it is made for signed numbers, which I didn't need. Anyway, this is my new code for Y*A/256: ; ;f(x) = x*x/1024 ;put this in table ;f(y+a) - f(y-a) = y*a/256 ;f(64+32) - f(64-32) = ;9-1 = 8 ZP1 = $00 *= $0600 LDA #>SQUARES STA ZP1+1 LDA #64 ;number1 LDY #32 ;number2 STA ZP1 TYA SEC SBC #64 ;number1 y-a BCS CONT ;result of substraction negative? EOR #$FF CLC ADC #$01 CONT TAX LDA (ZP1),Y ;g(Y+A) SEC SBC SQUARES,X ;g(Y-A) ;DONE! *= $0700 SQUARES ;x*x/1024 .BYTE $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 .BYTE $00,$00,$00,$00,$00,$00,$00,$01,$01,$01,$01,$01,$01,$01,$01,$01 .BYTE $01,$01,$01,$01,$01,$01,$01,$01,$02,$02,$02,$02,$02,$02,$02,$02 .BYTE $02,$02,$02,$03,$03,$03,$03,$03,$03,$03,$03,$03,$04,$04,$04,$04 .BYTE $04,$04,$04,$04,$05,$05,$05,$05,$05,$05,$05,$05,$06,$06,$06,$06 .BYTE $06,$06,$07,$07,$07,$07,$07,$07,$08,$08,$08,$08,$08,$08,$09,$09 .BYTE $09,$09,$09,$0A,$0A,$0A,$0A,$0A,$0B,$0B,$0B,$0B,$0B,$0C,$0C,$0C .BYTE $0C,$0C,$0D,$0D,$0D,$0D,$0E,$0E,$0E,$0E,$0F,$0F,$0F,$0F,$10,$10 .BYTE $10,$10,$11,$11,$11,$11,$12,$12,$12,$12,$13,$13,$13,$13,$14,$14 .BYTE $14,$15,$15,$15,$15,$16,$16,$16,$17,$17,$17,$17,$18,$18,$18,$19 .BYTE $19,$19,$1A,$1A,$1A,$1B,$1B,$1B,$1C,$1C,$1C,$1D,$1D,$1D,$1E,$1E .BYTE $1E,$1F,$1F,$1F,$20,$20,$20,$21,$21,$21,$22,$22,$23,$23,$23,$24 .BYTE $24,$24,$25,$25,$26,$26,$26,$27,$27,$27,$28,$28,$29,$29,$29,$2A .BYTE $2A,$2B,$2B,$2B,$2C,$2C,$2D,$2D,$2E,$2E,$2E,$2F,$2F,$30,$30,$31 .BYTE $31,$31,$32,$32,$33,$33,$34,$34,$35,$35,$35,$36,$36,$37,$37,$38 .BYTE $38,$39,$39,$3A,$3A,$3B,$3B,$3C,$3C,$3D,$3D,$3E,$3E,$3F,$3F,$40 .BYTE $40,$41,$41,$42,$42,$43,$43,$44,$44,$45,$45,$46,$46,$47,$47,$48 .BYTE $48,$49,$49,$4A,$4A,$4B,$4B,$4C,$4D,$4D,$4E,$4E,$4F,$4F,$50,$50 .BYTE $51,$52,$52,$53,$53,$54,$54,$55,$56,$56,$57,$57,$58,$58,$59,$5A .BYTE $5A,$5B,$5B,$5C,$5D,$5D,$5E,$5E,$5F,$60,$60,$61,$62,$62,$63,$63 .BYTE $64,$65,$65,$66,$67,$67,$68,$68,$69,$6A,$6A,$6B,$6C,$6C,$6D,$6E .BYTE $6E,$6F,$70,$70,$71,$72,$72,$73,$74,$74,$75,$76,$76,$77,$78,$78 .BYTE $79,$7A,$7A,$7B,$7C,$7C,$7D,$7E,$7F,$7F,$80,$81,$81,$82,$83,$84 .BYTE $84,$85,$86,$86,$87,$88,$89,$89,$8A,$8B,$8C,$8C,$8D,$8E,$8F,$8F .BYTE $90,$91,$92,$92,$93,$94,$95,$95,$96,$97,$98,$98,$99,$9A,$9B,$9B .BYTE $9C,$9D,$9E,$9F,$9F,$A0,$A1,$A2,$A3,$A3,$A4,$A5,$A6,$A7,$A7,$A8 .BYTE $A9,$AA,$AB,$AB,$AC,$AD,$AE,$AF,$B0,$B0,$B1,$B2,$B3,$B4,$B5,$B5 .BYTE $B6,$B7,$B8,$B9,$BA,$BA,$BB,$BC,$BD,$BE,$BF,$C0,$C1,$C1,$C2,$C3 .BYTE $C4,$C5,$C6,$C7,$C8,$C8,$C9,$CA,$CB,$CC,$CD,$CE,$CF,$D0,$D0,$D1 .BYTE $D2,$D3,$D4,$D5,$D6,$D7,$D8,$D9,$DA,$DA,$DB,$DC,$DD,$DE,$DF,$E0 .BYTE $E1,$E2,$E3,$E4,$E5,$E6,$E7,$E8,$E9,$EA,$EA,$EB,$EC,$ED,$EE,$EF .BYTE $F0,$F1,$F2,$F3,$F4,$F5,$F6,$F7,$F8,$F9,$FA,$FB,$FC,$FD,$FE,$FF Quote Link to comment Share on other sites More sharing options...
roland p Posted February 7, 2010 Share Posted February 7, 2010 I somehow get inaccuracies when using squares. f(x) = x*x/1024 f(y+a) - f(y-a) = y*a/256 When using numbers like 96 and 2 then the equation will be: f(96+2) - f(96-2) f(98) - f(94) Now use f(x) = x*x/1024 When calculated with a calculator: 98*98/1024 - 94*94/1024 = 9.37 - 8.62 = 0.75 But the squares in the table are not floating point, therefore by using tables the equation will be: 9 - 9 = 0 ! So there is an inaccuracy when using tables. This will be visible in my ballblazer in somewhat jittery sprite position. Does somebody know if this can be fixed? I now improved the russian peasant multiplication somewhat (inspired by vdub's routine). This is still a bit slow, but provides the accuracy I want: ;$0 = number 1 ;$1 = number 2 ;result in A LDA #0 .LOOP ROL $0 BCC .dontAdd LSR $1 ADC $1 jmp .LOOP .dontAdd LSR $1 BNE .LOOP end Quote Link to comment Share on other sites More sharing options...
Crazyace Posted February 7, 2010 Share Posted February 7, 2010 It's truncated - so the answer is correct ..96*2/256 is 0. You need 16 bit tables for full accuracy - probably 9 bit to handle this rounding correctly Quote Link to comment Share on other sites More sharing options...
roland p Posted February 9, 2010 Share Posted February 9, 2010 It's truncated - so the answer is correct ..96*2/256 is 0. You need 16 bit tables for full accuracy - probably 9 bit to handle this rounding correctly I was hoping the problem could be fixed by altering the tables. I tried to use 'floor' so 9.37 - 8.62 would be 9 - 8, and would give a correct result, but incorrect results with other values. Quote Link to comment Share on other sites More sharing options...
roland p Posted July 4, 2015 Share Posted July 4, 2015 (edited) I now improved the russian peasant multiplication somewhat (inspired by vdub's routine). This is still a bit slow, but provides the accuracy I want: ;$0 = number 1 ;$1 = number 2 ;result in A LDA #0 .LOOP ROL $0 BCC .dontAdd LSR $1 ADC $1 jmp .LOOP .dontAdd LSR $1 BNE .LOOP end I shaved off a few cycles: A reminder of what it does: 8-bit x 8-bit, high byte result in A LDA #0 ;2 BEQ LOOP ;3 doAdd LSR MULTIPLICANT2 ;5 ADC MULTIPLICANT2 ;3 LOOP ASL MULTIPLICANT1 ;5 BCS doAdd ;3 dontAdd LSR MULTIPLICANT2 ;5 BNE LOOP Edited July 4, 2015 by roland p Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.