Interlacing Tables, Table Reduction
Sometimes you end up with routines that use a lot of tables. While writing my (0-65535) Hex to Decimal routine I ended up with a lot of 16 byte tables (6 of them actually). I realized that an optimization could be made by interlacing the tables. Normally I would do something like this:
lda hexValue ;3 @3 lsr ;2 @5 lsr ;2 @7 lsr ;2 @9 lsr ;2 @11 tay ;2 @13 lda hexValue ;3 @16 and #$0F ;2 @18 tax ;2 @20 clc ;2 @22 lda HighTab,X ;4 @26 adc LowTab,Y ;4 @30
I took 4 of the tables and interlaced them. The interlacing of the 4 tables forces the index for them to become multiples of 4 by nature. Multiplying the index by 4 was as simple as cutting out two of the LSR's, and adding an AND statement to eliminate the remainder.
lda hexValue ;3 @3 and #$F0 ;2 @5 lsr ;2 @7 lsr ;2 @9 tay ;2 @11 lda hexValue ;3 @14 and #$0F ;2 @16 tax ;2 @18 lda HighTab,X ;4 @22 adc LowTab,Y ;4 @26
In the above routine the carry is now cleared automatically with the AND and LSR's. I was able to save 1 byte and go 4 cycles faster, and it was all free savings by just interlacing a few tables.
My hex to decimal routine first converts the number into bytes of value 0-99, and then uses a look up table to find the BCD value for a quick conversion. Having a 100 byte table is a pig of a routine. I realized though that an odd number is an odd number in both HEX and BCD. So bit 0 is set for odd numbers, and cleared otherwise. A simple LSR followed by a ROL is all that is needed to cut that table in half. This is the initial code:
tay ;2 @2 HEX value 0-99 lda BcdTab,Y ;4 @6 BCD value 0-99 tay ;2 @8 and #$0F ;2 @10 sta decOnes ;3 @13 tya ;2 @15 lsr ;2 @17 lsr ;2 @19 lsr ;2 @21 lsr ;2 @23 sta decTens ;3 @26
Now the optimized code:
lsr ;2 @2 Hex value 0-99 >> 1, keep bit 0 (odd/even) in the carry tay ;2 @4 lda ShiftedBcdTab,Y ;4 @8 BCD value 0-99 >> 1 tay ;2 @10 rol ;2 @12 BCD value 0-99, as odd/even bit is returned... and #$0F ;2 @14 sta decOnes ;3 @17 tya ;2 @19 lsr ;2 @21 lsr ;2 @23 lsr ;2 @25 sta decTens ;3 @28
The real kicker is since my table is shifted to right by 1 place I only need to do three shifts to get decTens, and not four. So I pay 2 bytes more at the beginning of the routine and take one back later. The new routine takes 2 cycles longer then before, but saves 49 bytes!!! That's 50 bytes saved for cutting the 100 byte routine in half, and 1 byte lost for an extra ROL instruction. I can live with that.
0 Comments
Recommended Comments
There are no comments to display.