+Lee Stewart Posted June 9, 2014 Author Share Posted June 9, 2014 Wow! I now know more about multicolor mode than I ever wanted to know! MCHAR is done—finally. Now, it's on to the last graphics primitive, LINE , which draws a bitmapped line between 2 sets of pixel coordinates. After that, I can focus on the VDP mode words: TEXT , TEXT80 , GRAPHICS , MULTI , GRAPHICS2 , SPLIT and SPLIT2 . ...lee 1 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 11, 2014 Author Share Posted June 11, 2014 (edited) I have finished my first pass at LINE ! Before assembling and testing, it is 69 ALC instructions! I still need to change a few scratchpad RAM storage locations (used to help me keep things straight) to registers, especially those in the pixel-plotting loop. At least I don't have a DIV instruction inside the loop as the TI programmer did. I do have two SRL 8 instructions to effectively divide by 256; but, those are much faster than DIV (I think!). I won't be able to assemble and test the code until I finish with the VDP mode words, which are next on the list, though I actually may translate the ALC to Forth Assembler to test sooner if it will fit in one block. For those champing at the bit to look over my shoulder, here's my current code for LINE (the original TI Forth definition follows the ALC as ALC comments): ;[*** LINE *** ( x1 y1 x2 y2 --- ) * LINE does the following: * 1) Computes dy = y2-y1 and dx = x2-x1 * 2) Adds sign (+1|-1|0) to differences * 3) Determines which direction, x or y, has slope <= 1 to avoid * DIV overflow * 4) Forces plotting direction to be positive * 5) Sets starting y|x accumulator as acc = (y|x)*256 * 6) Computes accumulator increment as inc = slope*256 * 7) Each time through dot plotting loop: * a) x|y = x|y + 1 * b) y|x = acc/256 <--truncates fraction * c) Plot dot * d) acc = acc+inc DATA DOT__N LINE_N DATA 4+TERMBT*LSHFT8+'L','IN','E '+TERMBT LINE DATA @+2 LINEP BL @BLF2A DATA _LINE->6000+BANK1 X1 EQU FAC X2 EQU FAC+2 Y1 EQU FAC+4 Y2 EQU FAC+6 DX EQU ARG DY EQU ARG+2 ACC EQU ARG+4 INC EQU ARG+6 DOTCNT EQU FREEPD COORD EQU FREEPD+2 _LINE MOV *SP+,@Y2 ; pop coordinates MOV *SP+,@X2 MOV *SP+,@Y1 MOV *SP+,@X1 MOV @Y2,R0 ; calculate dy S @Y1,R0 BL @_SNW ; add sign MOV R0,@DY MOV @X2,R0 ; calculate dx S @X1,R0 BL @_SNW ; add sign MOV R0,@DX ABS R0 MOV @DY,R1 ABS R1 C R1,R0 ; compare|dy| to |dx| JLT LINE01 INC R1 ; increment for point count MOV R1,@DOTCNT ; store point count MOV @Y1,@COORD ; assume starting with y1 MOV @X1,R4 ; and x1 (to R4 temporarily) C @Y1,@Y2 ; should we switch? JGT LINE03 ; yes JMP LINE04 ; no LINE03 MOV @Y2,@COORD ; we're starting with y2 MOV @X2,R4 ; and x2 (to R4 temporarily) LINE04 LI CRU,LNYAX ; load CRU (R12) with LNYAX to indicate y-axis processing MOV @DX,R1 ; load dx SLA R1,8 ; multiply dx by 256 CLR R0 DIV @DY,R1 ; 256*dx/dy MOV R0,@INC ; load increment JMP LINE02 LINE01 INC R0 ; increment for point count MOV R0,@DOTCNT ; store point count MOV @X1,@COORD ; assume starting with x1 MOV @Y1,R4 ; and y1 (to R4 temporarily) C @X1,@X2 ; should we switch? JGT LINE05 ; yes JMP LINE06 ; no LINE05 MOV @X2,@COORD ; we're starting with x2 MOV @Y2,R4 ; and y2 (to R4 temporarily) LINE06 LI CRU,LNXAX ; load CRU (R12) with LNXAX to indicate x-axis processing MOV @DY,R1 ; load dy SLA R1,8 ; multiply dy by 256 CLR R0 DIV @DX,R1 ; 256*dy/dx MOV R0,@INC ; load increment LINE02 SLA R4,8 ; * 256 MOV R4,@ACC ; store it LNLOOP B *CRU ; branch to x-axis or y-axis plotting LNXAX MOV @COORD,R3 ; get next x for DOT MOV @ACC,R0 ; get accumulator contents to y for DOT SRL R0,8 ; divide by 256 for proper position JMP LNPLOT ; go to plot LNYAX MOV @COORD,R0 ; get next y for DOT MOV @ACC,R3 ; get accumulator contents to x for DOT SRL R3,8 ; divide by 256 for proper position LNPLOT BL @__DOT ; plot the dot (R0 = y, R3 = x) A @INC,@ACC ; increment accumulator INC @COORD ; increment principal coordinate DEC @DOTCNT ; decrement counter JLT LNLOOP ; are we done? BL @RTNEXT ; yup, back to bank 0 and the inner interpreter * ( High-level Forth for LINE from TI Forth---comments are mine) * : LINE ( x1 y1 x2 y2--- ) * >R R ROT >R R ( copy y1, y2 to return stack) * - SNW ( add +1|-1|0 to row difference [dy] per sign) * SWAP >R R ROT >R R ( copy x1, x2 to return stack) * - SNW ( add +1|-1|0 to column difference [dx] per sign) * OVER ABS ( dup dy; get |dy|) * OVER ABS ( dup dx; get |dx|) * < ( |dy| < |dx|) * >R R ( copy [|dy|<|dx|] to return stack) * 0= ( not [|dy|<|dx|]) * ( insure we use slope <= 1) * IF ( [|dy| >= |dx|]?) * SWAP ( yes; swap dy and dx) * THEN * 100 ROT ROT */ ( slope [<= 1] * 256) * R> ( get |dy|<|dx| from return stack) * ( work on X axis?) * IF ( |dy|<|dx|?) * R> R> ( get x1 & x2 from return stack) * OVER OVER ( dup them) * ( insure drawing left to right) * > ( x1 > x2?) * IF ( yes) * SWAP ( start at x2) * R> DROP R> ( get y1 & y2 from return stack; discard y1) * ELSE ( no; we're starting at x1) * R> R> DROP ( get y1 & y2 from return stack; discard y2) * THEN * 100 * ( y1*256 [initial value of y for DO]) * ROT ROT 1+ SWAP ( x2+1 x1 for DO) * DO * I OVER ( I; y = y1*256, 1st time through) * 0 100 M/ SWAP DROP ( y/256) * DOT ( plot dot [I,y/256]) * OVER + ( y = y+256*dy/dx) * LOOP * ELSE ( work on Y axis instead) * R> R> R> R> ( get x1 x2 y1 y2 from return stack) * ROT >R ROT >R ( put x2 x1 back to return stack) * OVER OVER ( dup y1 y2 for comparison) * ( insure drawing top to bottom ) * > ( y1 > y2?) * IF ( yes) * SWAP ( start at y2) * R> DROP R> ( get x1 & x2 from return stack; discard x1) * ELSE ( no; we're starting at y1) * R> R> DROP ( get x1 & x2 from return stack; discard x2) * THEN * 100 * ( x1*256 [initial value of x for DO]) * ROT ROT 1+ SWAP ( y2+1 y1 for DO) * DO * DUP ( x = x1*256, 1st time through) * 0 100 M/ SWAP DROP ( x/256) * I * DOT ( plot dot [x/256,I]) * OVER + ( x = x+256*dx/dy) * LOOP * THEN * DROP DROP ( clean up) ;] I welcome any suggestions for improving the code. ...lee EDIT: Corrected lines 51 & 67: @DOTCNT Edited June 11, 2014 by Lee Stewart 1 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 11, 2014 Author Share Posted June 11, 2014 OK...Here is LINE using mostly registers (still not tested, however)— ;[*** LINE *** ( x1 y1 x2 y2 --- ) * LINE does the following: * 1) Computes dy = y2-y1 and dx = x2-x1 * 2) Adds sign (+1|-1|0) to differences * 3) Determines which direction, x or y, has slope <= 1 to avoid * DIV overflow * 4) Forces plotting direction to be positive * 5) Sets starting y|x accumulator as acc = (y|x)*256 * 6) Computes accumulator increment as inc = slope*256 * 7) Each time through dot plotting loop: * a) x|y = x|y + 1 * b) y|x = acc/256 <--truncates fraction * c) Plot dot * d) acc = acc+inc DATA DOT__N LINE_N DATA 4+TERMBT*LSHFT8+'L','IN','E '+TERMBT LINE DATA @+2 LINEP BL @BLF2A DATA _LINE->6000+BANK1 DOTCNT EQU FAC ; temporary storage for point (dot) count for line DX EQU FAC+2 DY EQU FAC+4 * Register usage--- * R0: varies * R1: varies * R2: y2 * R3: x2 * R4: y1, then, point (dot) count for line (DOTCNT) * R5: x1, then, increment for dependent coordinate (INC) * R6: accumulator for dependent coordinate (ACC) * R7: current independent coordinate (COORD) * R12: contains label for principal axis (LNXAX or LNYAX) _LINE MOV *SP+,R2 ; pop coordinates MOV *SP+,R3 MOV *SP+,R4 MOV *SP+,R5 MOV R2,R0 ; calculate dy S R4,R0 BL @_SNW ; add sign MOV R0,@DY MOV R3,R0 ; calculate dx S R5,R0 BL @_SNW ; add sign MOV R0,@DX ABS R0 MOV @DY,R1 ABS R1 C R1,R0 ; compare|dy| to |dx| JLT LINE01 INC R1 ; increment for point count MOV R1,@DOTCNT ; store point count MOV R4,R7 ; assume starting with y1 MOV R5,R6 ; and x1 (to ACC) C R4,R2 ; should we switch? JGT LINE03 ; yes JMP LINE04 ; no LINE03 MOV R2,R7 ; we're starting with y2 MOV R3,R6 ; and x2 (to ACC) LINE04 LI CRU,LNYAX ; load CRU (R12) with LNYAX to indicate y-axis processing MOV @DX,R1 ; load dx SLA R1,8 ; multiply dx by 256 CLR R0 DIV @DY,R1 ; 256*dx/dy MOV R0,R5 ; load increment JMP LINE02 LINE01 INC R0 ; increment for point count MOV R0,@DOTCNT ; store point count MOV R5,R7 ; assume starting with x1 MOV R4,R6 ; and y1 (to ACC) C R5,R3 ; should we switch? JGT LINE05 ; yes JMP LINE06 ; no LINE05 MOV R3,R7 ; we're starting with x2 MOV R2,R6 ; and y2 (to ACC) LINE06 LI CRU,LNXAX ; load CRU (R12) with LNXAX to indicate x-axis processing MOV @DY,R1 ; load dy SLA R1,8 ; multiply dy by 256 CLR R0 DIV @DX,R1 ; 256*dy/dx MOV R0,R5 ; load increment LINE02 SLA R6,8 ; * 256 (adjust ACC) MOV @DOTCNT,R4 ; load point counter LNLOOP B *CRU ; branch to x-axis or y-axis plotting LNXAX MOV R7,R1 ; get next x for DOT MOV R6,R0 ; get accumulator contents to y for DOT SRL R0,8 ; divide by 256 for proper position JMP LNPLOT ; go to plot LNYAX MOV R7,R0 ; get next y for DOT MOV R6,R1 ; get accumulator contents to x for DOT SRL R1,8 ; divide by 256 for proper position LNPLOT BL @__DOT ; plot the dot (R0 = y, R1 = x) A R5,R6 ; increment accumulator INC R7 ; increment principal coordinate DEC R4 ; decrement counter JLT LNLOOP ; are we done? BL @RTNEXT ; yup, back to bank 0 and the inner interpreter * ( High-level Forth for LINE from TI Forth---comments are mine) * : LINE ( x1 y1 x2 y2--- ) * >R R ROT >R R ( copy y1, y2 to return stack) * - SNW ( add +1|-1|0 to row difference [dy] per sign) * SWAP >R R ROT >R R ( copy x1, x2 to return stack) * - SNW ( add +1|-1|0 to column difference [dx] per sign) * OVER ABS ( dup dy; get |dy|) * OVER ABS ( dup dx; get |dx|) * < ( |dy| < |dx|) * >R R ( copy [|dy|<|dx|] to return stack) * 0= ( not [|dy|<|dx|]) * ( insure we use slope <= 1) * IF ( [|dy| >= |dx|]?) * SWAP ( yes; swap dy and dx) * THEN * 100 ROT ROT */ ( slope [<= 1] * 256) * R> ( get |dy|<|dx| from return stack) * ( work on X axis?) * IF ( |dy|<|dx|?) * R> R> ( get x1 & x2 from return stack) * OVER OVER ( dup them) * ( insure drawing left to right) * > ( x1 > x2?) * IF ( yes) * SWAP ( start at x2) * R> DROP R> ( get y1 & y2 from return stack; discard y1) * ELSE ( no; we're starting at x1) * R> R> DROP ( get y1 & y2 from return stack; discard y2) * THEN * 100 * ( y1*256 [initial value of y for DO]) * ROT ROT 1+ SWAP ( x2+1 x1 for DO) * DO * I OVER ( I; y = y1*256, 1st time through) * 0 100 M/ SWAP DROP ( y/256) * DOT ( plot dot [I,y/256]) * OVER + ( y = y+256*dy/dx) * LOOP * ELSE ( work on Y axis instead) * R> R> R> R> ( get x1 x2 y1 y2 from return stack) * ROT >R ROT >R ( put x2 x1 back to return stack) * OVER OVER ( dup y1 y2 for comparison) * ( insure drawing top to bottom ) * > ( y1 > y2?) * IF ( yes) * SWAP ( start at y2) * R> DROP R> ( get x1 & x2 from return stack; discard x1) * ELSE ( no; we're starting at y1) * R> R> DROP ( get x1 & x2 from return stack; discard x2) * THEN * 100 * ( x1*256 [initial value of x for DO]) * ROT ROT 1+ SWAP ( y2+1 y1 for DO) * DO * DUP ( x = x1*256, 1st time through) * 0 100 M/ SWAP DROP ( x/256) * I * DOT ( plot dot [x/256,I]) * OVER + ( x = x+256*dx/dy) * LOOP * THEN * DROP DROP ( clean up) ;] ...lee Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 12, 2014 Author Share Posted June 12, 2014 Hacking away at the VDP mode words— I am setting these up to use a table of default values for locating the various VDP tables (PDT, SPDTAB, SATR, SMTN, SIT, etc.), setting screen and character colors, etc. I am thinking of allowing the user to provide different values for these defaults; but, perhaps, all I should do is to include instructions for changing them programmatically after the defaults are invoked by a mode change. What do you think? ...lee Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 12, 2014 Author Share Posted June 12, 2014 Sprites in Multicolor Mode— I have never used sprites in multicolor mode, so I am unclear why the TI programmer, who wrote the VDP mode words for TI Forth, set VR01 = EBh. That setting includes double-sizing and magnifying sprites. This does happen to be the default setting shown in the TI book, Video Display Processors: Programmer's Guide (with no explanation!?!); however, that same book shows bitmap graphics mode with a default setting of double-sized sprites with no magnification. The same TI programmer set the bitmap sprite default to single-size with no magnification. Can anyone shed some light here? ...lee Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 17, 2014 Author Share Posted June 17, 2014 I am done translating the words for graphics modes and primitives to ALC. Before testing, I have the following room left in the 4 banks of a 32KB ROM: BANK0: 2138 bytes BANK1: 1096 bytes BANK2: 2662 bytes BANK3: 8106 bytes This is pretty good, but BANK1 is getting tight. I may not have enough room to put the 40/80-column editor there without moving some other code to another bank. I still want to put the file words in ROM, but I want to stay away from BANK3 if I can. I am trying to reserve BANK3 for the floating point math library I converted for TurboForth awhile back and that will consume most of the space in one bank. We'll see. Right now I've got a lot of testing to do! ...lee 1 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 19, 2014 Author Share Posted June 19, 2014 VDP mode words done & working!— This is getting a little tedious. I may have to stop for awhile. I will try to finish testing the graphics primitives first, however—40 words! ...lee 2 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 20, 2014 Author Share Posted June 20, 2014 I thought I'd test LINE first. I'm in a bit of a quandary over dy and dx. Lines 60 and 66 increment the magnitude of each by 1. I don't think it's necessary. Perhaps lines with very narrow slopes are marginally cleaner, but they seem to be OK. The problem with the increased size is that drawing a line from the top near the left to the bottom left will spill onto the right side of the screen near the bottom. I suspect the same thing happens at the bottom of the screen, but it would not matter except in SPLIT mode, where it would run into the text part of the screen. The problem does not arise when dy and dx are left alone. Furthermore, I can reduce the code quite a bit if I can leave them alone. Ideas? ;[*** routine to add sign to number *** *++ R0 must contain number under consideration. _SNW MOV R0,R0 ; check number in R0 for sign JEQ SNWXIT ; if 0, we're done JGT SNW01 ; if positive, increment it DEC R0 ; decrement negative number JMP SNWXIT ; we're outta here SNW01 INC R0 ; increment positive number SNWXIT RT ; return to caller * HEX * : SNW ( n --- n-1|0|n+1 ) * ( decrement if -, 0 if 0, increment if +) * DUP SGN + ;] ;[*** LINE *** ( x1 y1 x2 y2 --- ) * LINE does the following: * 1) Computes dy = y2-y1 and dx = x2-x1 * 2) Adds sign (+1|-1|0) to differences * 3) Determines which direction, x or y, has slope <= 1 to avoid * DIV overflow * 4) Forces plotting direction to be positive * 5) Sets starting y|x accumulator as acc = (y|x)*256 * 6) Computes accumulator increment as inc = slope*256 * 7) Each time through dot plotting loop: * a) x|y = x|y + 1 * b) y|x = acc/256 <--truncates fraction * c) Plot dot * d) acc = acc+inc * DATA DTBM_N * LINE_N DATA 4+TERMBT*LSHFT8+'L','IN','E '+TERMBT * LINE DATA $+2 * LINEP BL @BLF2A * DATA _LINE->6000+BANK1 DOTCNT EQU FAC ; temporary storage for point (dot) count for line DX EQU FAC+2 DY EQU FAC+4 DYXSN EQU FAC+6 ; sign of dy/dx or dx/dy * Register usage--- * R0: varies * R1: varies * R2: y2 * R3: x2 * R4: y1, then, point (dot) count for line (DOTCNT) * R5: x1, then, increment for dependent coordinate (INC) * R6: accumulator for dependent coordinate (ACC) * R7: current independent coordinate (COORD) * R12: contains label for principal axis (LNXAX or LNYAX) _LINE MOV *SP+,R2 ; pop coordinates MOV *SP+,R3 MOV *SP+,R4 MOV *SP+,R5 MOV R2,R0 ; calculate dy S R4,R0 *++--++ BL @_SNW ; add sign MOV R0,R1 ; prepare for sign calculation ABS R0 MOV R0,@DY MOV R3,R0 ; calculate dx S R5,R0 *++--++ BL @_SNW ; add sign XOR R0,R1 ; calculate sign of slope (dy/dx|dx/dy) MOV R1,@DYXSN ; store sign of slope ABS R0 MOV R0,@DX MOV @DY,R1 C R1,R0 ; compare|dy| to |dx| JLT LINE01 INC R1 ; increment for point count MOV R1,@DOTCNT ; store point count MOV R4,R7 ; assume starting with y1 MOV R5,R6 ; and x1 (to ACC) C R4,R2 ; should we switch? JGT LINE03 ; yes JMP LINE04 ; no LINE03 MOV R2,R7 ; we're starting with y2 MOV R3,R6 ; and x2 (to ACC) LINE04 LI CRU,LNYAX ; load CRU (R12) with LNYAX to indicate y-axis processing MOV @DX,R1 ; load dx SLA R1,8 ; multiply dx by 256 CLR R0 DIV @DY,R0 ; 256*dx/dy JMP LINE02 LINE01 INC R0 ; increment for point count MOV R0,@DOTCNT ; store point count MOV R5,R7 ; assume starting with x1 MOV R4,R6 ; and y1 (to ACC) C R5,R3 ; should we switch? JGT LINE05 ; yes JMP LINE06 ; no LINE05 MOV R3,R7 ; we're starting with x2 MOV R2,R6 ; and y2 (to ACC) LINE06 LI CRU,LNXAX ; load CRU (R12) with LNXAX to indicate x-axis processing MOV @DY,R1 ; load dy SLA R1,8 ; multiply dy by 256 CLR R0 DIV @DX,R0 ; 256*dy/dx LINE02 MOV R0,R5 ; load increment SLA R6,8 ; * 256 (adjust ACC) MOV @DYXSN,R0 ; get sign JGT LINE07 ; is sign JEQ LINE07 ; negative? NEG R5 ; yes, negate increment LINE07 MOV @DOTCNT,R4 ; load point counter LNLOOP B *CRU ; branch to x-axis or y-axis plotting LNXAX MOV R7,R1 ; get next x for DOT MOV R6,R0 ; get accumulator contents to y for DOT SRL R0,8 ; divide by 256 for proper position JMP LNPLOT ; go to plot LNYAX MOV R7,R0 ; get next y for DOT MOV R6,R1 ; get accumulator contents to x for DOT SRL R1,8 ; divide by 256 for proper position LNPLOT BL @__DTBM ; plot the dot (R0 = y, R1 = x) A R5,R6 ; increment accumulator INC R7 ; increment principal coordinate DEC R4 ; decrement counter JNE LNLOOP ; are we done? BL @RTNEXT ; yup, back to bank 0 and the inner interpreter ...lee Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 21, 2014 Author Share Posted June 21, 2014 New LINE Routine— I decided to try an integer, no-divide, no-multiply (except shifts, i.e., multiply by powers of 2) Bresenham type of algorithm. It actually seems to run slower than the old LINE ; but, I'm sure I can tighten up the new one a bit. Then I can compare the two for speed. Anyway, here's the new LINE for anyone interested in helping tweak it—or, just to look over my shoulder: ;[*** LINE *** ( x1 y1 x2 y2 --- ) ( alternative LINE---one or the other) *++ This is an integer, no-divide version of the Bresenham line algorithm * LINE does the following: * 1) Computes dy = y2-y1 and dx = x2-x1 * 2) Determines which direction, x or y, has slope <= 1 * x) Flips dx and dy * y) Leaves dx and dy alone * 3) sets DOTCNT = dx in R4 * 4) Computes D = 2*dy-dx * 5) Forces plotting direction to be positive for independent variable * 6) Sets starting y|x accumulator as acc = (y|x) * 7) Finds accumulator increment as inc = +1|-1 * Plots first dot * 9) Each time through dot plotting loop: * a) Loop counter check * b) x|y = x|y + 1 * c) D > 0? * yes) * y1) acc = acc + inc * y2) D = D+2*(dy-dx) * no) D = D+2*dy * d) y|x = acc * e) Plot dot * f) Decrement point counter * DATA DTBM_N * LINE_N DATA 4+TERMBT*LSHFT8+'L','IN','E '+TERMBT * LINE DATA $+2 * LINEP BL @BLF2A * DATA _LINE->6000+BANK1 DX EQU FAC DY EQU FAC+2 DYXSN EQU FAC+4 ; sign of dy/dx or dx/dy, then, D * Register usage--- * R0: varies * R1: varies * R2: y2 * R3: x2 * R4: y1, then, point (dot) count for line (DOTCNT) * R5: x1, then, increment for dependent coordinate (INC) (+1|-1) * R6: accumulator for dependent coordinate (ACC) * R7: current independent coordinate (COORD) * R12: contains label for principal axis (LNXAX or LNYAX) _LINE MOV *SP+,R2 ; pop coordinates MOV *SP+,R3 MOV *SP+,R4 MOV *SP+,R5 SETO @DYXSN ; initially, store -1 as sign of slope MOV R2,R0 ; calculate dy S R4,R0 MOV R0,R1 ; prepare for sign calculation ABS R0 MOV R0,@DY MOV R3,R0 ; calculate dx S R5,R0 XOR R0,R1 ; calculate sign of slope (dy/dx|dx/dy) JLT LINE08 ; negative slope? NEG @DYXSN ; change sign to +1 LINE08 ABS R0 MOV R0,@DX MOV @DY,R1 C R1,R0 ; compare|dy| to |dx| JLT LINE01 ; dy < dx? MOV R0,@DY ; no, flip dy MOV R1,@DX ; and dx MOV R4,R7 ; assume starting with y1 MOV R5,R6 ; and x1 (to ACC) C R4,R2 ; should we switch? JGT LINE03 ; yes JMP LINE04 ; no LINE03 MOV R2,R7 ; we're starting with y2 MOV R3,R6 ; and x2 (to ACC) LINE04 LI CRU,LNYAX ; load CRU (R12) with LNYAX to indicate y-axis processing JMP LINE02 LINE01 MOV R5,R7 ; assume starting with x1 MOV R4,R6 ; and y1 (to ACC) C R5,R3 ; should we switch? JGT LINE05 ; yes JMP LINE06 ; no LINE05 MOV R3,R7 ; we're starting with x2 MOV R2,R6 ; and y2 (to ACC) LINE06 LI CRU,LNXAX ; load CRU (R12) with LNXAX to indicate x-axis processing LINE02 MOV @DYXSN,R5 ; get sign to INC register before we destroy it! MOV @DY,R0 ; calculate D SLA R0,1 ; D = 2*dy S @DX,R0 ; D = 2*dy-dx MOV R0,@DYXSN ; store D in DYXSN MOV @DX,R4 ; load point counter CI CRU,LNXAX ; x or y axis? JEQ LINE07 ; x-axis MOV R7,R0 ; y-axis, COORD to y for DOT MOV R6,R1 ; ACC to x for DOT JMP LINE09 ; to first plot LINE07 MOV R7,R1 ; x-axis, COORD to x for DOT MOV R6,R0 ; ACC to y for DOT LINE09 BL @__DTBM ; plot first dot (R0 = y, R1 = x) LNLOOP MOV R4,R4 ; are we done? JEQ LINEX ; yup! DEC R4 ; decrement counter INC R7 ; increment principal coordinate *++ calculate D MOV @DY,R1 ; get dy MOV @DYXSN,R0 ; D > 0? JGT LINE10 ; yup JMP LINE11 ; nope LINE10 A R5,R6 ; inc/dec dependent variable S @DX,R1 ; dy-dx LINE11 SLA R1,1 ; 2*dy or 2*(dy-dx) A R1,@DYXSN ; D = D+[2*dy or 2*(dy-dx)] B *CRU ; branch to x-axis or y-axis plotting LNXAX MOV R7,R1 ; get next x for DOT MOV R6,R0 ; get accumulator contents to y for DOT JMP LNPLOT ; go to plot LNYAX MOV R7,R0 ; get next y for DOT MOV R6,R1 ; get accumulator contents to x for DOT LNPLOT BL @__DTBM ; plot the dot (R0 = y, R1 = x) JMP LNLOOP ; next point LINEX BL @RTNEXT ; yup, back to bank 0 and the inner interpreter ;] ...lee Quote Link to comment Share on other sites More sharing options...
eck Posted June 21, 2014 Share Posted June 21, 2014 Sorry, Lee, I have no idea what magic is going on here. So this might be crap - and it saves only 4 bytes, if it fits:(New Line routine) change line 116 from JMP LNPLOT to JMP LINE09 and line 120 from JMP LNLOOP to JMP LINE09 and delete line 119. 1 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 22, 2014 Author Share Posted June 22, 2014 (edited) Sorry, Lee, I have no idea what magic is going on here. So this might be crap - and it saves only 4 bytes, if it fits:(New Line routine) change line 116 from JMP LNPLOT to JMP LINE09 and line 120 from JMP LNLOOP to JMP LINE09 and delete line 119. Thanks for looking it over. You are absolutely right. Your fix is much better. I had found 1 or 2 similar fixes, but missed that one. I expect there are other places this routine can be tightened up, as well. Regarding the "magic" going on, the idea here is to use Bresenham's line algorithm for plotting a rasterized line in bitmap mode on the TI, given the coordinates of the line ends. I based my LINE routine on the pseudocode near the bottom of the "Derivation" discussion here. My code is more complicated than the referenced pseudocode because I need to handle more than just a line with a positive slope (dy/dx < 1). The solution is to convert all lines encountered to just such conditions, but tracking a negative slope by adding a negative value (-1) to "increment" the dependent variable in a negative direction. If the slope is >1, it is still listed in the ALC as dy/dx, even though it is now the inverse (dx/dy) , with x as the dependent variable. Getting into and out of this code is a little convoluted because it is part of fbForth's threaded interpretive language. I have made it more convoluted because I split the Forth word headers from the rest of the threaded dictionary in ROM to make more room in ROM bank0. I'm not sure that is the best way to handle it, but it is working pretty well so far. Thanks again for looking over my shoulder. ...lee [EDIT: The URL for Bresenham's line algorithm was mangled by AtariAge's editor. ] Edited June 22, 2014 by Lee Stewart Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 22, 2014 Author Share Posted June 22, 2014 (edited) Cleaned-up LINE routine— Here is the LINE routine with @eck's improvements, serialized labels in order and B *CRU changed to a conditional jump (line 114): ;[*** LINE *** ( x1 y1 x2 y2 --- ) ( alternative LINE---one or the other) *++ This is an integer, no-divide version of the Bresenham line algorithm * LINE does the following: * 1) Computes dy = y2-y1 and dx = x2-x1 * 2) Determines which direction, x or y, has slope <= 1 * x) Flips dx and dy * y) Leaves dx and dy alone * 3) sets DOTCNT = dx in R4 * 4) Computes D = 2*dy-dx * 5) Forces plotting direction to be positive for independent variable * 6) Sets starting y|x accumulator as acc = (y|x) * 7) Finds accumulator increment as inc = +1|-1 * Plots first dot * 9) Each time through dot plotting loop: * a) Loop counter check * b) x|y = x|y + 1 * c) D > 0? * yes) * y1) acc = acc + inc * y2) D = D+2*(dy-dx) * no) D = D+2*dy * d) y|x = acc * e) Plot dot * f) Decrement point counter * DATA DTBM_N * LINE_N DATA 4+TERMBT*LSHFT8+'L','IN','E '+TERMBT * LINE DATA $+2 * LINEP BL @BLF2A * DATA _LINE->6000+BANK1 DX EQU FAC DY EQU FAC+2 DYXSN EQU FAC+4 ; sign of dy/dx or dx/dy, then, D * Register usage--- * R0: varies * R1: varies * R2: y2 * R3: x2 * R4: y1, then, point (dot) count for line (DOTCNT) * R5: x1, then, increment for dependent coordinate (INC) (+1|-1) * R6: accumulator for dependent coordinate (ACC) * R7: current independent coordinate (COORD) * R12: contains flag for principal axis (1 = x axis, 0 = y axis) _LINE MOV *SP+,R2 ; pop coordinates MOV *SP+,R3 MOV *SP+,R4 MOV *SP+,R5 SETO @DYXSN ; initially, store -1 as sign of slope MOV R2,R0 ; calculate dy S R4,R0 MOV R0,R1 ; prepare for sign calculation ABS R0 MOV R0,@DY MOV R3,R0 ; calculate dx S R5,R0 XOR R0,R1 ; calculate sign of slope (dy/dx|dx/dy) JLT LINE01 ; negative slope? NEG @DYXSN ; change sign to +1 LINE01 ABS R0 MOV R0,@DX MOV @DY,R1 C R1,R0 ; compare|dy| to |dx| JLT LINE04 ; dy < dx? MOV R0,@DY ; no, flip dy MOV R1,@DX ; and dx MOV R4,R7 ; assume starting with y1 MOV R5,R6 ; and x1 (to ACC) C R4,R2 ; should we switch? JGT LINE02 ; yes JMP LINE03 ; no LINE02 MOV R2,R7 ; we're starting with y2 MOV R3,R6 ; and x2 (to ACC) LINE03 CLR CRU ; 0 to CRU (R12) to indicate y-axis processing JMP LINE07 LINE04 MOV R5,R7 ; assume starting with x1 MOV R4,R6 ; and y1 (to ACC) C R5,R3 ; should we switch? JGT LINE05 ; yes JMP LINE06 ; no LINE05 MOV R3,R7 ; we're starting with x2 MOV R2,R6 ; and y2 (to ACC) LINE06 LI CRU,1 ; 1 to CRU (R12) to indicate x-axis processing LINE07 MOV @DYXSN,R5 ; get sign to INC register before we destroy it! MOV @DY,R0 ; calculate D SLA R0,1 ; D = 2*dy S @DX,R0 ; D = 2*dy-dx MOV R0,@DYXSN ; store D in DYXSN MOV @DX,R4 ; load point counter MOV CRU,CRU ; x or y axis? JNE LINE08 ; x-axis MOV R7,R0 ; y-axis, COORD to y for DOT MOV R6,R1 ; ACC to x for DOT JMP LNLOOP ; to first plot LINE08 MOV R7,R1 ; x-axis, COORD to x for DOT MOV R6,R0 ; ACC to y for DOT LNLOOP BL @__DTBM ; plot first dot (R0 = y, R1 = x) MOV R4,R4 ; are we done? JEQ LINEX ; yup! DEC R4 ; decrement counter INC R7 ; increment principal coordinate *++ calculate D MOV @DY,R1 ; get dy MOV @DYXSN,R0 ; D > 0? JGT LINE09 ; yup JMP LINE10 ; nope LINE09 A R5,R6 ; inc/dec dependent variable S @DX,R1 ; dy-dx LINE10 SLA R1,1 ; 2*dy or 2*(dy-dx) A R1,@DYXSN ; D = D+[2*dy or 2*(dy-dx)] MOV CRU,CRU ; x-axis or y-axis? JEQ LNYAX ; y-axis MOV R7,R1 ; x-axis, get next x for DOT MOV R6,R0 ; get accumulator contents to y for DOT JMP LNLOOP ; go to plot LNYAX MOV R7,R0 ; y-axis, get next y for DOT MOV R6,R1 ; get accumulator contents to x for DOT JMP LNLOOP ; plot the dot (R0 = y, R1 = x) & on to next point LINEX BL @RTNEXT ; yup, back to bank 0 and the inner interpreter ;] Unexpectedly (at least, to me), this version is no faster than TI's version. I like it better; but, when I used it to fill the SPLIT2 graphics mode screen with lines radiating from the upper left corner, the current, Bresenham-derived LINE took 52 seconds in Classic99 and the TI version took 49 seconds. I took a look at the plotting loop in each version and found (without accounting for different instruction times) that the Bresenham-derived LINE has 16 instructions in both paths through the loop while the TI version has only 9 or 10 instructions! Part of the problem is that I have 3 variables in scratchpad RAM for which I have no available register space. I could trade that space for a BLWP to the DOT routine; but, I don't think that would help. I could also use variables in scratchpad RAM for loop variables that do not need to be put in registers each time through the loop as do D, dy and dx. That would reduce the number of instructions. Perhaps, I'll try that and report back. I'm glad we had this little talk. For those who want to follow this code through the call to DOT (BL @__DTBM) in line 99, here is that code: *++ put the following DATA statements in ROM---they do not change *++ <<< also, it looks like only the first 8 bytes ever get used!!!! >>> DTAB DATA >8040,>2010,>0804,>0201 ; array used only in bitmap modes (4,5,6) DATA >7FBF,>DFEF,>F7FB,>FDFE ; ..[ALC only] DATA >8040,>2010,>0804,>0201 ;[*** DOT *** ( x y --- ) * Plot a dot at dotcolumn x and dotrow y in bitmap mode. * DATA DTOG_N * DTBM_N DATA 3+TERMBT*LSHFT8+'D','OT'+TERMBT * DTBM DATA @+2 * DTBMP BL @BLF2A * DATA _DTBM->6000+BANK1 _DTBM *++ get bit-set byte and vaddr to stack (from old DDOT code) MOV *SP+,R0 ; pop y from stack MOV *SP+,R1 ; pop x from stack BL @__DTBM ; branch to body of this routine BL @RTNEXT *++ body of DOT routine to allow call by other routines in this bank *++ Registers passed must contain *++ R0: y coordinate *++ R1: x coordinate __DTBM MOV R0,R2 ; y to R2 MOV R1,R3 ; x to R3 ANDI R0,>0007 ; R0 = 3 right bits of y = char dotrow ANDI R1,>0007 ; R1 = 3 right bits of x = char dotcolumn ANDI R2,>00F8 ; R2 = 5 left bits of y = start dot row of char pattern ANDI R3,>00F8 ; R3 = 5 left bits of x = start dot column of char pattern SLA R2,>0005 ; R2 * 32 = PDT offset of char pattern row's 1st row A R2,R0 ; R0 = PDT offset of char pattern's row byte A R3,R0 ; R0 = PDT offset of char pattern byte of dot AI R0,>2000 ; convert to actual location in VRAM (vaddr) of pattern byte CLR R2 MOVB @DTAB(R1),R2 ; bit mask (b) of dot to high byte of R2 *++ beginning of old DOT code CLR R1 LIMI 0 BLWP @VSBR ; get byte to operate on to high byte of R1 MOV @$DMODE,R3 ; get DMODE DEC R3 ; make it -1, 0 or +1 JEQ DUNDR ; undraw? JLT DRW ; draw? XOR R2,R1 ; toggle bit to be drawn|undrawn JMP DOTCOL DUNDR SZC R2,R1 ; clear bit to be undrawn JMP DOTCOL DRW SOC R2,R1 ; set bit to be drawn DOTCOL BLWP @VSBW ; write result back to VRAM <<< ensure R0 preserved!!! >>> MOVB @$DCOL+1,R1 ; dcolor to high byte (low byte should still = 0) JLT DOTEX S @H2000,R0 ; adjust to point to color table <<< ensure R0 preserved!!! >>> BLWP @VSBW ; write new colors to color table DOTEX LIMI 2 RT * : DOT ( x y --- ) * DDOT DUP 2000 - >R DMODE @ ( PS: b vaddr dmode RS: vaddr-2000) * CASE 0 OF VOR ENDOF ( draw ) ( PS: RS: vaddr-2000) * 1 OF SWAP FF XOR SWAP VAND ENDOF ( undraw ) ( PS: b.xor.FFh vaddr RS: vaddr-2000) * 2 OF VXOR ENDOF ( toggle ) * DROP DROP ENDCASE R> * DCOLOR @ 0 < IF DROP ELSE DCOLOR @ SWAP VSBW THEN ; ;]* ...lee Edited June 23, 2014 by Lee Stewart Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 22, 2014 Author Share Posted June 22, 2014 No change in speed —I just traded one set of MOVes for another. Aside from more clever programming than I have so far managed, I suppose the only increase in speed I'm going to get is by changing the one occurrence of BLWP @VSBR and the two of BLWP @VSBW in DOT to inline code. Any ideas? ...lee Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 23, 2014 Author Share Posted June 23, 2014 Almost done with the graphics primitives! I have the joystick words (simple) and multicolor characters yet to test before I let out the next beta. Soon, I promise! ...lee Quote Link to comment Share on other sites More sharing options...
Willsy Posted June 23, 2014 Share Posted June 23, 2014 No change in speed —I just traded one set of MOVes for another. Aside from more clever programming than I have so far managed, I suppose the only increase in speed I'm going to get is by changing the one occurrence of BLWP @VSBR and the two of BLWP @VSBW in DOT to inline code. Any ideas? ...lee That would make a big difference I think. BLWP is a slow instruction (though it's doing quite a lot of work). Quote Link to comment Share on other sites More sharing options...
eck Posted June 23, 2014 Share Posted June 23, 2014 To late. Hi,ho, Lee! Thank you for your explanation and the link. BLWP is good for 26 machine cycles plus 6 memory cycles. If you are able to save the 'pops' in the subprograms too, than you could save about 300 cycles per pixel. You will loose the beauty of your code, if you are going to take the 'pasta route' on the one hand, but if you have enough room left in your page, than I think it is worth a try on the other hand. There should be few people, who will ever look inside your code complaining this sort of programming, but there will be more folks, whom will enjoy the functionality of your work, is my opinion, which is not proofed. Hope I find the time to port your code to my computer to watch it going. Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 24, 2014 Author Share Posted June 24, 2014 Multicolor mode words ( MULTI , MINIT , MCHAR ) are working! I just need to verify the joystick words ( JMODE , JOYST , JKBD , JCRU ) and update FBLOCKS. I think I will also play with inlining VSBR and VSBW in DOT before I release the next beta. ...lee Quote Link to comment Share on other sites More sharing options...
Tursi Posted June 24, 2014 Share Posted June 24, 2014 VSBR has a lot more overhead than just the BLWP, it jumps around all over the place to do that single byte write. In addition, its registers are in 8-bit RAM. Doing it inline is orders of magnitude faster. I did take up the challenge and start working on a faster line draw to see if I could beat what you had, but, it's unlikely I'll be able to test it before the weekend. Also.. I wasn't sure if I was restricted to the registers you used or if I have any others (one more would be very nice!) And finally, didn't know how much ROM space was available. No big deal if you can't use it, I'm having fun trying a new approach. 2 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 24, 2014 Author Share Posted June 24, 2014 (edited) VSBR has a lot more overhead than just the BLWP, it jumps around all over the place to do that single byte write. In addition, its registers are in 8-bit RAM. Doing it inline is orders of magnitude faster. I did take up the challenge and start working on a faster line draw to see if I could beat what you had, but, it's unlikely I'll be able to test it before the weekend. Also.. I wasn't sure if I was restricted to the registers you used or if I have any others (one more would be very nice!) And finally, didn't know how much ROM space was available. No big deal if you can't use it, I'm having fun trying a new approach. Most fbForth words use the main fbForth workspace in scratchpad RAM at 8300h – 831Fh. R0 – R7 are expected to be free for any word to use in ALC. R11 (LINK) has its usual function as does R12 (CRU). The rest are committed to fbForth and should not be touched unless you know how and when the inner interpreter uses and needs them. You may have noticed that I have been using R12 with abandon when the CRU is not involved in a given fbForth word—KSCAN is using its own registers, so I can still use it. In a few cases, I have boxed myself in; but, most of the time, I'm OK. Because I am using the fbForth workspace for both LINE and DOT (called by LINE ), I have limited myself there. With inline code for VSBR and VSBW, I can probably get away with using the main workspace, which would be faster. Regarding VSBR and VSBW (and all the other VDP access routines, for that matter), I am not using the E/A code (not that you implied such). I am using code based on what the original TI Forth programmers wrote. I haven't compared the code, so I can't really say much more than that, except that it does use 8-bit RAM for registers. There's very little room in scratchpad RAM for additional workspaces, though I may start using the space that includes FAC and ARG from 8348h – 836Dh (38 bytes) when it's not in use by DSRs or floating point math. There's also a 14-byte space at 8320h immediately following the main workspace, though not enough for a full workspace. As you can see, there are some options—and, probably more I haven't imagined yet. ...lee Edited June 24, 2014 by Lee Stewart 2 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 24, 2014 Author Share Posted June 24, 2014 I added a VDP mode-changing word, VMODE , to change VDP modes by the mode number. It adds only 24 bytes, including a 14-byte jump table. This will allow restoration to a previous mode simply by saving and restoring the mode number. I am still working on inlining VSBR and VSBW. Maybe I will get to posting the next beta later today. ...lee Quote Link to comment Share on other sites More sharing options...
Tursi Posted June 24, 2014 Share Posted June 24, 2014 Odds are that your VDP utilities are much faster then, I did in fact imply the EA versions... I was not sure. Thanks for the breakdown on Forth space... if I steal your idea to use FAC and ARG as a workspace I think that will help my code (I'm already using all the registers you listed -- including R12 /and/ R11 - for storing needed data, and I needed one more. The workspace switch should be worth the reduced work inside the loop ). No promises, don't wait for me, etc, etc, but it's a fun challenge and I didn't try drawing lines with the bitmap mode layout in mind before this. 1 Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 24, 2014 Author Share Posted June 24, 2014 Odds are that your VDP utilities are much faster then, I did in fact imply the EA versions... I was not sure. Thanks for the breakdown on Forth space... if I steal your idea to use FAC and ARG as a workspace I think that will help my code (I'm already using all the registers you listed -- including R12 /and/ R11 - for storing needed data, and I needed one more. The workspace switch should be worth the reduced work inside the loop ). No promises, don't wait for me, etc, etc, but it's a fun challenge and I didn't try drawing lines with the bitmap mode layout in mind before this. I just looked at the E/A code for VSBr and VSBW. TI forth/fbForth's is a straight-up copy. It looks like the only savings will be avoiding the BLWP/RTWP, unless I can also avoid BL/RTs. We'll see. One thing I might do, if simply inlining the code doesn't look promising, is to do a BLWP/RTWP for LINE , which would give me 13 registers between LINE and DOT . I want to avoid BLWP/RTWP for DOT because of the repeated calls to it. ...lee Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 24, 2014 Author Share Posted June 24, 2014 Wow!—inlining the code for VSBR and VSBW into DOT only improved it by 20 seconds! It is now 32 seconds, a 60+% improvement. I guess that's something, but not exactly what I expected. ...lee Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 25, 2014 Author Share Posted June 25, 2014 Changing the code for LINE to use a context switch to its own registers starting at FAC and using only registers, except for a one-time instruction at the beginning, is 1 second faster (~3%) and 26 bytes less code (170 bytes). Recall that the routine I am timing is discussed above in post #712. I might add that it is plotting 85,807 dots and toggling the color at every plot. Here is the current code for LINE followed by the current code for DOT , with VSBR and two VSBWs inlined: LINE ;[### LINE *** ( x1 y1 x2 y2 --- ) ( alternative LINE---one or the other) *++ This is an integer, no-divide version of the Bresenham line algorithm * LINE does the following: * 1) Computes dy = y2-y1 and dx = x2-x1 * 2) Determines which direction, x or y, has slope <= 1 * x) Flips dx and dy * y) Leaves dx and dy alone * 3) sets DOTCNT = dx in R4 * 4) Computes D = 2*dy-dx * 5) Forces plotting direction to be positive for independent variable * 6) Sets starting y|x accumulator as acc = (y|x) * 7) Finds accumulator increment as inc = +1|-1 * Plots first dot * 9) Each time through dot plotting loop: * a) Loop counter check * b) x|y = x|y + 1 * c) D > 0? * yes) * y1) acc = acc + inc * y2) D = D+2*(dy-dx) * no) D = D+2*dy * d) y|x = acc * e) Plot dot * f) Decrement point counter * DATA DTBM_N * LINE_N DATA 4+TERMBT*LSHFT8+'L','IN','E '+TERMBT * LINE DATA $+2 * LINEP BL @BLF2A * DATA _LINE->6000+BANK1 * Register usage--- * R0: varies * R1: varies * R2: y2 * R3: x2 * R4: y1, then, point (dot) count for line (DOTCNT) * R5: x1, then, increment for dependent coordinate (INC) (+1|-1) * R6: accumulator for dependent coordinate (ACC) * R7: current independent coordinate (COORD) * R8: dx, then, 2*dx * R9: dy, then, 2*dy * R10: sign of dy/dx or dx/dy, then, D * R12: contains flag for principal axis (1 = x axis, 0 = y axis) _LINE STPTR EQU MAINWS+18 ; stack pointer (fbForth's R9) LWPI FAC ; let's use our ws MOV @STPTR,R0 ; get stack pointer to R0 MOV *R0+,R2 ; pop coordinates (won't actually change Forth SP) MOV *R0+,R3 MOV *R0+,R4 MOV *R0+,R5 SETO R10 ; initially, store -1 as sign of slope MOV R2,R0 ; calculate dy S R4,R0 MOV R0,R1 ; prepare for sign calculation ABS R0 MOV R0,R9 MOV R3,R0 ; calculate dx S R5,R0 XOR R0,R1 ; calculate sign of slope (dy/dx|dx/dy) JLT LINE01 ; negative slope? NEG R10 ; change sign to +1 LINE01 ABS R0 MOV R0,R8 MOV R9,R1 C R1,R0 ; compare|dy| to |dx| JLT LINE04 ; dy < dx? MOV R0,R9 ; no, flip dy MOV R1,R8 ; and dx MOV R4,R7 ; assume starting with y1 MOV R5,R6 ; and x1 (to ACC) C R4,R2 ; should we switch? JGT LINE02 ; yes JMP LINE03 ; no LINE02 MOV R2,R7 ; we're starting with y2 MOV R3,R6 ; and x2 (to ACC) LINE03 CLR CRU ; 0 to CRU (R12) to indicate y-axis processing JMP LINE07 LINE04 MOV R5,R7 ; assume starting with x1 MOV R4,R6 ; and y1 (to ACC) C R5,R3 ; should we switch? JGT LINE05 ; yes JMP LINE06 ; no LINE05 MOV R3,R7 ; we're starting with x2 MOV R2,R6 ; and y2 (to ACC) LINE06 LI CRU,1 ; 1 to CRU (R12) to indicate x-axis processing LINE07 MOV R10,R5 ; get sign to INC register before we destroy it! SLA R9,1 ; dy = 2*dy (we don't need dy by itself any more) MOV R9,R0 ; calculate D S R8,R0 ; D = 2*dy-dx MOV R0,R10 ; store D in DYXSN MOV R8,R4 ; load point counter SLA R8,1 ; 2*dx (we don't need dx by itself any more) MOV CRU,CRU ; x or y axis? JNE LINE08 ; x-axis MOV R7,R0 ; y-axis, COORD to y for DOT MOV R6,R1 ; ACC to x for DOT JMP LNLOOP ; to first plot LINE08 MOV R7,R1 ; x-axis, COORD to x for DOT MOV R6,R0 ; ACC to y for DOT LNLOOP BL @__DTBM ; plot first dot (R0 = y, R1 = x) MOV R4,R4 ; are we done? JEQ LINEX ; yup! DEC R4 ; decrement counter INC R7 ; increment principal coordinate *++ Calculate D MOV R9,R1 ; get 2*dy MOV R10,R0 ; D > 0? JGT LINE09 ; yup JMP LINE10 ; nope LINE09 A R5,R6 ; inc/dec dependent variable S R8,R1 ; 2*dy-2*dx LINE10 A R1,R10 ; D = D+[2*dy or 2*dy-2*dx)] MOV CRU,CRU ; x-axis or y-axis? JEQ LNYAX ; y-axis MOV R7,R1 ; x-axis, get next x for DOT MOV R6,R0 ; get accumulator contents to y for DOT JMP LNLOOP ; go to plot LNYAX MOV R7,R0 ; y-axis, get next y for DOT MOV R6,R1 ; get accumulator contents to x for DOT JMP LNLOOP ; plot the dot (R0 = y, R1 = x) & on to next point LINEX LWPI MAINWS ; RESTORE MAIN WS AI SP,8 ; REDUCE STACK BY 4 CELLS BL @RTNEXT ; back to bank 0 and the inner interpreter ;] DOT ;[### DOT *** ( x y --- ) * Plot a dot at dotcolumn x and dotrow y in bitmap mode. * DATA DTOG_N * DTBM_N DATA 3+TERMBT*LSHFT8+'D','OT'+TERMBT * DTBM DATA @+2 * DTBMP BL @BLF2A * DATA _DTBM->6000+BANK1 _DTBM *++ get bit-set byte and vaddr to stack (from old DDOT code) MOV *SP+,R0 ; pop y from stack MOV *SP+,R1 ; pop x from stack BL @__DTBM ; branch to body of this routine BL @RTNEXT *++ body of DOT routine to allow call by other routines in this bank *++ Registers passed must contain *++ R0: y coordinate *++ R1: x coordinate __DTBM MOV R0,R2 ; y to R2 MOV R1,R3 ; x to R3 ANDI R0,>0007 ; R0 = 3 right bits of y = char dotrow ANDI R1,>0007 ; R1 = 3 right bits of x = char dotcolumn ANDI R2,>00F8 ; R2 = 5 left bits of y = start dot row of char pattern ANDI R3,>00F8 ; R3 = 5 left bits of x = start dot column of char pattern SLA R2,>0005 ; R2 * 32 = PDT offset of char pattern row's 1st row A R2,R0 ; R0 = PDT offset of char pattern's row byte A R3,R0 ; R0 = PDT offset of char pattern byte of dot AI R0,>2000 ; convert to actual location in VRAM (vaddr) of pattern byte CLR R2 MOVB @DTAB(R1),R2 ; bit mask (b) of dot to high byte of R2 *++ beginning of old DOT code CLR R1 LIMI 0 *++ vsbr --- get byte to operate on to high byte of R1 STWP R3 ; get this ws address MOVB @1(R3),@VDPWA ; Write low byte of address MOVB R0,@VDPWA ; Write high byte of address MOVB @VDPRD,R1 ; Read data MOV @$DMODE,R3 ; get DMODE DEC R3 ; make it -1, 0 or +1 JEQ DUNDR ; undraw? JLT DRW ; draw? XOR R2,R1 ; toggle bit to be drawn|undrawn JMP DOTCOL DUNDR SZC R2,R1 ; clear bit to be undrawn JMP DOTCOL DRW SOC R2,R1 ; set bit to be drawn *++ vsbw --- write result back to VRAM <<< ensure R0 preserved!!! >>> DOTCOL STWP R3 ; get this ws address (address of r0) MOVB @1(R3),@VDPWA ; Write low byte of address ORI R0,>4000 ; Properly adjust VDP write bit MOVB R0,@VDPWA ; Write high byte of address ANDI R0,>3FFF ; restore R0 MOVB R1,@VDPWD ; Write data MOVB @$DCOL+1,R1 ; dcolor to high byte (low byte should still = 0) JLT DOTEX S @H2000,R0 ; adjust to point to color table <<< ensure R0 preserved!!! >>> *++ vsbw --- write new colors to color table MOVB @1(r3),@VDPWA ; Write low byte of address (r3 should still have address of r0) ORI R0,>4000 ; Properly adjust VDP write bit MOVB R0,@VDPWA ; Write high byte of address MOVB R1,@VDPWD ; Write data DOTEX LIMI 2 RT * : DOT ( x y --- ) * DDOT DUP 2000 - >R DMODE @ ( PS: b vaddr dmode RS: vaddr-2000) * CASE 0 OF VOR ENDOF ( draw ) ( PS: RS: vaddr-2000) * 1 OF SWAP FF XOR SWAP VAND ENDOF ( undraw ) ( PS: b.xor.FFh vaddr RS: vaddr-2000) * 2 OF VXOR ENDOF ( toggle ) * DROP DROP ENDCASE R> * DCOLOR @ 0 < IF DROP ELSE DCOLOR @ SWAP VSBW THEN ; ;]* Note that LINE branches at line 104 into DOT at its line 23. In a little while just for shits and giggles , I'm going to time the same routine using Forth code inherited from TI Forth. I expect it to be miserably slow—we'll see. ...lee Quote Link to comment Share on other sites More sharing options...
+Lee Stewart Posted June 25, 2014 Author Share Posted June 25, 2014 OMG—this is just wrong! It takes the Forth source coded graphics primitives 347 seconds to do the plot—that's more than 11 times slower! I am very glad I took the time to rewrite all of that code. I think I will post the next beta later tonight, but no later than tomorrow morning. ...lee Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.