fbForth fbForth—TI Forth with File-based Block I/O [Post #1 UPDATED: 06/09/2023]

+Lee Stewart · June 9, 2014

Wow! I now know more about multicolor mode than I ever wanted to know! :-o MCHAR is done—finally.

Now, it's on to the last graphics primitive, LINE , which draws a bitmapped line between 2 sets of pixel coordinates. After that, I can focus on the VDP mode words: TEXT , TEXT80 , GRAPHICS , MULTI , GRAPHICS2 , SPLIT and SPLIT2 .

...lee

+Lee Stewart · June 11, 2014

I have finished my first pass at LINE ! Before assembling and testing, it is 69 ALC instructions! I still need to change a few scratchpad RAM storage locations (used to help me keep things straight) to registers, especially those in the pixel-plotting loop. At least I don't have a DIV instruction inside the loop as the TI programmer did. I do have two SRL 8 instructions to effectively divide by 256; but, those are much faster than DIV (I think!). I won't be able to assemble and test the code until I finish with the VDP mode words, which are next on the list, though I actually may translate the ALC to Forth Assembler to test sooner if it will fit in one block.

For those champing at the bit to look over my shoulder, here's my current code for LINE (the original TI Forth definition follows the ALC as ALC comments):

;[*** LINE ***      ( x1 y1 x2 y2 --- )
* LINE does the following:
*     1) Computes dy = y2-y1 and dx =  x2-x1
*     2) Adds sign (+1|-1|0) to differences
*     3) Determines which direction, x or y, has slope <= 1 to avoid
*         DIV overflow
*     4) Forces plotting direction to be positive
*     5) Sets starting y|x accumulator as acc = (y|x)*256
*     6) Computes accumulator increment as inc = slope*256
*     7) Each time through dot plotting loop:
*         a) x|y = x|y + 1
*         b) y|x = acc/256  <--truncates fraction
*         c) Plot dot
*         d) acc = acc+inc

       DATA DOT__N
LINE_N DATA 4+TERMBT*LSHFT8+'L','IN','E '+TERMBT
LINE   DATA @+2
LINEP  BL   @BLF2A
       DATA _LINE->6000+BANK1

X1     EQU  FAC
X2     EQU  FAC+2
Y1     EQU  FAC+4
Y2     EQU  FAC+6
DX     EQU  ARG
DY     EQU  ARG+2
ACC    EQU  ARG+4
INC    EQU  ARG+6
DOTCNT EQU  FREEPD
COORD  EQU  FREEPD+2

_LINE  MOV  *SP+,@Y2         ; pop coordinates
       MOV  *SP+,@X2
       MOV  *SP+,@Y1
       MOV  *SP+,@X1
       MOV  @Y2,R0          ; calculate dy
       S    @Y1,R0
       BL   @_SNW           ; add sign
       MOV  R0,@DY
       MOV  @X2,R0          ; calculate dx
       S    @X1,R0
       BL   @_SNW           ; add sign
       MOV  R0,@DX
       ABS  R0
       MOV  @DY,R1
       ABS  R1
       C    R1,R0           ; compare|dy| to |dx|
       JLT  LINE01
       INC  R1              ; increment for point count
       MOV  R1,@DOTCNT      ; store point count
       MOV  @Y1,@COORD      ; assume starting with y1
       MOV  @X1,R4          ;   and x1 (to R4 temporarily)
       C    @Y1,@Y2         ; should we switch?
       JGT  LINE03          ; yes
       JMP  LINE04          ; no
LINE03 MOV  @Y2,@COORD      ; we're starting with y2
       MOV  @X2,R4          ;   and x2 (to R4 temporarily)
LINE04 LI   CRU,LNYAX       ; load CRU (R12) with LNYAX to indicate y-axis processing
       MOV  @DX,R1          ; load dx
       SLA  R1,8            ; multiply dx by 256
       CLR  R0  
       DIV  @DY,R1          ; 256*dx/dy
       MOV  R0,@INC         ; load increment
       JMP  LINE02
LINE01 INC  R0              ; increment for point count
       MOV  R0,@DOTCNT      ; store point count
       MOV  @X1,@COORD      ; assume starting with x1
       MOV  @Y1,R4          ;   and y1 (to R4 temporarily)
       C    @X1,@X2         ; should we switch?
       JGT  LINE05          ; yes
       JMP  LINE06          ; no
LINE05 MOV  @X2,@COORD      ; we're starting with x2
       MOV  @Y2,R4          ;   and y2 (to R4 temporarily)
LINE06 LI   CRU,LNXAX       ; load CRU (R12) with LNXAX to indicate x-axis processing
       MOV  @DY,R1          ; load dy
       SLA  R1,8            ; multiply dy by 256
       CLR  R0  
       DIV  @DX,R1          ; 256*dy/dx
       MOV  R0,@INC         ; load increment
LINE02 SLA  R4,8            ; * 256
       MOV  R4,@ACC         ; store it
LNLOOP B    *CRU            ; branch to x-axis or y-axis plotting
LNXAX  MOV  @COORD,R3       ; get next x for DOT
       MOV  @ACC,R0         ; get accumulator contents to y for DOT
       SRL  R0,8            ; divide by 256 for proper position
       JMP  LNPLOT          ; go to plot
LNYAX  MOV  @COORD,R0       ; get next y for DOT
       MOV  @ACC,R3         ; get accumulator contents to x for DOT
       SRL  R3,8            ; divide by 256 for proper position
LNPLOT BL   @__DOT          ; plot the dot (R0 = y, R3 = x)
       A    @INC,@ACC       ; increment accumulator 
       INC  @COORD          ; increment principal coordinate
       DEC  @DOTCNT         ; decrement counter
       JLT  LNLOOP          ; are we done?
       BL   @RTNEXT         ; yup, back to bank 0 and the inner interpreter

* ( High-level Forth for LINE from TI Forth---comments are mine)

* : LINE   ( x1 y1 x2 y2--- )
*     >R R ROT >R R       ( copy y1, y2 to return stack)
*     - SNW               ( add +1|-1|0 to row difference [dy] per sign)
*     SWAP >R R ROT >R R  ( copy x1, x2 to return stack)
*     - SNW               ( add +1|-1|0 to column difference [dx] per sign)
*     OVER ABS            ( dup dy; get |dy|)
*     OVER ABS            ( dup dx; get |dx|)
*     <                   ( |dy| < |dx|)
*     >R R                ( copy [|dy|<|dx|] to return stack)
*     0=                  ( not [|dy|<|dx|])
*     ( insure we use slope <= 1)
*     IF                  ( [|dy| >= |dx|]?)
*         SWAP            ( yes; swap dy and dx)
*     THEN 
*     100 ROT ROT */      ( slope [<= 1] * 256)
*     R>                  ( get |dy|<|dx| from return stack)
*     ( work on X axis?)
*     IF                      ( |dy|<|dx|?)
*         R> R>               ( get x1 & x2 from return stack)
*         OVER OVER           ( dup them)
*         ( insure drawing left to right)
*         >                   ( x1 > x2?)
*         IF                  ( yes) 
*             SWAP            ( start at x2)
*             R> DROP R>      ( get y1 & y2 from return stack; discard y1)
*         ELSE                ( no; we're starting at x1)
*             R> R> DROP      ( get y1 & y2 from return stack; discard y2)
*         THEN 
*         100 *               ( y1*256 [initial value of y for DO])
*         ROT ROT 1+ SWAP     ( x2+1 x1 for DO)
*         DO 
*             I OVER          ( I; y = y1*256, 1st time through)
*             0 100 M/ SWAP DROP  ( y/256)
*             DOT             ( plot dot [I,y/256])
*             OVER +          ( y = y+256*dy/dx)
*         LOOP           
*     ELSE                    ( work on Y axis instead) 
*         R> R> R> R>         ( get x1 x2 y1 y2 from return stack)
*         ROT >R ROT >R       ( put x2 x1 back to return stack)
*         OVER OVER           ( dup y1 y2 for comparison)
*         ( insure drawing top to bottom )
*         >                   ( y1 > y2?)
*         IF                  ( yes) 
*             SWAP            ( start at y2)
*             R> DROP R>      ( get x1 & x2 from return stack; discard x1)               
*         ELSE                ( no; we're starting at y1)
*             R> R> DROP      ( get x1 & x2 from return stack; discard x2)                                  
*         THEN               
*         100 *               ( x1*256 [initial value of x for DO])
*         ROT ROT 1+ SWAP     ( y2+1 y1 for DO)                       
*         DO 
*             DUP             ( x = x1*256, 1st time through)
*             0 100 M/ SWAP DROP  ( x/256)
*             I 
*             DOT             ( plot dot [x/256,I])
*             OVER +          ( x = x+256*dx/dy)
*         LOOP            
*     THEN 
*     DROP DROP               ( clean up)
;]

I welcome any suggestions for improving the code.

...lee

EDIT: Corrected lines 51 & 67: @DOTCNT

Edited June 11, 2014 by Lee Stewart

+Lee Stewart · June 11, 2014

OK...Here is LINE using mostly registers (still not tested, however)—

;[*** LINE ***      ( x1 y1 x2 y2 --- )
* LINE does the following:
*     1) Computes dy = y2-y1 and dx =  x2-x1
*     2) Adds sign (+1|-1|0) to differences
*     3) Determines which direction, x or y, has slope <= 1 to avoid
*         DIV overflow
*     4) Forces plotting direction to be positive
*     5) Sets starting y|x accumulator as acc = (y|x)*256
*     6) Computes accumulator increment as inc = slope*256
*     7) Each time through dot plotting loop:
*         a) x|y = x|y + 1
*         b) y|x = acc/256  <--truncates fraction
*         c) Plot dot
*         d) acc = acc+inc

       DATA DOT__N
LINE_N DATA 4+TERMBT*LSHFT8+'L','IN','E '+TERMBT
LINE   DATA @+2
LINEP  BL   @BLF2A
       DATA _LINE->6000+BANK1

DOTCNT EQU  FAC         ; temporary storage for point (dot) count for line
DX     EQU  FAC+2
DY     EQU  FAC+4

* Register usage---
*       R0:  varies
*       R1:  varies
*       R2:  y2
*       R3:  x2
*       R4:  y1, then, point (dot) count for line (DOTCNT)
*       R5:  x1, then, increment for dependent coordinate (INC)
*       R6:  accumulator for dependent coordinate (ACC)
*       R7:  current independent coordinate       (COORD)
*      R12:  contains label for principal axis (LNXAX or LNYAX)

_LINE  MOV  *SP+,R2         ; pop coordinates
       MOV  *SP+,R3
       MOV  *SP+,R4
       MOV  *SP+,R5
       MOV  R2,R0           ; calculate dy
       S    R4,R0
       BL   @_SNW           ; add sign
       MOV  R0,@DY
       MOV  R3,R0           ; calculate dx
       S    R5,R0
       BL   @_SNW           ; add sign
       MOV  R0,@DX
       ABS  R0
       MOV  @DY,R1
       ABS  R1
       C    R1,R0           ; compare|dy| to |dx|
       JLT  LINE01
       INC  R1              ; increment for point count
       MOV  R1,@DOTCNT      ; store point count
       MOV  R4,R7           ; assume starting with y1
       MOV  R5,R6           ;   and x1 (to ACC)
       C    R4,R2           ; should we switch?
       JGT  LINE03          ; yes
       JMP  LINE04          ; no
LINE03 MOV  R2,R7           ; we're starting with y2
       MOV  R3,R6           ;   and x2 (to ACC)
LINE04 LI   CRU,LNYAX       ; load CRU (R12) with LNYAX to indicate y-axis processing
       MOV  @DX,R1          ; load dx
       SLA  R1,8            ; multiply dx by 256
       CLR  R0  
       DIV  @DY,R1          ; 256*dx/dy
       MOV  R0,R5           ; load increment
       JMP  LINE02
LINE01 INC  R0              ; increment for point count
       MOV  R0,@DOTCNT      ; store point count
       MOV  R5,R7           ; assume starting with x1
       MOV  R4,R6           ;   and y1 (to ACC)
       C    R5,R3           ; should we switch?
       JGT  LINE05          ; yes
       JMP  LINE06          ; no
LINE05 MOV  R3,R7           ; we're starting with x2
       MOV  R2,R6           ;   and y2 (to ACC)
LINE06 LI   CRU,LNXAX       ; load CRU (R12) with LNXAX to indicate x-axis processing
       MOV  @DY,R1          ; load dy
       SLA  R1,8            ; multiply dy by 256
       CLR  R0  
       DIV  @DX,R1          ; 256*dy/dx
       MOV  R0,R5           ; load increment
LINE02 SLA  R6,8            ; * 256 (adjust ACC)
       MOV  @DOTCNT,R4      ; load point counter
LNLOOP B    *CRU            ; branch to x-axis or y-axis plotting
LNXAX  MOV  R7,R1           ; get next x for DOT
       MOV  R6,R0           ; get accumulator contents to y for DOT
       SRL  R0,8            ; divide by 256 for proper position
       JMP  LNPLOT          ; go to plot
LNYAX  MOV  R7,R0           ; get next y for DOT
       MOV  R6,R1           ; get accumulator contents to x for DOT
       SRL  R1,8            ; divide by 256 for proper position
LNPLOT BL   @__DOT          ; plot the dot (R0 = y, R1 = x)
       A    R5,R6           ; increment accumulator 
       INC  R7              ; increment principal coordinate
       DEC  R4              ; decrement counter
       JLT  LNLOOP          ; are we done?
       BL   @RTNEXT         ; yup, back to bank 0 and the inner interpreter

* ( High-level Forth for LINE from TI Forth---comments are mine)

* : LINE   ( x1 y1 x2 y2--- )
*     >R R ROT >R R       ( copy y1, y2 to return stack)
*     - SNW               ( add +1|-1|0 to row difference [dy] per sign)
*     SWAP >R R ROT >R R  ( copy x1, x2 to return stack)
*     - SNW               ( add +1|-1|0 to column difference [dx] per sign)
*     OVER ABS            ( dup dy; get |dy|)
*     OVER ABS            ( dup dx; get |dx|)
*     <                   ( |dy| < |dx|)
*     >R R                ( copy [|dy|<|dx|] to return stack)
*     0=                  ( not [|dy|<|dx|])
*     ( insure we use slope <= 1)
*     IF                  ( [|dy| >= |dx|]?)
*         SWAP            ( yes; swap dy and dx)
*     THEN 
*     100 ROT ROT */      ( slope [<= 1] * 256)
*     R>                  ( get |dy|<|dx| from return stack)
*     ( work on X axis?)
*     IF                      ( |dy|<|dx|?)
*         R> R>               ( get x1 & x2 from return stack)
*         OVER OVER           ( dup them)
*         ( insure drawing left to right)
*         >                   ( x1 > x2?)
*         IF                  ( yes) 
*             SWAP            ( start at x2)
*             R> DROP R>      ( get y1 & y2 from return stack; discard y1)
*         ELSE                ( no; we're starting at x1)
*             R> R> DROP      ( get y1 & y2 from return stack; discard y2)
*         THEN 
*         100 *               ( y1*256 [initial value of y for DO])
*         ROT ROT 1+ SWAP     ( x2+1 x1 for DO)
*         DO 
*             I OVER          ( I; y = y1*256, 1st time through)
*             0 100 M/ SWAP DROP  ( y/256)
*             DOT             ( plot dot [I,y/256])
*             OVER +          ( y = y+256*dy/dx)
*         LOOP           
*     ELSE                    ( work on Y axis instead) 
*         R> R> R> R>         ( get x1 x2 y1 y2 from return stack)
*         ROT >R ROT >R       ( put x2 x1 back to return stack)
*         OVER OVER           ( dup y1 y2 for comparison)
*         ( insure drawing top to bottom )
*         >                   ( y1 > y2?)
*         IF                  ( yes) 
*             SWAP            ( start at y2)
*             R> DROP R>      ( get x1 & x2 from return stack; discard x1)               
*         ELSE                ( no; we're starting at y1)
*             R> R> DROP      ( get x1 & x2 from return stack; discard x2)                                  
*         THEN               
*         100 *               ( x1*256 [initial value of x for DO])
*         ROT ROT 1+ SWAP     ( y2+1 y1 for DO)                       
*         DO 
*             DUP             ( x = x1*256, 1st time through)
*             0 100 M/ SWAP DROP  ( x/256)
*             I 
*             DOT             ( plot dot [x/256,I])
*             OVER +          ( x = x+256*dx/dy)
*         LOOP            
*     THEN 
*     DROP DROP               ( clean up)
;]

...lee

+Lee Stewart · June 12, 2014

Hacking away at the VDP mode words—

I am setting these up to use a table of default values for locating the various VDP tables (PDT, SPDTAB, SATR, SMTN, SIT, etc.), setting screen and character colors, etc. I am thinking of allowing the user to provide different values for these defaults; but, perhaps, all I should do is to include instructions for changing them programmatically after the defaults are invoked by a mode change. What do you think?

...lee

+Lee Stewart · June 12, 2014

Sprites in Multicolor Mode—

I have never used sprites in multicolor mode, so I am unclear why the TI programmer, who wrote the VDP mode words for TI Forth, set VR01 = EBh. That setting includes double-sizing and magnifying sprites. This does happen to be the default setting shown in the TI book, Video Display Processors: Programmer's Guide (with no explanation!?!); however, that same book shows bitmap graphics mode with a default setting of double-sized sprites with no magnification. The same TI programmer set the bitmap sprite default to single-size with no magnification. Can anyone shed some light here?

...lee

+Lee Stewart · June 17, 2014

I am done translating the words for graphics modes and primitives to ALC. Before testing, I have the following room left in the 4 banks of a 32KB ROM:

BANK0:  2138 bytes
BANK1:  1096 bytes
BANK2:  2662 bytes
BANK3:  8106 bytes

This is pretty good, but BANK1 is getting tight. I may not have enough room to put the 40/80-column editor there without moving some other code to another bank. I still want to put the file words in ROM, but I want to stay away from BANK3 if I can. I am trying to reserve BANK3 for the floating point math library I converted for TurboForth awhile back and that will consume most of the space in one bank. We'll see. Right now I've got a lot of testing to do! :-o

...lee

+Lee Stewart · June 19, 2014

VDP mode words done & working!—

This is getting a little tedious. I may have to stop for awhile. I will try to finish testing the graphics primitives first, however—40 words! :skull:

...lee

+Lee Stewart · June 20, 2014

I thought I'd test LINE first. I'm in a bit of a quandary over dy and dx. Lines 60 and 66 increment the magnitude of each by 1. I don't think it's necessary. Perhaps lines with very narrow slopes are marginally cleaner, but they seem to be OK. The problem with the increased size is that drawing a line from the top near the left to the bottom left will spill onto the right side of the screen near the bottom. I suspect the same thing happens at the bottom of the screen, but it would not matter except in SPLIT mode, where it would run into the text part of the screen. The problem does not arise when dy and dx are left alone. Furthermore, I can reduce the code quite a bit if I can leave them alone. Ideas?

;[*** routine to add sign to number ***    
*++ R0 must contain number under consideration.

_SNW   MOV  R0,R0               ; check number in R0 for sign
       JEQ  SNWXIT              ; if 0, we're done
       JGT  SNW01               ; if positive, increment it
       DEC  R0                  ; decrement negative number
       JMP  SNWXIT              ; we're outta here
SNW01  INC  R0                  ; increment positive number
SNWXIT RT                       ; return to caller

* HEX
* : SNW       ( n --- n-1|0|n+1 )
*     ( decrement if -, 0 if 0, increment if +)
*     DUP SGN + 
;]                                               
;[*** LINE ***      ( x1 y1 x2 y2 --- )
* LINE does the following:
*     1) Computes dy = y2-y1 and dx =  x2-x1
*     2) Adds sign (+1|-1|0) to differences
*     3) Determines which direction, x or y, has slope <= 1 to avoid
*         DIV overflow
*     4) Forces plotting direction to be positive
*     5) Sets starting y|x accumulator as acc = (y|x)*256
*     6) Computes accumulator increment as inc = slope*256
*     7) Each time through dot plotting loop:
*         a) x|y = x|y + 1
*         b) y|x = acc/256  <--truncates fraction
*         c) Plot dot
*         d) acc = acc+inc

*        DATA DTBM_N
* LINE_N DATA 4+TERMBT*LSHFT8+'L','IN','E '+TERMBT
* LINE   DATA $+2
* LINEP  BL   @BLF2A
*        DATA _LINE->6000+BANK1

DOTCNT EQU  FAC         ; temporary storage for point (dot) count for line
DX     EQU  FAC+2
DY     EQU  FAC+4
DYXSN  EQU  FAC+6       ; sign of dy/dx or dx/dy

* Register usage---
*       R0:  varies
*       R1:  varies
*       R2:  y2
*       R3:  x2
*       R4:  y1, then, point (dot) count for line (DOTCNT)
*       R5:  x1, then, increment for dependent coordinate (INC)
*       R6:  accumulator for dependent coordinate (ACC)
*       R7:  current independent coordinate       (COORD)
*      R12:  contains label for principal axis (LNXAX or LNYAX)

_LINE  MOV  *SP+,R2         ; pop coordinates
       MOV  *SP+,R3
       MOV  *SP+,R4
       MOV  *SP+,R5
       MOV  R2,R0           ; calculate dy
       S    R4,R0
*++--++       BL   @_SNW           ; add sign
       MOV  R0,R1           ; prepare for sign calculation
       ABS  R0
       MOV  R0,@DY
       MOV  R3,R0           ; calculate dx
       S    R5,R0
*++--++       BL   @_SNW           ; add sign
       XOR  R0,R1           ; calculate sign of slope (dy/dx|dx/dy)
       MOV  R1,@DYXSN       ; store sign of slope
       ABS  R0
       MOV  R0,@DX
       MOV  @DY,R1
       C    R1,R0           ; compare|dy| to |dx|
       JLT  LINE01
       INC  R1              ; increment for point count
       MOV  R1,@DOTCNT      ; store point count
       MOV  R4,R7           ; assume starting with y1
       MOV  R5,R6           ;   and x1 (to ACC)
       C    R4,R2           ; should we switch?
       JGT  LINE03          ; yes
       JMP  LINE04          ; no
LINE03 MOV  R2,R7           ; we're starting with y2
       MOV  R3,R6           ;   and x2 (to ACC)
LINE04 LI   CRU,LNYAX       ; load CRU (R12) with LNYAX to indicate y-axis processing
       MOV  @DX,R1          ; load dx
       SLA  R1,8            ; multiply dx by 256
       CLR  R0  
       DIV  @DY,R0          ; 256*dx/dy
       JMP  LINE02
LINE01 INC  R0              ; increment for point count
       MOV  R0,@DOTCNT      ; store point count
       MOV  R5,R7           ; assume starting with x1
       MOV  R4,R6           ;   and y1 (to ACC)
       C    R5,R3           ; should we switch?
       JGT  LINE05          ; yes
       JMP  LINE06          ; no
LINE05 MOV  R3,R7           ; we're starting with x2
       MOV  R2,R6           ;   and y2 (to ACC)
LINE06 LI   CRU,LNXAX       ; load CRU (R12) with LNXAX to indicate x-axis processing
       MOV  @DY,R1          ; load dy
       SLA  R1,8            ; multiply dy by 256
       CLR  R0  
       DIV  @DX,R0          ; 256*dy/dx
LINE02 MOV  R0,R5           ; load increment
       SLA  R6,8            ; * 256 (adjust ACC)
       MOV  @DYXSN,R0       ; get sign
       JGT  LINE07          ; is sign
       JEQ  LINE07          ;       negative?
       NEG  R5              ; yes, negate increment
LINE07 MOV  @DOTCNT,R4      ; load point counter
LNLOOP B    *CRU            ; branch to x-axis or y-axis plotting
LNXAX  MOV  R7,R1           ; get next x for DOT
       MOV  R6,R0           ; get accumulator contents to y for DOT
       SRL  R0,8            ; divide by 256 for proper position
       JMP  LNPLOT          ; go to plot
LNYAX  MOV  R7,R0           ; get next y for DOT
       MOV  R6,R1           ; get accumulator contents to x for DOT
       SRL  R1,8            ; divide by 256 for proper position
LNPLOT BL   @__DTBM         ; plot the dot (R0 = y, R1 = x)
       A    R5,R6           ; increment accumulator 
       INC  R7              ; increment principal coordinate
       DEC  R4              ; decrement counter
       JNE  LNLOOP          ; are we done?
       BL   @RTNEXT         ; yup, back to bank 0 and the inner interpreter

...lee

+Lee Stewart · June 21, 2014

New LINE Routine—

I decided to try an integer, no-divide, no-multiply (except shifts, i.e., multiply by powers of 2) Bresenham type of algorithm. It actually seems to run slower than the old LINE ; but, I'm sure I can tighten up the new one a bit. Then I can compare the two for speed.

Anyway, here's the new LINE for anyone interested in helping tweak it—or, just to look over my shoulder:

;[*** LINE ***      ( x1 y1 x2 y2 --- )     ( alternative LINE---one or the other)
*++ This is an integer, no-divide version of the Bresenham line algorithm

* LINE does the following:
*     1) Computes dy = y2-y1 and dx =  x2-x1
*     2) Determines which direction, x or y, has slope <= 1
*         x) Flips dx and dy
*         y) Leaves dx and dy alone
*     3) sets DOTCNT = dx in R4
*     4) Computes D = 2*dy-dx
*     5) Forces plotting direction to be positive for independent variable
*     6) Sets starting y|x accumulator as acc = (y|x)
*     7) Finds accumulator increment as inc = +1|-1
*      Plots first dot
*     9) Each time through dot plotting loop:
*         a) Loop counter check
*         b) x|y = x|y + 1
*         c) D > 0?
*             yes)
*                 y1) acc = acc + inc
*                 y2) D = D+2*(dy-dx)
*             no) D = D+2*dy
*         d) y|x = acc
*         e) Plot dot
*         f) Decrement point counter

*        DATA DTBM_N
* LINE_N DATA 4+TERMBT*LSHFT8+'L','IN','E '+TERMBT
* LINE   DATA $+2
* LINEP  BL   @BLF2A
*        DATA _LINE->6000+BANK1

DX     EQU  FAC
DY     EQU  FAC+2
DYXSN  EQU  FAC+4       ; sign of dy/dx or dx/dy, then, D
* Register usage---
*       R0:  varies
*       R1:  varies
*       R2:  y2
*       R3:  x2
*       R4:  y1, then, point (dot) count for line (DOTCNT)
*       R5:  x1, then, increment for dependent coordinate (INC) (+1|-1)
*       R6:  accumulator for dependent coordinate (ACC)
*       R7:  current independent coordinate       (COORD)
*      R12:  contains label for principal axis (LNXAX or LNYAX)

_LINE  MOV  *SP+,R2         ; pop coordinates
       MOV  *SP+,R3
       MOV  *SP+,R4
       MOV  *SP+,R5
       SETO @DYXSN          ; initially, store -1 as sign of slope
       MOV  R2,R0           ; calculate dy
       S    R4,R0
       MOV  R0,R1           ; prepare for sign calculation
       ABS  R0
       MOV  R0,@DY
       MOV  R3,R0           ; calculate dx
       S    R5,R0
       XOR  R0,R1           ; calculate sign of slope (dy/dx|dx/dy)
       JLT  LINE08          ; negative slope?
       NEG  @DYXSN          ; change sign to +1
LINE08 ABS  R0
       MOV  R0,@DX
       MOV  @DY,R1
       C    R1,R0           ; compare|dy| to |dx|
       JLT  LINE01          ; dy < dx?
       MOV  R0,@DY          ; no, flip dy
       MOV  R1,@DX          ;        and dx
       MOV  R4,R7           ; assume starting with y1
       MOV  R5,R6           ;   and x1 (to ACC)
       C    R4,R2           ; should we switch?
       JGT  LINE03          ; yes
       JMP  LINE04          ; no
LINE03 MOV  R2,R7           ; we're starting with y2
       MOV  R3,R6           ;   and x2 (to ACC)
LINE04 LI   CRU,LNYAX       ; load CRU (R12) with LNYAX to indicate y-axis processing
       JMP  LINE02
LINE01 MOV  R5,R7           ; assume starting with x1
       MOV  R4,R6           ;   and y1 (to ACC)
       C    R5,R3           ; should we switch?
       JGT  LINE05          ; yes
       JMP  LINE06          ; no
LINE05 MOV  R3,R7           ; we're starting with x2
       MOV  R2,R6           ;   and y2 (to ACC)
LINE06 LI   CRU,LNXAX       ; load CRU (R12) with LNXAX to indicate x-axis processing
LINE02 MOV  @DYXSN,R5       ; get sign to INC register before we destroy it!
       MOV  @DY,R0          ; calculate D
       SLA  R0,1            ; D = 2*dy
       S    @DX,R0          ; D = 2*dy-dx
       MOV  R0,@DYXSN       ; store D in DYXSN
       MOV  @DX,R4          ; load point counter
       CI   CRU,LNXAX       ; x or y axis?
       JEQ  LINE07          ; x-axis
       MOV  R7,R0           ; y-axis, COORD to y for DOT
       MOV  R6,R1           ; ACC to x for DOT
       JMP LINE09           ; to first plot
LINE07 MOV  R7,R1           ; x-axis, COORD to x for DOT
       MOV  R6,R0           ; ACC to y for DOT
LINE09 BL   @__DTBM         ; plot first dot (R0 = y, R1 = x)
LNLOOP MOV  R4,R4           ; are we done?
       JEQ  LINEX           ; yup!
       DEC  R4              ; decrement counter
       INC  R7              ; increment principal coordinate
*++ calculate D
       MOV  @DY,R1          ; get dy
       MOV  @DYXSN,R0       ; D > 0?
       JGT  LINE10          ; yup
       JMP  LINE11          ; nope
LINE10 A    R5,R6           ; inc/dec dependent variable
       S    @DX,R1          ; dy-dx
LINE11 SLA  R1,1            ; 2*dy or 2*(dy-dx)
       A    R1,@DYXSN       ; D = D+[2*dy or 2*(dy-dx)]
       B    *CRU            ; branch to x-axis or y-axis plotting
LNXAX  MOV  R7,R1           ; get next x for DOT
       MOV  R6,R0           ; get accumulator contents to y for DOT
       JMP  LNPLOT          ; go to plot
LNYAX  MOV  R7,R0           ; get next y for DOT
       MOV  R6,R1           ; get accumulator contents to x for DOT
LNPLOT BL   @__DTBM         ; plot the dot (R0 = y, R1 = x)
       JMP  LNLOOP          ; next point
LINEX  BL   @RTNEXT         ; yup, back to bank 0 and the inner interpreter

;]

...lee

eck · June 21, 2014

Sorry, Lee, I have no idea what magic is going on here. So this might be crap - and it saves only 4 bytes, if it fits:(New Line routine) change line 116 from JMP LNPLOT to JMP LINE09 and line 120 from JMP LNLOOP to JMP LINE09 and delete line 119.

+Lee Stewart · June 22, 2014

Sorry, Lee, I have no idea what magic is going on here. So this might be crap - and it saves only 4 bytes, if it fits:(New Line routine) change line 116 from JMP LNPLOT to JMP LINE09 and line 120 from JMP LNLOOP to JMP LINE09 and delete line 119.

Thanks for looking it over. You are absolutely right. Your fix is much better. I had found 1 or 2 similar fixes, but missed that one. I expect there are other places this routine can be tightened up, as well.

Regarding the "magic" going on, the idea here is to use Bresenham's line algorithm for plotting a rasterized line in bitmap mode on the TI, given the coordinates of the line ends. I based my LINE routine on the pseudocode near the bottom of the "Derivation" discussion here. My code is more complicated than the referenced pseudocode because I need to handle more than just a line with a positive slope (dy/dx < 1). The solution is to convert all lines encountered to just such conditions, but tracking a negative slope by adding a negative value (-1) to "increment" the dependent variable in a negative direction. If the slope is >1, it is still listed in the ALC as dy/dx, even though it is now the inverse (dx/dy) , with x as the dependent variable.

Getting into and out of this code is a little convoluted because it is part of fbForth's threaded interpretive language. I have made it more convoluted because I split the Forth word headers from the rest of the threaded dictionary in ROM to make more room in ROM bank0. I'm not sure that is the best way to handle it, but it is working pretty well so far.

Thanks again for looking over my shoulder.

...lee

[EDIT: The URL for Bresenham's line algorithm was mangled by AtariAge's editor. :mad: ]

Edited June 22, 2014 by Lee Stewart

+Lee Stewart · June 22, 2014

Cleaned-up LINE routine—

Here is the LINE routine with @eck's improvements, serialized labels in order and B *CRU changed to a conditional jump (line 114):

;[*** LINE ***      ( x1 y1 x2 y2 --- )     ( alternative LINE---one or the other)
*++ This is an integer, no-divide version of the Bresenham line algorithm

* LINE does the following:
*     1) Computes dy = y2-y1 and dx =  x2-x1
*     2) Determines which direction, x or y, has slope <= 1
*         x) Flips dx and dy
*         y) Leaves dx and dy alone
*     3) sets DOTCNT = dx in R4
*     4) Computes D = 2*dy-dx
*     5) Forces plotting direction to be positive for independent variable
*     6) Sets starting y|x accumulator as acc = (y|x)
*     7) Finds accumulator increment as inc = +1|-1
*      Plots first dot
*     9) Each time through dot plotting loop:
*         a) Loop counter check
*         b) x|y = x|y + 1
*         c) D > 0?
*             yes)
*                 y1) acc = acc + inc
*                 y2) D = D+2*(dy-dx)
*             no) D = D+2*dy
*         d) y|x = acc
*         e) Plot dot
*         f) Decrement point counter

*        DATA DTBM_N
* LINE_N DATA 4+TERMBT*LSHFT8+'L','IN','E '+TERMBT
* LINE   DATA $+2
* LINEP  BL   @BLF2A
*        DATA _LINE->6000+BANK1

DX     EQU  FAC
DY     EQU  FAC+2
DYXSN  EQU  FAC+4       ; sign of dy/dx or dx/dy, then, D
* Register usage---
*       R0:  varies
*       R1:  varies
*       R2:  y2
*       R3:  x2
*       R4:  y1, then, point (dot) count for line (DOTCNT)
*       R5:  x1, then, increment for dependent coordinate (INC) (+1|-1)
*       R6:  accumulator for dependent coordinate (ACC)
*       R7:  current independent coordinate       (COORD)
*      R12:  contains flag for principal axis (1 = x axis, 0 = y axis)

_LINE  MOV  *SP+,R2         ; pop coordinates
       MOV  *SP+,R3
       MOV  *SP+,R4
       MOV  *SP+,R5
       SETO @DYXSN          ; initially, store -1 as sign of slope
       MOV  R2,R0           ; calculate dy
       S    R4,R0
       MOV  R0,R1           ; prepare for sign calculation
       ABS  R0
       MOV  R0,@DY
       MOV  R3,R0           ; calculate dx
       S    R5,R0
       XOR  R0,R1           ; calculate sign of slope (dy/dx|dx/dy)
       JLT  LINE01          ; negative slope?
       NEG  @DYXSN          ; change sign to +1
LINE01 ABS  R0
       MOV  R0,@DX
       MOV  @DY,R1
       C    R1,R0           ; compare|dy| to |dx|
       JLT  LINE04          ; dy < dx?
       MOV  R0,@DY          ; no, flip dy
       MOV  R1,@DX          ;        and dx
       MOV  R4,R7           ; assume starting with y1
       MOV  R5,R6           ;   and x1 (to ACC)
       C    R4,R2           ; should we switch?
       JGT  LINE02          ; yes
       JMP  LINE03          ; no
LINE02 MOV  R2,R7           ; we're starting with y2
       MOV  R3,R6           ;   and x2 (to ACC)
LINE03 CLR  CRU             ; 0 to CRU (R12) to indicate y-axis processing
       JMP  LINE07
LINE04 MOV  R5,R7           ; assume starting with x1
       MOV  R4,R6           ;   and y1 (to ACC)
       C    R5,R3           ; should we switch?
       JGT  LINE05          ; yes
       JMP  LINE06          ; no
LINE05 MOV  R3,R7           ; we're starting with x2
       MOV  R2,R6           ;   and y2 (to ACC)
LINE06 LI   CRU,1           ; 1 to CRU (R12) to indicate x-axis processing
LINE07 MOV  @DYXSN,R5       ; get sign to INC register before we destroy it!
       MOV  @DY,R0          ; calculate D
       SLA  R0,1            ; D = 2*dy
       S    @DX,R0          ; D = 2*dy-dx
       MOV  R0,@DYXSN       ; store D in DYXSN
       MOV  @DX,R4          ; load point counter
       MOV  CRU,CRU         ; x or y axis?
       JNE  LINE08          ; x-axis
       MOV  R7,R0           ; y-axis, COORD to y for DOT
       MOV  R6,R1           ; ACC to x for DOT
       JMP LNLOOP           ; to first plot
LINE08 MOV  R7,R1           ; x-axis, COORD to x for DOT
       MOV  R6,R0           ; ACC to y for DOT
LNLOOP BL   @__DTBM         ; plot first dot (R0 = y, R1 = x)
       MOV  R4,R4           ; are we done?
       JEQ  LINEX           ; yup!
       DEC  R4              ; decrement counter
       INC  R7              ; increment principal coordinate
*++ calculate D
       MOV  @DY,R1          ; get dy
       MOV  @DYXSN,R0       ; D > 0?
       JGT  LINE09          ; yup
       JMP  LINE10          ; nope
LINE09 A    R5,R6           ; inc/dec dependent variable
       S    @DX,R1          ; dy-dx
LINE10 SLA  R1,1            ; 2*dy or 2*(dy-dx)
       A    R1,@DYXSN       ; D = D+[2*dy or 2*(dy-dx)]
       MOV  CRU,CRU         ; x-axis or y-axis?
       JEQ  LNYAX           ; y-axis
       MOV  R7,R1           ; x-axis, get next x for DOT
       MOV  R6,R0           ; get accumulator contents to y for DOT
       JMP  LNLOOP          ; go to plot
LNYAX  MOV  R7,R0           ; y-axis, get next y for DOT
       MOV  R6,R1           ; get accumulator contents to x for DOT
       JMP  LNLOOP          ; plot the dot (R0 = y, R1 = x) & on to next point
LINEX  BL   @RTNEXT         ; yup, back to bank 0 and the inner interpreter
;]

Unexpectedly (at least, to me), this version is no faster than TI's version. I like it better; but, when I used it to fill the SPLIT2 graphics mode screen with lines radiating from the upper left corner, the current, Bresenham-derived LINE took 52 seconds in Classic99 and the TI version took 49 seconds. I took a look at the plotting loop in each version and found (without accounting for different instruction times) that the Bresenham-derived LINE has 16 instructions in both paths through the loop while the TI version has only 9 or 10 instructions! Part of the problem is that I have 3 variables in scratchpad RAM for which I have no available register space. I could trade that space for a BLWP to the DOT routine; but, I don't think that would help. I could also use variables in scratchpad RAM for loop variables that do not need to be put in registers each time through the loop as do D, dy and dx. That would reduce the number of instructions. Perhaps, I'll try that and report back. I'm glad we had this little talk.

For those who want to follow this code through the call to DOT (BL @__DTBM) in line 99, here is that code:

*++ put the following DATA statements in ROM---they do not change
*++ <<< also, it looks like only the first 8 bytes ever get used!!!! >>>
DTAB   DATA >8040,>2010,>0804,>0201     ; array used only in bitmap modes (4,5,6)
       DATA >7FBF,>DFEF,>F7FB,>FDFE     ; ..[ALC only]
       DATA >8040,>2010,>0804,>0201

;[*** DOT ***       ( x y --- )
*       Plot a dot at dotcolumn x and dotrow y in bitmap mode.

*        DATA DTOG_N
* DTBM_N DATA 3+TERMBT*LSHFT8+'D','OT'+TERMBT
* DTBM   DATA @+2
* DTBMP  BL   @BLF2A
*        DATA _DTBM->6000+BANK1

_DTBM   
*++ get bit-set byte and vaddr to stack (from old DDOT code)
       MOV  *SP+,R0             ; pop y from stack
       MOV  *SP+,R1             ; pop x from stack
       BL   @__DTBM             ; branch to body of this routine
       BL   @RTNEXT

*++ body of DOT routine to allow call by other routines in this bank
*++ Registers passed must contain
*++     R0: y coordinate
*++     R1: x coordinate
                            
                            
__DTBM MOV  R0,R2               ; y to R2
       MOV  R1,R3               ; x to R3
       ANDI R0,>0007            ; R0 = 3 right bits of y = char dotrow
       ANDI R1,>0007            ; R1 = 3 right bits of x = char dotcolumn
       ANDI R2,>00F8            ; R2 = 5 left bits of y = start dot row of char pattern
       ANDI R3,>00F8            ; R3 = 5 left bits of x = start dot column of char pattern
       SLA  R2,>0005            ; R2 * 32 = PDT offset of char pattern row's 1st row
       A    R2,R0               ; R0 = PDT offset of char pattern's row byte
       A    R3,R0               ; R0 = PDT offset of char pattern byte of dot
       AI   R0,>2000            ; convert to actual location in VRAM (vaddr) of pattern byte 
       CLR  R2
       MOVB @DTAB(R1),R2        ; bit mask (b) of dot to high byte of R2
*++ beginning of old DOT code       
       CLR  R1
       LIMI 0
       BLWP @VSBR               ; get byte to operate on to high byte of R1
       MOV  @$DMODE,R3          ; get DMODE
       DEC  R3                  ; make it -1, 0 or +1
       JEQ  DUNDR               ; undraw?
       JLT  DRW                 ; draw?
       XOR  R2,R1               ; toggle bit to be drawn|undrawn
       JMP  DOTCOL
DUNDR  SZC  R2,R1               ; clear bit to be undrawn
       JMP  DOTCOL
DRW    SOC  R2,R1               ; set bit to be drawn
DOTCOL BLWP @VSBW               ; write result back to VRAM <<< ensure R0 preserved!!! >>>
       MOVB @$DCOL+1,R1         ; dcolor to high byte (low byte should still = 0)
       JLT  DOTEX
       S    @H2000,R0           ; adjust to point to color table <<< ensure R0 preserved!!! >>>
       BLWP @VSBW               ; write new colors to color table
DOTEX  LIMI 2
       RT

* : DOT ( x y --- )                                               
*   DDOT DUP 2000 - >R DMODE @                          ( PS: b vaddr dmode RS: vaddr-2000)
*   CASE  0 OF VOR  ENDOF                   ( draw )    ( PS: RS: vaddr-2000)
*         1 OF SWAP FF XOR SWAP VAND ENDOF  ( undraw )  ( PS: b.xor.FFh vaddr RS: vaddr-2000)
*         2 OF VXOR ENDOF                   ( toggle )                             
*   DROP DROP ENDCASE R>                                          
*   DCOLOR @ 0 < IF DROP ELSE DCOLOR @ SWAP VSBW THEN ;          
;]*

...lee

Edited June 23, 2014 by Lee Stewart

+Lee Stewart · June 22, 2014

No change in speed :_( —I just traded one set of MOVes for another. Aside from more clever programming than I have so far managed, I suppose the only increase in speed I'm going to get is by changing the one occurrence of BLWP @VSBR and the two of BLWP @VSBW in DOT to inline code. Any ideas?

...lee

+Lee Stewart · June 23, 2014

Almost done with the graphics primitives! :-o I have the joystick words (simple) and multicolor characters :ponder: yet to test before I let out the next beta. Soon, I promise!

...lee

Willsy · June 23, 2014

No change in speed —I just traded one set of MOVes for another. Aside from more clever programming than I have so far managed, I suppose the only increase in speed I'm going to get is by changing the one occurrence of BLWP @VSBR and the two of BLWP @VSBW in DOT to inline code. Any ideas?

...lee

That would make a big difference I think. BLWP is a slow instruction (though it's doing quite a lot of work).

eck · June 23, 2014

To late.

Hi,ho, Lee!

Thank you for your explanation and the link.

BLWP is good for 26 machine cycles plus 6 memory cycles. If you are able to save the 'pops' in the subprograms too, than you could save about 300 cycles per pixel. You will loose the beauty of your code, if you are going to take the 'pasta route' on the one hand, but if you have enough room left in your page, than I think it is worth a try on the other hand. There should be few people, who will ever look inside your code complaining this sort of programming, but there will be more folks, whom will enjoy the functionality of your work, is my opinion, which is not proofed.

Hope I find the time to port your code to my computer to watch it going.

+Lee Stewart · June 24, 2014

Multicolor mode words ( MULTI , MINIT , MCHAR ) are working! I just need to verify the joystick words ( JMODE , JOYST , JKBD , JCRU ) and update FBLOCKS. I think I will also play with inlining VSBR and VSBW in DOT before I release the next beta.

...lee

Tursi · June 24, 2014

VSBR has a lot more overhead than just the BLWP, it jumps around all over the place to do that single byte write. In addition, its registers are in 8-bit RAM. Doing it inline is orders of magnitude faster.

I did take up the challenge and start working on a faster line draw to see if I could beat what you had, but, it's unlikely I'll be able to test it before the weekend. Also.. I wasn't sure if I was restricted to the registers you used or if I have any others (one more would be very nice!) And finally, didn't know how much ROM space was available.

No big deal if you can't use it, I'm having fun trying a new approach.

+Lee Stewart · June 24, 2014

VSBR has a lot more overhead than just the BLWP, it jumps around all over the place to do that single byte write. In addition, its registers are in 8-bit RAM. Doing it inline is orders of magnitude faster.

I did take up the challenge and start working on a faster line draw to see if I could beat what you had, but, it's unlikely I'll be able to test it before the weekend. Also.. I wasn't sure if I was restricted to the registers you used or if I have any others (one more would be very nice!) And finally, didn't know how much ROM space was available.

No big deal if you can't use it, I'm having fun trying a new approach.

Most fbForth words use the main fbForth workspace in scratchpad RAM at 8300h – 831Fh. R0 – R7 are expected to be free for any word to use in ALC. R11 (LINK) has its usual function as does R12 (CRU). The rest are committed to fbForth and should not be touched unless you know how and when the inner interpreter uses and needs them. You may have noticed that I have been using R12 with abandon when the CRU is not involved in a given fbForth word—KSCAN is using its own registers, so I can still use it. In a few cases, I have boxed myself in; but, most of the time, I'm OK. Because I am using the fbForth workspace for both LINE and DOT (called by LINE ), I have limited myself there. With inline code for VSBR and VSBW, I can probably get away with using the main workspace, which would be faster.

Regarding VSBR and VSBW (and all the other VDP access routines, for that matter), I am not using the E/A code (not that you implied such). I am using code based on what the original TI Forth programmers wrote. I haven't compared the code, so I can't really say much more than that, except that it does use 8-bit RAM for registers. There's very little room in scratchpad RAM for additional workspaces, though I may start using the space that includes FAC and ARG from 8348h – 836Dh (38 bytes) when it's not in use by DSRs or floating point math. There's also a 14-byte space at 8320h immediately following the main workspace, though not enough for a full workspace. As you can see, there are some options—and, probably more I haven't imagined yet.

...lee

Edited June 24, 2014 by Lee Stewart

+Lee Stewart · June 24, 2014

I added a VDP mode-changing word, VMODE , to change VDP modes by the mode number. It adds only 24 bytes, including a 14-byte jump table. This will allow restoration to a previous mode simply by saving and restoring the mode number.

I am still working on inlining VSBR and VSBW. Maybe I will get to posting the next beta later today.

...lee

Tursi · June 24, 2014

Odds are that your VDP utilities are much faster then, I did in fact imply the EA versions... I was not sure. Thanks for the breakdown on Forth space... if I steal your idea to use FAC and ARG as a workspace I think that will help my code (I'm already using all the registers you listed -- including R12 /and/ R11 - for storing needed data, and I needed one more. The workspace switch should be worth the reduced work inside the loop ).

No promises, don't wait for me, etc, etc, but it's a fun challenge and I didn't try drawing lines with the bitmap mode layout in mind before this.

+Lee Stewart · June 24, 2014

Odds are that your VDP utilities are much faster then, I did in fact imply the EA versions... I was not sure. Thanks for the breakdown on Forth space... if I steal your idea to use FAC and ARG as a workspace I think that will help my code (I'm already using all the registers you listed -- including R12 /and/ R11 - for storing needed data, and I needed one more. The workspace switch should be worth the reduced work inside the loop ).

No promises, don't wait for me, etc, etc, but it's a fun challenge and I didn't try drawing lines with the bitmap mode layout in mind before this.

I just looked at the E/A code for VSBr and VSBW. TI forth/fbForth's is a straight-up copy. It looks like the only savings will be avoiding the BLWP/RTWP, unless I can also avoid BL/RTs. We'll see.

One thing I might do, if simply inlining the code doesn't look promising, is to do a BLWP/RTWP for LINE , which would give me 13 registers between LINE and DOT . I want to avoid BLWP/RTWP for DOT because of the repeated calls to it.

...lee

+Lee Stewart · June 24, 2014

Wow!—inlining the code for VSBR and VSBW into DOT only improved it by 20 seconds! It is now 32 seconds, a 60+% improvement. I guess that's something, but not exactly what I expected.

...lee

+Lee Stewart · June 25, 2014

Changing the code for LINE to use a context switch to its own registers starting at FAC and using only registers, except for a one-time instruction at the beginning, is 1 second faster (~3%) and 26 bytes less code (170 bytes). Recall that the routine I am timing is discussed above in post #712. I might add that it is plotting 85,807 dots and toggling the color at every plot. Here is the current code for LINE followed by the current code for DOT , with VSBR and two VSBWs inlined:

LINE

;[### LINE ***      ( x1 y1 x2 y2 --- )     ( alternative LINE---one or the other)
*++ This is an integer, no-divide version of the Bresenham line algorithm

* LINE does the following:
*     1) Computes dy = y2-y1 and dx =  x2-x1
*     2) Determines which direction, x or y, has slope <= 1
*         x) Flips dx and dy
*         y) Leaves dx and dy alone
*     3) sets DOTCNT = dx in R4
*     4) Computes D = 2*dy-dx
*     5) Forces plotting direction to be positive for independent variable
*     6) Sets starting y|x accumulator as acc = (y|x)
*     7) Finds accumulator increment as inc = +1|-1
*      Plots first dot
*     9) Each time through dot plotting loop:
*         a) Loop counter check
*         b) x|y = x|y + 1
*         c) D > 0?
*             yes)
*                 y1) acc = acc + inc
*                 y2) D = D+2*(dy-dx)
*             no) D = D+2*dy
*         d) y|x = acc
*         e) Plot dot
*         f) Decrement point counter

*        DATA DTBM_N
* LINE_N DATA 4+TERMBT*LSHFT8+'L','IN','E '+TERMBT
* LINE   DATA $+2
* LINEP  BL   @BLF2A
*        DATA _LINE->6000+BANK1

* Register usage---
*       R0:  varies
*       R1:  varies
*       R2:  y2
*       R3:  x2
*       R4:  y1, then, point (dot) count for line (DOTCNT)
*       R5:  x1, then, increment for dependent coordinate (INC) (+1|-1)
*       R6:  accumulator for dependent coordinate (ACC)
*       R7:  current independent coordinate       (COORD)
*       R8:  dx, then, 2*dx
*       R9:  dy, then, 2*dy
*      R10:  sign of dy/dx or dx/dy, then, D
*      R12:  contains flag for principal axis (1 = x axis, 0 = y axis)

_LINE  
STPTR  EQU  MAINWS+18       ; stack pointer (fbForth's R9)
       LWPI FAC             ; let's use our ws
       MOV  @STPTR,R0       ; get stack pointer to R0
       MOV  *R0+,R2         ; pop coordinates (won't actually change Forth SP)
       MOV  *R0+,R3
       MOV  *R0+,R4
       MOV  *R0+,R5
       SETO R10             ; initially, store -1 as sign of slope
       MOV  R2,R0           ; calculate dy
       S    R4,R0
       MOV  R0,R1           ; prepare for sign calculation
       ABS  R0
       MOV  R0,R9
       MOV  R3,R0           ; calculate dx
       S    R5,R0
       XOR  R0,R1           ; calculate sign of slope (dy/dx|dx/dy)
       JLT  LINE01          ; negative slope?
       NEG  R10             ; change sign to +1
LINE01 ABS  R0
       MOV  R0,R8
       MOV  R9,R1
       C    R1,R0           ; compare|dy| to |dx|
       JLT  LINE04          ; dy < dx?
       MOV  R0,R9           ; no, flip dy
       MOV  R1,R8           ;        and dx
       MOV  R4,R7           ; assume starting with y1
       MOV  R5,R6           ;   and x1 (to ACC)
       C    R4,R2           ; should we switch?
       JGT  LINE02          ; yes
       JMP  LINE03          ; no
LINE02 MOV  R2,R7           ; we're starting with y2
       MOV  R3,R6           ;   and x2 (to ACC)
LINE03 CLR  CRU             ; 0 to CRU (R12) to indicate y-axis processing
       JMP  LINE07
LINE04 MOV  R5,R7           ; assume starting with x1
       MOV  R4,R6           ;   and y1 (to ACC)
       C    R5,R3           ; should we switch?
       JGT  LINE05          ; yes
       JMP  LINE06          ; no
LINE05 MOV  R3,R7           ; we're starting with x2
       MOV  R2,R6           ;   and y2 (to ACC)
LINE06 LI   CRU,1           ; 1 to CRU (R12) to indicate x-axis processing
LINE07 MOV  R10,R5          ; get sign to INC register before we destroy it!
       SLA  R9,1            ; dy = 2*dy (we don't need dy by itself any more)
       MOV  R9,R0           ; calculate D
       S    R8,R0           ; D = 2*dy-dx
       MOV  R0,R10          ; store D in DYXSN
       MOV  R8,R4           ; load point counter
       SLA  R8,1            ; 2*dx (we don't need dx by itself any more)
       MOV  CRU,CRU         ; x or y axis?
       JNE  LINE08          ; x-axis
       MOV  R7,R0           ; y-axis, COORD to y for DOT
       MOV  R6,R1           ; ACC to x for DOT
       JMP LNLOOP           ; to first plot
LINE08 MOV  R7,R1           ; x-axis, COORD to x for DOT
       MOV  R6,R0           ; ACC to y for DOT
LNLOOP BL   @__DTBM         ; plot first dot (R0 = y, R1 = x)
       MOV  R4,R4           ; are we done?
       JEQ  LINEX           ; yup!
       DEC  R4              ; decrement counter
       INC  R7              ; increment principal coordinate
*++ Calculate D
       MOV  R9,R1           ; get 2*dy
       MOV  R10,R0          ; D > 0?
       JGT  LINE09          ; yup
       JMP  LINE10          ; nope
LINE09 A    R5,R6           ; inc/dec dependent variable
       S    R8,R1           ; 2*dy-2*dx
LINE10 A    R1,R10          ; D = D+[2*dy or 2*dy-2*dx)]
       MOV  CRU,CRU         ; x-axis or y-axis?
       JEQ  LNYAX           ; y-axis
       MOV  R7,R1           ; x-axis, get next x for DOT
       MOV  R6,R0           ; get accumulator contents to y for DOT
       JMP  LNLOOP          ; go to plot
LNYAX  MOV  R7,R0           ; y-axis, get next y for DOT
       MOV  R6,R1           ; get accumulator contents to x for DOT
       JMP  LNLOOP          ; plot the dot (R0 = y, R1 = x) & on to next point
LINEX  LWPI MAINWS          ; RESTORE MAIN WS
       AI   SP,8            ; REDUCE STACK BY 4 CELLS
       BL   @RTNEXT         ; back to bank 0 and the inner interpreter
;]

DOT

;[### DOT ***       ( x y --- )
*       Plot a dot at dotcolumn x and dotrow y in bitmap mode.

*        DATA DTOG_N
* DTBM_N DATA 3+TERMBT*LSHFT8+'D','OT'+TERMBT
* DTBM   DATA @+2
* DTBMP  BL   @BLF2A
*        DATA _DTBM->6000+BANK1

_DTBM   
*++ get bit-set byte and vaddr to stack (from old DDOT code)
       MOV  *SP+,R0             ; pop y from stack
       MOV  *SP+,R1             ; pop x from stack
       BL   @__DTBM             ; branch to body of this routine
       BL   @RTNEXT

*++ body of DOT routine to allow call by other routines in this bank
*++ Registers passed must contain
*++     R0: y coordinate
*++     R1: x coordinate
                            
                            
__DTBM MOV  R0,R2               ; y to R2
       MOV  R1,R3               ; x to R3
       ANDI R0,>0007            ; R0 = 3 right bits of y = char dotrow
       ANDI R1,>0007            ; R1 = 3 right bits of x = char dotcolumn
       ANDI R2,>00F8            ; R2 = 5 left bits of y = start dot row of char pattern
       ANDI R3,>00F8            ; R3 = 5 left bits of x = start dot column of char pattern
       SLA  R2,>0005            ; R2 * 32 = PDT offset of char pattern row's 1st row
       A    R2,R0               ; R0 = PDT offset of char pattern's row byte
       A    R3,R0               ; R0 = PDT offset of char pattern byte of dot
       AI   R0,>2000            ; convert to actual location in VRAM (vaddr) of pattern byte 
       CLR  R2
       MOVB @DTAB(R1),R2        ; bit mask (b) of dot to high byte of R2
*++ beginning of old DOT code       
       CLR  R1
       LIMI 0

*++ vsbr --- get byte to operate on to high byte of R1
       STWP R3                  ; get this ws address
       MOVB @1(R3),@VDPWA       ; Write low byte of address
       MOVB R0,@VDPWA           ; Write high byte of address
       MOVB @VDPRD,R1           ; Read data
       
       MOV  @$DMODE,R3          ; get DMODE
       DEC  R3                  ; make it -1, 0 or +1
       JEQ  DUNDR               ; undraw?
       JLT  DRW                 ; draw?
       XOR  R2,R1               ; toggle bit to be drawn|undrawn
       JMP  DOTCOL
DUNDR  SZC  R2,R1               ; clear bit to be undrawn
       JMP  DOTCOL
DRW    SOC  R2,R1               ; set bit to be drawn

*++ vsbw --- write result back to VRAM <<< ensure R0 preserved!!! >>>
DOTCOL STWP R3                  ; get this ws address (address of r0)
       MOVB @1(R3),@VDPWA       ; Write low byte of address
       ORI  R0,>4000            ; Properly adjust VDP write bit
       MOVB R0,@VDPWA           ; Write high byte of address
       ANDI R0,>3FFF            ; restore R0
       MOVB R1,@VDPWD           ; Write data

       MOVB @$DCOL+1,R1         ; dcolor to high byte (low byte should still = 0)
       JLT  DOTEX
       S    @H2000,R0           ; adjust to point to color table <<< ensure R0 preserved!!! >>>

*++ vsbw --- write new colors to color table
       MOVB @1(r3),@VDPWA       ; Write low byte of address (r3 should still have address of r0)
       ORI  R0,>4000            ; Properly adjust VDP write bit
       MOVB R0,@VDPWA           ; Write high byte of address
       MOVB R1,@VDPWD           ; Write data

DOTEX  LIMI 2
       RT

* : DOT ( x y --- )                                               
*   DDOT DUP 2000 - >R DMODE @                          ( PS: b vaddr dmode RS: vaddr-2000)
*   CASE  0 OF VOR  ENDOF                   ( draw )    ( PS: RS: vaddr-2000)
*         1 OF SWAP FF XOR SWAP VAND ENDOF  ( undraw )  ( PS: b.xor.FFh vaddr RS: vaddr-2000)
*         2 OF VXOR ENDOF                   ( toggle )                             
*   DROP DROP ENDCASE R>                                          
*   DCOLOR @ 0 < IF DROP ELSE DCOLOR @ SWAP VSBW THEN ;          
;]*

Note that LINE branches at line 104 into DOT at its line 23.

In a little while just for shits and giggles , I'm going to time the same routine using Forth code inherited from TI Forth. I expect it to be miserably slow—we'll see.

...lee

+Lee Stewart · June 25, 2014

OMG—this is just wrong! :-o It takes the Forth source coded graphics primitives 347 seconds to do the plot—that's more than 11 times slower! I am very glad I took the time to rewrite all of that code.

I think I will post the next beta later tonight, but no later than tomorrow morning.

...lee

fbForth fbForth—TI Forth with File-based Block I/O [Post #1 UPDATED: 06/09/2023]

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members