Jump to content



Recommended Posts

20 minutes ago, apersson850 said:

The drawback of the flexible stack method is that the TMS 9900 doesn't have "deferred indirect". If it did, you could have returned with a mechanism like B @*SP+, which should be read as branch to the address stored in the memory position pointed to by the stack pointer and increment the stack pointer by two. Such instructions do exist, but typically in 32-bit architectures.

We need to do as BF did above, MOV *SP+,R11 and then B *R11.

<off topic>

6809 can do something like this which makes direct threaded code as used in the BASIC compiler very efficient.


The 6809 "thread interpreter" (NEXT) was one instruction, 9 clocks.


From Moving Forth by Brad Rodriguez




VS something like this on 9900 

         MOV  *IP+,R5     22
         B    *R5         12


</off topic>

Link to comment
Share on other sites

2 hours ago, apersson850 said:

Most simpler processors that indeed can do that trick usually has it as some oddball instruction. The double indirection isn't another general addressing mode, as it was where I saw it first.

It's a feature on 6809, if I understand this reference from: 

MC6809-MC6809E 8-Bit Microprocessor Programming Manual [M6809PM/AD]
© Motorola Inc., 1981

The indexed addressing modes have been expanded to include:
0-, 5-, 8-, 16-bit constant offsets,
8- or 16-bit accumulator offsets,
autoincrement/decrement (stack operation).

In addition, most indexed addressing modes may have an additional level of indirection added.


Link to comment
Share on other sites

4 hours ago, TheBF said:

There is a way around that. It is done on the ARM processor and other modern machines, this way as well. 

You allocate one register to be a stack pointer.


Then your call sequence is as @apersson850  showed previously.

RP   EQU R10 

* call *
DECT RP              10
MOV  R11,*RP         18
BL  @ABCD            20 

 MOV *RP+,R11        22
 B   *R11            12


If you use a macro assembler these can be  turned into one line.





Nope. It's not faster that BLWP/RTWP, but instead of needing a workspace for every sub-routine you just reserve a few bytes as your return stack.

With 20 bytes you can nest sub-routines 10 deep.


If you make macros for PUSH and POP, you can also use the return stack for temp storage anytime you need it.


       PUSH  R1   
       PUSH  R2
       LI R1,>1234
       LI R2 >5678
       A  R1,R2  
       MOV R2,@TOTAL 
       POP R2 
       POP R1 


With a stack you are using a small memory space over and over for multiple purposes. 


Hmmm are we talking about the TI99/4A CPU or a ARM CPU, as far as I know they are not using the same code?

Link to comment
Share on other sites

44 minutes ago, RXB said:

Hmmm are we talking about the TI99/4A CPU or a ARM CPU, as far as I know they are not using the same code?

No of course not. But the many of the modern RISC machines take the same approach as 9900. 

They allow the programmer to decide how to implement a stack with a general purpose register. 


  • Like 1
Link to comment
Share on other sites

45 minutes ago, Asmusr said:

You're saving the value of R11 before you branch? How does that work at the top level?

At the top level you don't have anything to return to. You are the master, literally.


@TheBF My comment about various 8-bit processors having various useful instructions was supposed to say that these processors typically have a large, but not very uniform, amount of instructions and addressing modes. With a lot of "this goes only with that" and "that goes only with this".  The approach of the TMS 9900 (or really the mini computer it's supposed to implement) is more the opposite. Not too many instructions, but instead you have the general addressing mode, which can be used for many instructions, and sometimes twice (like for Add, Subtract, MOVe and so on), without restrictions. The register that can be used with most of these addressing modes is also almost up to you.

If you need a stack, use a register as the stack pointer. If you need two, then pick another register too. Or yet another, if you need three. Ran out of registers completely, in spite of having 16 of them? Just create a new set.

So although execution speed is pretty slow, the TMS 9900 architecture does provide the programmer with an environment that's pretty simple to use. In spite of supporting 69 instructions, the approach with many of them is still RISC-inspired, in that a two operand instruction can have five addressing modes per operand, which in reality creates 25 different such operations. And then you can specify that the operands are bytes or words, so you kind of have 50 versions of each instruction.

The obvious deviation is the immediate mode. That's more like 8-bit style, in that it's a special instruction and can only use a register.

You have to do LI R3,>ABDC and then MOV R3,@TABLE(R6). You can't do MOV #>ABDC,@TABLE(R6) as a single instruction. This is the price you have to pay for the six instructions with two general addressing modes and byte/word capability. They eat up a large portion of the available opcodes in the CPU.


We have said it before here, and now I'm coming back to the core question (BL vs. BLWP), when programming the TMS 9900, the general rule is like if you can do something with less instructions, it's more efficient. Althogh the most complicated instructions consume quite a lot of CPU cycles, the simplest one does use a somewhat unproportional number of clock cycles too. So even if MPY takes a lot of time to execute, doing something equivalent with shift and add instructions typically uses even more. The same goes for the BLWP - RTWP pair compared to BL - B. The first is slower to execute, but it only takes you need to do a few extra maneuvers to save some data (R11 or some other register thing), which you wouldn't have to if you used BLWP, to be even. Then add that your programming task may be easier with a fresh set of registers in a subroutine, and the choice is obvious.


It's different if you are writing code that's supposed to execute in a console only, from ROM in a cartridge. Then you have only those 256 bytes of RAM to play with. Using another 32 bytes for a register file may be less optimal in such a case.

  • Like 3
Link to comment
Share on other sites

2 hours ago, apersson850 said:

At the top level you don't have anything to return to. You are the master, literally.

I mean when you use * call * at the top level. R11 is undefined and you put that on the stack. When you return from the second level the value on the stack is undefined.

Edited by Asmusr
Link to comment
Share on other sites

No. When you call the subroutine, you can save R11, which at that time doesn't contain anything relevant, on the stack. Then you do the call and R11 gets updated with the return address. Now if your subroutine calls another subroutine, it will save R11 prior to that call. At this time R11 contains the return address to the top level, so it's relevant.

When returning from the second level, R11 has the return address already. No stack pop needed to get that.

When returning from the first level, you pop the return address from the stack and return correctly.

Now you come back to the top level, where you should pop the stack for no other reason but to get rid of the unnecessary value there. Or you can let it stay, but then you have consumed one stack level for no good reason.


A better practice is to do a simple BL from top level and only do calls with return address push on the lower levels.

  • Like 2
Link to comment
Share on other sites

None of these changes what I said.

BL makes you run out of Registers pretty fast and thus you are forced to use memory address in Scratch pad instead.

As I have been stuck with Console Only and Scratch Pad with only GPL Registers to work with.

I am forced to use MOV instruction from memory to Register or Register to memory so I can free up that Register.

Or use stack, but the stack is not in the Scratch Pad so uses slower VDP memory.

If you guys programed in Console only you would see BL sucks compared to having more Registers in BLWP.

Look over the XB ROMs or CONSOLE ROM to tell me I am wrong.


This is why you guys just ADDED A CPU to take the place of the 9900!

Link to comment
Share on other sites

Oh yes it does.

It's already been shown that you don't run out of registers that fast, since a large number of subroutine levels is pretty uncommon. Two or three registers occupied caters for a lot of situations.

On the other hand, if you are programming console only, then you'll run out of suitable memory for new register files, needed by the BLWP concept, literally before you start. The RAM available is almost all used up already, if you want to preserve the GPL environment.

The use of BLWP on the 99/4A only makes sense when you have some kind of RAM expansion, in almost all cases.

A lot of my own assembly was as support to Pascal programs, which of course implies the memory expansion is available, or it would not run. I've used BLWP inside some of my own programs. Typically if a subroutine needs to reference a few parameters from the caller but also requires several registers by itself, to be efficient.

No, we haven't replaced the TMS 9900 either, wherever you got that from?


@Asmusr Even at the top level, R11 is usually relevant. At least my own assembly programs have almost always been support for some high level language. Frequently Pascal, sometime Extended BASIC. In both cases you want to be able to return to the caller, and then R11 is your ticket there. Even if you make a stand alone assembly program, you may want to be able to return to the operating system, i.e. the GPL code that invoked you program, in which case you may want to push your return link to the stack first thing you do, then do all calls from your top level with BL, not the macro CALL and eventually leave your program via a link poped off the stack. That one you saved at the beinning.

Edited by apersson850
  • Like 4
Link to comment
Share on other sites

I think there is something to be said (when RAM expansion is present) for the usefulness of BLWP for building reusable code modules that can be used anywhere. With parameters passed via labeled memory locations, a fresh workspace and the automatic storage of the previous workspace pointer and return address, just plunk in the code module and BLWP it without concern for trashing previous register contents or worry over which workspace the calling routine was using.


Mostly I use BL.  But now and then...

  • Like 3
Link to comment
Share on other sites

I thought I would show the ROM3 that Lee Stewart helped me with.

       AORG >6000
       TITL 'RXB ROM3'
BIAS   EQU  >6000
PAD    EQU  >8300       * TEMP
PAD1   EQU  >8301       * TEMP
PAD2   EQU  >8302       * VDP ADDRESS
PAD3   EQU  >8303       * TEMP
PAD4   EQU  >8304       * TEMP
PAD5   EQU  >8305       * TEMP
PAD8   EQU  >8308       * SPRITE 1
PADA   EQU  >830A       * SPRITE 2
FAC    EQU  >834A       * RAM line buffer
FAC1   EQU  >834B       * GCHAR buffer
FAC4   EQU  >834E       * String Address
FAC6   EQU  >8350       * String Length
GR0LB  EQU  >83E1       * GPLWS R0 LSB
GR4LB  EQU  >83E9       * GPLWS R4 LSB
VDPRD  EQU  >8800       * VDP Read Data address
VDPWD  EQU  >8C00       * VDP Write Data address
VBUFF  EQU  >03C0       * line buffer in VRAM
UNUSED DATA >0000,>0000,>0000,>0000
       DATA >0000,>0000,>0000,>0000
* XML table number 7 for RXB ROM3 - must have                    *
*     it's origin at >6010                                       *
*             0     1     2     3     4     5     6      7   
*             8    9      A    B    C     D     E     F
* XML table number 8 for RXB ROM3 - must have                   *
*     it's origin at >6030                                      *
*             0      1     2     3     4     5     6     7
       DATA COLLSP,>0000,>0000,>0000,>0000,>0000,>0000,>0000
*             8     9     A     B     C      D       E     F
       DATA >0000,>0000,EAINIT,CINIT,XISRON,XISROF,>0000,>0000
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* Write VRAM address
*     Expects address in R0
* BL here for writing data
VWADD  ORI  R0,>4000    * set to write VRAM data
* BL here for reading data
VWADDA MOVB @GR0LB,*R15 * write LSB of R0 to VDPWA
       MOVB R0,*R15     * write MSB of R0 to VDPWA
       ANDI R0,>3FFF    * ensure R0 returned intact
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* The following utilities expect 
*     R0 = VRAM address of row
*     R1 = RAM buffer address
* R2 and R10 will be destroyed
* Copy 1 row of 32 bytes from VDP (R0) to RAM (R1)
VRROW  MOV  R11,R10   * save return
       BL   @VWADDA   * write out VDP read address
       LI   R2,32     * read 1 row
       LI   R8,VDPRD  * Register faster then address
VRROW1 MOVB *R8,*R1+  * read next VDP byte to RAM
       MOVB *R8,*R1+  * read next VDP byte to RAM
       DECT R2        * dec count by 2
       JNE  VRROW1    * repeat if not done
       B    *R10      * return to caller
* Copy 1 row of 32 bytes from RAM (R1) to VDP (R0)
VWROW  MOV  R11,R10   * save return
       BL   @VWADD    * write out VDP write address
       LI   R2,32     * write one row
       LI   R8,VDPWD  * Register faster then address
VWROW1 MOVB *R1+,*R8  * write next VDP byte from RAM
       MOVB *R1+,*R8  * write next VDP byte from RAM
       DECT R2        * dec count by 2
       JNE  VWROW1    * repeat if not done
       B    *R10      * return to caller
* CALL ROLLRIGHT(repetion)                              *
RROLL  MOV  R11,R9    * save return address
       CLR  R0        * set to screen start
       LI   R3,24     * rows to roll
* Write row to RAM buffer
RROLLP LI   R1,FAC+1  * RAM buffer+1 for roll-right positions
       BL   @VRROW    * copy row to RAM buffer 2 bytes at a time
* Copy last column before first in RAM buffer
       MOVB @FAC+32,@FAC * copy roll-out byte to roll-in position
* Copy rolled row back to screen (R0 still has correct location)
       LI   R1,FAC    * reset RAM buffer pointer
       BL   @VWROW    * copy rolled line          
* Process next row
       AI   R0,32     * next row                   
       DEC  R3        * dec row count
       JNE  RROLLP    * roll next row if not done
       B    @PAGER     * return to XB
* CALL ROLLLEFT(repetion)                               *
LROLL  MOV  R11,R9    * save return address
       CLR  R0        * set to screen start
       LI   R3,24     * rows to roll
* Write row to RAM buffer
LROLLP LI   R1,FAC    * RAM buffer+1 for roll-left positions
       BL   @VRROW    * copy row to RAM buffer 2 bytes at a time
* Copy first column after last in RAM buffer
       MOVB @FAC,@FAC+32 * copy roll-out byte to roll-in position
* Copy rolled row back to screen (R0 still has correct location)
       LI   R1,FAC+1  * reset RAM buffer pointer
       BL   @VWROW    * copy rolled line 2 bytes at a time
* Process next row
       AI   R0,32     * next row                   
       DEC  R3        * dec row count
       JNE  LROLLP    * roll next row if not done
       B    @PAGER     * return to XB
* CALL ROLLUP(repetion)                                 *
UROLL  MOV  R11,R9    * save return address
       CLR  R0        * set to screen start
       LI   R3,23     * rows to roll (all but 1st)
* Write first row to RAM buffer
       LI   R1,FAC    * set RAM buffer
       BL   @VRROW    * copy row to RAM buffer 2 bytes at a time
* Copy RAM buffer to VRAM buffer
       LI   R0,VBUFF  * set VRAM dest to VBUFF
       LI   R1,FAC    * set RAM buffer
       BL   @VWROW    * copy row to VBUFF 2 bytes at a time
* Start copy loop at 2nd row
       LI   R0,32     * point to 2nd row
* Write row to RAM buffer
UROLLP LI   R1,FAC    * set RAM buffer
       BL   @VRROW    * copy row to RAM buffer 2 bytes at a time
* Copy to previous row
       AI   R0,-32    * back up 1 row
       LI   R1,FAC    * reset RAM buffer pointer
       BL   @VWROW    * copy to previous row 2 bytes at a time
* Process next row
       AI   R0,64     * next row
       DEC  R3        * dec row count
       JNE  UROLLP    * roll next row if not done
* Copy saved row to RAM
       LI   R0,VBUFF  * set VRAM source
       LI   R1,FAC    * set RAM buffer
       BL   @VRROW    * copy row to RAM buffer 2 bytes at a time
* Copy saved row to last row
       LI   R0,736    * point to last row
       LI   R1,FAC    * reset RAM buffer pointer
       BL   @VWROW    * copy to last row 2 bytes at a time
       B    @PAGER     * return to XB
* CALL ROLLDOWN(repetion)                               *
DROLL  MOV  R11,R9    * save return address
       LI   R0,736    * set to last row
       LI   R3,23     * rows to roll (all but last)
* Write last row to RAM buffer
       LI   R1,FAC    * set RAM buffer
       BL   @VRROW    * copy row to RAM buffer 2 bytes at a time
* Copy RAM buffer to VRAM buffer
       LI   R0,VBUFF  * set VRAM dest to VBUFF
       LI   R1,FAC    * set RAM buffer
       BL   @VWROW    * copy row to VBUFF 2 bytes at a time
* Start copy loop at 2nd-to-last row
       LI   R0,704    *  point to row 22
* Write row to RAM buffer
DROLLP LI   R1,FAC    * set RAM buffer
       BL   @VRROW    * copy row to RAM buffer 2 bytes at a time
* Copy to next row
       AI   R0,32     * down 1 row
       LI   R1,FAC    * reset RAM buffer pointer
       BL   @VWROW    * copy to next row 2 bytes at a time
* Process next row
       AI   R0,-64    * back up 2 rows
       DEC  R3        * dec row count
       JNE  DROLLP    * roll next row if not done
* Copy saved row to RAM
       LI   R0,VBUFF  * set VRAM source
       LI   R1,FAC    * set RAM buffer
       BL   @VRROW    * copy row to RAM buffer 
*                     *  2 bytes at a time
* Copy saved row to first row
       CLR  R0        * point to first row
       LI   R1,FAC    * reset RAM buffer pointer
       BL   @VWROW    * copy to first row 2 bytes at a time
       B    @PAGER    * return to caller
* CALL HCHAR(row,column,character#,repetition[,...])      *
* R3 COUNTER     - FAC
HCHAR  MOV  R11,R9      * save return address
       LI   R8,VDPWD    * put VDPWD in R8 for faster loop
       MOV  @PAD2,R0    * VRAM start address for HCHAR4
       MOV  @PAD,R1     * ASCII char code is in MSB
       MOV  @FAC,R3     * repetition to R3..
       MOV  R3,R7       * .. and to R4 for manipulation
       LI   R5,768      * get screen end = 768 to a register..
       MOV  R5,R6       * ..and to R6 for screen size
       C    R6,R7       * scrn_size > cnt, i.e., cnt OK?
       JGT  HCHAR1      * yes; jump
       MOV  R6,R7       * no; cnt = scrn_size
HCHAR1 C    R0,R5       * VRAM address outside screen?
       JHE  HCHARX      * error if so..just exit
       S    R0,R5       * bytes to end of screen
HCHAR2 MOV  R7,R3       * put cnt in R3 for HCHAR4
       JGT  HCHAR3      * are we done?
       JMP  HCHARX      * yup; we're outta here!
HCHAR3 S    R5,R7       * no; do we wrap to screen start?
       JLT  HCHAR4      * no
       MOV  R5,R3       * yes, just go to screen end
HCHAR4 BL   @VWADD      * write out VRAM write address
       LI   R8,VDPWD    * put VDPWD in R8 for faster loop
HCHAR5 MOVB R1,*R8      * Write a byte to next VRAM location
       DEC  R3          * decrement count
       JNE  HCHAR5      * Not done, fill another
       CLR  R0          * wrap for next round
       MOV  R6,R5       * scrn_size to bytes-to-end-of-screen
       JMP  HCHAR2      * see if more
HCHARX B    @PAGER      * return to caller
* CALL VCHAR(row,column,character#,repetition[,...])      *
* CALL VCHAR(row,column,character#,repetition)
* R3 COUNTER     = FAC
VCHAR   MOV  R11,R9     * save return address
        MOV  @PAD2,R0   * VDP ADDRESS 
        MOV  R0,R7      * Copy VDP ADDRESS
VCHART  CI   R7,31      * VDP ADDRESS>=31 top?
        JLE  VCHARD     * column<=31 top found
        AI   R7,-32     * VDP ADDRESS-32
        JMP  VCHART     * Loop
VCHARD  CI   R7,31      * column=31?
        JNE  VCHARR     * 0 to 30
        CLR  R7         * Reset column to 0
        JMP  VCHARZ
VCHARR  INC  R7         * column+1
VCHARZ  MOV  @PAD,R1    * Character to display 
        MOV  @FAC,R3    * Repetition  
VCHAR1  BL   @VWADD     * write out VRAM write address
        LI   R8,VDPWD   * Register faster then @
        MOVB R1,*R8     * write next VRAM byte from R1
        CI   R0,768     * End of screen?
        JEQ  VCHARE     * Yes
        CI   R0,735     * next to last ROW?
        JLE  VCHAR3     * Yes
        INC  R7         * column+1 
        CI   R7,31      * Next row past last column? 
        JLE  VCHAR2     * No
        CLR  R7         * Wrap Column back
VCHAR2  DEC  R3         * repetition-1
        JNE  VCHAR1     * No done yet
        JMP  VCHAR4     * Exit
VCHAR3  AI   R0,32      * ROW+1
        CI   R0,768     * Off screen?
        JHE  VCHARE     * Yes reset
        DEC  R3         * REPETITION-1
        JNE  VCHAR1     * No loop
* CALL HPUT(row,column,$variable,...)                    *
* CALL HPUT(row,column,number-variable,...)              *
* CALL HPUT(row,column,string or number)
* R3 COUNTER            = FAC6 R3
HPUT    MOV  R11,R9     * save return address
        MOV  @FAC4,R4   * String address or number
        MOV  @FAC6,R3   * Length
        LI   R7,BIAS    * Get Screen bias off set 
        LI   R1,VDPWD   * Register faster then @
        LI   R8,VDPRD   * Register faster then @
        CI   R3,0       * Length=0? 
        JEQ  HPUT2      * Yes
HPUT0   MOV  R4,R0      * Get String/number address
        BL   @VWADDA    * read out VDP address R4
        MOVB *R8,R6     * Get $/# from R4 byte into R6
        AB   R7,R6      * Add bias
        MOV  R0,R4      * Get new update into R4
        INC  R4         * STRING ADDRESS+1
        MOV  R5,R0      * Get SCREEN ADDRESS
        BL   @VWADD     * write out VDP write address R5
        MOVB R6,*R1     * Put R6 onto screen address R0 
        MOV  R0,R5      * Get new update into R5
        INC  R5         * SCREEN ADDRESS+1
        CI   R5,768     * Last row:col?
        JNE  HPUT1      * No, so continue loop
        CLR  R5         * Reset back to top row:col
HPUT1   DEC  R3         * count by -1
        JNE  HPUT0      * count=0? Restart at top row:col
HPUT2   B    @PAGER     * return to caller
* CALL VPUT(row,column,$variable,...)                    *
* CALL VPUT(row,column,number-variable,...)              *
* CALL VPUT(row,column,string or number)
* R3 COUNTER            = FAC6 R3
VPUT    MOV  R11,R9     * save return address
        MOV  @PAD2,R5   * VDP ADDRESS
        MOV  @FAC4,R4   * String address or number
        MOV  @FAC6,R3   * Length
        LI   R7,BIAS    * Get Screen bias off set 
        LI   R1,VDPWD   * Register faster then @
        LI   R8,VDPRD   * Register faster then @
        CI   R3,0       * Length=0? 
        JEQ  VPUT2      * Yes
VPUT0   MOV  R4,R0      * Get String/number address
        BL   @VWADDA    * read out VDP address R4
        MOVB *R8,R6     * Get $/# from R4 byte into R6
        AB   R7,R6      * Add bias
        MOV  R0,R4      * Get new update into R4
        INC  R4         * STRING ADDRESS+1
        MOV  R5,R0      * Get SCREEN ADDRESS
        BL   @VWADD     * write out VDP write address R5
        MOVB R6,*R1     * Put R6 onto screen address R0 
        MOV  R0,R5      * Get new update into R5
        CI   R5,767     * OFF SCREEN?
        JEQ  VPUT3      * Yes
        CI   R5,735     * Last ROW?
        JGT  VPUT4      * Yes
        AI   R5,32      * ROW+1
VPUT1   DEC  R3         * Length-1
        JNE  VPUT0      * No loop
VPUT2   B    @PAGER     * return to XB
        JMP  VPUT1      * Always loop
VPUT4   AI   R5,-735    * Reset back to top ROW
        JMP  VPUT1      * Always loop
* CALL HGET(row,column,length,$variable)                 *
* R3 Length             = PAD6  R3
HGET    MOV  R11,R9     * save return address
        MOV  @PAD4,R4   * String address
        MOV  @PAD6,R3   * Length
        LI   R7,BIAS    * Get Screen bias off set 
        LI   R1,VDPWD   * Register faster then @
        LI   R8,VDPRD   * Register faster then @       
HGET0   MOV  R5,R0      * Get Screen Address
        BL   @VWADDA    * read out VDP address R5
        MOVB *R8,R6     * Get Screen byte into R6
        SB   R7,R6      * Subtract bias
        MOV  R0,R5      * Get new update into R5 
        INC  R5         * SCREEN ADDRESS+1
        CI   R5,768     * Last row:col?
        JNE  HGET1      * No, so continue loop
        CLR  R5         * Reset back to top row:col
        BL   @VWADD     * write out VDP write address R4
        MOVB R6,*R1     * Put R6 onto String address R0 
        MOV  R0,R4      * Get new String update into R4
        INC  R4         * STRING ADDRESS+1
        DEC  R3         * Length-1
        JNE  HGET0      * count=0? Restart at top row:col
HGET2   B    @PAGER     * return to caller
* CALL VGET(row,column,length,$variable)                 *
* R3 Length             = PAD6  R3
VGET    MOV  R11,R9     * save return address
        MOV  @PAD2,R5   * VDP ADDRESS
        MOV  @PAD4,R4   * String address or number
        MOV  @PAD6,R3   * Length
        LI   R7,BIAS    * Get Screen bias off set 
        LI   R1,VDPWD   * Register faster then @
        LI   R8,VDPRD   * Register faster then @
VGET0   MOV  R5,R0      * Get Screen address
        BL   @VWADDA    * read out VDP address R5
        MOVB *R8,R6     * Get Screen from R5 into R6
        SB   R7,R6      * Subtact bias
        MOV  R0,R5      * Get new Screen update into R5
        CI   R5,735     * Last ROW?
        JLE  VGET2      * Yes, ROW+1
        CI   R5,767     * OFF SCREEN?
        JHE  VGET1      * Yes, reset ROW:COL to Zero
        INC  R5         * COL+1
        S    @PAD2,R5   * Original address-screen address 
        JMP  VGET3      * Save to String
VGET1   CLR  R5         * Reset back to top row:col  
        JMP  VGET3      * Save to String
VGET2   AI   R5,32      * ROW+1  
        BL   @VWADD     * write out VDP write address R4
        MOVB R6,*R1     * Put R6 onto screen address R0 
        MOV  R0,R4      * Get new update into R4
        INC  R4         * STRING ADDRESS+1     
        DEC  R3         * Length-1
        JNE  VGET0      * No loop
        B    @PAGER     * return to XB
* CALL INVERSE(chr#,...)                                 *
* CALL INVERSE(ALL,...)                                  *
* R0 TEMP VDP   
INVERS  MOV  R11,R9    * save return address
        CI   R1,0      * ALL flag?
        JNE  INV1      * No single character defintion
        LI   R1,>03F0  * Cursor first character
        LI   R2,129    * Load number of characters
        JMP  INV2      * Go do ALL
INV1    LI   R2,1      * Load 1 character 
* Get 4 bytes of character definition
INV2    LI   R3,>83E8 * BUFFER in R4 to R7 
        MOV  R1,R0    * Copy VDP Char Address
        BL   @VWADDA  * read out VDP address                 
        LI   R8,VDPRD * Register faster then address
        MOVB *R8,*R3+ * read next VDP byte to RAM
        MOVB *R8,*R3+ * read next VDP byte to RAM
        MOVB *R8,*R3+ * read next VDP byte to RAM
        MOVB *R8,*R3+ * read next VDP byte to RAM
        MOVB *R8,*R3+ * read next VDP byte to RAM
        MOVB *R8,*R3+ * read next VDP byte to RAM
        MOVB *R8,*R3+ * read next VDP byte to RAM
        MOVB *R8,*R3+ * read next VDP byte to RAM
        INV  R4       * INVERT BITS
        INV  R5       * INVERT BITS
        INV  R6       * INVERT BITS
        INV  R7       * INVERT BITS
        LI   R3,>83E8 * BUFFER in R4 to R7 
        MOV  R1,R0    * Copy VDP Char Address
        BL   @VWADD   * write out VDP address
        LI   R8,VDPWD * Register faster then address
        MOVB *R3+,*R8 * write next VDP byte from RAM
        MOVB *R3+,*R8 * write next VDP byte from RAM
        MOVB *R3+,*R8 * write next VDP byte from RAM
        MOVB *R3+,*R8 * write next VDP byte from RAM
        MOVB *R3+,*R8 * write next VDP byte from RAM
        MOVB *R3+,*R8 * write next VDP byte from RAM
        MOVB *R3+,*R8 * write next VDP byte from RAM
        MOVB *R3+,*R8 * write next VDP byte from RAM
        AI   R1,8     * Next Character Definition  
        DEC  R2       * Character counter -1
        JNE  INV2     * 0? No keep looping
        B    @PAGER     * return to XB
COLLSP MOV  R11,R9     * save return address
COLL   LI   R8,PAD     * PAD
       CLR  R0         * ZERO OUT
       MOVB *R8+,R0    * Sprite #1 ROW in high byte
       CLR  R4         * ZERO OUT
       MOVB *R8+,R4    * Sprite #1 COL in high byte
       CLR  R1         * ZERO OUT 
       MOVB *R8+,R1    * Sprite #2 ROW in high byte
       CLR  R5         * ZERO OUT
       MOVB *R8+,R5    * Sprite #2 COL in high byte
       MOV  @FAC,R7    * TOLERANCE
       SWPB R7         * Put into high byte
       CLR  @PAD       * zero out 
       CLR  @PAD2      * zero out 
       LI   R6,>C000   * Off screen value
       C    R0,R6      * To Sprite #1 ROW to high?
       JHE  COLL       * Yes defualt zero 
       C    R1,R6      * To Sprite #2 ROW to high?
       JHE  COLL       * Yes defualt zero 
*** Row comparison
       MOV  R1,R8
       S    R0,R8      * Sprite #2 ROW-Sprite #1 ROW
       ABS  R8         * No negative value
       C    R8,R7      * Within tolerance?
       JGT  COLL       * No defualt zero       
*** Column comparison 
       MOV  R5,R8
       S    R4,R8      * Sprite #2 COL-Sprite #1 COL
       ABS  R8         * No negative value
       C    R8,R7      * Within tolerance?
       JGT  COLLO      * No defualt zero 
       SWPB R0         * Sprite #1 ROW in low byte
       MOV  R0,@PAD    * Save Sprite #1 ROW to XB
       SWPB R4         * Sprite #1 COL in low byte
       MOV  R4,@PAD2   * Save Sprite #1 COL to XB
COLLO  B    @PAGER     * return to XB  
* CALL CLEARPRINT                                         *
CLEARP MOV  R11,R9     * save return address
       LI   R0,2       * Screen address start COL 3
       LI   R1,>8000   * Space Character
       LI   R3,24      * ROW counter
       LI   R4,2       * COL copy
       LI   R8,VDPWD   * put VDPWD in R8 for faster loop
CLEARL BL   @VWADD     * write out VRAM write address
       LI   R2,28      * Count COL 28
CLEARR MOVB R1,*R8     * Write a byte to next VRAM location
       DEC  R2         * COUNT-1
       JNE  CLEARR     * No loop
       AI   R4,32      * Start COL copy +32
       MOV  R4,R0      * Get new ROW:COL
       DEC  R3         * ROW-1
       JNE  CLEARL     * Not zero continue
       B    @PAGER     * return to XB 
CINIT   MOV  R11,R9    * save return address
        LI   R0,>2000  * RAM destination address
        LI   R1,ALCEND * ROM source address 
        LI   R2,>0274  * COUNT
INITLP  MOV  *R1+,*R0+ * Write next RAM word 1
        MOV  *R1+,*R0+ * Write next RAM word 2
        MOV  *R1+,*R0+ * Write next RAM word 3
        MOV  *R1+,*R0+ * Write next RAM word 4 
        MOV  *R1+,*R0+ * Write next RAM word 5
        MOV  *R1+,*R0+ * Write next RAM word 6
        MOV  *R1+,*R0+ * Write next RAM word 7
        MOV  *R1+,*R0+ * Write next RAM word 8
        DEC  R2        * COUNT-1
        JNE  INITLP    * Repeat if not done
        MOV  *R1+,*R0+ * Last word to load
        B    @PAGER       * DONE RETURN TO XB 
ALCEND  DATA >205A,>24F4,>4000,>AA55
        DATA >2038,>2096,>2038,>217E
        DATA >2038,>21E2,>2038,>234C
        DATA >2038,>2432,>2038,>246E
        DATA >2038,>2484,>2038,>2490
        DATA >2038,>249E,>2038,>24AA
        DATA >2038,>24B8,>2038,>2090
        DATA >0000,>0000,>0000,>0000
        DATA >0000,>0000,>0000,>0000
        DATA >0000,>0000,>0000,>0000
        DATA >0000,>0000,>0000,>0000
        DATA >6520,>C060,>2004,>0281
        DATA >4000,>130E,>C001,>0202
        DATA >834A,>8CB0,>1606,>8CB0
        DATA >1604,>8CB0,>1602,>C030
        DATA >0450,>0221,>0008,>10EF
        DATA >0200,>2500,>C800,>8322
        DATA >02E0,>83E0,>0460,>00CE
        DATA >C81D,>8322,>10F9,>C01D
        DATA >C06D,>0002,>06A0,>20DC
        DATA >C0C1,>0603,>0223,>8300
        DATA >D0D3,>1361,>0983,>0643
        DATA >1612,>C000,>165C,>C0C5
        DATA >05C3,>06A0,>2406,>1653
        DATA >05C3,>06A0,>23CA,>0204
        DATA >834A,>0202,>0008,>DC74
        DATA >0602,>15FD,>0380,>06A0
        DATA >20F8,>10F5,>C041,>1347
        DATA >0A81,>9060,>8312,>1143
        DATA >0981,>C141,>0A35,>0225
        DATA >0008,>A160,>8310,>045B
        DATA >C24B,>0643,>1634,>C0C5
        DATA >06A0,>23CA,>C0C1,>06A0
        DATA >2406,>112D,>06A0,>211C
        DATA >06A0,>23CA,>6004,>0A30
        DATA >A040,>0459,>C28B,>0A51
        DATA >09D1,>C201,>D120,>8343
        DATA >0984,>1303,>0600,>1123
        DATA >0580,>0206,>0001,>C0C5
        DATA >0223,>0004,>06A0,>23CA
        DATA >C0C1,>0643,>05C3,>06A0
        DATA >23CA,>0581,>6044,>3981
        DATA >C186,>1611,>C187,>0608
        DATA >15F5,>0606,>A184,>8180
        DATA >150A,>05C3,>045A,>0200
        DATA >0700,>0460,>2084,>0200
        DATA >1C00,>0460,>2084,>0200
        DATA >1400,>0460,>2084,>C01D
        DATA >C06D,>0002,>06A0,>20DC
        DATA >C0C1,>0603,>0223,>8300
        DATA >D0D3,>0983,>160E,>C000
        DATA >1622,>0202,>0008,>0204
        DATA >834A,>C0C5,>06A0,>23CA
        DATA >CD01,>05C3,>0642,>15FA
        DATA >0380,>0643,>160F,>C000
        DATA >1612,>C0C5,>05C3,>06A0
        DATA >2406,>160B,>05C3,>06A0
        DATA >23CA,>C101,>0201,>834A
        DATA >0460,>20CA,>06A0,>20F8
        DATA >10F8,>0460,>2166,>0460
        DATA >216E,>C81D,>2038,>C82D
        DATA >0002,>83E2,>C82D,>0004
        DATA >2044,>02E0,>83E0,>C80B
        DATA >2040,>C020,>2044,>06A0
        DATA >20DC,>C0C1,>0603,>0223
        DATA >8300,>D0D3,>0983,>0603
        DATA >1332,>0643,>164A,>C2A0
        DATA >2038,>162D,>C0C5,>05C3
        DATA >06A0,>2406,>9801,>2058
        DATA >1620,>0206,>0008,>0204
        DATA >834A,>C0C5,>06A0,>23CA
        DATA >CD01,>05C3,>0646,>15FA
        DATA >06A0,>22DA,>0225,>0004
        DATA >C105,>C046,>06A0,>23E6
        DATA >05C4,>D050,>0981,>06A0
        DATA >23E6,>C2E0,>2040,>C820
        DATA >203E,>830C,>02E0,>2038
        DATA >0380,>0200,>0700,>C2E0
        DATA >2040,>0460,>2084,>0200
        DATA >1C00,>0460,>226E,>C08B
        DATA >0643,>16F3,>C0C5,>06A0
        DATA >23CA,>C0C1,>06A0,>2406
        DATA >1102,>0460,>226A,>C020
        DATA >2038,>06A0,>211C,>6004
        DATA >0A10,>A0C0,>06A0,>23CA
        DATA >0452,>06A0,>227E,>0206
        DATA >834A,>CD83,>DDA0,>2058
        DATA >DD84,>CD81,>C0C1,>1602
        DATA >04D6,>1005,>0603,>06A0
        DATA >2406,>0981,>C581,>C020
        DATA >2044,>06A0,>22DA,>0460
        DATA >225A,>C80B,>203A,>C805
        DATA >203C,>C2E0,>601E,>069B
        DATA >C020,>2044,>C160,>203C
        DATA >D190,>0986,>C820,>830C
        DATA >203E,>C806,>830C,>C806
        DATA >8350,>C2E0,>6012,>069B
        DATA >C020,>2044,>0206,>834A
        DATA >0204,>001C,>CD84,>DDA0
        DATA >2058,>DD84,>C5A0,>831C
        DATA >C0A0,>830C,>1309,>C116
        DATA >C0C0,>0583,>D073,>06A0
        DATA >241A,>0584,>0602,>15FA
        DATA >C2E0,>6028,>069B,>C020
        DATA >2044,>C160,>203C,>C2E0
        DATA >203A,>045B,>C01D,>C06D
        DATA >0002,>06A0,>20DC,>C0C1
        DATA >0603,>0223,>8300,>D0D3
        DATA >0983,>0603,>1302,>0643
        DATA >1623,>C000,>1628,>C02D
        DATA >0004,>C0C5,>05C3,>06A0
        DATA >2406,>9801,>2058,>161D
        DATA >05C3,>06A0,>23CA,>C041
        DATA >1307,>C181,>0601,>C0C1
        DATA >06A0,>2406,>9050,>1A15
        DATA >DC01,>1309,>C0C6,>0981
        DATA >C141,>06A0,>2406,>DC01
        DATA >0583,>0605,>15FA,>0380
        DATA >06A0,>227E,>C02D,>0004
        DATA >10E6,>0460,>2166,>0460
        DATA >216E,>0200,>1300,>0460
        DATA >2084,>06C3,>D803,>8C02
        DATA >06C3,>D803,>8C02,>1000
        DATA >D060,>8800,>06C1,>D060
        DATA >8800,>06C1,>045B,>06C4
        DATA >D804,>8C02,>06C4,>0264
        DATA >4000,>D804,>8C02,>1000
        DATA >D801,>8C00,>06C1,>D801
        DATA >8C00,>06C1,>045B,>06C3
        DATA >D803,>8C02,>06C3,>D803
        DATA >8C02,>1000,>D060,>8800
        DATA >045B,>06C4,>D804,>8C02
        DATA >06C4,>0264,>4000,>D804
        DATA >8C02,>1000,>D801,>8C00
        DATA >045B,>C83E,>83E2,>02E0
        DATA >83E0,>C80B,>204E,>C081
        DATA >0281,>0040,>1B0A,>C0A1
        DATA >6010,>0281,>0004,>1605
        DATA >C0A2,>0002,>0692,>2466
        DATA >1001,>0692,>02E0,>2038
        DATA >C80B,>83F6,>0380,>0200
        DATA >0B00,>0460,>2084,>02E0
        DATA >83E0,>C80B,>204E,>06A0
        DATA >000E,>02E0,>2038,>C80B
        DATA >83F6,>0380,>06A0,>24CA
        DATA >D82D,>0002,>8C00,>0380
        DATA >06A0,>24CA,>D831,>8C00
        DATA >0602,>16FC,>0380,>06A0
        DATA >24D0,>DB60,>8800,>0002
        DATA >0380,>06A0,>24D0,>DC60
        DATA >8800,>0602,>16FC,>0380
        DATA >C05D,>D82D,>0001,>8C02
        DATA >0261,>8000,>D801,>8C02
        DATA >0380,>0201,>4000,>1001
        DATA >04C1,>C09D,>D820,>203D
        DATA >8C02,>E081,>D802,>8C02
        DATA >C06D,>0002,>C0AD,>0004
        DATA >045B
EAINIT  MOV  R11,R9    * save return address
        CLR  R0        * ZERO OUT R0
        LI   R1,>2000  * Start address
        LI   R2,8192   * Counter
        DECT R2        * Counter-2
        JNE  CLRINT    * ZERO?
        LI   R0,LOW1   * FOUR WORDS
        LI   R1,>2000  * Set up init
        BL   @FOURWS   * LOAD IT
        LI   R0,LOW2   * FOUR WORDS
        LI   R1,>2024  * Set up list 
        BL   @FOURWS   * LOAD IT
        LI   R0,LOW3   * Routines
        LI   R1,>20FA  * Set up routines
        LI   R2,1404   * Counter
SLOW3   MOV  *R0+,*R1+ * LOAD IT
        DECT R2        * Counter-2
        JNE  SLOW3     * ZERO? 
        LI   R0,LOW4   * Name List
        LI   R1,>3F38  * NAMES & Address
        LI   R2,200    * Counter
SLOW4   MOV  *R0+,*R1+ * LOAD IT
        DECT R2        * Counter-2
        JNE  SLOW4     * ZERO?
        B    @PAGER     * return to XB
        MOV  *R0+,*R1+
        MOV  *R0+,*R1+
        MOV  *R0+,*R1+
        MOV  *R0+,*R1+
        B    *R10
* Data for Initialization of
* Memory Expansion
LOW1  DATA  >A55A,>2128,>2398,>225A
LOW2  DATA  >A000,>FFD7,>2676,>3F38
LOW3  DATA  >0064,>2000,>2EAA,>2094
      DATA  >21C4,>2094,>2196,>2094,>21DE,>2094,>21F4
      DATA  >2094,>2200,>2094,>220E,>2094,>221A,>2094,>2228
      DATA  >209A,>22B2,>20DA,>23BA,>C80B,>2030,>D060
      DATA  >8349,>2060,>20FC,>132A,>C020,>8350,>1311,>06A0
      DATA  >2646,>101E,>0281,>3F38,>1319,>C001,>0202
      DATA  >834A,>8CB0,>1611,>8CB0,>160F,>8CB0,>160D,>C810
      DATA  >2022,>02E0,>20BA,>C020,>2022,>1309,>0690
      DATA  >02E0,>83E0,>C2E0,>2030,>045B,>0221,>0008,>10E4
      DATA  >0200,>0F00,>D800,>8322,>02E0,>83E0,>0460
      DATA  >00CE,>5820,>20FC,>8349,>02E0,>2094,>0380,>C83E
      DATA  >83E2,>02E0,>83E0,>C80B,>20AA,>C081,>0281
      DATA  >8000,>1B07,>09C1,>0A11,>0A42,>09B2,>A0A1,>0CFA
      DATA  >C092,>0692,>02E0,>2094,>C80B,>83F6,>0380
      DATA  >D060,>8373,>0981,>C87E,>8304,>F820,>20FC,>8349
      DATA  >02E0,>83E0,>C2E0,>2030,>045B,>02E0,>83E0
      DATA  >C80B,>20AA,>06A0,>000E,>02E0,>2094,>C80B,>83F6
      DATA  >0380,>06A0,>223A,>D82D,>0002,>8C00,>0380
      DATA  >06A0,>223A,>D831,>8C00,>0602,>16FC,>0380,>06A0
      DATA  >2240,>DB60,>8800,>0002,>0380,>06A0,>2240
      DATA  >DC60,>8800,>0602,>16FC,>0380,>C05D,>D82D,>0001
      DATA  >8C02,>0261,>8000,>D801,>8C02,>0380,>0201
      DATA  >4000,>1001,>04C1,>C09D,>D820,>2099,>8C02,>E081
      DATA  >D802,>8C02,>C06D,>0002,>C0AD,>0004,>045B
      DATA  >0204,>834A,>C014,>C184,>04F6,>04F6,>C140,>1323
      DATA  >0740,>0203,>0040,>04F6,>04D6,>0280,>0064
      DATA  >1A13,>0280,>2710,>1A08,>0583,>C040,>04C0,>3C20
      DATA  >20FA,>D920,>83E3,>0003,>0583,>C040,>04C0
      DATA  >3C20,>20FA,>D920,>83E3,>0002,>D920,>83E1,>0001
      DATA  >D520,>83E7,>0545,>1101,>0514,>045B,>C17E
      DATA  >53E0,>20FC,>C020,>8356,>C240,>0229,>FFF8,>0420
      DATA  >2114,>D0C1,>0983,>0704,>0202,>208C,>0580
      DATA  >0584,>80C4,>1306,>0420,>2114,>DC81,>9801,>20FE
      DATA  >16F6,>C104,>1352,>0284,>0007,>154F,>04E0
      DATA  >83D0,>C804,>8354,>C804,>2036,>0584,>A804,>8356
      DATA  >C820,>8356,>2038,>02E0,>83E0,>04C1,>020C
      DATA  >0F00,>C30C,>1301,>1E00,>022C,>0100,>04E0,>83D0
      DATA  >028C,>2000,>1332,>C80C,>83D0,>1D00,>0202
      DATA  >4000,>9812,>20FF,>16EE,>A0A0,>20A4,>1003,>C0A0
      DATA  >83D2,>1D00,>C092,>13E6,>C802,>83D2,>05C2
      DATA  >C272,>D160,>8355,>1309,>9C85,>16F2,>0985,>0206
      DATA  >208C,>9CB6,>16ED,>0605,>16FC,>0581,>C801
      DATA  >203A,>C809,>2034,>C80C,>2032,>0699,>10E2,>1E00
      DATA  >02E0,>209A,>C009,>0420,>2114,>09D1,>1604
      DATA  >0380,>02E0,>209A,>04C1,>06C1,>D741,>F3E0,>20FC
      DATA  >0380,>C80B,>2030,>02E0,>20BA,>0420,>2124
      DATA  >02E0,>83E0,>1303,>C2E0,>2030,>045B,>D820,>20BA
      DATA  >8322,>0460,>00CE,>04E0,>2022,>53E0,>20FC
      DATA  >C020,>8356,>0420,>2120,>0008,>1332,>0220,>FFF7
      DATA  >0201,>0200,>0420,>210C,>0580,>C800,>202E
      DATA  >C1E0,>2024,>C147,>04CC,>06A0,>25E0,>0283,>0001
      DATA  >1624,>058C,>04C3,>1023,>0283,>0046,>161E
      DATA  >04C2,>06A0,>262E,>0283,>003A,>16F7,>C020,>202E
      DATA  >0600,>0201,>0100,>0420,>210C,>06A0,>25E0
      DATA  >C020,>2022,>1307,>06A0,>2646,>1005,>CB4E,>0016
      DATA  >C3A0,>2022,>0380,>D740,>F3E0,>20FC,>0380
      DATA  >06A0,>25C2,>04C4,>D123,>2662,>0974,>C808,>202C
      DATA  >06A0,>2594,>0464,>23F8,>0580,>0240,>FFFE
      DATA  >C120,>2024,>A100,>1808,>8804,>2026,>1B05,>C160
      DATA  >2024,>C804,>2024,>100A,>C120,>2028,>A100
      DATA  >8804,>202A,>140C,>C160,>2028,>C804,>2028,>C1C5
      DATA  >0209,>0008,>06A0,>262E,>0609,>16FC,>10B6
      DATA  >0200,>0800,>10CC,>A005,>C800,>2022,>10AF,>A800
      DATA  >202C,>13AC,>0200,>0B00,>10C2,>A005,>C1C0
      DATA  >10A6,>A005,>DDC0,>DDE0,>20DB,>10A1,>A005,>06A0
      DATA  >2566,>C000,>1316,>0226,>FFF8,>8106,>1B02
      DATA  >0514,>1096,>8594,>16F8,>89A4,>0002,>0002,>16F4
      DATA  >89A4,>0004,>0004,>16F0,>C0E6,>0006,>C250
      DATA  >C403,>C009,>16FC,>0224,>0008,>C804,>202A,>10EA
      DATA  >A005,>06A0,>2566,>0226,>FFF8,>8106,>13E3
      DATA  >C296,>1501,>050A,>8294,>16F7,>89A4,>0002,>0002
      DATA  >16F3,>89A4,>0004,>0004,>16EF,>C296,>1516
      DATA  >C0E6,>0006,>C253,>C4C0,>C0C9,>16FC,>C246,>6244
      DATA  >C286,>022A,>0008,>C0C6,>0643,>064A,>C693
      DATA  >0649,>16FB,>0224,>0008,>C804,>202A,>10D9,>CB44
      DATA  >0002,>0200,>0C00,>0460,>2432,>0460,>2494
      DATA  >C28B,>0209,>0006,>C1A0,>202A,>0226,>FFF8,>C106
      DATA  >8806,>2028,>1AF3,>C806,>202A,>06A0,>262E
      DATA  >DDA0,>20E1,>0609,>16FA,>C580,>0206,>4000,>045A
      DATA  >C28B,>04C0,>C30C,>1308,>06A0,>262E,>D020
      DATA  >20E1,>06A0,>262E,>A003,>045A,>0209,>0004,>06A0
      DATA  >262E,>06A0,>25C2,>0A40,>A003,>0609,>16F8
      DATA  >045A,>0223,>FFD0,>0283,>000A,>1A05,>0223,>FFF9
      DATA  >0283,>0019,>1B01,>045B,>0200,>0A00,>0460
      DATA  >2432,>02E0,>83E0,>0200,>2032,>C330,>C270,>C830
      DATA  >8354,>C830,>8356,>C050,>1D00,>9820,>4000
      DATA  >20FF,>161D,>0699,>101B,>1E00,>02E0,>20DA,>C020
      DATA  >202E,>0201,>20DB,>0202,>0004,>0420,>2118
      DATA  >7000,>0950,>1610,>0982,>C001,>0201,>203C,>0420
      DATA  >2118,>04C8,>0602,>11D7,>D0F1,>0983,>A203
      DATA  >045B,>02E0,>20DA,>04C0,>06C0,>0460,>2432,>0201
      DATA  >3F40,>0221,>FFF8,>C011,>1105,>8060,>202A
      DATA  >16F9,>05CB,>045B,>0200,>0D00,>045B,>2D52,>5163
      DATA  >6483,>8455,>045C,>5B5F,>5EF0,>F003,>F0F0
      DATA  >4700,>00C8,>3F38
LOW4  DATA  >5554,>4C54,>4142,>2022,>5041,>4420,>2020,>8300
      DATA  >4750,>4C57,>5320,>83E0,>534F,>554E,>4420
      DATA  >8400,>5644,>5052,>4420,>8800,>5644,>5053,>5441
      DATA  >8802,>5644,>5057,>4420,>8C00,>5644,>5057
      DATA  >4120,>8C02,>5350,>4348,>5244,>9000,>5350,>4348
      DATA  >5754,>9400,>4752,>4D52,>4420,>9800,>4752
      DATA  >4D52,>4120,>9802,>4752,>4D57,>4420,>9C00,>4752
      DATA  >4D57,>4120,>9C02,>5343,>414E,>2020,>000E
      DATA  >584D,>4C4C,>4E4B,>2104,>4B53,>4341,>4E20,>2108
      DATA  >5653,>4257,>2020,>210C,>564D,>4257,>2020
      DATA  >2110,>5653,>4252,>2020,>2114,>564D,>4252,>2020
      DATA  >2118,>5657,>5452,>2020,>211C,>4453,>524C
      DATA  >4E4B,>2120,>4C4F,>4144,>4552,>2124,>4750,>4C4C
      DATA  >4E4B,>2100
* RXB Character set 
CHRLDR MOV  R11,R9    * save return address
       LI   R0,>03F8  * VDP destination address
       BL   @VWADD    * write out VDP write address
       LI   R2,96     * COUNT
       LI   R8,VDPWD  * Register faster then address
CHRLP  MOVB *R1+,*R8  * write next VDP byte from ROM 1
       MOVB *R1+,*R8  * write next VDP byte from ROM 2
       MOVB *R1+,*R8  * write next VDP byte from ROM 3
       MOVB *R1+,*R8  * write next VDP byte from ROM 4
       MOVB *R1+,*R8  * write next VDP byte from ROM 5
       MOVB *R1+,*R8  * write next VDP byte from ROM 6
       MOVB *R1+,*R8  * write next VDP byte from ROM 7
       MOVB *R1+,*R8  * write next VDP byte from ROM 8
       DEC  R2        * COUNT-1
       JNE  CHRLP     * Repeat if not done 
       B    @PAGER    * DONE
CHARS  BYTE >00,>00,>00,>00,>00,>00,>00,>00 *   31
       BYTE >00,>00,>00,>00,>00,>00,>00,>00 *   32
       BYTE >00,>10,>10,>10,>10,>00,>10,>00 * ! 33 
       BYTE >00,>28,>28,>28,>00,>00,>00,>00 * " 34
       BYTE >00,>28,>7C,>28,>28,>7C,>28,>00 * # 35
       BYTE >00,>38,>54,>30,>18,>54,>38,>00 * $ 36
       BYTE >00,>44,>4C,>18,>30,>64,>44,>00 * % 37
       BYTE >00,>20,>50,>20,>54,>48,>34,>00 * & 38
       BYTE >00,>08,>10,>20,>00,>00,>00,>00 * ' 39
       BYTE >00,>08,>10,>10,>10,>10,>08,>00 * ( 40
       BYTE >00,>20,>10,>10,>10,>10,>20,>00 * ) 41
       BYTE >00,>00,>28,>10,>7C,>10,>28,>00 * * 42
       BYTE >00,>10,>10,>7C,>10,>10,>00,>00 * + 43
       BYTE >00,>00,>00,>00,>00,>30,>10,>20 * , 44
       BYTE >00,>00,>00,>7C,>00,>00,>00,>00 * - 45
       BYTE >00,>00,>00,>00,>00,>30,>30,>00 * . 46
       BYTE >00,>04,>08,>10,>20,>40,>00,>00 * / 47
       BYTE >00,>3C,>4C,>54,>64,>44,>38,>00 * 0 48
       BYTE >00,>10,>30,>10,>10,>10,>38,>00 * 1 49
       BYTE >00,>38,>44,>08,>10,>20,>7C,>00 * 2 50
       BYTE >00,>38,>44,>18,>04,>44,>38,>00 * 3 51
       BYTE >00,>08,>18,>28,>48,>7C,>08,>00 * 4 52
       BYTE >00,>78,>40,>78,>04,>44,>38,>00 * 5 53
       BYTE >00,>38,>40,>78,>44,>44,>38,>00 * 6 54
       BYTE >00,>7C,>04,>08,>10,>20,>20,>00 * 7 55
       BYTE >00,>38,>44,>38,>44,>44,>38,>00 * 8 56
       BYTE >00,>38,>44,>44,>3C,>04,>78,>00 * 9 57
       BYTE >00,>00,>30,>30,>00,>30,>30,>00 * : 58
       BYTE >00,>00,>30,>30,>00,>30,>10,>20 * ; 59
       BYTE >00,>00,>10,>20,>40,>20,>10,>00 * < 60
       BYTE >00,>00,>00,>7C,>00,>7C,>00,>00 * = 61
       BYTE >00,>00,>10,>08,>04,>08,>10,>00 * > 62
       BYTE >00,>38,>44,>08,>10,>00,>10,>00 * ? 63
       BYTE >00,>38,>44,>54,>58,>40,>3C,>00 * @ 64
       BYTE >00,>38,>44,>44,>7C,>44,>44,>00 * A 65
       BYTE >00,>78,>24,>38,>24,>24,>78,>00 * B 66
       BYTE >00,>38,>44,>40,>40,>44,>38,>00 * C 67
       BYTE >00,>78,>24,>24,>24,>24,>78,>00 * D 68
       BYTE >00,>7C,>40,>78,>40,>40,>7C,>00 * E 69
       BYTE >00,>7C,>40,>78,>40,>40,>40,>00 * F 70
       BYTE >00,>38,>44,>40,>4C,>44,>38,>00 * G 71
       BYTE >00,>44,>44,>7C,>44,>44,>44,>00 * H 72
       BYTE >00,>38,>10,>10,>10,>10,>38,>00 * I 73
       BYTE >00,>04,>04,>04,>04,>44,>38,>00 * J 74
       BYTE >00,>44,>48,>50,>70,>48,>44,>00 * K 75
       BYTE >00,>40,>40,>40,>40,>40,>7C,>00 * L 76
       BYTE >00,>44,>6C,>54,>44,>44,>44,>00 * M 77
       BYTE >00,>44,>64,>54,>54,>4C,>44,>00 * N 78
       BYTE >00,>38,>44,>44,>44,>44,>38,>00 * O 79 
       BYTE >00,>78,>44,>44,>78,>40,>40,>00 * P 80
       BYTE >00,>38,>44,>44,>54,>4C,>3C,>00 * Q 81
       BYTE >00,>78,>44,>44,>78,>48,>44,>00 * R 82
       BYTE >00,>38,>44,>30,>08,>44,>38,>00 * S 83
       BYTE >00,>7C,>10,>10,>10,>10,>10,>00 * T 84
       BYTE >00,>44,>44,>44,>44,>44,>38,>00 * U 85
       BYTE >00,>44,>44,>44,>44,>28,>10,>00 * V 86
       BYTE >00,>44,>44,>44,>54,>54,>28,>00 * W 87
       BYTE >00,>44,>28,>10,>10,>28,>44,>00 * X 88
       BYTE >00,>44,>44,>28,>10,>10,>10,>00 * Y 89
       BYTE >00,>7C,>08,>10,>20,>40,>7C,>00 * Z 90
       BYTE >00,>38,>20,>20,>20,>20,>38,>00 * [ 91
       BYTE >00,>00,>40,>20,>10,>08,>04,>00 * \ 92
       BYTE >00,>38,>08,>08,>08,>08,>38,>00 * ] 93
       BYTE >00,>10,>38,>54,>10,>10,>10,>00 * ^ 94
       BYTE >00,>00,>00,>00,>00,>00,>7C,>00 * _ 95
       BYTE >00,>20,>10,>08,>00,>00,>00,>00 * ` 96
       BYTE >00,>00,>00,>38,>48,>48,>3C,>00 * a 97
       BYTE >00,>20,>20,>38,>24,>24,>38,>00 * b 98
       BYTE >00,>00,>00,>1C,>20,>20,>1C,>00 * c 99
       BYTE >00,>04,>04,>1C,>24,>24,>1C,>00 * d 100
       BYTE >00,>00,>00,>1C,>28,>30,>1C,>00 * e 101
       BYTE >00,>0C,>10,>38,>10,>10,>10,>00 * f 102
       BYTE >00,>00,>00,>1C,>24,>1C,>04,>38 * g 103
       BYTE >00,>20,>20,>38,>24,>24,>24,>00 * h 104
       BYTE >00,>10,>00,>30,>10,>10,>38,>00 * i 105
       BYTE >00,>08,>00,>08,>08,>08,>48,>30 * j 106
       BYTE >00,>20,>20,>24,>38,>28,>24,>00 * k 107
       BYTE >00,>30,>10,>10,>10,>10,>38,>00 * l 108
       BYTE >00,>00,>00,>78,>54,>54,>54,>00 * m 109
       BYTE >00,>00,>00,>38,>24,>24,>24,>00 * n 110
       BYTE >00,>00,>00,>18,>24,>24,>18,>00 * o 111
       BYTE >00,>00,>00,>38,>24,>38,>20,>20 * p 112
       BYTE >00,>00,>00,>1C,>24,>1C,>04,>04 * q 113
       BYTE >00,>00,>00,>28,>34,>20,>20,>00 * r 114
       BYTE >00,>00,>00,>1C,>30,>0C,>38,>00 * s 115
       BYTE >00,>10,>10,>38,>10,>10,>0C,>00 * t 116
       BYTE >00,>00,>00,>24,>24,>24,>1C,>00 * u 117 
       BYTE >00,>00,>00,>44,>28,>28,>10,>00 * v 118
       BYTE >00,>00,>00,>44,>54,>54,>28,>00 * w 119
       BYTE >00,>00,>00,>24,>18,>18,>24,>00 * x 120
       BYTE >00,>00,>00,>24,>24,>1C,>04,>38 * y 121
       BYTE >00,>00,>00,>3C,>08,>10,>3C,>00 * z 122
       BYTE >00,>0C,>10,>10,>20,>10,>10,>0C * { 123
       BYTE >00,>10,>10,>10,>00,>10,>10,>10 * | 124
       BYTE >00,>60,>10,>10,>08,>10,>10,>60 * } 125
       BYTE >00,>00,>20,>54,>08,>00,>00,>00 * ~ 126
* CALL HEX($variable,variable,...)                       *
* CALL HEX(variable,$variable,...)                       *
* CALL HEX(">####",variable,...)                         *
* HEX VDP to RAM (String to Address) LES version
* R1 HEX ADDRESS        = FAC
* R2 STRING             = PAD 
* R3 COUNTER            = R3
ASCHEX MOV  R11,R9      * save return address
       CLR  R1          * clear result reg
       LI   R2,PAD      * assume 4-bytes in PAD-PAD3
       LI   R3,4        * load counter
HEX01  CLR  R0          * zero out work reg
       MOVB *R2+,R0     * Get ASC byte
       SWPB R0          * get byte to LSB
       CI   R0,103      * g ?
       CI   R0,97       * a ?
       JHE  HEX02       * Valid
       CI   R0,71       * G ?
       CI   R0,65       * A ?
       JHE  HEX02       * Valid
       CI   R0,58       * : ?
       CI   R0,47       * / ?
* Convert ASC to value
HEX02  AI   R0,>FFD0    * correct for 0-9 in LSB
       CI   R0,>000A    * LSB < 10?
       JLT  HEX03       * we're good
       AI   R0,>FFF9    * correct LSB for A-F
* Add digit and shift if not done
HEX03  A    R0,R1       * add hex digit to result reg LSB
       DEC  R3          * decrement counter
       JEQ  HEX04       * return if done
       SLA  R1,4        * shift hex digit left 1 nybble
       JMP  HEX01       * get another hex digit
HEX04  MOV  R1,@FAC     * get result to FAC for CIF
       B    @PAGER      * return to XB
       B    @PAGER      * return to XB

* CALL ALPHALOCK(numeric-variable)                        *
ALPHA  MOV  R11,R9     * save return address
       MOV  R12,R8     * save R12 value for later
       CLR  R12        * ZERO OUT R12
       SBZ  21         * PUT ALPHA LOCK STROBE
       SRC  R12,14     * WAIT
       TB   7          * ALPHA LOCK DOWN?
       SBO  21         * RESET ALPHA LOCK STROBE
       LI   R1,>994A   * ALPHA LOCK ON     
ALPHAO MOV  R8,R12     * Restore R12 
       MOV  R1,@FAC    * Save value to FAC
       B    @PAGER     * return to XB
* CALL ISRON(variable)                                    *
XISRON MOV  R11,R9       * save return address
       MOV  @FAC,@ISR    * Put FAC into ISR Interupt hook
       JMP  NEXIT        * exit
* CALL ISROFF(variable)                                   *  
XISROF MOV  R11,R9       * save return address
       MOV  @ISR,@ISR    * Compare if new ISR HOOK
       JEQ  NHOOK        * No 
       MOV  @ISR,@FAC    * Put ISR Hook into FAC
NHOOK  CLR  @ISR         * Clear ISR Hook
NEXIT  CLR  @STATUS      * Clear GPL stuse byte
       B    @PAGER       * return to XB
SPRLP  MOVB R1,*R8     * Write a byte to next VRAM location
       DEC  R2         * COUNT-1  
       JNE  SPRLP      * LOOP
       AORG >7FFA
       B    *R9      * return to caller   


  • Like 2
Link to comment
Share on other sites

I also looked at Rich's code. Firstly, it's nice code. But the code itself doesn't really support the argument I think Rich was making (unless I misunderstood him). The subroutines presented all save their return address (usually into R9) and then only BL one level deep. Sometimes there a couple of BL's in the same routine, but they're not nested. In this scenario, the action Rich has chosen (saving the return address into a register, taking a BL, and then returning to the parent via the address in the saved register) is absolutely the correct one. Also, as Rich and @apersson850 pointed out up-thread, when you're working in a stock console environment you basically have no memory at all, so a BLWP is a major luxury. It's frankly a miracle that TI/Microsoft/Whoever-the-hell-wrote-the-TI-OS managed to make it all work at all.


Just as an aside, I noticed that the last three or four subroutines in Rich's code save the return address, but don't take a BL. In those cases the MOV R11,R9 (or whatever they were) could be removed. 


I'm more interested in knowing where the tipping point is in selecting BLWP over BL - assuming you have expansion memory in which to host a workspace. My current hunch is: If you're saving more than two registers elsewhere and having to restore them, you might as well take a BLWP and have the luxury of hosting your subroutine in it's own register space. Or, as I might be inclined to do, have workspaces for each logical layer of my code. TurboForth (the only major assembly language application I have written for the TI) does not do this, however there are places in the code where I wish I had done that. The block editor is the first area of the code-space that springs to mind. It's totally un-related to the rest of the Forth system - it's basically a stand-alone program/utility. It would have made much more sense to host it in its own workspace.

  • Like 4
Link to comment
Share on other sites

Another way of describing a tipping point is if the number of registers involved in hosting the arguments needed by the subroutine you call plus the number of working registers the subroutine need for its internal processing exceed what you have available in the workspace (remember that you may have some global variables there too), then you have two options:

  • Replace the BL call by a BLWP call.
  • Consider if a simple LWPI is sufficient to get a new workspace without actually calling a subroutine as well. Note that you can also do this inside the subroutine called by BL, not only in inline code. Then you don't need to save R11. It will still be there, in the previous workspace.

The advantage of LWPI is that R13-R15 are free to use as well, since they aren't needed for return linkage.

The disadvantage is that you don't automatically get a pointer to your previous workspace injected into R13. You have to provide access to the previous workspace in a different way, if your arguments are there. It's simple enough - LWPI R13, OLDWORKSAPCE will fix it, but that's another instruction to execute.

Another disadvantage, if you use the BL instruction to call a subroutine and do LWPI inside, is that to be able to reset the workspace you need to know which workspace was used by the caller. The TMS 9900 does have STWP (STore Workspace Pointer, i.e. save what it is now to a register), but not LWP (Load Workspace Pointer, i.e. set the new value from a register). Instead you have to move the stored value to the word after the LWPI instruction where you restore the caller's registers, which requires self modifying code, and that doesn't run from ROM, if that's a desire you have.


If you make the subroutine for a specific program, then you probably do know the caller's workspace, but if you try to write a more general subroutine, then BLWP is a better option, since you never know what the caller is up to. There you go for a different tipping point, not based on the number of registers you save, but the intended use case for the subroutine.

Edited by apersson850
  • Like 3
Link to comment
Share on other sites

2 hours ago, Willsy said:

Just as an aside, I noticed that the last three or four subroutines in Rich's code save the return address, but don't take a BL. In those cases the MOV R11,R9 (or whatever they were) could be removed. 

It's for consistency. The return to BASIC is via the PAGER routine, and that uses R9 to return via, regardless of whether the routine that ends there did or did not call any subroutine.

  • Like 1
Link to comment
Share on other sites

2 hours ago, Willsy said:

I'm more interested in knowing where the tipping point is in selecting BLWP over BL - assuming you have expansion memory in which to host a workspace. My current hunch is: If you're saving more than two registers elsewhere and having to restore them, you might as well take a BLWP and have the luxury of hosting your subroutine in it's own register space. Or, as I might be inclined to do, have workspaces for each logical layer of my code. TurboForth (the only major assembly language application I have written for the TI) does not do this, however there are places in the code where I wish I had done that. The block editor is the first area of the code-space that springs to mind. It's totally un-related to the rest of the Forth system - it's basically a stand-alone program/utility. It would have made much more sense to host it in its own workspace.

My opinion is that never :) 

To be more precise, in my opinion BLWP type activities are fine if you're calling an operating system function (although a XOP would be better but requires a proper ROM vectors), or doing a context switch (perhaps for an interrupt but also for other things). Otherwise I'd prefer BL. In fact the only cases where I've used BLWP have been for debug or I/O purposes, to output a value in a register without messing any of the other registers. For this thing BLWP is very convenient. In those cases the workspace might even be in external slow RAM, since for debug purposes you typically don't care about performance.


When you use BL and "manually" stack the registers you need, you create reentrant code. Which can support recursion. It doesn't have to be immediate recursion, i.e. the function calling itself, but also indirect: so you're at function A, and you call function B, which then calls back function A. Or however deep this might be. The point is, that with reentrant code you don't need to know if a routine you're calling is eventually calling the very same routine again, before returning. This topic is also closely related to calling conventions. The C language calling convention could use R10 as a stack pointer (and could use the same register for the frame pointer). In a C calling convention you'd typically pass the first four arguments to the function in R0-R3, permit the called function to destroy the values in R0-R3 but preserve other registers, and provide the return value in R0 for example. If this is standardised it becomes easy to know how a function is supposed to behave.

  • Like 5
Link to comment
Share on other sites

1 hour ago, speccery said:

My opinion is that never :) 

Then you are concerned about execution performance, I presume?

In my programming life I also care about programming performance and reliability, not only execution speed. To isolate caller from called a BLWP could be handy, and isolation is usually good to prevent one stupid mistake to kill more than itself. Unless it's real-time programming where performance is cruical, or it simply doesn't work.


But the stacking concept is good for recursion, direct or indirect.

Using R0-R3 as arguments could work, but not allowing the called routine any work registers (except destroy the arguments) doesn't sound efficient to me. I would probably prefer to pass arguments on the stack in that case, and make a local copy if I want to access it frequently before the subroutine has finished its job.

Having the stack pointer also be the frame pointer/environment pointer, isn't that a bit confusing if you need to push/top temporaries on the stack? I've always preferred a separate frame pointer.

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Create New...