Assembly on the 99/4A

Tursi · January 8, 2023

On 1/6/2023 at 10:38 AM, SteveB said:

I was surprised to see that a shift operation can be just as expensive as a MPY multiplication, though I do not really know how to read this table...

Basically...

First row is if the count is not zero, the cycles are 12+2*C, and there are 3 memory accesses (read instruction, read data, write data)
Second row, if it is zero, then R0 is used for the count. If R0 is /also/ 0, this means 16 shifts. The number of cycles is 52, and there is one more memory access (to read R0).
Third row, if R0 is not zero, then the count is 20+2*N (being the value in R0), plus the same extra memory access. They said N instead of C to avoid confusion about which count to use.

It's worth noting that the second and third row actually say the same thing (20+2*16 = 52 cycles!). I think they just wanted to clarify that shift count of 0 with R0 equal to 0 still has meaning.

(Also worth noting that the R0 count is taken from the 4 least significant bits, with the rest of R0 ignored.)

apersson850 · January 8, 2023

4 hours ago, Tursi said:

It's worth noting that the second and third row actually say the same thing (20+2*16 = 52 cycles!). I think they just wanted to clarify that shift count of 0 with R0 equal to 0 still has meaning.

No, they don't say the same thing, since if N=0 then the bit count is 16 and the cycle count is 52. But 20+2*0=20, so the formula in the third row isn't valid if N=0.

Switch1995 · January 8, 2023

Apologies if addressed and I missed it. Does use of KSCAN risk destroying anything in scratch pad (like the built in E/A VDP routines would)?

Tursi · January 9, 2023

20 hours ago, apersson850 said:

No, they don't say the same thing, since if N=0 then the bit count is 16 and the cycle count is 52. But 20+2*0=20, so the formula in the third row isn't valid if N=0.

Editted. But you can keep the internet point I awarded.

Edited January 9, 2023 by Tursi

Tursi · January 9, 2023

14 hours ago, Switch1995 said:

Apologies if addressed and I missed it. Does use of KSCAN risk destroying anything in scratch pad (like the built in E/A VDP routines would)?

I think it only needs GPLWS (>83E0 onwards). It also changes the GROM address if a key is pressed.

+OLD CS1 · January 9, 2023

1 hour ago, Tursi said:

You win an internet point.

Hey, those things add up.

Tursi · January 9, 2023

Yeah, I'm gonna retract that post. Just getting grumpy from lack of retro coding time again

apersson850 · January 9, 2023

7 hours ago, Tursi said:

Editted. But you can keep the internet point I awarded.

Ouch! I missed the original version. How sad... 🤪

Sergioz82 · January 14, 2023

On 1/4/2023 at 8:36 PM, retrodroid said:

So here's another newbie to AL question.

I note that AL has multiple seemingly useful and powerful logical instructions to perform binary math operations. Coming from higher level languages, it's bit a of mystery to me exactly *when* each of these might be useful. I've annotated the ones below that I've identified uses for, top of mind:

ANDI

CLR - reset word to zeros

COC

CZC

INV - Create a "masked"/selected effect for a char pattern?

ORI

SETO - Set word to "FFFF" / max value.

SLA - Multiply by 2 for each bit shifted, move LSB byte to the MSB position.

SOC

SRA - Divide by 2 for each bit shifted, move MSB byte to the LSB position.

SRC

SRL

SZC

XOR

Can someone provide references or examples when the others would come into play? So far the books I've seen all explain in one line and a simple example WHAT these do, but give no clue as to WHEN you might find them useful.

I feel like the "secret" to AL programming is knowing when to leverage these instructions. Thus far, I can kind of get where I need to go by using a small set of AL instructions pieced together in what is probably a horrifically clunky and inelegant manner, and I'd like to find resources to explain/show how/when to leverage these to create elegant AL code.

Hi

I don't have practical examples for all the instructions you ask but I hope this can help:

CLR: it's useful when you want to clean a register or a memory location(s). For example let's say for some reasons you want to clear the first 32 bytes in VDP memory:

CLR R0 ; here goes the address you want to reach in VDP memory. Since it's >0000 then CLR is perfect.
CLR R1 ;clear the word value in R1 (CLR is for words only, you can't use it to clear a single byte ~~clear a byte without clearing the next one~~). Only most significative byte is required by VSBW routine, but CLR is a fast way to set it 00
J1 BLWP @VSBW ;VDP SINGLE BYTE WRITE routine: write the most significative byte in R1 in VDP memory at address contained in R0
INC R0 ;go to next byte in the table
CI R0,32 ;have I cleared 32 bytes?
JNE J1 no: jump and repeat. yes: go to the next line of code.

CZC: in Spac Man I use this instruction to make the character accept player's input every 8 pixels interval, that is when it's aligned with a passage and not in between a passage and a wall where the input would go wasted:

MOV @PACYX,R0 ;high byte of PACYX is Y value (row), low byte X (column)
AI R0,>0100 ;add one pixel in Y as row for sprites starts at FF while for characters at 0 but I need them aligned for CZC check and later to determine on which 8x8 character the sprite is on.
LI R1,>0707 ;prepare R1 for double CZC check in both row and column. In a single byte, 7 means 00000111 in binary. 8 is 00001000. R1 content in binary is 00000111 00000111
CZC R1,R0 ;if the bits set to 1 in R1 are 0 in R0 it means inside R0 I have a multiple of 8 so I can accept input player
JNE NOINPUT ;If the correspondence is false then jump (skip) the KSCAN routine. In my case it means keep moving in the same direction.

SETO: I use it together with CLR to set a flag in a subroutine for detecting the character under the sprite. I reserved a word in RAM labeled DBLCHK.

I SETO @DBLCHK when I want to use the subroutine for reading two characters at time (I use magnified sprites 16x16), otherwhise I CLR @DBLCHK when I need to read a single character

SOC and SZC are quite handy when you want to manipulate words/bytes. For example:

Here I use SOCB (same command but for bytes) to change the color of the trap to the color of the ghost that has fallen into.

LI R0,>390 ;VDP address for color table of trap charset
LI R1,DBLUE ;I prepare R1 with dark blue on transparent value 40 (DBLUE EQU >4000) which is the trap default color
MOVB @3(R9),R10 ;R9 points to the row-column word at sprite attribute list entry for the captured ghost. I then use the pointer to move what's 3 bytes ahead that address (the sprite color, let's say 07 cyan) in R10 high byte
SOCB R10,R1 SOCB in this case simply merges the twos: now in R1 I have >4700 *
BLWP @VSBW ;write the new color byte 47 in trap charset

* if color nibbles were the same for both then it could come handy a SLA instruction: let's suppose dark blue was 04 then before SOCB I could add SLA R10,4 so the result was >70 and SOCB result would have been 74

Edited January 15, 2023 by Sergioz82

Stuart · January 15, 2023

2 hours ago, Sergioz82 said:

CLR is word only, you can't clear a byte without clearing the next one

You can clear a byte using ANDI. ANDI R1,>00FF will clear the upper byte of R1. ANDI R1,>FF00 will clear the lower byte of R1. Not that it makes the slightest difference to your example code. ;-)

+Lee Stewart · January 15, 2023

5 hours ago, Sergioz82 said:

       CLR  R0         ; here goes the address you want to reach in VDP memory. Since it's >0000 then CLR is perfect.
       CLR  R1         ; clear the word value in R1 (CLR is word only, you can't clear a byte without clearing the next one).
;                        Only most significative byte is required by VSBW routine, but CLR is a fast way to set it 00
J1     BLWP @VSBW      ; VDP SINGLE BYTE WRITE routine: write the most significative byte in R1 in VDP memory at address contained in R0
       INC  R0         ; go to next byte in the table
       CI   R0,32      ; have I cleared 32 bytes?
       JNE  J1         ; no: jump and repeat. yes: go to the next line of code.

Perhaps not germane to your point, but calling VSBW in a loop is slow because it writes the VRAM address for every byte transferred. You might consider inlining the VRAM writes to take advantage of the automatic address-incrementing of VRAM access:

VDPWA  EQU  >8C02       ; VDP Write Address register
VDPWD  EQU  >8C00       * VDP Write Data register

; ***Be sure interrupts are off for the following code***
; Write VRAM starting address
       LI   R0,>4000    ; Address of screen start + write-data flag
       CLR  R1          ; writing 0s to VRAM..also the LSB of the VDP write address
       LI   R2,>VDPWA   ; writing to VDP write address register
       MOVB R1,*R2      ; write LSB of VRAM start address
       MOVB R0,*R2      ; write MSB of VRAM start address

; Write 32 0-bytes to beginning of screen
       DECT R2          ; correct R2 to point to VDP write data register
       LI   R3,32       ; load byte count
J1     MOVB R1,*R2      ; write 0-byte to next VRAM address
       DEC  R3          ; decrement byte count
       JNE  J1          ; repeat if count not expired..else next line of code

...lee

Sergioz82 · January 15, 2023

9 hours ago, Stuart said:

You can clear a byte using ANDI. ANDI R1,>00FF will clear the upper byte of R1. ANDI R1,>FF00 will clear the lower byte of R1. Not that it makes the slightest difference to your example code.

Indeed. That's a good example for @retrodroid. Btw I edited my comment in that part, "you can't clear a byte without clearing the next one" on a second read it seemed too generic to me, like, it's not possible at all.

@Lee Stewart

Quote

Perhaps not germane to your point, but calling VSBW in a loop is slow because it writes the VRAM address for every byte transferred. You might consider inlining the VRAM writes to take advantage of the automatic address-incrementing of VRAM access:

That's an interesting example. These days I'm experimenting with bitmap mode so I'm looking for different takes on VDP access routines. In my case I wanted to make a "software sprite" and move it around the screen. The horizontal displacement is doable, the vertical.. I wonder if there are some methods to trick VDP internal counter to do something like MOVB R1,@8(R0)

Edited January 15, 2023 by Sergioz82

Asmusr · January 15, 2023

55 minutes ago, Sergioz82 said:

That's an interesting example. These days I'm experimenting with bitmap mode so I'm looking for different takes on VDP access routines. In my case I wanted to make a "software sprite" and move it around the screen. The horizontal displacement is doable, the vertical.. I wonder if there are some methods to trick VDP internal counter to do something like MOVB R1,@8(R0)

I suggest to split the software sprite routine into 3 parts: lines belonging to the top incomplete character (if any), then handle full characters, and finally handle lines belonging to the bottom incomplete character (if any).

retrodroid · January 15, 2023

20 hours ago, Sergioz82 said:

Hi

I don't have practical examples for all the instructions you ask but I hope this can help:

CLR: it's useful when you want to clean a register or a memory location(s). For example let's say for some reasons you want to clear the first 32 bytes in VDP memory:

CLR R0 ; here goes the address you want to reach in VDP memory. Since it's >0000 then CLR is perfect.
CLR R1 ;clear the word value in R1 (CLR is for words only, you can't use it to clear a single byte ~~clear a byte without clearing the next one~~). Only most significative byte is required by VSBW routine, but CLR is a fast way to set it 00
J1 BLWP @VSBW ;VDP SINGLE BYTE WRITE routine: write the most significative byte in R1 in VDP memory at address contained in R0
INC R0 ;go to next byte in the table
CI R0,32 ;have I cleared 32 bytes?
JNE J1 no: jump and repeat. yes: go to the next line of code.

CZC: in Spac Man I use this instruction to make the character accept player's input every 8 pixels interval, that is when it's aligned with a passage and not in between a passage and a wall where the input would go wasted:

MOV @PACYX,R0 ;high byte of PACYX is Y value (row), low byte X (column)
AI R0,>0100 ;add one pixel in Y as row for sprites starts at FF while for characters at 0 but I need them aligned for CZC check and later to determine on which 8x8 character the sprite is on.
LI R1,>0707 ;prepare R1 for double CZC check in both row and column. In a single byte, 7 means 00000111 in binary. 8 is 00001000. R1 content in binary is 00000111 00000111
CZC R1,R0 ;if the bits set to 1 in R1 are 0 in R0 it means inside R0 I have a multiple of 8 so I can accept input player
JNE NOINPUT ;If the correspondence is false then jump (skip) the KSCAN routine. In my case it means keep moving in the same direction.

SETO: I use it together with CLR to set a flag in a subroutine for detecting the character under the sprite. I reserved a word in RAM labeled DBLCHK.

I SETO @DBLCHK when I want to use the subroutine for reading two characters at time (I use magnified sprites 16x16), otherwhise I CLR @DBLCHK when I need to read a single character

SOC and SZC are quite handy when you want to manipulate words/bytes. For example:

Here I use SOCB (same command but for bytes) to change the color of the trap to the color of the ghost that has fallen into.

LI R0,>390 ;VDP address for color table of trap charset
LI R1,DBLUE ;I prepare R1 with dark blue on transparent value 40 (DBLUE EQU >4000) which is the trap default color
MOVB @3(R9),R10 ;R9 points to the row-column word at sprite attribute list entry for the captured ghost. I then use the pointer to move what's 3 bytes ahead that address (the sprite color, let's say 07 cyan) in R10 high byte
SOCB R10,R1 SOCB in this case simply merges the twos: now in R1 I have >4700 *
BLWP @VSBW ;write the new color byte 47 in trap charset

* if color nibbles were the same for both then it could come handy a SLA instruction: let's suppose dark blue was 04 then before SOCB I could add SLA R10,4 so the result was >70 and SOCB result would have been 74

Thanks for this, these are exactly the type of real-world examples I was looking for.

As I develop my "rendering prototype", basically a test app where I figure out how I'm can render various aspects of my game, and the required techniques, etc. in AL, I've been caught up numerous times already by the 16-bit word vs. 1-byte nature of many of the VDP operations. Usually when something doesn't work I'll come back to it later and instantly see that I'm setting values in the low nibble that are expected to be in the high, etc. I think I'm starting to get the hang of it.

Edited January 15, 2023 by retrodroid

retrodroid · January 15, 2023

I have a game-technique question I'd like to pose to the experienced wizards on this forum.

Thus far, I've been using the vsync interrupt (via polling the status bit) to provide a counter that I can use to determine the relative timings for animations and object movements in my game. It's pretty simple to animate some object by updating the charset pattern every 4 or 8 frames (0.067s), for example.

However, in my game certain enemies not only have their own inherent relative speed to say each other, or the player character, but also I need to be able to increase/multiply their speeds as the difficulty increases over time. So enemy X might move at 1/2 the speed of P1 at the start of the game (e.g. 1 pixel every 4 frames), but over time their speed could increase by 25% per level. This means that in order to increase their apparent movement across the screen accordingly, I not only need to figure out some ratio of frames to movement to apply (say, move 1 px every 3/5 frames), but also eventually be able to also move them more than 1px per frame for faster speeds.

I'm guessing I need to figure out a way to express their speed as "pixels-per-second", and then using that and 1/60th (or 1/50th for PAL) vsync interrupt determine how many pixels the need to move each frame, over time. For example, if the desired speed is "60 px per second" then simply moving the object 1 px per vsync cycle meets that requirement. But if the desired speed is 45 px per second, I need some fractional math to tell me if the object should be moved during each vsync cycle, and by how many pixels.

This all seems like a complex math problem, and not something I'm very comfortable implementing in AL at this point. Any suggestions are appreciated.

apersson850 · January 15, 2023

9 hours ago, Sergioz82 said:

That's an interesting example. These days I'm experimenting with bitmap mode so I'm looking for different takes on VDP access routines. In my case I wanted to make a "software sprite" and move it around the screen. The horizontal displacement is doable, the vertical.. I wonder if there are some methods to trick VDP internal counter to do something like MOVB R1,@8(R0)

I recognize that from when I developed routines for printing text in bitmap mode, where the text may be placed anywhere (not just in the block it would normally reside). You get four quadrants to handle.

@retrodroid Don't forget that the slowest instruction of them all, DIV, will calculate both the quotient and the remainder at the same time.

speed div 50 = how many pixels to move at the interrupt.

speed mod 50 = how many of them you should move one pixel more.

So you move the number of pixels from the first formula on every interrupt, and from the second you get how many out of 50 interrupts you should add one more pixel.

For example if your speed is 52, then you get 1 and 2. So for each interrupt you move one pixel, but for two out of 50 you move one more.

Depending on how accurate you want it to be, you can just count integer number of interrupts, or you can create bit masks with bits indicating at which interrupt to add one more pixel. That table doesn't get too large and will be pretty fast to handle, if you do a lookup style access.

PeteE · January 15, 2023

I think it could be expressed using three variables: speed, counter and a threshold. The speed is added to the counter, and while the counter exceeds the threshold, move the object and subtract the threshold from the counter. If the object moves 1px every frame, then the speed will be equal to the threshold. If the object moves more than 1px every frame, the speed is greater than the threshold, etc.

With threshold = 100:

Speed = 25, move 1px every 4 frames

Speed = 33, move 1px every 3 frames (close enough)

Speed = 20, move 1px every 5 frames

Speed = 200, move 2px every frame

Speed = 400, move 4px every frame

Something like this code:

  ; initialization
  LI  R0, 10     ; speed
  CLR R1         ; counter

  ; game loop
GAMELP

  A   R0, R1     ; add speed to counter
CHKMOV ; check if object can move this frame
  CI  R1, 100    ; counter < threshold?
  JL  NOMOVE
  AI  R1,-100    ; subtract threshold from counter
  BL  @MOVOBJ    ; move object 1px
  JMP CHKMOV
NOMOVE
  ; game loop continues

  ; wait for vsync
  BL  @VSYNC

  JMP GAMELP

Asmusr · January 15, 2023

It sounds like you need to look into fixed point arithmetic, which it much simpler than it sounds. For instance, you can store positions and velocities as 8.8 fixed point numbers where the 8 most significant bits are the integer part and the 8 least significant bit are the fraction part. In this representation 256 is 1, 512 is 2, 128 is 0.5, 64 is 0.25, 192 is 0.75, and so on. It's like storing your numbers multiplied by 256. You can add an subtract the numbers as usual, and when you need to display the sprites you just use the most significant byte. You can also use fewer bits for the fraction, in which case you need to shift to get the integer part.

retrodroid · January 16, 2023

7 hours ago, apersson850 said:

I recognize that from when I developed routines for printing text in bitmap mode, where the text may be placed anywhere (not just in the block it would normally reside). You get four quadrants to handle.

@retrodroid Don't forget that the slowest instruction of them all, DIV, will calculate both the quotient and the remainder at the same time.

speed div 50 = how many pixels to move at the interrupt.

speed mod 50 = how many of them you should move one pixel more.

So you move the number of pixels from the first formula on every interrupt, and from the second you get how many out of 50 interrupts you should add one more pixel.

For example if your speed is 52, then you get 1 and 2. So for each interrupt you move one pixel, but for two out of 50 you move one more.

Depending on how accurate you want it to be, you can just count integer number of interrupts, or you can create bit masks with bits indicating at which interrupt to add one more pixel. That table doesn't get too large and will be pretty fast to handle, if you do a lookup style access.

Your post resembles my own thinking on how to approach this, but "clarified".

The speed calculations will only be necessary at the start of each of new screen, so are not especially time-critical (so DIVs should be fine).

Here is my amateurish code for how I control how often a sprite should be moved or animated, using the vsync interrupt:

*      Only do what follows every 6 frames. @FRCTR holds the frame number.
       CLR  R4
       MOV  @FRCTR,R5
       LI   R0,6
       DIV  R0,R4
       CI   R5,0
       JNE  SKIPIT
	   * Put stuff here that you want done every 6 frames.

SKIPIT	
*	   * Put stuff here that you want done every frame.

This seems to work but I'm using it all over the place. I was thinking I could build a data table with the list of binary mask values, each binary digit representing the corresponding frame count, then change the FRAMECTR to a simple index into the list of values, and use bitmask AND to check if the current FRAMECTR value was a match for the required frame number, instead of the current DIV logic.


BYTE >01 ; 0000 0001  Matches every frame (60/FPS)
BYTE >03 ; 0000 0011  Matches every 2nd and every 1 frames.  
BYTE >05 ; 0000 0101  Matches every 3rd and every 1 frames.
BYTE >0B ; 0000 1011  Matches every 4th, 2nd, and 1 frame. 
BYTE >11 ; 0001 0001  Matches every 5th, and 1 frame.
BYTE >25 ; 0010 0101  Matches every 6th, 3rd, and 1 frame.
BYTE >41 ; 0100 0001  Matches every 7th and 1 frame.
BYTE >8B ; 1000 1011  Matches every 8th, 4th, 2nd, and 1 frame. 
ETC...

That kind of approach seems like it would be much more efficient since I would be referencing this code in many places each frame.

Edited January 16, 2023 by retrodroid

retrodroid · January 16, 2023

5 hours ago, Asmusr said:

It sounds like you need to look into fixed point arithmetic, which it much simpler than it sounds. For instance, you can store positions and velocities as 8.8 fixed point numbers where the 8 most significant bits are the integer part and the 8 least significant bit are the fraction part. In this representation 256 is 1, 512 is 2, 128 is 0.5, 64 is 0.25, 192 is 0.75, and so on. It's like storing your numbers multiplied by 256. You can add an subtract the numbers as usual, and when you need to display the sprites you just use the most significant byte. You can also use fewer bits for the fraction, in which case you need to shift to get the integer part.

Yes, I see how that could work. The "speed" tracking is what I'll need for my actual game, currently my playground code is just counting frames (as noted in the post above) as a simple but limited way to control execution timing for things.

Thanks!

apersson850 · January 17, 2023

On 1/15/2023 at 9:01 PM, Asmusr said:

It sounds like you need to look into fixed point arithmetic, which it much simpler than it sounds. For instance, you can store positions and velocities as 8.8 fixed point numbers where the 8 most significant bits are the integer part and the 8 least significant bit are the fraction part. In this representation 256 is 1, 512 is 2, 128 is 0.5, 64 is 0.25, 192 is 0.75, and so on. It's like storing your numbers multiplied by 256. You can add an subtract the numbers as usual, and when you need to display the sprites you just use the most significant byte. You can also use fewer bits for the fraction, in which case you need to shift to get the integer part.

That may be the single most efficient way of handling it.

retrodroid · January 23, 2023

On 1/15/2023 at 12:09 PM, PeteE said:
I think it could be expressed using three variables: speed, counter and a threshold. The speed is added to the counter, and while the counter exceeds the threshold, move the object and subtract the threshold from the counter. If the object moves 1px every frame, then the speed will be equal to the threshold. If the object moves more than 1px every frame, the speed is greater than the threshold, etc.

With threshold = 100:

Speed = 25, move 1px every 4 frames

Speed = 33, move 1px every 3 frames (close enough)

Speed = 20, move 1px every 5 frames

Speed = 200, move 2px every frame

Speed = 400, move 4px every frame

Something like this code:
  ; initialization
  LI  R0, 10     ; speed
  CLR R1         ; counter

  ; game loop
GAMELP

  A   R0, R1     ; add speed to counter
CHKMOV ; check if object can move this frame
  CI  R1, 100    ; counter < threshold?
  JL  NOMOVE
  AI  R1,-100    ; subtract threshold from counter
  BL  @MOVOBJ    ; move object 1px
  JMP CHKMOV
NOMOVE
  ; game loop continues

  ; wait for vsync
  BL  @VSYNC

  JMP GAMELP

After working it and trying different things I've come to realize your approach above works just fine and has the benefit of simplicity (no division required at all). This is working well in my game now, thanks.

Now I have a new problem, my main game has exploded in size due to all the different variables/states that must be dealt with - e.g. Player mode (standing, walking, jumping, falling, etc.), input direction (Up, Down, Right, Left), and different navigation, boundary and intersection checks that must be used, etc. etc. This means that I'm starting to see the very convenient "JEQ" opcodes start to fail due to being too far away from their targets.

So I'm making an effort to encapsulate some of the main/commonly used algorithms into more abstract routines that can be called from multiple modes/places with different configuring parameters (set in registers and variables), in order to reduce code duplication and overall code size.

A problem I have is that I need to be able to branch to a routine stored in a register, sort of a way for each mode to define its own intersect detect behaviour, and then have the common code invoke the routine that was passed in. I'm having a bear of time with this, and nothing seems to work.

Here's an example:

DOSTUFF
	MOV  R11, *R10+              * Push return address onto the stack

	; Stuff goes here

	DECT R10                        * Pop return address off the stack
    MOV  *R10, R11
    B    *R11
*// EDOSTUFF

	...	
	MOV  @DOSTUFF,R3
	BL   *R3                     ; Branch to char collision routine

I've tried flailing around with the exact syntax but no matter what I try I can't get my "DOSTUFF" routine to be invoked by the branch, instead I get a colourful system crash/lockup.

What's the best-practice for this type of operation?

Asmusr · January 23, 2023

59 minutes ago, retrodroid said:

What's the best-practice for this type of operation?

You need:

BL @DOSTUFF

or

LI R3,DOSTUFF

BL *R3

or

MOV @DOSTUFFADDR,R3

BL *R3

DOSTUFFADDR: DATA DOSTUFF

+Lee Stewart · January 23, 2023

1 hour ago, retrodroid said:

I've tried flailing around with the exact syntax but no matter what I try I can't get my "DOSTUFF" routine to be invoked by the branch, instead I get a colourful system crash/lockup.

What's the best-practice for this type of operation?

What @Asmusr said. What you were doing was branching to the contents of DOSTUFF, which is a MOV instruction that evaluates to >CE8B—not the address (DOSTUFF) you intended.

That said, it is not clear to me why you go to the extra trouble of loading a register with the address of your routine instead of @Asmusr’s first suggestion of simply BLing to the address.

...lee

retrodroid · January 23, 2023

2 hours ago, Asmusr said:

You need:

BL @DOSTUFF

or

LI R3,DOSTUFF

BL *R3

or

MOV @DOSTUFFADDR,R3

BL *R3

DOSTUFFADDR: DATA DOSTUFF

Thank you!

1 hour ago, Lee Stewart said:

What @Asmusr said. What you were doing was branching to the contents of DOSTUFF, which is a MOV instruction that evaluates to >CE8B—not the address (DOSTUFF) you intended.

That said, it is not clear to me why you go to the extra trouble of loading a register with the address of your routine instead of @Asmusr’s first suggestion of simply BLing to the address.

...lee

I'm creating some more generic routines to do things like player sprite movement with different parameters depending on which mode they are in (walk, fall, climb, etc.) (to reduce code bloat). It's data-driven, using data lists I pass in in other registers. So I can configure the routine in the calling routine by setting up some registers to different lists of values, and even in this case other routines to be optionally called.

Assembly on the 99/4A

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members