Jump to content
IGNORED

Assembly on the 99/4A


matthew180

Recommended Posts

21 hours ago, GDMike said:

 Msg: #100

I'm thinking to just add my own post # at tos into my header.

Does adding a number to the post take up so much space it had to be removed?

It is a stupid change that makes the reference to a post much harder to find and is a disservice to the users, and I do pay them money each month.

I do not like to pay for less service and more work on my part since they are getting my money it should not have been changed.

Again, it was a stupid change.

  • Like 1
Link to comment
Share on other sites

12 minutes ago, RXB said:

Does adding a number to the post take up so much space it had to be removed?

It is a stupid change that makes the reference to a post much harder to find and is a disservice to the users, and I do pay them money each month.

I do not like to pay for less service and more work on my part since they are getting my money it should not have been changed.

Again, it was a stupid change.

Msg# 102

It's a recession... getting less these days..

Sorry, just being tupid.

Maybe the other software wasn't secure 

Edited by GDMike
Link to comment
Share on other sites

On 11/1/2022 at 9:20 PM, RXB said:

Pretty much limits you to Assembly only.

I've used the ISR hook in a program written in Extended BASIC. Supported by assembly, of course.

 

Regarding traditional vs. optimized assembly: That thread where the list comes from was about a benchmark, where traditional was the first approach that came to somebody's mind, optimzied the best of a number of attempts by various people to improve execution speed.

 

I've used this division of programs in a project once:

  • Extended BASIC for main program and a rudimentary assembly program loader which could load a memory image file. Either named file on disk or the first found on the tape.
  • Assembly program which is loaded by the Extended BASIC program and will load the main assembly language support program (to be called from Extended BASIC) and will also load the character definitions for true descenders from a file. This program is written in such a way that the part of it in 8 K RAM that's only used at startup is at the end, and is then overwritten by data buffers when the program is running. Other parts of it is loaded into 24 K RAM, in an area Extended BASIC doesn't need.
  • An Extended BASIC program used to load the assembly files in their correct places in memory and then save them as memory image files. This program must run on a disk system, but can create files on a cassette, so the final program can be used with a cassette system.
  • An assembly language program which supports with memory image file handling for the creation of the software package.

This was a two man project, where a friend wrote the first program and I did the last three.

I write about this just to show that the concept of having a program that creates a program to be used by a program is not at all any novel thought.

  • Like 3
Link to comment
Share on other sites

  • 1 month later...

Howdy folks.  As described in the 

 ...thread, I'm working on a platform game, now to be pure AL. Also, I'm an AL newbie, so please bear with my ignorance.

 

I've successfully ported my "rendering prototype" from XB256 to AL, incl. one test screen, char definitions, and the colour table. This is all working well, and I have evidence that my sprite definitions are also being loaded successfully. However, I seem to have hit a roadblock trying to get any sprites on the screen, at least ones under my control ;).

 

Here is my sprite-related code:

 

Defines some labels:

SATRTB EQU  >0380                      * Sprite attribute table base
SPGTB  EQU  >2800                      * Sprite pattern generator table base

 

Inits the Graphics Mode:

 

       LI   R0, >01E2                  * 11100010 Graphics I
       BL   @VWTR                      * 16K,No Blank,Enable Int,M1,M2,0,16x16,1x Mag
       LI   R0,>0200          * Name Base Table to >0000 - >02FF (768 bytes)
       BL   @VWTR
       LI   R0,>030C          * Color Table to >0300 - >0320 (32 bytes)
       BL   @VWTR
       LI   R0,>0404          * Pattern Generator Table
       BL   @VWTR             * >2000 - >2800 (2048 bytes)
       LI   R0,>0507          * Sprite Attribute Table
       BL   @VWTR             * >0380 - >03FF (128 bytes)
       LI   R0,>0605          * Sprite Pattern Table
       BL   @VWTR             * >2800 - >2C00 (1024 bytes)

       LI   R0,SATRTB         * Disable all sprite processing by writing
       LI   R1,>D000          * >D0 (208) to the vertical position of the
       BL   @VSBW             * first sprite entry

 

Populates the Sprite Attribute Table:

*      Load Sprite Patterns
       LI   R0, SPGTB
       LI   R1, SPRDAT
       LI   R2, SPREND - SPRDAT
       BL   @VMBW

 

Then populates the Sprite Attribute Table with 1 sprite via:

	LI  R0,SATRTB        ;Address of Sprite Attribute List.
	LI  R1,SL1           ;Pointer to the list.
	LI  R2,4	         ;Copy 1st sprite only
	BLWP @VMBW       	 ;Move the list

 

Sprite Attribute table:

****************************************
* Sprite Locations
****************************************
SL1    DATA >60A0, >800F

 

Sprite Pattern table:

****************************************
* Sprite Patterns
****************************************
SPRDAT
       DATA >0000, >001F, >3F7F, >67C7 ; Color 15
       DATA >FDFD, >7F0A, >201A, >1F0F ;
       DATA >0000, >00C0, >E0F0, >3018 ;
       DATA >F8F8, >F080, >20C0, >C080 ;

 

 

...and of course, nothing appears on the screen! 🤨

 

One thing I cannot figure out is how the sprite pattern mapping is supposed to work in AL. The E/A manual states:

Quote

In the Editor/Assembler, the Sprite Descriptor Table starts at address >0000 for pattern code >00. However, addresses >0400 and above are usually used for the block because the lower addresses are used for the Screen Image Table, Color Table, and Sprite Attribute List. The pattern defined starting at address >0400 is referred to as pattern code >80 in the Sprite Attribute Table.

Huh?

 

How do you reference the sprite pattern that you want to use in the Sprite Attribute Table?  

 

I thought since the Sprite Pattern Table starts at >2800 maybe it should be >2800 + (16*(Sprite#-1)), but you only get one byte to specify it.  Simply using >00 or >01 didn't work either.

 

 

 

Edited by retrodroid
  • Like 1
Link to comment
Share on other sites

2 hours ago, retrodroid said:

How do you reference the sprite pattern that you want to use in the Sprite Attribute Table?  

 

I thought since the Sprite Pattern Table starts at >2800 maybe it should be >2800 + (16*(Sprite#-1)), but you only get one byte to specify it.  Simply using >00 or >01 didn't work either.

 

First off, I would use the defaults for all of the tables except, possibly, the Sprite Descriptor Table. For that, I make it coincident with the Pattern Descriptor Table and use codes >00 – >30 for up to 31 sprites and some higher code (>255) for sprite #31. Or, just use all upper codes (>224 – >255). If you really must use a space different from the PDT, I would use a space starting just after the PDT at >1000. This way, you don’t split up VRAM space. But—whatever floats your boat.

 

That said and regarding your question above, patterns for characters (and standard-size sprites) are 8 bytes each. The patterns are referenced from the start of the respective pattern table beginning at character >00. In your code, the pattern for character >00 starts at VRAM address SPGTB. The pattern for each character can be found by multiplying the character number by 8. This is only important for changing the pattern for a given character. When you associate a sprite with a character (in the Sprite Attribute Table), you only need the character number (>00 – >FF, 0 – 255) for any given pattern.

 

Regarding why your sprite is not showing, you have specified character >80 when you should have specified character >00 because you have only defined the first 4 character patterns.

 

...lee 

  • Like 1
Link to comment
Share on other sites

1 hour ago, Lee Stewart said:

 

First off, I would use the defaults for all of the tables except, possibly, the Sprite Descriptor Table. For that, I make it coincident with the Pattern Descriptor Table and use codes >00 – >30 for up to 31 sprites and some higher code (>255) for sprite #31. Or, just use all upper codes (>224 – >255). If you really must use a space different from the PDT, I would use a space starting just after the PDT at >1000. This way, you don’t split up VRAM space. But—whatever floats your boat.

 

That said and regarding your question above, patterns for characters (and standard-size sprites) are 8 bytes each. The patterns are referenced from the start of the respective pattern table beginning at character >00. In your code, the pattern for character >00 starts at VRAM address SPGTB. The pattern for each character can be found by multiplying the character number by 8. This is only important for changing the pattern for a given character. When you associate a sprite with a character (in the Sprite Attribute Table), you only need the character number (>00 – >FF, 0 – 255) for any given pattern.

 

Regarding why your sprite is not showing, you have specified character >80 when you should have specified character >00 because you have only defined the first 4 character patterns.

 

...lee 

Thank-you!  I was getting lost in the woods mentally with this one.

 

It turns out that the source of the problem was that I had copied in some sample code from a couple of different sources, the primary one being the "Viewport" example from this very thread, and the other the Sprite example from the E/A manual.   The "Viewport" example disables the interrupt and replaces all the standard libs with faster versions of its own, which is the model I plan to imitate. However, the E/A example uses the standard library calls and the interrupt enabled.  At the end of the day, the reason my Sprite Attribute Table failed to load was that I was inadvertently using "BLWP @VMBW" instead of just "BL @VMBW".  Your note about following the standard locations tweaked me to remembering that I was using code from two different approaches and I found my problem minutes later.  👍

 

I'm embarrassed to admit that I messed around for more than a day trying to get that code to do something.  

 

Once I got my sprites loading, I immediately added a bunch more to my test list, and then pondered for a little tooo long when it kept only showing the first 4 sprites in the list...    :)    ...got that resolved too. haha.

 

Edited by retrodroid
  • Like 3
Link to comment
Share on other sites

14 minutes ago, retrodroid said:

I messed around for more than a day trying to get that code to do something.  

That's all? Wow that's not bad. Because when you read the EA manual, every word has to be looked up for the definition when most of the time it considers you know beforehand. If things take a month, then you will know it forever with good notes. You're on the right path.

Edited by GDMike
  • Like 1
Link to comment
Share on other sites

Okay, here's a bit of a devil's advocate question.  

 

I have come to understand that it is a best practice to not access the VDP memory from the CPU side except immediately after a vertical sync interrupt (vertical retrace signal), and that there is a pocket of time during this event where the VDP is otherwise idle and the CPU can access the VDP RAM much faster than at other times, since at other times the VDP is busy painting the screen.  Let me know if I have that correct or not.

 

In my own test programs I've noticed that if I wait for the vsync interrupt before I do anything I get a nice solid 60hz cycle/update frequency.  I've also noticed that if I ignore the vsync interrupt and let my program run at full-speed, everything still seems to work, but much much faster.  Maybe I notice a little screen-tearing if I'm updating the screen each cycle.

 

Is there a risk of data-corruption or some other failure scenario by not waiting for the vsync? If not why wouldn't I want my program to run 10x faster? Why is waiting for the vsync a thing?

 

Link to comment
Share on other sites

8 hours ago, retrodroid said:

Okay, here's a bit of a devil's advocate question.  

 

I have come to understand that it is a best practice to not access the VDP memory from the CPU side except immediately after a vertical sync interrupt (vertical retrace signal), and that there is a pocket of time during this event where the VDP is otherwise idle and the CPU can access the VDP RAM much faster than at other times, since at other times the VDP is busy painting the screen.  Let me know if I have that correct or not.

 

In my own test programs I've noticed that if I wait for the vsync interrupt before I do anything I get a nice solid 60hz cycle/update frequency.  I've also noticed that if I ignore the vsync interrupt and let my program run at full-speed, everything still seems to work, but much much faster.  Maybe I notice a little screen-tearing if I'm updating the screen each cycle.

 

Is there a risk of data-corruption or some other failure scenario by not waiting for the vsync? If not why wouldn't I want my program to run 10x faster? Why is waiting for the vsync a thing?

 

It depends what you're doing.

 

On the TI-99/4A, only relatively special sequences of code can overrun the VDP, so the usual concerns about access speed causing lost data on other platforms are much less of a concern on the TI. (Generally, again unless you are writing specially high performance code, the only situation to be wary of is reading after setting the VDP read address - there's needs to be 8uS between those steps, and if you are running in scratchpad with scratchpad registers and the address of the VDP in a register (yes, all that), then it's possible to sometimes read before the data is ready.) The CPU/hardware combination is otherwise too slow (especially for writes, due to the multiplexer injecting wait states even on the read-before-write cycle.)

 

Anyway, nothing about accessing the VDP affects the CPU speed. There are no extra wait states of any kind (which is why VDP overrun is a thing at all). So when you access the VDP doesn't affect CPU performance.

 

On the TI, you /normally/ operate in LIMI 0 state - this means the VDP can not interrupt your code. The VDP interrupt is the only way that corruption of data can occur - for instance, if the interrupt occurs in the middle of a copy loop, it might change the VDP address without your knowledge and corrupt the rest of the loop. This is a big issue on systems like the ColecoVision where the VDP interrupt is non-maskable and must always be considered - but not so much on the TI as long as you aren't leaving interrupts on (LIMI 2) during your main loop.

 

So ultimately, the best reason to wait for vsync on the TI is that you are either using a high performance copy loop to update tables and want to guarantee no corruption, or you want to improve your odds of no screen tearing due to updating tables mid-screen. The other reason is simply to control timing of your code - for instance games can do their work, then wait for vsync to guarantee a standard framerate.

 

  • Like 6
  • Thanks 1
Link to comment
Share on other sites

Since my console has 64 K RAM on 16-bit bus, it does happen now and then, if I try to run a program written for the stock console, that the screen gets corrupted. Obviously because the programmer realized he could remove the recommended NOP between address setup and data read.

If you do run your code with interrupts enabled (LIMI 2), then you should bracket you VDP access to make sure it runs without interruption.

 

LIMI 0

Set VDP address

Read or write all data you need

LIMI 2

  • Like 4
Link to comment
Share on other sites

12 hours ago, retrodroid said:

Why is waiting for the vsync a thing?

In addition to what Tursi said, if your main loop executes faster than 1/60 s, you want to wait for vsync because otherwise you would end up drawing graphics that are never shown. Unless you can find something useful to do in the remaining time before vsync, e.g. preparing for the next frame, you might as well burn it off and ensure a regular update speed.

 

If your main loop is slower than 1/60 s, it's not so obvious that you want to wait for vsync, but maybe you're using a double buffer technique, e.g. flipping between drawing to alternate name tables, in which case waiting for vsync before making the flip ensures no screen tearing. 

 

  • Like 3
Link to comment
Share on other sites

Great answers, thanks all!

 

So for a game/program that can complete all of its required processing in under 1/60th (or 1/50th) of a second, you might as well just have it execute after each vsync using a the "single-buffer" of the main screen.

 

But for a more complex program that requires more than 1/60th of a second to complete processing, you can use a double buffered approach to paint the background buffer in your own good time and use the vsync interval to swap buffers (when ready, not necessarily every interval). Makes sense.

 

So should I be adding a NOP between setting the VDP read address and performing the read?

 

Edited by retrodroid
Link to comment
Share on other sites

1 hour ago, retrodroid said:

So should I be adding a NOP between setting the VDP read address and performing the read?

 

Unless your code and source register are in Scratchpad RAM (>8300 – >83FF), you are probably safe. That said, a good way to get the extra wait is to interleave an instruction that can be run in parallel. However, that will likely only be practical if you are in-lining your VRAM reading code.

 

...lee

Link to comment
Share on other sites

1 hour ago, retrodroid said:

So should I be adding a NOP between setting the VDP read address and performing the read?

If you want to be really nice to accelerated consoles, you should put something. But nine times out of ten you can put a more useful instruction there than a NOP, meaning it's not wasted time anyway ;)

 

Link to comment
Share on other sites

1 hour ago, Tursi said:

If you want to be really nice to accelerated consoles, you should put something. But nine times out of ten you can put a more useful instruction there than a NOP, meaning it's not wasted time anyway ;)

 

Yea like XB ROMs use SWPB R1 as it knows that R1 needs to be swapped later for another use of that byte being in left side of word.

Another one is SOC that is slower then SWPB but again prepares for a later use of that word.

Link to comment
Share on other sites

14 hours ago, Tursi said:

But nine times out of ten you can put a more useful instruction there than a NOP, meaning it's not wasted time anyway ;)

That was my thought too. Look ahead in your code. If it looks like this:

  • Set up address.
  • Wait.
  • Read data.
  • Instruction unrelated to the data that's read.

Then you can just as well write it like this:

  • Set up address.
  • Instruction unrelated to the data that's read.
  • Read data.

It doesn't matter if the unrelated instruction comes a few lines down. As long as it's unrelated, it will still work.

  • Like 2
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...