Jump to content
IGNORED

CC65 - writing a DLI tutorial


Recommended Posts

Hi All,

As part of my contribution to the cc65 development for the atari 8bit community,
I've decided to write a short tutorial that might help you guys if you wish to learn how to setup a DLI service in your app.

 

I assume you are already familiair with terms such as Display List, WSYNC, VBLANK, NMIEN, raster beam, scanline, clock cycles, HW registers, shadow registers, DMA.

If you're not, you can refer the book 'De-Re-Atari' which has some good explanations and tutorials for these terms.

 

What is a DLI?

DLI simply stands for Display List Interrupt. When the raster beam reaches the right part of the current scanline of the display screen it is turned off. during that time, there is an oppertunity to perform some graphics changes that enahce the display. This oppertunity is implemented through the usage of a display list interrupt.

 

What can I do with a DLI?

With the DLI you can achieve some really cool graphics enhancements which includes:

- A colorful screen. You set different colors in different scanlines. for example you wish to draw a sky which is combined out of different shades of blue.

- Multiple character sets. have multiple character sets appearing together on the same screen, for example, you have a top menu bar that display the remaining lives, time to finish level and score - all which are combined out of a specific character set, and below the top menu bar, you have the leve screen itself which is combined out of a different character set.

- Player Missle Horizontal position. The PMG horizontal position can be controlled for each scanline. for example, in your game you collect objects which are made out of players. the objects can be drawn several times on screen (not on the same scanline though) so one player can be shown as multiple object on screen in different horizontal positions.

- Players width and priority. mostly used with priority masking to blend with background to get additional colors. for example, the background is drawn overlayed with player 0 and because of the priority register, the background gets the color of the player.

 

What actually happens when a DLI is executed?

A DLI execution is broken into 3 phases:

Phase 1 - Covers the period of time that passes from the begining of the DLI routine that is being invoked till the horizontal sync instruction is reached (WSYNC). during this phase, the raster beam is drawing the last scanline of the interrupt mode line. it is not recommended to do graphics changes in this phase.

Phase 2 - Covers the period of time that passes from the WSYNC till the raster beam re-appears on screen. This phase corresponds to the horizontal blank event. This is the most important phase of the DLI execution, as soon as the raster beam is shut off, all the graphics changes should be made.

Phase 3 - Covers the period of time that passes from the re-appearance of the raster beam on screen till the end of the DLI routine.

 

Note: Graphics changes must be done in phase 2 of the routine. if not, it can affect your DLI timing and you can observe some unwanted graphics appearing on screen. For example, the raster beam changes color at the middle or towards the end of the scanline rather than changing the entire color of the scanline itself. The graphics changes should be done fast before the raster beam is "waking" up again. that is why the dli routine must be short and fast.

 

What does the DLI routine code looks like?

The dli routine is a simple C routine that returns void and accept void as a parameter. For example:

void dli_routine(void)

DLI routine thumb rules

The are few thumb rule for implementing the dli routine:

1. It has to be short and fast. If you need to do several operations on screen, it is best to break the DLI service into several dli routines that are place in a vector.

2. As the OS interrupt routine saves no registers, you must save the registers by pushing them onto the stack at the beging of the routine, and restore them by pulling them from the stack at the end of the routine. Only the Processor Status Register is being pushed automatically by the OS, but you are likely not using it in your routine.

3. You must work with the graphics HW registers rather then the shadow registers. for example the background color HW register is 53274 ($D01A) where its shadow register is 712 ($2C8).

It is important to understand that on each vertical blank (VBLANK), the OS shadow process will wipes out the HW register.

6. You must return the control back by exiting the dli routine properly. This must be done in CC65 inline asm command using the "rti" instruction.

 

You will have to write your dli routine either entirely or partially using CC65 inline asm. saving and restoring registers, and return the control must be done in asm. so begining and end of routine is inline asm. the middle part can be inline asm (preferred) but can also be in C.

It is best that once your dli routine is in place, to look how the produced asm code from that looks like and to see that there are no mistakes. (this can be done by compiling individually your c file and observe the result object *.s file which is basically an asm file). 

 

DLI routine written entirely in cc65 inline asm

Here is an example of a dli routine that alters Player 0 (PMG) color:

void dli_routine(void)
{
	asm("pha");

        asm("lda #33");

        asm("sta $D40A");

        asm("sta $D012");

        asm("pla");
	asm("rti");
}

The routine above is written entirely in CC65 inline asm. it is short and fast.

Let's break the dli routine down:

- 1st line saves the accomulator register by pushing it onto the stack. why do we need to save the accomulator register only and not the other X and Y registers, well the simple answer is that we are doing so because only the accomulator register is participating in this dli routine (lda, sta instructions), had I used X and Y registers in the routine, I would have definately save and restore them as well.

- 2nd line prepares the new color of the player. It loads it into the accomulator register.

- 3rd line is the wait for the horizontal sync or what is normally called WSYNC event. this is to wait until the raster beam finishes through the entire scan line and shuts off. that is the time to do the graphics changes.

- 4th line, it is time to do the graphics changes. we already have the accomulator loaded with the new color value so now we have to store it in the player 0 HW register ($D012 = 53266). remember HW register and not the shadow one ($2C0=704).

- 5th line, is to restore the accumolator register back from the stack

- 6th line, returning from the dli routine. must be included in all dli routines.

 

Same code written in C

As said earlier, part of the code in the dli routine can be written in C. but part of it must remain in CC65 inline asm as you can't write it in C.

Here is the same code from above, but now in ?

void dli_routine(void)
{
	asm("pha");

        *(unsigned char*)0xD012 = *(unsigned char*)0xD40A  = 33;

        asm("pla");
	asm("rti");
}

Here is the produced asm file afer compiling the dli routine:

; ---------------------------------------------------------------
; void __near__ dli_routine (void)
; ---------------------------------------------------------------
.proc	_dli_routine: near
	pha
	ldx     #$00
	lda     #$21
	sta     $D40A
	sta     $D012
	pla
	rti
.endproc

If you look carefully at the produced code above you will notice that a new instruction was added: ldx #$00. why is that, why do we need it and what is it used for?

Clearly it is not used for and not need, but the compiler and mainly the optimizer decided to produce it as it has sets of rules that also depends on the params that we on the stack prior to the call to the dli routine.

So in order to remove this ldx #$00 instruction we will need to save and restore the X register, to and from the stack, as well.

 

The revised routine will now look like this:

void dli_routine(void)
{
	asm("pha");
	asm("txa");
	asm("pha");

	*(unsigned char*)0xD012 = *(unsigned char*)0xD40A  = 33;

	asm("pla");
	asm("tax");
	asm("pla");

	asm("rti");
}

And the produced asm file will now be right and look like this:

; ---------------------------------------------------------------
; void __near__ dli_routine (void)
; ---------------------------------------------------------------

.proc	_dli_routine: near
        pha
	txa
	pha
	lda     #$21
	sta     $D40A
	sta     $D012
	pla
	tax
	pla
	rti
.endproc

Now the ldx #$00 is gone.

That is why it is always important to observe the generated asm code produced from the C code and make sure everything is in the right place.

 

Now that we have the dli routine in place, we need to set the dli vector to point to it.

To point to our dli routine simply write the following code:

OS.vdslst = &dli_routine;

This is how the vdslst is defined in the file _atarios.h:

    void (*vdslst)(void);                   // = $0200/$0201    DISPLAY LIST NMI VECTOR

It is corresponding to the addresses $0200,$0201 / 512,513 where you normally points to your display list interrupt routines.

 

Next step is to set the vertical point on screen you wish your routine to work on. for example, if you wish to have a colorful sky that start in screenline number 7 then you need to set the dli flag in the screen display list.

 

DLI screenline minus 1

Remember that phase 2 of the DLI execution, is where you do all the graphics changes. This is after a WSYNC has occured. It means that it will take affect when the raster beam re-appears on screen and that is going to be happening only in the next scanline (as the current scanline is finished being drawn).

If you work in Antic mode 4 (char mode) , which every screen line is a 8 scanlines, so each DLI flag you set in your display list, will refer to 8 scanlines of the raster beam on screen.

If you work in Antic mode E (bitmap mode), which every screen line is 1 scanline, so each DLI flag you set in your display list, will correspond to 1 scanline of the raster beam on screen.

It is important for you to understand in which screen mode you are working in, and what is the relationship between the screen line and the scanline of that screen.

 

Back to the colorful sky example, if your colorful sky begins at screen line number 7, you will have to set the dli vertical position on screen to be at screen line number 6.

 

Here is an example of a display list where the DLI instruction is set in the 8th screenline, the actual affected screen line will be screenline number 9:

unsigned char DisplayList[] = 
{
	DL_BLK8,
	DL_BLK8,
	DL_BLK8,
	DL_LMS(DL_CHR40x8x4),
	0,
	0,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_DLI(DL_CHR40x8x4),
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,
	DL_CHR40x8x4,	
	DL_CHR40x8x4,	
	DL_CHR40x8x4,	
	DL_CHR40x8x4,	
	DL_JVB,
	0,
	0
};

In the above example, dli is set using the "DL_DLI" constant and once the raster beam start its work on screenline number 9, the dli routine will be called.

 

Your dli is now setup and ready to go, all you need to do is enable it.

To enable your dli you will need to set the bit D7 of the NMIEN register at the address $D40E (54286).

The NMEIN register is part of the OS NMI which is short for None-Maskable-Interrupts.

These interrupt can not be "masked" (disabled) at the 6502 level. NMI interrupts (except SYSTEM RESET) can be disabled by ANTIC.

OS NMI handler is what handles the dispaly list interrupt.

 

Here is a sample c code to enable the dli:

*(unsigned char*) 0xD40E = 192;

192 is bit D7.

Alternatively you can use the predefined NMIEN flag in _antic.h file as follows:

ANTIC.nmien = NMIEN_DLI | NMIEN_VBI; 

The NMIEN_DLI and NMIEN_VBI are defined in _antic.h as follows:

...
#define NMIEN_DLI   0x80
...
#define NMIEN_VBI   0x40 

and 0x40+0x80 = 0xC0=192 = bit D7.

 

As soon as NMIEN bit D7 is enabled, the dli routine starts to work and now whenever the raster beam will reach to screenline number 9 it will change the color of player 0.

This will go on and on until the dli will be disabled.

 

When you no longer need the dli routine to work, for example, if the game is over and you want to switch your screen from the level screen (where the dli routine works) to the game over screen (where it should be disabled), you will need to disable the dli.

To disable the dli routine you will need to remove bit D7 from the NMIEN register.

 

NMIEN is a write-only register and can't be read back. This means you can't just remove the bit with a simple bitwise operator. 

Instead, you can just set the NMIEN register to have the NMIEN_VBI bit only and by that the D7 bit will be remove by itself. 

 

Here is an example code to disable the dli:

ANTIC.nmien = NMIEN_VBI;

When do I enable/disable my DLI routine

You have to be very careful when you enable your dli. once it is enabled the routine starts to work immediately and if you didn't time it properly you will start see things on screen you did not expect as the raster beam keeps working continuesly.

Good practice would be to enable your dli routine only after an entire screen frame was processed succesfully.

To tell that a frame has been processed, you need to wait for VBLANK to occur. a VBLANK (a.k.a vertical blank) occurs after the raster beam finished scanning the entire screen from the top left to the bottom right. the vblank is normally used by the OS to do housekeeping, but it can be also used for the programmer to do some graphic calculation or set some handlers in place.

one handler is our dli routine.

To do enable the dli routine after VBLANK you either:

- Set up a VBI interrupt handler or

- Use a trick to wait on atari clock at location 20 which indicates the VBLANK has occured.

 

As this tutorial doesn't cover VBI (I will write a new tutorial dedicated to VBI in the near future), For this example I will use the second approach (address 20 clock trick).

 

Here is an example code to enable your dli routine :

void wait_for_vblank(void)
{
    asm("lda $14");
wvb:
    asm("cmp $14");
    asm("beq %g", wvb);
}

void main(void)
{
   ...
   ...
   wait_for_vblank();
   ANTIC.nmien = NMIEN_DLI| NMIEN_VBI;

   ...
   wait_for_vblank();
   ANTIC.nmien = NMIEN_VBI;   
}

The wait_for_vblank above is written entirly in inline asm. it could have been written in C with same results:

unsigned char currClockFrame = 0;

void waitForVBLANK(void)
{
    currClockFrame = OS.rtclok[2];
    while (OS.rtclok[2] == currClockFrame);
}

Multiple DLI routines

A good practices can be using multiple DLI routines. 

This is typically powerful when you have several points on screen you want to do some graphics enhacements. each DLI routine handles a differnt point of enhacement individually. 

 

These routines will be part of a display list interrupt vector. the vector will hold the sequence on which these routines will operate. 

You can control the order of the routines in the sequence. 

Let's take an example of 2 dli routines, each routine changes a graphics HW register separetly.

To process the 2 dli routines in a sequence you need the enter the proper sequence to the dli vector.

It is done by pointing to the 2nd routine inside the 1st routine, and pointing to the 1st routine inside the 2nd routine.

 

Here is a sample code that uses 2 dli routines in a sequence:

void dli_routine1(void)
{
	...
        ...
	OS.vdslst = &dli_routine2;
	...
        ...
}

void dli_routine2(void)
{
	...
        ...
	OS.vdslst = &dli_routine1;
	...
        ...
}

A working example

As a feedback I got while posting this tutorial, I have decided to work on an example that shows how powerful DLI really is and what it can be used for and how.

 

The example below is create 2 PMGs on screen (Player 0 and Player 1).

out of the 2 players (which uses a 3rd color) 3 objects are create: a star, a bird, and an ant-eater.

Each has its own horizontal position and color. 

The star is static, the bird and ant-eater are moving horizontally.

 

The example uses 3 DLI routines (each routine for each object).

Each routine controlls the HPOS and color of each object. 

 

dli.rar

 

Feel  free to leave any feedback or comments.

 

cheers,

Yaron

 

dli.rar

Edited by Yaron Nir
  • Like 11
  • Thanks 2
Link to comment
Share on other sites

 

ANTIC.nmien &= ~NMIEN_DLI;

 

NMIEN is a write-only register and can't be read back. This only happens to work because there is no read register at that address and the read gets $FF. This should just write NMIEN_VBI instead.

 

  • Like 2
Link to comment
Share on other sites

Same code written in C

As said earlier, part of the code in the dli routine can be written in pure C. but part of it must remain in inline asm as you can write it in C.

The above code written in C can look like this:

void dli_routine(void)
{
	asm("pha");

        *(unsigned char*)0xD012 = *(unsigned char*)0xD40A  = 33;

        asm("pla");
	asm("rti");
}

Here is the produced asm file afer compiling the dli routine:

; ---------------------------------------------------------------
; void __near__ dli_routine (void)
; ---------------------------------------------------------------
.proc	_dli_routine: near
	pha
	ldx     #$00
	lda     #$21
	sta     $D40A
	sta     $D012
	pla
	rti
.endproc


If you look carefully at the produced code above you will notice that a new instruction was added: ldx #$00. why is that, why do we need it and for what it is used for?

clearly it is not used for and not need, but the compiler and mainly the optimizer decided to produce it as it has sets of rules that also depends on the params that we on the stack prior to the call to the dli routine.

So in order to remove this ldx #$00 instruction we will need to save and restore the X register as well.

 

The LDX #$00 is there because evaluation in C is done at least in ints, which is 16-bit here, hence the A/X register pair. If you compile with -O or -Os, it should be optimized away, but apparently it is not.

 

Edit: but it's never a good idea to rely on specific compiler behaviour, which might change in the future. Better to write such small, time critical routines fully in assembly.

Edited by ivop
  • Like 1
Link to comment
Share on other sites

Yaron did raise the issue and an interesting discussion followed and whilst the recommendation is to code these in assembler some coders may wish to stick with C. Plus they can always ask on here as someone will ultimately help out. :)

 

 

Its a pain at the moment that the way to reference the locations names doesn't work with the ASM statement to help make things more readable.

asm("sta $D40A"); /* ANTIC.wsync - OK */

asm("sta %v", ANTIC.wsync); /* Error: Identifier expected for argument 1 */
asm("sta %w", ANTIC.wsync); /* Error: Constant integer expression expected */
Edited by Wrathchild
  • Like 2
Link to comment
Share on other sites

Just a moment, let me check out my DLI (which I created last year) ...

 

it looks like this:

#define FONTBASE_HW 0xd409
#define FNT         148
#define WSYNC       0xd40a
#define VSDLST      0x0200
#define NMIEN       0xd40e
#define GPRIOR      0xd01b

void DLI() {
	__asm__("pha");
	__asm__("lda #%b", (unsigned char)FNT);
	__asm__("sta %w", WSYNC); // WSYNC
	__asm__("sta %w", FONTBASE_HW); // FONTBASE Hardware-Register von 756

	__asm__("lda #192");       // Graphics 11
	__asm__("sta %w", GPRIOR); // HW-Register von 623
	__asm__("sta %w", WSYNC);  // WSYNC
	__asm__("lda #64");        // Graphics 9
	__asm__("sta %w", GPRIOR); // HW-Register von 623
	__asm__("sta %w", WSYNC);  // WSYNC

	__asm__("lda #192");       // Graphics 11
	__asm__("sta %w", GPRIOR); // HW-Register von 623
	__asm__("sta %w", WSYNC);  // WSYNC
	__asm__("lda #64");        // Graphics 9
	__asm__("sta %w", GPRIOR); // HW-Register von 623
	__asm__("sta %w", WSYNC);  // WSYNC

	__asm__("lda #192");       // Graphics 11
	__asm__("sta %w", GPRIOR); // HW-Register von 623
	__asm__("sta %w", WSYNC);  // WSYNC
	__asm__("lda #64");        // Graphics 9
	__asm__("sta %w", GPRIOR); // HW-Register von 623
	__asm__("sta %w", WSYNC);  // WSYNC

	__asm__("lda #192");       // Graphics 11
	__asm__("sta %w", GPRIOR); // HW-Register von 623
	__asm__("sta %w", WSYNC);  // WSYNC
	__asm__("lda #64");        // Graphics 9
	__asm__("sta %w", GPRIOR); // HW-Register von 623

	__asm__("pla");
	__asm__("rti");
}

This one works for me, I hope it helps you create your DLI.

Edited by atarixle
Link to comment
Share on other sites

Nice of you doing this tutorial, a fully working sample would be great!

I actually found code posted on AA for doing this (Thanks to Popmilo I think!) and it really helped.

My brief experience with them:

-I was naively hoping I could get around the sprite limitations by changing their positions/sizes/colors every scanline. As it turns out, it's practically impossible. Even though you may have the CPU time, the changes would occur all over the scanline and cause all sorts of glitches.

-I think the concept was a little ahead of its time. It would have been ok with a 6502 at twice the clock speed. It works great on an ST or Amiga.

-I woud like to know why on earth they didn't time DLIs to actually start towards the end of the scanline. From my tests, they start in the middle of it... this means you have to waste a lot of cycles with a costly WSYNC to hide colors changes.
Those colour changes every scanline which the Atari is famous for can't actually be done with DLIs without wasting most of your CPU cycles... It's crazy.


  • Like 1
Link to comment
Share on other sites

Sometimes the pre-amble can be helpful to prep things and then once the wsync is done get out.

e.g.

pha
txa
pha
tya
pha
lda #$34
ldx #$0E
ldy #$86
sta wsync
sta colpf0
stx colpf1
sty colpf2
pla
tay
pla
tax
pla
rti

The CC65 thread also highlights a good method which can be to save space and cycles:

 sta save_a+1
 stx save_x+1
 sty save_y+1
 lda #$34
 ldx #$0E
 ldy #$86
 sta wsync
 sta colpf0
 stx colpf1
 sty colpf2
save_y:
 ldy #0
save_x:
 ldx #0
save_a:
 lda #0
 rti

This works so long as another DLI is not triggered during the processing of the current one (as that would overwrite the original 'save' values.

Then to save even more cycles and instructions I've even seen such routines placed within zero page :)

Edited by Wrathchild
  • Like 2
Link to comment
Share on other sites

 

The LDX #$00 is there because evaluation in C is done at least in ints, which is 16-bit here, hence the A/X register pair. If you compile with -O or -Os, it should be optimized away, but apparently it is not.

 

Edit: but it's never a good idea to rely on specific compiler behaviour, which might change in the future. Better to write such small, time critical routines fully in assembly.

My preference is to write the entire routine in inline asm. sometimes i write parts of it using C sometimes all in asm.

I demonstrated this option to show the programmers here there is a way to write it also in C.

  • Like 1
Link to comment
Share on other sites

Yaron did raise the issue and an interesting discussion followed and whilst the recommendation is to code these in assembler some coders may wish to stick with C. Plus they can always ask on here as someone will ultimately help out. :)

 

 

Its a pain at the moment that the way to reference the locations names doesn't work with the ASM statement to help make things more readable.

asm("sta $D40A"); /* ANTIC.wsync - OK */

asm("sta %v", ANTIC.wsync); /* Error: Identifier expected for argument 1 */
asm("sta %w", ANTIC.wsync); /* Error: Constant integer expression expected */

yes , as mark mentioned i did raise the issue, feedback was good. either way and as explained in the tutorial, it is always best to look at the compiled generated asm file to see how the code looks like and if there are no strange things appearring there.

Link to comment
Share on other sites

Just a moment, let me check out my DLI (which I created last year) ...

 

it looks like this:

#define FONTBASE_HW 0xd409
#define FNT         148
#define WSYNC       0xd40a
#define VSDLST      0x0200
#define NMIEN       0xd40e
#define GPRIOR      0xd01b

void DLI() {
	__asm__("pha");
	__asm__("lda #%b", (unsigned char)FNT);
	__asm__("sta %w", WSYNC); // WSYNC
	__asm__("sta %w", FONTBASE_HW); // FONTBASE Hardware-Register von 756

	__asm__("lda #192");       // Graphics 11
	__asm__("sta %w", GPRIOR); // HW-Register von 623
	__asm__("sta %w", WSYNC);  // WSYNC
	__asm__("lda #64");        // Graphics 9
	__asm__("sta %w", GPRIOR); // HW-Register von 623
	__asm__("sta %w", WSYNC);  // WSYNC

	__asm__("lda #192");       // Graphics 11
	__asm__("sta %w", GPRIOR); // HW-Register von 623
	__asm__("sta %w", WSYNC);  // WSYNC
	__asm__("lda #64");        // Graphics 9
	__asm__("sta %w", GPRIOR); // HW-Register von 623
	__asm__("sta %w", WSYNC);  // WSYNC

	__asm__("lda #192");       // Graphics 11
	__asm__("sta %w", GPRIOR); // HW-Register von 623
	__asm__("sta %w", WSYNC);  // WSYNC
	__asm__("lda #64");        // Graphics 9
	__asm__("sta %w", GPRIOR); // HW-Register von 623
	__asm__("sta %w", WSYNC);  // WSYNC

	__asm__("lda #192");       // Graphics 11
	__asm__("sta %w", GPRIOR); // HW-Register von 623
	__asm__("sta %w", WSYNC);  // WSYNC
	__asm__("lda #64");        // Graphics 9
	__asm__("sta %w", GPRIOR); // HW-Register von 623

	__asm__("pla");
	__asm__("rti");
}

This one works for me, I hope it helps you create your DLI.

 

Thanks for posting your example.

it seems your routine does a lot of things, and as i wrote in the tutorial, these type of dli routines must be SHORT and super fast, and if needed be broken into mulitple dli routines.

i haven't times your example, but it seems it might exceed the time "allowed" to dli to run in all of its 3 phases and it can produce abnormal screen results. as you mentioned this is working normal for you, so it depends a lot on what other things are being performed and what does your main loop looks like.

Link to comment
Share on other sites

Nice of you doing this tutorial, a fully working sample would be great!

 

I actually found code posted on AA for doing this (Thanks to Popmilo I think!) and it really helped.

 

My brief experience with them:

 

-I was naively hoping I could get around the sprite limitations by changing their positions/sizes/colors every scanline. As it turns out, it's practically impossible. Even though you may have the CPU time, the changes would occur all over the scanline and cause all sorts of glitches.

 

-I think the concept was a little ahead of its time. It would have been ok with a 6502 at twice the clock speed. It works great on an ST or Amiga.

 

-I woud like to know why on earth they didn't time DLIs to actually start towards the end of the scanline. From my tests, they start in the middle of it... this means you have to waste a lot of cycles with a costly WSYNC to hide colors changes.

Those colour changes every scanline which the Atari is famous for can't actually be done with DLIs without wasting most of your CPU cycles... It's crazy.

 

 

 

 

so I see both you and mark request a working sample. ok, i will work to create a sample test that demonstrate single DLI and multiple DLIs and update my post.

as for the sprite limitations, search in this forum a post by popmilo, he did a nice demo to show how 2 sprites can be shown on differnent hotizontal positions on screen and it seems as if many sprites appear on screen.

i can try to write a short test app that demonstrate a sprite that is appearing on 2 horizontal positions at once (no on the same scan line that is). let me know if you desire that. should not be that complicated.

the game crownland is using this method and it seems nicely done (except the flickering part).

Link to comment
Share on other sites

 

these type of dli routines must be SHORT and super fast

 

 

'Must' isn't the right word to use as you are quite entitled to have a single DLI that does a wsync for all lines and, for example, change the background colour. Not overly practical, but hey.

 

Equally for a static-ish page like a high score chart in a text mode, you can use a number of wsync's to add brightness or colour gradients to the letters with a DLI flag set on each line.

So here your DLIs are taking up much of the the CPU time but that doesn't matter because your mainly code is basically sitting in a loop waiting for some keyboard or joystick input with perhaps some music running in the VBI.

Edited by Wrathchild
Link to comment
Share on other sites

My example is doing things to more than one scan-line, and it's ment to be work in Graphics 0.

 

I just wanted to show, how I defined the Constants, e.g. in __asm__("sta %w", WSYNC);

 

I am not that kind of super-programmer (in fact, I am a pro in BASIC and Turbo-BASIC), so I don't even know why I use __asm__() instead of asm() ;-)

 

Oh, btw, I call the DLI like this:

	// initialize DLI
	POKEW(0x0200, (int)DLI); // VSDLST ... POKE the address of the function DLI() into 0x0200
	POKE(0xd40e, 0xc0); // NMIEN ... turn on DLI

which is a quick and dirty PEEK-and-POKE, the way I use to do it in BASIC, instead of using a C-like style by using pre-defined functions.

Edited by atarixle
Link to comment
Share on other sites

 

so I see both you and mark request a working sample. ok, i will work to create a sample test that demonstrate single DLI and multiple DLIs and update my post.

as for the sprite limitations, search in this forum a post by popmilo, he did a nice demo to show how 2 sprites can be shown on differnent hotizontal positions on screen and it seems as if many sprites appear on screen.

i can try to write a short test app that demonstrate a sprite that is appearing on 2 horizontal positions at once (no on the same scan line that is). let me know if you desire that. should not be that complicated.

the game crownland is using this method and it seems nicely done (except the flickering part).

 

I wasn't asking for myself, just suggesting that a tutorial with a working example is even better (just something simple like a color change)

 

I've seen some multiplexor demos, I was complaining about changing multiple sprite positions/color every scanline not being practical

 

One thing I'm not sure of and I see you don't mention it, is checking for the source of the NMI when one occurs. I made the assumption that it can be omitted but I'm not sure how safe it is.

 

Another trick to get some cycles back when chaining DLIs is to only update the low byte of the NMI vector if you're not crossing a page... that's pretty obvious indeed but that's my modest contribution to this thread :)

  • Like 1
Link to comment
Share on other sites

 

Its a pain at the moment that the way to reference the locations names doesn't work with the ASM statement to help make things more readable.

asm("sta $D40A"); /* ANTIC.wsync - OK */

asm("sta %v", ANTIC.wsync); /* Error: Identifier expected for argument 1 */
asm("sta %w", ANTIC.wsync); /* Error: Constant integer expression expected */

 

 

I had a response on github to this, there is a workaround:

 

asm("lda %w", (unsigned)&ANTIC.vcount);
asm("sta %w", (unsigned)&GTIA_WRITE.colpf3);
which produces the expected:
lda     $D40B
sta     $D019
ANTIC & GTIA includes have been around for a long time and the recent introduction of "_atarios.h" outline here help provide an alternative to using defines in C.
Link to comment
Share on other sites

My example is doing things to more than one scan-line, and it's ment to be work in Graphics 0.

 

I just wanted to show, how I defined the Constants, e.g. in __asm__("sta %w", WSYNC);

 

I am not that kind of super-programmer (in fact, I am a pro in BASIC and Turbo-BASIC), so I don't even know why I use __asm__() instead of asm() ;-)

 

Oh, btw, I call the DLI like this:

	// initialize DLI
	POKEW(0x0200, (int)DLI); // VSDLST ... POKE the address of the function DLI() into 0x0200
	POKE(0xd40e, 0xc0); // NMIEN ... turn on DLI

which is a quick and dirty PEEK-and-POKE, the way I use to do it in BASIC, instead of using a C-like style by using pre-defined functions.

 

well, as mark mentioned, if you use the latest build of cc65 , then you can use the OS, ANTIC and GTIA build in structures, so for your example 0x0200 it could be used as OS.vdslst

Link to comment
Share on other sites

 

 

I've seen some multiplexor demos, I was complaining about changing multiple sprite positions/color every scanline not being practical

 

you can use dli on everyscan line, for example if you wish your screen to read a table of character set and set them accordingly on screen per line.

 

as for sprite (moving sprites that is , which changes x and y positions all the time) you can use the "kernal" interrupt.

please refer to the book de-re-atari. it give an example of the atari basket ball where 2 sprites were used, and a kernal interrupt was used to set 2 colors of the player.

 

 

 

One thing I'm not sure of and I see you don't mention it, is checking for the source of the NMI when one occurs. I made the assumption that it can be omitted but I'm not sure how safe it is.

not sure i understand what you mean.

Link to comment
Share on other sites

not sure i understand what you mean.

 

In the NMI code I found, there was this sequence(in blue):

 

 

nmi_handler

bit NMIST

bpl nmi_not_dli

bmi dli_handler

 

nmi_not_vbi lda #%00100000

bit NMIST

bne nmi_not_reset

sta NMIRES

rti

nmi_not_reset pla

rti

 

nmi_not_dli pha

bvc nmi_not_vbi

txa

pha

tya

pha

sta NMIRES

vbi_handler_jmp jmp vbi_handler

 

 

dli_handler

rti

 

I'm guessing that's not required, but I'm not sure.

 

It also checks for the Reset button on the 400/800... again I don't really know if that needs to be there.

 

From the Altirra Hardware Reference Manual p55:

 

 

The reset (RNMI) bit stays latched until cleared by NMIRES, but the VBI and DLI bits are mutually exclusive: the

DLI bit is cleared at scan line 248, and the VBI bit is cleared whenever a DLI occurs. This means that it is

generally unnecessary to test the VBI bit or write to NMIRES past boot – the NMI routine can test bit 7 for a DLI,

bit 5 for reset, and then assume a VBI otherwise.

 

I left it in the VBi for safety but then I chain DLI handlers which don't do any tests and then in the last DLI handler I set the NMI address back to the VBi.

Just hoping that's ok because I don't have real HW for testing.

Link to comment
Share on other sites

You don't need to check for the reset NMI in a custom NMI handler because the two are mutually exclusive. The reset NMI is only used on the 400/800, and you can't replace the NMI handler on those models. You can only hook (VVBLKI) and (VDSLST) off of the OS NMI dispatcher, which will already have handled the reset NMI.

 

On the XL/XE series, the reset NMI input is disconnected and will never fire. There is one circumstance where you do need to care, and that is if you are running a diagnostic cartridge. In that case, it is possible to see bit 5 set in NMIST when servicing a VBI or DLI due to the way ANTIC powers up, even though an NMI is not actually triggered for it. This is resolved either by ignoring the reset NMI bit or writing to NMIRES at least once before setting up the NMI handler. Beyond that I have never seen a case where writing to NMIRES was needed.

 

You can avoid the NMIST check and sequence the NMI handlers to match the display, but this has to be done perfectly -- one misstep with a VBI or DLI handler running over or switching a screen too late, and you are toast. For most cases the six cycles to check NMIST are affordable insurance against this.

  • Like 3
Link to comment
Share on other sites

In theory if you were really desperate for every cycle and were stealing the hardware NMI vector in a custom Ram-based OS you could ignore NMIST when you absolutely know it'll be coming from a DLI. Then keep a count, flag, whatever, and set the hardware vector to point to your VBlank routine at the last occurrence.

 

Probably a rare requirement, at best might mean you can push an extra register and preload A/X/Y and hit WSYNC and be guaranteed not to have it overrun to the next scanline.

  • Like 2
Link to comment
Share on other sites

  • 2 weeks later...
  • 1 year later...

@Yaron Nir

 

Finally I found time to read this thread :] I think I've mastered these basics in the meantime ;) but thanks anyway, it's very valuable.

 

Quote

Only the Processor Status Register is being pushed automatically by the OS, but you are likely not using it in your routine.

 

I thought it was a feature of the 6502 not the OS ;)  https://wiki.nesdev.com/w/index.php/Status_flags

Interrupts, including the NMI and also the pseudo-interrupt BRK instruction, implicitly push the status register to the stack.

 

Edited by zbyti
Status flags
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...