Elite must be ported to A8

flashjazzcat · March 24, 2010

Double buffering is pretty essential in most games. I would suspect it might be even more necessary in 3D software due to the extra instructions and extra clock cycles needed to execute a piece of 3D. That's why I'm wondering if it might simply be faster to do a complete wipe of a section of the back buffer where a redraw is about to take place rather than removing the previous line drawn.

Unless the program does a complete re-render of the whole back-buffer every frame, surely there's going to extra processing involved. If you draw an XORed line to the back-buffer, then flip the displays, the back-buffer then contains a bitmap with no trace of the line you've just drawn (and which is now visible on the screen). You can't then erase the line you've just drawn by XORing it with the back-buffer, because the alternate buffers would always be one frame behind. Surely double buffering is most useful where the frame buffer is completely erased and redrawn each time. Ideal for a word-processor which does a complete refresh of the screen every frame, but not much good for incremental additions to the display RAM. At the very least, you'd have to swap the frame buffers, and then copy the visible buffer to the back buffer so that it always reflects the current state of the display. That would work, but at enormous cost in terms of the block move.

Edited March 24, 2010 by flashjazzcat

Rybags · March 24, 2010

You don't do any processing of Screen1 with respect to Screen2 and vice/versa.

You need to retain (for both screens) old positions of line origin/destination and star coordinates for the second XOR (erase) phase.

You should also be able to save a not insignificant extra amount of time, regardless if you do XOR or "Store zeros" erase methods by also retaining other variables like DeltaX/Dy/LineSize values and screen addresses of stars.

Heaven/TQA · March 24, 2010

well...you could hold of while drawing the screen or computing the screen all necessary starting/end points of the lines? (starfield is anyway done separatly?)

well... if you have less than 128 lines you can keep everything in one page (x,y).

so you would do

show buffer 2

draw buffer 1 save coordinates

flip buffer

show buffer 1

load coordinates_backup

draw buffer 2 (to erase the old stuff)

load new_coords

draw buffer 2

flip

...

does that make sense?

Rybags · March 24, 2010

Yes.

I suspect they'd do most/all the calculations before any drawing anyway, otherwise you'd get disjointed looking ships.

I've not looked at any code, but I'd suspect they'd probably start the erase phase at a predetermined screen position, then draw straight afterwards. Flicker occurs, but you do get some stuff that's drawn in time.

Chances are that subsequent draw calculations would take so long that the drawn stuff remains for 1-2 frames, so most flicker is probably 2:1 or 3:1 on/off at worst.

Edited March 24, 2010 by Rybags

PeteD · March 24, 2010

I've just found out the beeb version of Empire Strikes Back is double buffering in this exact way (after wondering why the screen was flickering in my port). It'll clear previous (if any) lines, draw new ones, remember those points, swap frames, etc etc Other than possibly some RAM use, there's no difference, and I'd presume that RAM would be used for a single buffer routine anyway because as Rybags and Heaven say, you'd calculate all the points/lines before deleting them else you'd end up with the possibility of a blank frame before the new ones are drawn on.

*edit*

That's RAM for the line/point list, not the obvious big splodge of ram for another screen.

Pete

Edited March 24, 2010 by PeteD

JamesD · March 24, 2010

Just a thought here... if you have the code replicated 8 times, you can jump into that block of code at the start based on where the line starts (Y) and you never have to test for carry. You only have to test to see if you have drawn the last pixel.

The code increments the high byte every time at the bottom of that section because you know a carry has to take place when it gets there.

That however interferes with horizontal unrolling which probably gains far more speed (hardcoded bitmasks etc).

Not if you only use it for lines with a slope > 45 degrees.

Do you have an example of hardcoded bitmasks somewhere?

I think that's the same or similar to what I proposed to the Oric guys for the < 45 degree case.

Plus you are going to have special cases for horizontal, vertical and less than 45 degree lines.

Those special cases are extremely uncommon. Also don't forget that Elite is very short at memory.

Well, those special cases definitely made a noticable difference on the Oric frame rate. Ultimately that's going to determine whether or not the game is playable or not. On a fast machine you may not need special cases but on a slow machine it makes a difference.

Keep in mind that you can have several ships, a planet and several missiles on screen at once. Your frame rate is going to drop and saving clock cycles anywhere you can helps.

The math is clearly a bottleneck, but every clock cycle saved in drawing can be used by the math. If you are worried about memory more than speed you should look at the size of the tables and double buffering while you are at it. Cutting the screen width may be worth it just for the RAM savings if you are double buffering.

Lazarus · March 24, 2010

Not if you only use it for lines with a slope > 45 degrees.

Do you have an example of hardcoded bitmasks somewhere?

I think that's the same or similar to what I proposed to the Oric guys for the < 45 degree case.

LDA (zp),Y
EOR #$80
STA (zp),Y
...
LDA (zp),Y
EOR #$40
STA (zp),Y
...
...
...
LDA (zp),Y
EOR #$01
STA (zp),Y
INY

Can also be used for vertical oriented lines.

If you render the lines on a char matrix (done often on c64 with 16x16 chars), then you get code like this:

LDA (zp),Y
EOR #$80
STA (zp),Y
TXA
SBC dx
BCS skip1
ADC dy
INY
skip1:
TAX

LDA (zp),Y
EOR #$40
STA (zp),Y
TXA
SBC dx
BCS skip2
ADC dy
INY
skip2:
TAX

...

The math is clearly a bottleneck, but every clock cycle saved in drawing can be used by the math. If you are worried about memory more than speed you should look at the size of the tables and double buffering while you are at it. Cutting the screen width may be worth it just for the RAM savings if you are double buffering.

I think memory is the bottleneck. You can do all kinds of optimizations if you don't need to keep a game with 2000 solar systems etc in memory aswell.

Edited March 24, 2010 by Lazarus

PeteD · March 24, 2010

The planets/names/galaxy map etc are all generated based on a Fibonacci algorithm so takes much less RAM than you'd expect.

Pete

Edited March 24, 2010 by PeteD

Lazarus · March 24, 2010

The planets/names/galaxy map etc are all generated based on a Fibonacci algorithm so takes much less RAM than you'd expect.

I'd expect something along the size of C64 Elite RAM usage.

PeteD · March 24, 2010

The beeb version in 32k could've had LOTS more planets/galaxies but Acornsoft thought it would be too overwhelming so got them to drop it to 8 galaxies with max 256 planets each (or is that 256 solar systems with 8 planets?). They're created when needed so having more of them doesn't mean extra ram. If the C64 version has more then it's the same deal, doesn't mean more ram for that stuff, it'll be taken up with extra things like the missions the beeb didn't have, the trumbles, the music etc

Pete

Edited March 24, 2010 by PeteD

JamesD · March 24, 2010

Not if you only use it for lines with a slope > 45 degrees.

Do you have an example of hardcoded bitmasks somewhere?

I think that's the same or similar to what I proposed to the Oric guys for the < 45 degree case.
LDA (zp),Y
EOR #$80
STA (zp),Y
...
LDA (zp),Y
EOR #$40
STA (zp),Y
...
...
...
LDA (zp),Y
EOR #$01
STA (zp),Y
INY
Can also be used for vertical oriented lines.

I'm thinking OR would be more desirable here.

EOR will erase pixels that have already been drawn.

That can make it look like there are breaks in the object, usually at corners.

That's similar to what I suggested. With less than 2K of RAM left I think unrolling was ruled out on the Oric.

This is what was suggested for the Oric after some discussion. The Oric screen is 6 bits/pixel, 40 bytes/row btw.

<edit> this is the pixel chunking code for lines with a slop under 45 degrees

 MAC PLOT
  IF XOR_MODE
   eor     (buf_r),y   ; 5
  ELSE
   ora     (buf_r),y   ; 5
  ENDIF
   sta     (buf),y     ; 6 = 11
 ENDM

 MAC PLOTP_XC
   dex                 ; 2
   beq     Byte{1}     ; 2/3+50.38     12.5% taken
Cont{1}:
; average: 10.42 (4/55.38=87.5/12.5%)
 ENDM

.loopXMajorC:
   PLOTP_XC P_XC               ;10.42
   adc     .dY                 ; 3
   bcc     .loopXMajorC        ; 2/3           >=50% taken
; average: 16.42
   sbc     .dX                 ; 3
   sta     .tmpSum             ; 3
   lda     Pot2PCTbl-1,x       ; 4
   eor     chunk               ; 3
   PLOT                        ;11
   lda     Pot2PCTbl-1,x       ; 4
   sta     chunk               ; 3
   lda     .tmpSum             ; 3
   dey                         ; 2
   bne     .loopXMajorCSub     ; 2/3=38/39(-3)

The math is clearly a bottleneck, but every clock cycle saved in drawing can be used by the math. If you are worried about memory more than speed you should look at the size of the tables and double buffering while you are at it. Cutting the screen width may be worth it just for the RAM savings if you are double buffering.

I think memory is the bottleneck. You can do all kinds of optimizations if you don't need to keep a game with 2000 solar systems etc in memory aswell.

I said "a" bottleneck.

Well, if I remember right, Elite used some math to calculate solar systems and built planet names from a pool of strings rather than hold all the data for it. Still, it requires a lot of RAM for an 8 bit game.

I was messing with graphics on the MC-10 back in July and was trying all sorts of table theatrics to make it fast.

The fastest horizontal line I came up with did a lookup for the ends of the line and wrote $FF to all places in between. No OR for all middle bytes. Well worth the special case.

Vertical was lookup the bit mask and OR with RAM in a loop. No carries to worry about on the 6803 so pointing to next line is one instruction (ABX) assuming BYTEs/line is in B. Not exactly something that translates strait to the 6502 but the overall idea is the same.

I didn't use any other special cases at the time though now I would.

Edited March 24, 2010 by JamesD

JamesD · March 24, 2010

BTW, I didn't write that code, I made a suggestion on how to do chunking and "thrust26" posted it.

Not sure if he wrote it before or after my suggestion.

<edit>

The table contains values with all bits set from the current pixel position on.

Find the start pixel, set all bits from there on with the table, find end pixel, mask remaining bits off by xor with table value.

Edited March 24, 2010 by JamesD

Rybags · March 24, 2010

OR when drawing is OK in single-colour but not if you use XOR to erase, as it can leave residue where two lines cross.

e.g. intersection point of 2 lines is erased when first line is XOR removed, but will be turned on again when second line is XOR removed.

Wrathchild · March 24, 2010

This is shown, I believe, in the Master 128 video of Elite I watched earlier today on You-tube. The thargoid is a mix of cyan and red.

JamesD · March 24, 2010

OR when drawing is OK in single-colour but not if you use XOR to erase, as it can leave residue where two lines cross.

e.g. intersection point of 2 lines is erased when first line is XOR removed, but will be turned on again when second line is XOR removed.

But it's much faster to have a line erase routine. It's just modified Bresenham logic without updating the screen until the address changes, then write 0 to it. You don't have to worry about accidentally erasing something since you are erasing everything.

Heaven/TQA · March 24, 2010

OR when drawing is OK in single-colour but not if you use XOR to erase, as it can leave residue where two lines cross.

e.g. intersection point of 2 lines is erased when first line is XOR removed, but will be turned on again when second line is XOR removed.

but what about simply storing "0" where you normally would set dots? so same line routine but without bit pattern lookup?

PeteD · March 24, 2010

The Master version seems odd, at least on YT (so it might be them) not tried to actually run it yet but it does seem to be in hires 4 colour with the colours being black, yellow, red, cyan and it makes white from yellow and cyan. It might be doing an attempted colour mix on the ships also.

check out the planet.

Pete

JamesD · March 24, 2010

OR when drawing is OK in single-colour but not if you use XOR to erase, as it can leave residue where two lines cross.

e.g. intersection point of 2 lines is erased when first line is XOR removed, but will be turned on again when second line is XOR removed.

but what about simply storing "0" where you normally would set dots? so same line routine but without bit pattern lookup?

I think he's been playing with sprites too much.

Edited March 24, 2010 by JamesD

JamesD · March 24, 2010

BTW, there are other line drawing routines that could be used. Two-step is supposed to be faster. I just question whether the limited number of registers on the 6502 will allow any speed increase since the faster ones draw from both ends of the line.

If you use the lookup table as suggested... maybe. It could be similar to unrolling the loop part way if you don't have to do much register swapping.

Lazarus · March 24, 2010

I'm thinking OR would be more desirable here.

EOR will erase pixels that have already been drawn.

Elite draws with EOR. Probably because then they can use the same line routine for erasing too.

JamesD · March 24, 2010

I'm thinking OR would be more desirable here.

EOR will erase pixels that have already been drawn.

Elite draws with EOR. Probably because then they can use the same line routine for erasing too.

Which means it's not that optimized speed wise and it explains the breaks in the lines.

Heaven/TQA · March 24, 2010

why not asking Wratchchild... maybe he can through some code from debugger of the Elite draw routines?

PeteD · March 24, 2010

The source is on Bell's site. I seem to remember after a quick look at it a while ago that it's not the easiest to read.

Pete

Heaven/TQA · March 24, 2010

when watching the videos you see two circles crossing each other whiping out the intersection so it is a EOR line draw.

Lazarus · March 24, 2010

Elite draws with EOR. Probably because then they can use the same line routine for erasing too.

Which means it's not that optimized speed wise and it explains the breaks in the lines.

It's no demo. Ofcourse it's not optimized as good as possible. Most of the memory is occupied by the game itself.

Elite must be ported to A8

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members