Elite must be ported to A8

analmux · March 25, 2010

OK, sounds great.

First step of work is done. So, who will do the gfx & who will do the code?

emkay · March 25, 2010

OK, sounds great.

First step of work is done. So, who will do the gfx & who will do the code?

Naaa.... the blue danube is missing

Wrathchild · March 25, 2010

Sounding nice emkay!

When doing these tunes will one version work for both NTSC and PAL?

e.g. if the tune has a speed delay of say '5' (PAL) then altering that within code to '6' (NTSC) would do the job?

or is the approach of leaving this at '5' and then skipping the call to the replay routine on the 6th turn work?

My worry with the later would that an instrument/envelope sound could be impaired in doing this?

BTW: I have a nice version of the Blue Danube tune up my sleeve that stRing did for me quite a while back now.

Wrathchild · March 25, 2010

So, who will do the gfx & who will do the code?

Erm...

PeteD · March 25, 2010

afaik newer RMT versions use a VBI vs timer approach to PAL/NTSC timing. ie if you've composed it on PAL it calls every VBI, play it back on NTSC and it calls it via a timer at 80%/frame time. It will make a slight difference to the sounds but it shouldn't be noticeable and afaik POKEY sounds slightly different on PAL/NTSC anyway due to the different clock speeds (internally for POKEY not how many times you call the code), but again, barely noticeable.

Pete

analmux · March 25, 2010

So, who will do the gfx & who will do the code?

Erm...

Oops, sorry. Great work by the way

Not that I personally like the game, but I understood many people are sad atari never got an Elite.

analmux · March 25, 2010

afaik newer RMT versions use a VBI vs timer approach to PAL/NTSC timing. ie if you've composed it on PAL it calls every VBI, play it back on NTSC and it calls it via a timer at 80%/frame time. It will make a slight difference to the sounds but it shouldn't be noticeable and afaik POKEY sounds slightly different on PAL/NTSC anyway due to the different clock speeds (internally for POKEY not how many times you call the code), but again, barely noticeable.

Pete

Yes, new RMT supports different timing schemes, using (f.e.) DLI at varying scanlines instead of a fixed one. However, I'm not sure this is a good solution in cases of heavy processing at the background, maybe for Elite it is. Another idea might be to do also a separate NTSC-version of the tune. Then the RMT engine can operate at a fixed DLI.

Edited March 25, 2010 by analmux

emkay · March 26, 2010

same tune just saved in 60Hz....

elit60.xex

emkay · March 26, 2010

The only point that never was changed in all A8s is the ratio of the CPU clock and POKEY's clocking. If it is only granted to have the timing correction in the VBI, everything might be ok.

It's just needed to have the speed of the tune a step slower in 60hz.

analmux · March 26, 2010

same tune just saved in 60Hz....

Sounds great.

Well, emkay, this asks for some recordings. Fortunately I've got a PAL and an NTSC setup at home.

EDIT:

When I get home again I'll do some recordings of both 50Hz and 60Hz

Edited March 26, 2010 by analmux

analmux · March 26, 2010

OK, I did recordings from my real ataris.

I tried "elitemodu" on both PAL and NTSC. The RMT engine automatically corrects the playing speed on NTSC, and will go from fixed scanline to variable ones. However, there are slight differences in 'timing' of the filter/pulsewave voices. This is of course caused by the non-exact 5/4 ratio between #of scanlines: PAL=312 & NTSC=262 scanlines, i.e. 312/262 is not exactly 5/4. However, the variable DLIs will occur at regular positions on the screen.

That's why I would advice to make a separate NTSC version with its own filter settings, at least in case of filter envelope. Pulsewave offset could be managed correctly when RMT supports correct handling by direct programming. However, pulsewave evolution will depend on PAL or NTSC.

same tune just saved in 60Hz....

I also included a recording of "elit60", played on my NTSC machine only. To be honest, I like it a bit faster. It's a different question whether you should speed up the PAL version of the tune.

emkay · March 27, 2010

That's odd.

You did not mixup the recordings?

Only elitemodu_ntsc shows the sounds as intended (Or at least possible in RMT). Both other versions show some unstable sounding. It's vice versa to the emulation.

Possibly we found a bug ? Is it possible that the emulations were still calculated at 1.79MHz while PAL emulation need 1.77MHz? This would partly explain the unstable sound in RMT.

analmux · March 27, 2010

No, I didn't mix up anything. You can check. The 1.79mhz clock of NTSC should result in somewhat higher notes. You can hear it by ear, when paying attention.

I think it's rather difficult what's causing this. Apparently there is a bug somewhere, both in Emu, both in RMT. Or,maybe 'bug' isn't the right word. But, I wouldn't be surprised if it's a clocking issue.

Anyway, it's still weird that the elitemodu tune, which is WRITTEN for 50 hz programming, sounds most correctly on NTSC, at a somewhat irregular RMT engine timing. Note that 262/5 is fractional, so there's no way to reach 262 from 0 after 5 equal integer steps. However, possibly this irregularity is still smaller than "what's causing the bug".

By the way:

I wrote 5/4 above, but it's supposed to be 6/5 of course.

ZylonBane · March 29, 2010

OR when drawing is OK in single-colour but not if you use XOR to erase, as it can leave residue where two lines cross.

e.g. intersection point of 2 lines is erased when first line is XOR removed, but will be turned on again when second line is XOR removed.

But it's much faster to have a line erase routine. It's just modified Bresenham logic without updating the screen until the address changes, then write 0 to it. You don't have to worry about accidentally erasing something since you are erasing everything.

Testing for an address change would still be slow. The fastest way would be for the line-erase routine to treat the screen as 32 pixels wide, eg-- one byte per pixel. Then the only overhead would be the up-front cost of dividing each X-coordinate by 8, which is cheap as-is, but could be made even cheaper with a lookup table. This would also have the advantage of the algorithm completing in up to 8 times fewer iterations, depending on how horizontal the line is.

However, obviously beyond a certain number of lines it's going to become faster to just blank the entire screen. A sufficiently intelligent algorithm could track how long the last frame took to draw, then guess how long it would take to "un-draw", and choose the appropriate clearing technique.

On an unrelated note, remember that GR.8 on real hardware yields superior image quality when run in "inverse" mode, with the default colors swapped so that you clear bits to light them up instead of the other way around.

Rybags · March 29, 2010

A line erase might best be done by two routines. Where DeltaY is dominant, the screen address always changes each step, so a standard algorithm is probably the only way to go.

Where DeltaX is dominant, a routine optimised towards only writing once per screen byte could be used.

Erase whole screen... a 256*192 screen takes at least 24576 cycles to erase, essentially an entire frame's worth of work. The situation where a render takes more to undo than that amount would probably fairly common.

ZylonBane · March 29, 2010

Erase whole screen... a 256*192 screen takes at least 24576 cycles to erase, essentially an entire frame's worth of work. The situation where a render takes more to undo than that amount would probably fairly common.

Fortunately, Elite only uses a bit over two-thirds the available screen for the outer-space view.

Maybe someone should ring up Paul Woakes and ask him how he got Mercenary running so fast.

Edited March 29, 2010 by ZylonBane

Lazarus · March 29, 2010

Erase whole screen... a 256*192 screen takes at least 24576 cycles to erase, essentially an entire frame's worth of work.

That would mean 36k unrolled code for two frame buffers. A bit more realistic would be a minimum of 30720 cycles.

ZylonBane · March 29, 2010

That would mean 36k unrolled code for two frame buffers. A bit more realistic would be a minimum of 30720 cycles.

Why would you be erasing both frame buffers at the same time?

Lazarus · March 29, 2010

That would mean 36k unrolled code for two frame buffers. A bit more realistic would be a minimum of 30720 cycles.

Why would you be erasing both frame buffers at the same time?

Huh? 30720 cycles is the minimum for clearing one buffer if you don't want to waste 36k for lot's of STA $XXXX.

ZylonBane · March 30, 2010

Oh, never mind. It sounded like you were saying that was the amount of code to clear "two frame buffers".

Lazarus · March 30, 2010

Oh, never mind. It sounded like you were saying that was the amount of code to clear "two frame buffers".

Yes it is the amount of code for clearing two buffers since you have two buffers and you'd need a clear routine for both.

Rybags · March 30, 2010

Doesn't the BBC version allow toggling the radar and stuff, which then gives a full display?

Screen clear would obviously need a balance between speed and size.

Probably something like this for a 256x200 bitmap:

ldy #$3f

lda #0

clear: sta screen,y

sta screen+64,y

sta screen+128,y

... etc, 96 more

sta screen+6240,y

sta screen+6320,y

dey

bmi finished

jmp clear

All up, a little over 300 bytes of code per buffer. Although you get an extra 2 cycle hit for the loop branch. Loop is 64 times, clearing 100 bytes per iteration = 446 cycles in branches including the DEY.

A tighter loop doing 200 iterations clearing 32 bytes per sweep could use normal branch = 999 cycles in branches. Around 100 bytes of program per buffer.

Intermediate loop clearing 50 bytes over 128 iterations (has to use JMP) = 894 cycles in branches, about 160 bytes program code.

ed - fixed figures, include times for DEY.

Edited March 30, 2010 by Rybags

Lazarus · March 30, 2010

All up, a little over 300 bytes of code per buffer. Although you get an extra 2 cycle hit for the loop branch. Loop is 64 times, clearing 100 bytes per iteration = 446 cycles in branches including the DEY.

A tighter loop doing 200 iterations clearing 32 bytes per sweep could use normal branch = 999 cycles in branches. Around 100 bytes of program per buffer.

Intermediate loop clearing 50 bytes over 128 iterations (has to use JMP) = 894 cycles in branches, about 160 bytes program code.

ed - fixed figures, include times for DEY.

First case is 507*64+1 = 32449 cycles, second is 165*200+1 = 33001 cycles, third is 257*128+1 = 32897.

Kr0tki · March 30, 2010

On an unrelated note, remember that GR.8 on real hardware yields superior image quality when run in "inverse" mode, with the default colors swapped so that you clear bits to light them up instead of the other way around.

Very interesting. What do you exactly mean by "superior quality"?

Edited March 30, 2010 by Kr0tki

Rybags · March 30, 2010

You can achieve the same thing by just swapping the colour settings of PF1 & PF2.

Can't say I've ever experienced a 1-filled background with the pixels/lines you want zeroed out as looking better quality though.

Gr. 8 in default colours tends to look crap anyway... black & white or 2 shades of grey gives much better definition.

Elite must be ported to A8

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members