Jump to content
IGNORED

Why the ripple upward?


tschak909

Recommended Posts

Pretty cool! :)

 

That 168 scanline block is most of the screen, where else do you find time for the task queue besides the top and bottom vertical blanks?

 

Anywhere we're waiting for a timer to complete. Just the top and bottom areas of the screen, I seem to recall.

But there are a number of different tasks done with different priorities/orders. Some tasks rather big could only be done in some places in the frame. Some were 'filler' tasks that took very little processing time so they could fill in the gaps left over after the big tasks had run, and so be sure to use every single drip of processing time that was available.

  • Like 1
Link to comment
Share on other sites

 

Anywhere we're waiting for a timer to complete. Just the top and bottom areas of the screen, I seem to recall.

But there are a number of different tasks done with different priorities/orders. Some tasks rather big could only be done in some places in the frame. Some were 'filler' tasks that took very little processing time so they could fill in the gaps left over after the big tasks had run, and so be sure to use every single drip of processing time that was available.

Most excellent design - a tile mapped mario world style game without extra hardware requires all the available time in both blanks and careful load balancing/queueing.

 

This is one feature I think could be improved in bB, unless I am mistaken only one of the vertical blanks is available for the game loop.

 

Virtual World BASIC (inspired by bB and BD) sports two game loops to allow BASIC to run in both vertical blanks, giving the programmer instant access to granular load balancing.

 

It was initially designed to step down to 30 HZ to use an entire frame for repositioning the full screen playfield camera and sprites which kept programming in BASIC very simple like bitd by minimizing the need for the programmer to worry about load balancing.

 

It's now evolved with DLI's like the Atari home computers that give the programmer fine grain load balancing control to update a region of the screen from either of the blanks with more time left over for additional tasks the programmer can load balance on a per frame basis, but the architecture is more complex as you've described.

 

I think your analogy about a few weeks to write a great game in BASIC compared to six months in asm is spot on, but without finding a way to alleviate the load (like the first method stealing a frame, or using a 32-bit co-processor to update the framebuffer) then the architecture and concept for load balancing/queueing becomes just as important for the BASIC programmer as for the Assembly programmer.

 

As I was working with ANTIC I realized DLI's gave Atari BASIC programmers the ability to organize regions of the screen to exert fine grain load balancing control - another great influence for making this design architecture accessible to the BASIC programmer. I also tried to simplify it so it's easier to use in Virtual World BASIC - DLI's are fairly complicated to use in Atari BASIC.

 

I think much of the speed improvement in BASIC over asm on the VCS comes from being isolated from kernel load balancing - this architecture the BASIC programmer (thankfully) never has to worry about.

 

I see the BD 168 scanline kernel without WSYNC is yielding 500 extra cycles of processing power to draw that display, that's a fantastic optimization - your kernel tree must be perfectly balanced with no branches taking even a single extra cycle for that to work! :)

 

Link to comment
Share on other sites

 

I see the BD 168 scanline kernel without WSYNC is yielding 500 extra cycles of processing power to draw that display, that's a fantastic optimization - your kernel tree must be perfectly balanced with no branches taking even a single extra cycle for that to work! :)

 

 

Think of 8 banks all containing IDENTICAL code (the code I posted). When a bank-switch happens inside that code, it's actually switching to one of the 8 banks to draw the particular 'character' line (21 scanlines). That bank ALSO contains the data/visuals for the character line. So where you se the bankswitch, it's actually switching OUT the code that is actually running. The next instruction is a completely different bank's instruction which just HAPPENS to be exactly the same as the next instruction in the bank the bankswitch was executing in. The very LAST bank just has a 'rts' instead of continuation of the code - so it does the looping/branching for 8 lines without a single cycle of cost. It's essentially inline code, spread over 8 banks, every single cycle catered for.

  • Like 1
Link to comment
Share on other sites

Note that in complex kernels you might not use WSYNC at all (Probably only Boulder Dash does that)...

Too many credits for us. icon_smile.gif

 

I am sure there are quite lot of kernels not using WSYNC (or only very rarely) and even well before Boulder Dash. But probably a bit less complex.

 

BTW: I just searched my mails. That kernel was done in May 2005.

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...