Jump to content

Recommended Posts

It doesn't need to be just a "put pixel" routine, which is hardward specific. What's about basic low level functions like:

- plot line/fill rectangle

- plot bitmap

- plot text

- scroll area

- copy area

That's exactly what we've got now. It's still a fair amount of code, and replacing it for another device is a fair old undertaking. :)

 

I'll attempt to catalogue all the hardware-specific drawing code entry points, args, etc. The documentation task has been neglected of late...

Edited by flashjazzcat

I think the real beauty (and genius) of this GUI is the fact that it runs on the venerable 6502! Sure it would likely run faster on a faster device, but the fact that it runs so amazingly well on the base hardware is what I find the most incredible and worthwhile thing about it.

  • Like 5

Jacobus, I agree, it should always be the main goal, that it runs on original hardware in a good/useable way (maybe with some additional ram/rom). It will be always VERY impressive to see it on more powerful hardware (like in the 65816 example - damn cool!), but being able to run it on the original will always produce the most stunning effect! :)

Edited by Prodatron
  • Like 2

After watching the fast 65816 demo video, it struck me that, no matter what, there is going to be choppy window movement, with the current design. If you were to try making the window that is being dragged as a ghosted/dotted image, would you not have a faster & smoother window move? If every other pixel didn't need to be written to the screen or dealt with, until the window reaches it's destination, it would seem that there would be an apparent 2x speed up. Just a thought. Thanks for all of the great stuff that you're doing!

Nonetheless, the tearing is the problem. I think that the impact of the tearing could be minimized if you were to ensure that you had contiguous blocks of memory for filled-bitmap arrays to work with, when dealing with moving screen objects. Sort of like how contig works for files.

 

For example, if you were to specify that only one specific bank of ram was utilized for the storage of bitmaps designated as movable screen objects (windows & icons), you could have a service routine that ensures that those bits are always packed in ram optimally via a predetermined logical ram region partitioning, based on maximum possible pixel calculations (max window size, max icon size).

 

This service routine could also ensure contiguous free ram cells (a ram defragmenter &/or zeroer, depending on what it's doing). Perhaps you could stripe these ram-based logical partitions across the last 2K of each bank of expansion ram, to act as temp caches for chunked segments of bits during the ROR/ROL, or however you're moving the bits for the windows, provided that the performance penalty of bank switching doesn't adversely effect the overall operation. This might be minimized by an inventive use of pointers to metadata regions, as in PFS, just applied to sets of bits set in arrays in ram.

 

Anyway, I think that there is more performance to be gained in smooth window movement. The above may just be food for thought, but I would suggest trying to wrap your head around a new type of bank manipulation that utilizes the same concepts as RAID 0, just using ram banks, instead of disks. There are many tricks that have already been developed for use on files & filesystems... my thoughts are that these same techniques can be creatively adapted to the quirks of ram banking on 8-bit processors, and the scarcity & allocation aspects of free ram regions.

 

Basically, doing things in parallel is the ideal, but if you can't, then the next best thing is a well-ordered set of bits rolling down the line, segmented into manageable chunks.

 

Maybe if you think on these things you'll get some cool new idea that can boost performance greatly. I'm just applying a bit of out-of-the-box thinking here, since you are up against very difficult problems making this impossible piece of software possible, ha. Hope that something here helps.

Still ironing some bugs out with MrFish's valiant assistance, but I just realized how to make bootable GOS ROMs for the Ultimate 1MB and Incognito and I'm gonna try it out.

 

The (rather tight, as it turns out) 64KB of space in Ultimate/Incognito currently isn't bootable: it's just empty space and there's no way to place any of those banks at $A000 on powerup. The solution is simple: just use one of the BASIC slots as the base bank of the GOS. This also gives the GOS an extra 8KB to play with.

 

Nonetheless, the tearing is the problem. I think that the impact of the tearing could be minimized if you were to ensure that you had contiguous blocks of memory for filled-bitmap arrays to work with, when dealing with moving screen objects. Sort of like how contig works for files.

 

For example, if you were to specify that only one specific bank of ram was utilized for the storage of bitmaps designated as movable screen objects (windows & icons), you could have a service routine that ensures that those bits are always packed in ram optimally via a predetermined logical ram region partitioning, based on maximum possible pixel calculations (max window size, max icon size).

 

This service routine could also ensure contiguous free ram cells (a ram defragmenter &/or zeroer, depending on what it's doing). Perhaps you could stripe these ram-based logical partitions across the last 2K of each bank of expansion ram, to act as temp caches for chunked segments of bits during the ROR/ROL, or however you're moving the bits for the windows, provided that the performance penalty of bank switching doesn't adversely effect the overall operation. This might be minimized by an inventive use of pointers to metadata regions, as in PFS, just applied to sets of bits set in arrays in ram.

The blitter is just copying one area of the screen to another when moving the front window. No bank switching is involved. The rear windows are iteratively redrawn, but there's no movement and therefore no tearing there. So carefully syncing the blitter with the vertical blank would be the method to reduce tearing with the full window drag.

 

Regarding bit rotation: there are no ROL/ROR instructions used during blitting. Everything's done via a look-up table.

 

Anyway, I think that there is more performance to be gained in smooth window movement. The above may just be food for thought, but I would suggest trying to wrap your head around a new type of bank manipulation that utilizes the same concepts as RAID 0, just using ram banks, instead of disks. There are many tricks that have already been developed for use on files & filesystems... my thoughts are that these same techniques can be creatively adapted to the quirks of ram banking on 8-bit processors, and the scarcity & allocation aspects of free ram regions.

As I say: the blitter is just the fastest possible copy operation from one area of the frame buffer to another (fast because it completely eliminates bit rotation), but perhaps when there's less to do I'll devote some time to enlightening myself to these advanced concepts. Of course, until I document the thing I'm the only one who knows how it works at the moment, so speculation is to be expected. :) For information, the memory manager is an adapted version of the LNG allocator, augmented to handle up to sixty-five 16KB banks. It does 256 byte page allocations, uses an allocation bitmap in each bank, and performs "best fit" allocation.

 

Basically, doing things in parallel is the ideal, but if you can't, then the next best thing is a well-ordered set of bits rolling down the line, segmented into manageable chunks.

 

Maybe if you think on these things you'll get some cool new idea that can boost performance greatly. I'm just applying a bit of out-of-the-box thinking here, since you are up against very difficult problems making this impossible piece of software possible, ha. Hope that something here helps.

Hopefully what I've explained about the blitter has cleared a few things up. :) This full window drag business was a bit of a "dare" between me and Prodatron, and makes quite a nice 65816 demo. If it's ever worth trying to sync the blitter with the VBLANK, I might do so, but such things are low on the to-do list.

 

Regarding the "impossibility" of a multitasking graphical OS on the A8: the only impossibility seems now to be finding the time to write the rest of the code. As to the mechanics of the thing: they work.

Edited by flashjazzcat
  • Like 1

Some light reading for those who wonder what I've been doing with my time, describing two recent bug fixes. Thanks to MrFish for test-driving the ROMs and running into these issues in the first place.

 

Bug 1:

 

Laboriously triggering the typical crash (when closing apps), I noticed that the scheduler was dumping the CPU in no-man’s land. Unfortunately it was not humanly possible to back-trace the newly entered context to the point at which it previously yielded. The execution history for an endless series of context switches is not for the faint of heart. :) All I really had to go on was the task’s PID (process ID), and I eventually hit lucky when the crash happened when the kernel task (ID #1) was being switched back in after waking up because it had a pending message. I immediately noticed that the scheduler had exited (via RTI) to the wrapper “wake up” code for processes awaking from timed sleep (the Profiler calls this all the time, for example) before crashing on a bad instruction in some weird place. But the kernel process never goes into a timed sleep in the first place: it just waits on messages. So how did the kernel end up with the address of the wake up wrapper on its stack?

 

The only possibility was that this was the wrong stack frame: it couldn’t belong to the kernel process. With that idea in mind, I trawled through the segmented stack caching code and eventually found something. Each task has a slot in the array PROCESS.STACKSLOT. This location holds the stack slot number of the current process (i.e. which quarter of the stack it’s using, as in 1-4), or zero if the task’s stack is cached out (usually because the task went to sleep and another process claimed its stack frame). However, right next to the definition for PROCESS.STACKSLOT, there’s a comment saying “bits 0-1: slot number, bit 7 = cached”. Presumably I’d begun by numbering the slots 0-3 and using bit 7 to mean “cached”. At some point, I’d revised things so that the slots are numbered 1-4 and zero means “cached”. But I didn’t tell that to the delete process routine, which tested bit 7 of PROCESS.STACKSLOT to establish whether the task being killed was occupying the hardware stack space. Of course bit 7 was never set, so it was always assumed that the stack was occupying a slot. The slot number was then extracted from the lower bits and marked free in the stack slot allocation table. But if the stack was cached (PROCESS.STACKSLOT = 0) the lower bits were 0 also, so what tended to happen was that slot 0 was always marked free even if it contained an active process’s stack. What would then happen is the scheduler would look for an unused slot and overwrite an active stack with that of a newly woken up process.

 

Bug 2:

 

Even with the stack cache bug fixed, I was still getting mangled windows after closing down apps and then opening more. It always seemed to happen at around the sixteenth attempt to open a window (regardless of how many had been closed down). This offered a clue. I used the debugger to keep eyes on the window list and soon discovered that the head and tail pointers were eventually overrunning the limit (sixteen) or becoming set to zero while windows were open. Basically I found that the top-level window close routine was calling the wrong routine to remove the deleted window from the window list. The routine being called deleted the list node, but didn’t add the released node back onto the free node list. A routine one level higher up did this. So, once the right routine was called, new windows were getting proper node numbers and problems went away.

Edited by flashjazzcat
  • Like 10

Bug 1:

 

Wow, thank God for commented code, even your own! That was a hairy one!

 

Bug 2:

 

A little more straightforward with the 16 involved. Glad you got that one fixed.

 

Thanks MrFish for the testing! Those were some serious bugs that needed squashed. :-D

  • Like 1

Here's a ROM for a 1Mbit MaxFlash cart:

 

A8 GOS - AtariMax 1Mbit, ST Mouse.zip

 

ST mouse required. I just seem to be pottering around looking for bugs at the moment, so I figured why not put the thing out there, warts and all. ;) I'll make more builds as an when: just want to start the ball rolling here. You can use the MaxFlash cart studio to make a flasher ATR if you have a real cart, or just mount the ROM in Altirra (choose MaxFlash 128K / 1Mbit).

 

All you can really do with this is launch multiple instances of the Profiler (i.e. task manager) and "Jotter" (i.e. dummy text window) from the System menu (i.e. the Fuji menu). So: single-click on the Fuji menu header, and click on "Profiler" and "Jotter". Look in the Applications and Processes tabs of the Profiler to see an auto-updating list of processes, their PIDs, CPU usage, etc. Shut applications down with close box in the top left corner of the window.

 

 

What doesn't work:

  • Command buttons in the Profiler, which simply aren't plumbed in yet. Close applications with the window close box in the drag bar.
  • Application menu options. Not plumbed in yet.
  • Any kind of message when you hit the current sixteen process limit, or elegant handling of running out of RAM. It'll probably just crash in the latter case
  • There's no event code attached to desktop icons yet either, so clicking or double-clicking on those does nothing
  • System clock doesn't advance

 

What should work:

  • Pre-emptive multi-tasking microkernel, pre-empting the running process at 50/60Hz using the same priority scheme as described in the SymbOS documentation. Processes are also explicitly yielding their CPU quantum when they have nothing to do, so cooperative applications ease system load yet further.
  • Sleep timers: the profiler application puts itself to sleep for 2-3 seconds and is put back on the run queue when the timer expires or it gets another message. If the timer didn't expire, the application receives the number of remaining ticks as a 24-bit integer. This is kind of similar to a system call available in Linux.
  • Global message queue which is where all the IPC happens. Room for sixty four messages. No checks for unresponsive applications yet, but this will be added, so you'll be able to kill apps which have become unresponsive (providing they haven't wiped out the OS). :)
  • ROM disk handler. Applications and fonts are now loaded from a crude file system on the cartridge. There's a FAT-like root directory in one of the ROM banks and a very basic file handler in the kernel.
  • Memory manager. Adapted from the LNG allocator (author approached, didn't reply, so I went ahead); bitmap allocation table, best-fit allocation, 256 byte page chunks, supports up to 65 x 64KB banks (including main bank). Single page per bank reserved for free block map, allocation chains, etc.
  • Rectangle based window manager, actually started at the end of 2013.
  • Pixel precise blitter, recently finished.
  • Relocating loader for MADS relocatable binaries.

 

Bugs:

  • Processes on sleep timers very occasionally never wake up again. This one's hard to trigger but a dump of main RAM when it does happen will probably help. Presumably a bug in the linked lists.
  • Occasional freezing (?)
  • Occasional graphical glitches (window manager probably contains many bugs still)
  • Application count in Profiler is wrong (just keeps climbing: thanks MrFish).

 

Quirks:

  • Memory allocator runs in kernel context, which keeps interrupt disable bit set, so mouse pointer stalls when loading applications. This is a carry-over from when the kernel was called via BRK and a less intrusive method of running atomic code will eventually be used for lengthy processes. IPC and other stuff works out quite nicely as a short IRQ, meanwhile.
  • NTSC will run more slowly than PAL because of the 60Hz NMI. If the scheduler ever gets moved back to a timer IRQ, the problem will be slightly eased, but the mouse pointer NMI will always match the frame rate.

 

To-do:

  • System manager, which handles alert boxes, file systems, etc, etc. The meat and potatoes of a working OS, basically.
  • New UI controls and refinements to existing ones.
  • Driver architecture.
  • Lots, lots more. ;)

SIO and file system are the next big jobs, and for these I need to work out how the drivers will look. DCB (page three and ZDCB on page zero) will be compatible with the Atari OS so that PBI HDDs will work, but custom SIO routines will be needed because the Atari OS is always banked out. SIDE2 version will have the HDD driver on the cart (much like SDX does), and will therefore form a self-contained package, requiring nothing more than adequate RAM on the host machine.

Edited by flashjazzcat
  • Like 14

Amazing stuff...

I just downloaded the latest Altirra and tested your new rom.

The Atari 600XL was my first computer in January 1985. This is very special to my after (almost exactly) 30 years.

Really, really special.

Edited by Dutch800XL

Can this be flashed to a u1mb?

Not this image, no. The source code is now set up to produce different sized ROMs with different banking schemes, but experiments with Ultimate 1MB's "GUI" slot have proved a tremendous ballache. Aside from the fact that 64KB (eight banks) is barely enough for the core system, the banking register at $D5E0 is only active when SDX is switched on (this is why SDX must be enabled to access the XEX loader, which itself is a 16KB banked ROM). I had assumed that using one of the spare BASIC slots to "chain" to the GUI space would have worked, but I haven't managed to attain anything approaching the desired result yet.

 

Probably not doable any time soon, but it would be awesome to see this support the touchscreen mode of the colleen emulator for android. Maybe outside the scope of what you want to do though?

I was running the GUI on Colleen on my Android cell a couple of years ago... it wasn't perfect, but it should still work if the ROM is supported.

 

Great news. AtariMax 8mbit or SIC! or SIDE version, maybe? :)

Yeah: all in the pipeline, and most flashable with UFlash 2 beta. :)

Edited by flashjazzcat

Here's a ROM for a 1Mbit MaxFlash cart:

 

ST mouse required.

 

Yeah, plug&play was unable to download the latest drivers for the Atari TrackBall I had at the joystick port.

I guess that's the price I pay for being early adopter!

 

As many others, I'm enjoying every bit of this time machine ride...

  • Like 2

Jon, GOS looks great!

 

Will there be a mouse sensitivity adjustment at some point?

 

My converted Logitech Bus Mouse moves the cursor from edge to edge with only 0.9 inch (2.29 cm) of travel.

 

Terrific job and thanks for offering up the demo!

 

-a8isa1

Edited by a8isa1
  • Like 1

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...