flashjazzcat Posted August 16, 2013 Author Share Posted August 16, 2013 Fixed up some code to handle the "BRK bug", whereby we might find ourselves in the NMI with the B flag set (and thus with a BRK instruction missed). .proc ExitInt NMI tsx lda $0103,x ; get saved processor status register and #$10 ; check for BRK flag beq NotBRK lda $0104,x sec sbc #2 ; drop it by 2 (back to missed BRK) sta $0104,x lda $0105,x ; get MSB of return address sbc #0 sta $0105,x ; after RTI, missed BRK will now be processed NotBRK pla ; get x which we pushed earlier tax IRQ lda pbsave sta $d301 pla ; get a rti .endp ; Tried doing this at the top of the NMI at first (note the above code is a wrapper which banks the OS back in during the interrupt phase), but it clattered the DLI timing. Doing it on exit causes no problems, since we only have one DLI per frame. If an NMI interrupts an IRQ (while the OS is banked in), it's no problem, since there are no BRK instructions to pre-empt in the interrupt service routines. Next task is to implement BRK handling in the IRQ, then run a continuous loop to test the BRK bug handling. 1 Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted August 18, 2013 Author Share Posted August 18, 2013 (edited) Just completed some tests with a second (64Hz) Pokey IRQ (which will become the scheduler) running alongside the mouse sampler: performance seemed to suffer because of the check for BRK at the top of the IRQ routine (in order to branch to the dispatcher), but I saved many thousands of machine cycles by placing the Pokey timer interrupt dispatcher in the custom IRQ handler before interrupts are handed over to the OS routines. This really made for a drastic speed improvement (since the Pokey timers are now checked before anything else, and the service routines jumped to directly rather than via the timer vectors) - such that the system is faster with the two Pokey IRQs than it was previously with only one. This is a fairly compelling argument for - if not writing a complete custom OS - doing away with the OS interrupt dispatcher altogether and running the whole thing off a custom routine. .proc IRQHand pha txa pha tsx lda $0103,x ; check for BRK and #$10 bne Dispatcher pla ; pull x off stack tax lda #2 bit IRQST ; Pokey Timer 2? (Mouse Sampler) bne NotTimer2 lda #$fd sta IRQST lda POKMSK sta IRQST jmp Mouse NotTimer2 lda #4 bit IRQST ; Pokey Timer 4? (Scheduler) bne NotTimer4 lda #$fb sta IRQST lda POKMSK sta IRQST jmp Scheduler NotTimer4 lda #> [ExitInt.IRQ] pha lda #< [ExitInt.IRQ] pha php lda $d301 sta pbsave ora #1 sta $d301 jmp (vimirq) Dispatcher pla tax jmp ExitInt.IRQ .endp ; Edited August 18, 2013 by flashjazzcat 2 Quote Link to comment Share on other sites More sharing options...
phaeron Posted August 18, 2013 Share Posted August 18, 2013 Yes, the XL/XE OS's interrupt dispatcher is slow and takes ~200 cycles to dispatch an IRQ. For about 30 bytes more it can be doubled in speed even while maintaining the same IRQ priority and support for PBI and PIA interrupts. It could be made even faster if POKEY's IRQ bits weren't so awkwardly arranged -- I have no idea why they chose to put the keyboard IRQs into bits 6 and 7 instead of the serial ready IRQs, and the special case needed for the serial output complete IRQ is a PITA. I assume since you're using both timers 2 and 4 that you're not concerned with handling serial I/O, which the OS routines are optimized for. The stock OS also can't use abs,X indexing to test for BRK because it has to handle stack wrapping, but that's not an issue for you. Note that your new optimized dispatch isn't clearing the decimal flag before calling the mouse or scheduler routines like the OS does. Might want to add a CLD. Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted August 18, 2013 Author Share Posted August 18, 2013 (edited) Yes, the XL/XE OS's interrupt dispatcher is slow and takes ~200 cycles to dispatch an IRQ. For about 30 bytes more it can be doubled in speed even while maintaining the same IRQ priority and support for PBI and PIA interrupts. It could be made even faster if POKEY's IRQ bits weren't so awkwardly arranged -- I have no idea why they chose to put the keyboard IRQs into bits 6 and 7 instead of the serial ready IRQs, and the special case needed for the serial output complete IRQ is a PITA. Well, I'm learning a lot fast here, since I haven't had a reason to delve deeply into the interrupt handlers in the past. But when I realized how much more quickly I could get to that mouse sampler (and I've since moved the BRK check so that it's done after the two Pokey dispatches), I wondered why I hadn't done it sooner: Pokey timer 2 fires around 1,000 times a second, so saving 50-100 cycles getting there amounts to a considerable optimisation. Without doubt a lot of things can be done more efficiently than the OS implementation when the dispatcher is tailored for a specific purpose... but then the OS routine is generic, of course. I assume since you're using both timers 2 and 4 that you're not concerned with handling serial I/O, which the OS routines are optimized for. The stock OS also can't use abs,X indexing to test for BRK because it has to handle stack wrapping, but that's not an issue for you. The custom interrupt handlers won't be used during serial I/O, since the GUI's handlers are wrappers which were initially intended simply to switch the OS ROM back in (it's RAM the rest of the time). CIO calls also use wrapper code (to bank in the OS ROM), and the interrupt service routines themselves are carefully designed so that the stock OS has no trouble calling them if the ROM happens to be switched in... or if an NMI pre-empts an IRQ, for instance. Regarding the stack: I may end up using a segmented stack (possibly 64 bytes per process), so I'll have to give some careful thought to the indexed addressing. More that four processes will be supported, of course, but the idea is to try and minimize stack copying (although I don't see it being such a major performance problem as was first supposed). Note that your new optimized dispatch isn't clearing the decimal flag before calling the mouse or scheduler routines like the OS does. Might want to add a CLD. Thanks... I'd missed that. BTW: does using 2 Pokey timers mean we only have 2 channels left for audio beeps and burps? I'm a bit of a sound noob... Edited August 18, 2013 by flashjazzcat Quote Link to comment Share on other sites More sharing options...
phaeron Posted August 18, 2013 Share Posted August 18, 2013 (edited) BTW: does using 2 Pokey timers mean we only have 2 channels left for audio beeps and burps? I'm a bit of a sound noob... It means you only have two channels left that you can control the pitch on. It's perfectly valid to enable audio output on a channel that you're also using to fire IRQs, as long as the timer period works for both uses. Tricks with the noise generators can help squeeze out a couple of extra pitches from the same timer period. You can also still reuse those channels for other sounds by driving them directly in volume-only mode in a loop. It won't be the highest quality sound, but it suffices for a short click. This is what the OS keyboard handler does for the key click by means of VCOUNT polling, but with CONSOL instead of an audio channel. Edited August 18, 2013 by phaeron Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted August 19, 2013 Author Share Posted August 19, 2013 It means you only have two channels left that you can control the pitch on. It's perfectly valid to enable audio output on a channel that you're also using to fire IRQs, as long as the timer period works for both uses. Tricks with the noise generators can help squeeze out a couple of extra pitches from the same timer period. OK - thanks for that. I could have run the scheduler off the NMI, but jumping into the scheduler's context would have been that much more difficult. Using the IRQ with BRK as a "syscall" mechanism is pretty elegant... at least it should be. We also get much more flexibility with the scheduler's frequency using the IRQ, of course. Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted August 20, 2013 Author Share Posted August 20, 2013 Brave new development environment: 2 Quote Link to comment Share on other sites More sharing options...
phaeron Posted August 21, 2013 Share Posted August 21, 2013 Stop making my debugger look ugly. Go to Debug > Options > Change Font to change the debugger font back to a monospace font. 1 Quote Link to comment Share on other sites More sharing options...
popmilo Posted August 21, 2013 Share Posted August 21, 2013 Brave new development environment: For a moment I thought you are showing us new Atari Ide written inside your Gui ! 1 Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted August 21, 2013 Author Share Posted August 21, 2013 Stop making my debugger look ugly. Go to Debug > Options > Change Font to change the debugger font back to a monospace font. Ah... That explains it. Actually I came a cropper with Eclipse and was on the point of abandoning this setup. The new builds don't play nice with the plug-ins, so I'm having to jump through a few hoops. Quote Link to comment Share on other sites More sharing options...
fibrewire Posted September 15, 2013 Share Posted September 15, 2013 I like the updated site Jon - looking good! Those mock-up screenshots are fantastic! I posted them here for those too lazy to click a link... 3 Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted September 16, 2013 Author Share Posted September 16, 2013 Thanks! And all credit to MrFish for those superb new mock-ups. It'll take a while to knock the new site into shape, and I don't even have all the original content online yet. Quote Link to comment Share on other sites More sharing options...
fibrewire Posted September 19, 2013 Share Posted September 19, 2013 (edited) Well, here's how it looked on a 65C02 CPU Apple II with 128K banked ram http://www.youtube.com/watch?v=weFF8P6gyCE How difficult would it be to locate the text rendering section with only the disassembled binary available? I only have my phone available to me right now, so I'm even more useless at the moment than usual That text render sure is fast on this demo... Also it looks like the text is rendered simultaneous with graphics, this is apparent when the GUI draws text with icons Edited September 19, 2013 by fibrewire Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted September 19, 2013 Author Share Posted September 19, 2013 How difficult would it be to locate the text rendering section with only the disassembled binary available? I only have my phone available to me right now, so I'm even more useless at the moment than usual That text render sure is fast on this demo... Also it looks like the text is rendered simultaneous with graphics, this is apparent when the GUI draws text with icons I've just watched the whole thing through again and there's nothing which makes me want to dig the disassembler out, impressive though it is. The refresh in BeagleWrite isn't all that startlingly fast, and the menu and client area rendering looks on a par with what we've got at the moment... possibly slower in places (especially the menus). Not sure I see what you mean about the simultaneous rendering: it just looks to me like each icon's label is rendered immediately after the icon is drawn. I can't see much to be gained by doing any kind of interleaved / parallel rendering. Great system, though... totally inspiring! Quote Link to comment Share on other sites More sharing options...
atarixle Posted September 21, 2013 Share Posted September 21, 2013 I agree with you, here is a brief history in Text-Rendering in BOSS-X ... I don't think there will be any need to disassemble Apple's Text-Renderer http://www.youtube.com/watch?v=I5iNBPXy8Ys 2 Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted September 21, 2013 Author Share Posted September 21, 2013 I agree with you, here is a brief history in Text-Rendering in BOSS-X ... I don't think there will be any need to disassemble Apple's Text-Renderer http://www.youtube.com/watch?v=I5iNBPXy8Ys Good video. This kind of continuous refinement is lots of fun. I think the Apple renderer uses the same look-up table approach we're using, however. Quote Link to comment Share on other sites More sharing options...
Wrathchild Posted September 21, 2013 Share Posted September 21, 2013 Whilst the video showed the improvement of the BOSS rendering routine's speed, the change from showing line scrolling to page transitions spoilt the comparison for me. Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted September 21, 2013 Author Share Posted September 21, 2013 Whilst the video showed the improvement of the BOSS rendering routine's speed, the change from showing line scrolling to page transitions spoilt the comparison for me. Yeah, the scrolling version was really nice. Quote Link to comment Share on other sites More sharing options...
atarixle Posted September 22, 2013 Share Posted September 22, 2013 (edited) I doubled the scrolling speed by scrolling two scan-lines in one step, so the comparsion would have been corrupted by this already. But the rendering speed in fact is faster. In the Viewer 2011 I removed a PROCedure call, and replaced it by the direct USR-call. This saved some time in Turbo-BASIC. The new 2011 routine in fact is faster, but you have to use a Stop Watch to mesure that. Displaying one page is a lot faster than before. More important I found was that the Viewer additionally can jump from page to page. Yeah, and the 2013 boosts up the speed while using an ML routine also for word-wrapping. Edited September 22, 2013 by atarixle Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted September 22, 2013 Author Share Posted September 22, 2013 (edited) Why not quadruple the scrolling speed again by scrolling a full line of text at a time? Edited September 22, 2013 by flashjazzcat Quote Link to comment Share on other sites More sharing options...
atarixle Posted September 22, 2013 Share Posted September 22, 2013 (edited) I had to do it five times faster ... because I use 10 scanlines for one line of text which makes it more readable (althru I do not support lowerlobe(?)-letters). And I like the feeling of fine scrolling. Oh btw, the newer viewer still does fine-scrolling, I just was showing the page-by-page-view in the video. Re-Upload of the video hopefully in correct ratio ... Edited September 22, 2013 by atarixle Quote Link to comment Share on other sites More sharing options...
crash Posted September 22, 2013 Share Posted September 22, 2013 I would choose scrolling a line at a time rather than fine/smooth scrolling if there was a choice. The difference in speed over the years is very impressive. Thanks for sharing this! Quote Link to comment Share on other sites More sharing options...
UNIXcoffee928 Posted September 22, 2013 Share Posted September 22, 2013 Hi Jon! I was wondering if you have included Koala Pad & Atari Light Pen support? I am of the opinion that the addition of these two input devices would provide a modern feel on original hardware, and may also help with pointer accuracy when running the software in an emulator, on a touch enabled device. You've been doing a fabulous job on this GUI, thanks for all of your efforts! Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted September 22, 2013 Author Share Posted September 22, 2013 Hi Jon! I was wondering if you have included Koala Pad & Atari Light Pen support? I am of the opinion that the addition of these two input devices would provide a modern feel on original hardware, and may also help with pointer accuracy when running the software in an emulator, on a touch enabled device. You've been doing a fabulous job on this GUI, thanks for all of your efforts! Thanks. No reason why we can't provide drivers for all manner of input devices. The only problem I can foresee is trying to use a 320x200 drawing package with an input device (such as a tablet) with lower resolution. Other than that: no worries. Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted September 22, 2013 Author Share Posted September 22, 2013 For light relief: http://hackaday.com/2013/08/28/a-one-third-scale-macintosh/ Interestingly it uses a 320x200 display. 3 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.