flashjazzcat Posted September 7, 2014 Author Share Posted September 7, 2014 (edited) Now that you mention that, that makes perfect sense, but sheezus, these are the kids of weird things that you only run into after you've implemented interrupt sensitive code. Exactly, and I've learned a lot about interrupts while writing this (and from Avery's Altirra Hardware Reference Manual, which I always keep to hand). But this was an interesting case, and my previous theory about skipped IRQs didn't hold much water. After all, IRQ's are level triggered and until the interrupt source has been acknowledged, they won't be missed. The scheduler IRQ would have to be delayed for an entire frame before a second timer IRQ would cause the original IRQ to be superseded. While changing things so the scheduler runs in the VBI (which - right now - seems a much better idea), I figured out what was wrong. Since the kernel was using the I flag to prevent context switches in critical sections, the entire scheduler IRQ was being blocked, and potentially for long enough to miss a whole frame of VCOUNT ticks. What we wanted to skip was just the context switch, though - not the entire interrupt, which measures VCOUNT and assumes no more than a single frame's worth of ticks have occurred. The solution was, of course, to update the process's cumulative VCOUNT total once per frame (now in the NMI), regardless of whether a context switch occurs. The VCOUNT totals are now accurately maintained, and I've corrected the frames-per-sample count (which was off): Fifteen per cent idle is much more like it, and I think this is correct now, although I'm actually (pleasantly) surprised it's that low. Note: I'm moving those percentage counters out of the control group frame ASAP. They look ugly when they update (although the momentary "strikethrough" effect is much more pronounced in the video than in actuality). PS: We can (and do) still use SEI to define a critical section. The NMI updates the VCOUNT counters and then checks the state of "I" on the stack. Edited September 7, 2014 by flashjazzcat 3 Quote Link to comment Share on other sites More sharing options...
Heaven/TQA Posted September 7, 2014 Share Posted September 7, 2014 I find this amazing each time I watch it... 1 Quote Link to comment Share on other sites More sharing options...
Prodatron Posted September 7, 2014 Share Posted September 7, 2014 (edited) Wow, 15%, that looks much better!! Didn't I say you will get the idle time increased? I like the way the diagram is displayed! Maybe it's even faster if you don't redraw the frame line everytime? Ok, probably that doesn't matter so much. I am very impressed! Edited September 7, 2014 by Prodatron 1 Quote Link to comment Share on other sites More sharing options...
TheNameOfTheGame Posted September 7, 2014 Share Posted September 7, 2014 Wow! So idle actually just consumes 15% then. Fantastic insight you had there. Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted September 7, 2014 Author Share Posted September 7, 2014 (edited) Wow, 15%, that looks much better!! Didn't I say you will get the idle time increased? And I like the way the diagram is displayed! Maybe it's even faster if you down't redraw the frame everytime? You were right. Indeed, nearly every suggestion you've made has been right (window rectangles onwards). As said - the frame looks ugly being redrawn every time, and it won't have to be if I move the percentage value. So idle actually just consumes 15% then. Yeah. I had forgotten to update the tally on a forced context switch (having moved a lot of code around) in the video, but even when I reinstated this, the idle load remained at 15-16%. EDIT: turns out the idle load is 32 per cent on an NTSC machine. I think the drop in scheduler frequency 64Hz to 50Hz had a fairly large impact on the idle load in itself, but with NTSC's 60Hz NMI we're right back up there again. Of course the mouse pointer is rendered 20 per cent more frequently on NTSC Ataris too. So - do we artificially worsen PAL performance to match that of NTSC? Oh - the joys of video standard dependent code. Oops... forgot to reboot the system after changing from PAL to NTSC in Altirra. The kernel measures the maximum VCOUNT value at boot-time, and thus we were still counting the number of frames required on a PAL system. NTSC idle load is actually 18 per cent, which is damned surprising given the 20 per cent increase in interrupt overhead. Commit! Edited September 7, 2014 by flashjazzcat 1 Quote Link to comment Share on other sites More sharing options...
+Eyvind Bernhardsen Posted September 8, 2014 Share Posted September 8, 2014 NTSC idle load is actually 18 per cent, which is damned surprising given the 20 per cent increase in interrupt overhead. Not that surprising. 18 is 20% more than 15, so I'd say it's exactly as expected 3 Quote Link to comment Share on other sites More sharing options...
+MrFish Posted September 8, 2014 Share Posted September 8, 2014 I know you're not a fan of the headings in this design style Jon, but I couldn't help satisfying my curiosity about a few things. Anyway... curiosity satisfied... 4 Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted September 8, 2014 Author Share Posted September 8, 2014 Not that surprising. 18 is 20% more than 15, so I'd say it's exactly as expected Good point. The measurement is obviously accurate, at last. Quote Link to comment Share on other sites More sharing options...
+Stephen Posted September 8, 2014 Share Posted September 8, 2014 I know you're not a fan of the headings in this design style Jon, but I couldn't help satisfying my curiosity about a few things. Anyway... curiosity satisfied... TM Graph 13b1 (In Situ, Scaled).png TM Graph 13d1 (In Situ, Scaled).png I love the bottom one, it's less crowded and everything is clear and easy to read. 1 Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted September 8, 2014 Author Share Posted September 8, 2014 (edited) TM Graph 13d1 (In Situ, Scaled).png I do like the economical coalescing of information in the second example. Food for thought... Thoughts of NTSC and PAL plagued my sleep last night. If the scheduler interrupt is going to be tied to the refresh rate (50Hz for PAL, 60 for NTSC), then applications which want - for example - to sleep for a specified number of system ticks will need to be PAL/NTSC aware. I guess this is no big ask, though: one location holding "jiffies per second" should suffice. Plans for the clock are taking shape now, although I'll have to be troubled by writing a driver for the Ultimate RTC in the first instance. Ideally, the on-screen clock can be a discreet process which puts itself to sleep for a minute at a time, to be woken by the timer or by the user choosing an item from the clock's registered menu. Sure - tasks can go into idle mode after checking for conditions or maintaining their own internal timers, but a sleep for n ticks facility will be optimal. Edited September 8, 2014 by flashjazzcat Quote Link to comment Share on other sites More sharing options...
Xuel Posted September 8, 2014 Share Posted September 8, 2014 One thing that could potentially throw off CPU usage measurements based on VCOUNT is that there are a different number of cycles available to the CPU for different scan lines. During vertical blank, the CPU gets 105 cycles per scan line whereas on a typical 40-byte mode F scanline, the CPU only gets 64 cycles. Maybe this averages out over many frames if the IRQ pattern isn't locked to the frame rate but perhaps there's some economical way to account for this variation more directly with a lookup table of some kind? Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted September 8, 2014 Author Share Posted September 8, 2014 One thing that could potentially throw off CPU usage measurements based on VCOUNT is that there are a different number of cycles available to the CPU for different scan lines. During vertical blank, the CPU gets 105 cycles per scan line whereas on a typical 40-byte mode F scanline, the CPU only gets 64 cycles. Maybe this averages out over many frames if the IRQ pattern isn't locked to the frame rate but perhaps there's some economical way to account for this variation more directly with a lookup table of some kind? It should average out, since although processes running outside of the DMA region might get more cycles (and therefore effectively run faster than processes occurring while the screen's being drawn), everything will eventually get a "bite" of that non-DMA CPU time. The measurement would be completely accurate if each process always used its full time-slice (since then every process would execute in the DMA and non-DMA phase), but of course idle processes often yield early to the next task. A task which yields immediately after the NMI will appear to use up fewer CPU cycles than the next task which gets the CPU, assuming DMA has started by that time. But my feeling is that context switches will naturally tend to "drift" across the frame anyway, and thus tend to even out (albeit possibly less so than if the scheduler was deliberately staggered across frames). 1 Quote Link to comment Share on other sites More sharing options...
+MrFish Posted September 8, 2014 Share Posted September 8, 2014 (edited) I love the bottom one, it's less crowded and everything is clear and easy to read. I do like the economical coalescing of information in the second example. Food for thought... Alright.. Here's a compacted version of that, and a title style variation, one with a black graph. The white graph styles give a slightly shorter window size, by nature of how their graph borders are formed. [Edit: They (white graphs) also end up slightly wider horizontally -- for the same reason mentioned above -- with the same width window as used for the design with the black graphs.] Edited September 8, 2014 by MrFish Quote Link to comment Share on other sites More sharing options...
TheNameOfTheGame Posted September 8, 2014 Share Posted September 8, 2014 (edited) Nice..They look good! Edited September 8, 2014 by TheNameOfTheGame Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted September 8, 2014 Author Share Posted September 8, 2014 I think we're getting warm. Really the lower panels still need titles since the number of tasks and timers is relevant not only to the CPU but to memory usage as well. I like the idea of having the totals in two columns, though, especially given the limited vertical space. Quote Link to comment Share on other sites More sharing options...
+MrFish Posted September 8, 2014 Share Posted September 8, 2014 (edited) Really the lower panels still need titles since the number of tasks and timers is relevant not only to the CPU but to memory usage as well. Yeah, it's just a design experiment. As you say, the Process, Program, and Timer group does not directly relate to the CPU as the memory totals do. I don't think it causes too much cognitive dissonance though. I'm really just looking for a way to compact the design, though I can't help but think that two memory headings seems redundant when every section has headings. But if I add the memory group with it's graph, then the other totals end up getting orphaned, and the design doesn't have as much unity. Edited September 8, 2014 by MrFish Quote Link to comment Share on other sites More sharing options...
+Stephen Posted September 8, 2014 Share Posted September 8, 2014 Looking great! I like the bottom one the best, but none look bad. (BTW - just stating an opinion, and giving a thumbs up for all the work - don't want to make this design by committee ) Quote Link to comment Share on other sites More sharing options...
+MrFish Posted September 8, 2014 Share Posted September 8, 2014 (edited) I think we're getting warm. I like the idea of having the totals in two columns, though, especially given the limited vertical space. Yeah, I know this is what you want to see. Not bad, really... [Edit: Missed something. This is better, saved two pixels as well.] Edited September 8, 2014 by MrFish 2 Quote Link to comment Share on other sites More sharing options...
+MrFish Posted September 8, 2014 Share Posted September 8, 2014 Looking great! I like the bottom one the best, but none look bad. (BTW - just stating an opinion, and giving a thumbs up for all the work - don't want to make this design by committee ) Thanks. I usually work through stuff like this in private. But it's good to hear opinions sometimes. If I didn't want opinions, I wouldn't post up. 1 Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted September 8, 2014 Author Share Posted September 8, 2014 (edited) Yeah, I know this is what you want to see. Not bad, really... [Edit: Missed something. This is better, saved two pixels as well.] TM Graph 22 (In Situ, Scaled).png Nice. I'm working on white text on opaque background as we speak. Actually it's a bit more far-reaching than that. The font rendering code is so similar to the bitmap rendering code that they can be coalesced (fonts are just masked bitmaps, after all). This will reduce most graphics operations - including drawing glyphs - down to pixel set/clear or blit, decreasing code size (and complexity) and making it much easier to write a different display driver. New bitmap code has already shaved 1% off idle load when monitor is running. Edited September 8, 2014 by flashjazzcat 3 Quote Link to comment Share on other sites More sharing options...
pixelmischief Posted September 8, 2014 Share Posted September 8, 2014 Thanks. I usually work through stuff like this in private. But it's good to hear opinions sometimes. If I didn't want opinions, I wouldn't post up. Hmmm. Interesting. =) Quote Link to comment Share on other sites More sharing options...
pixelmischief Posted September 8, 2014 Share Posted September 8, 2014 Yeah, I know this is what you want to see. Not bad, really... [Edit: Missed something. This is better, saved two pixels as well.] TM Graph 22 (In Situ, Scaled).png I vastly prefer versions that have a blacked out area with white ticks for the graph, vs. a black-bordered, white area with black ticks. Quote Link to comment Share on other sites More sharing options...
pixelmischief Posted September 8, 2014 Share Posted September 8, 2014 DebiOS? DebOS. 'Cause she be da bawss. 1 Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted September 8, 2014 Author Share Posted September 8, 2014 I vastly prefer versions that have a blacked out area with white ticks for the graph, vs. a black-bordered, white area with black ticks. Me too, and I still like the 'heart monitor' graph as well. Maybe we do need 'Graph type' in the menu after all. Note: selected tab has 1px upper border in the actual dialog. Quote Link to comment Share on other sites More sharing options...
+MrFish Posted September 8, 2014 Share Posted September 8, 2014 Maybe we do need 'Graph type' in the menu after all. Or just click on the graph itself to cycle through styles. 1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.