flashjazzcat Posted October 7, 2013 Author Share Posted October 7, 2013 (edited) Possible improvement: could you redraw only edges of slider when moving for a step (by clicking on arrows)? (Now it looks like you are erasing slider and than drawing it in new place). That would look nicer but it's more code. Might end up kludgy but I'll see if it's possible (arrow mousedown just updates offset then calls scrollbar redraw). Actually thinking about it, we might as well not draw the scrollbar background under the scroll thumb at all... That would work with the new control model. Edited October 7, 2013 by flashjazzcat Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted October 7, 2013 Author Share Posted October 7, 2013 (edited) Well, that looks about a thousand times better: Sometimes we're so busy coding, we overlook obvious enhancements which can be made. Thanks Popmilo! Note: strictly speaking, in this kind of columnar view, I realize the scrollbars should "snap" to the nearest column offset. Haven't done that yet... Edited October 7, 2013 by flashjazzcat 7 Quote Link to comment Share on other sites More sharing options...
w1k Posted October 7, 2013 Share Posted October 7, 2013 woow, fastest how? Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted October 7, 2013 Author Share Posted October 7, 2013 (edited) woow, fastest how? How is it rendered? The left portion of the scrollbar background is drawn, then the thumb, then the right portion of the background. Also, I take care not to obliterate anything with the opposite colour when rendering: The white space inside the scrollbar thumb now does not overlap the black border. This simple change reduced flickering to almost nil. Another point which just occurred to me is that the sliders should probably be drawn ghosted anyway until the button is released unless the scrolling is "live". Edited October 7, 2013 by flashjazzcat Quote Link to comment Share on other sites More sharing options...
popmilo Posted October 7, 2013 Share Posted October 7, 2013 Well, that looks about a thousand times better: Sometimes we're so busy coding, we overlook obvious enhancements which can be made. Thanks Popmilo! No problem! Glad to help. Come on, give us more puzzles like this one! It is nice to sit relaxed in a chair, thinking about higher concepts, letting others do the hard work, and then feeling like you solved something big ps. Will be really nice to have a chance to start actual work on applications for your Gui... Great work man! 1 Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted October 7, 2013 Author Share Posted October 7, 2013 It is nice to sit relaxed in a chair, thinking about higher concepts, letting others do the hard work, and then feeling like you solved something big ps. Will be really nice to have a chance to start actual work on applications for your Gui... Great work man! Well, if it wasn't for observant folk who aren't totally wrapped up in the coding side, I'd probably miss a lot of stuff. That's one good thing about posting regular progress updates, I suppose (although on the face of it, not much seems to have changed in the videos at times). I share your enthusiasm for starting work on applications. It's gratifying to get the skeleton app in the video responding to events, but when you think how rich the functionality will eventually be... you just want it done! 2 Quote Link to comment Share on other sites More sharing options...
popmilo Posted October 7, 2013 Share Posted October 7, 2013 I share your enthusiasm for starting work on applications. It's gratifying to get the skeleton app in the video responding to events, but when you think how rich the functionality will eventually be... you just want it done! Would like to see more small enhancements like this one... Dozen of those, and gui would look even more awesome than it already is I'm not sure about a dozen possible improvements, but two or three would be nice... Will you provide source code for gui once it is out in public ? Would like to see core text drawing routines... Those look like they are taking most of drawing time, and I have a feeling something could be done about it ... Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted October 7, 2013 Author Share Posted October 7, 2013 (edited) Would like to see more small enhancements like this one... Dozen of those, and gui would look even more awesome than it already is I'm not sure about a dozen possible improvements, but two or three would be nice... Well, best speak now or leave the tweaks till later, since the main drive at the moment is to make real progress, rather than just prettifying the same demo for months on end. Will you provide source code for gui once it is out in public ? Would like to see core text drawing routines... Those look like they are taking most of drawing time, and I have a feeling something could be done about it ... Unsure about the source code situation. I considered releasing other big projects in the past but was talked out of it... many people just download code then forget about it, for one thing. I do it myself all the time. We optimised the text drawing to death earlier in these pages (or it seemed like it at the time), but I guess it can be looked at again when there are fewer competing priorities. My guess is that because every byte written to the screen goes through two masks (clipping mask and window mask), there's a real limit to just how much faster things can be made without throwing the baby out with the bath water. With such limitations in mind, here's the core rendering loop for unstyled text. Feel free to suggest improvements. charlineloop ; line loop ldy linecount ; inline code instead of call to get_scr for speed lda LineMask,y ; see if we need to render this line bne dorenderline tay ora PrevYMask bne notlastline jmp char_done notlastline sty PrevYMask jmp nextline dorenderline sta PrevYMask lda linetable,y ; need code to set up mask as well sta scr lda linetable+200,y sta scr+1 lda (WindowMask),y tay lda mask_slot_table,y sta maskptr lda mask_slot_table_hi,y sta maskptr+1 ldx char_byte_width ldy #0 lda (ptr1),y tay lda (shiftptr),y tay and lmask sta chbuffer tya and rmask sta chbuffer+1 dex beq done1 ldy #1 lda (ptr1),y tay lda (shiftptr),y tay and lmask ora chbuffer+1 sta chbuffer+1 tya and rmask sta chbuffer+2 dex beq done1 ldy #2 lda (ptr1),y tay lda (shiftptr),y tay and lmask ora chbuffer+2 sta chbuffer+2 tya and rmask sta chbuffer+3 dex beq done1 ldy #3 lda (ptr1),y tay lda (shiftptr),y and lmask ora chbuffer+3 sta chbuffer+3 done1 ; x should already be 0 ldy xbyte char_render_loop lda ClipMask,y ; beq SkipByte ; is this worth doing? and (maskptr),y and chbuffer,x ; and with render bits and greymask ora (scr),y sta (scr),y SkipByte iny inx cpx byte_width bne char_render_loop nextline lda ptr1 clc adc char_byte_width sta ptr1 bcc *+4 inc ptr1+1 lda greymask and #1 cmp #1 ror greymask inc linecount dec tmp5 beq char_done jmp charlineloop char_done ; finished render, so check if we need to underline the character Edited October 7, 2013 by flashjazzcat 1 Quote Link to comment Share on other sites More sharing options...
The Usotsuki Posted October 8, 2013 Share Posted October 8, 2013 This makes me wonder if a 64K version of Apple II Desktop is possible. xD Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted October 8, 2013 Author Share Posted October 8, 2013 (edited) Although I rightfully credited popmilo and others for suggesting and describing the method for drawing and erasing the mouse pointer in an NMI interrupt, I notice that analmux suggested the very same thing in post 2 of this topic! So apologies for failing to acknowledge that when the method was eventually implemented. Apologies also to tebe, whose graphics OBX contribution went by unacknowledged by me all that time ago. Lots of missed treasures in this thread... tebe, if you have sources for your demo, I'd be most interested to have a look. Edited October 8, 2013 by flashjazzcat 1 Quote Link to comment Share on other sites More sharing options...
popmilo Posted October 8, 2013 Share Posted October 8, 2013 Date of mentioned post: "Dec 6, 2009", in couple of months it will be 4 years Don't we just love these projects that get developed in such a long time periods and during all that time the hardware doesn't change ? ps. Thanks for that code snippet ! Printed it out, started reading it during work 1 Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted October 8, 2013 Author Share Posted October 8, 2013 Date of mentioned post: "Dec 6, 2009", in couple of months it will be 4 years Don't we just love these projects that get developed in such a long time periods and during all that time the hardware doesn't change ? ps. Thanks for that code snippet ! Printed it out, started reading it during work Yeah... fortunately I didn't really start writing anything in earnest until a year later, so we're coming up to three years. BTW: sorry for lack of comments in code, but I'm certain you'll be able to intuit what's going on. Quote Link to comment Share on other sites More sharing options...
Irgendwer Posted October 8, 2013 Share Posted October 8, 2013 (edited) With such limitations in mind, here's the core rendering loop for unstyled text. Feel free to suggest improvements. ... lda greymask and #1 cmp #1 ror greymask ... lda greymask eor #255 sta greymask ??? (had only a quick look, and may not understand what you try to do here.. ) Edit: done1 ; x should already be 0 ldy xbyte char_render_loop lda ClipMask,y ; beq SkipByte ; is this worth doing? and (maskptr),y and chbuffer,x ; and with render bits and greymask ora (scr),y sta (scr),y SkipByte iny inx cpx byte_width bne char_render_loop Try to turn the horizontal direction to get rid of the CPX byte_width (seems it is always positive so dex bpl char_render_loop should work instead...) Edited October 8, 2013 by Irgendwer 3 Quote Link to comment Share on other sites More sharing options...
popmilo Posted October 9, 2013 Share Posted October 9, 2013 lda greymask and #1 cmp #1 ror greymask As I understood it, its just supposed to rotate greymask, so simple lsr should work: lda greymask lsr ror greymask Quote Link to comment Share on other sites More sharing options...
popmilo Posted October 9, 2013 Share Posted October 9, 2013 lda greymask eor #255 sta greymask Irgendwer is right, this does the same thing as 'rotate' if greymask is '10101010'. 1 Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted October 9, 2013 Author Share Posted October 9, 2013 Irgendwer is right, this does the same thing as 'rotate' if greymask is '10101010'. Yes he is. However, there has to be some archaic reason why I did it this way, since I use EOR #$FF to do the same thing on the dithered desktop background. Who knows... anyway, it can be changed. Using X to count down to zero (branching when positive) is a good idea, although the contents of chbuffer would have to be stored in reverse order, so perhaps this would bloat the set-up routine. Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted October 9, 2013 Author Share Posted October 9, 2013 Irgendwer is right, this does the same thing as 'rotate' if greymask is '10101010'. And sure enough, it turns out that I didn't use EOR #$FF because it corrupts greymask when it's solid, resulting in skipping alternate scanlines of non-greyed text (since all output is ANDed with greymask). The shifting method used leaves a greymask of $FF intact. Quote Link to comment Share on other sites More sharing options...
popmilo Posted October 9, 2013 Share Posted October 9, 2013 As I understood it, its just supposed to rotate greymask, so simple lsr should work: lda greymask lsr ror greymask Me loves puzzles lsr is already faster and shorter than: and #1 cmp #1 But, can this be any faster ? I knew there was another way As greymask is alternating every odd-even line, you could make a table 200 bytes high and then use y register as counter at the beginning of the loop: lda GreyMaskTable,y sta greymask It is 200 bytes, but you could even design masks that wouldn't be just simple xxxx pattern... ps. Bear with me, I had more free time than usual on job today Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted October 9, 2013 Author Share Posted October 9, 2013 (edited) Me loves puzzles lsr is already faster and shorter than: and #1 cmp #1 But, can this be any faster ? I knew there was another way As greymask is alternating every odd-even line, you could make a table 200 bytes high and then use y register as counter at the beginning of the loop: lda GreyMaskTable,y sta greymask It is 200 bytes, but you could even design masks that wouldn't be just simple xxxx pattern... ps. Bear with me, I had more free time than usual on job today Point I was trying to make in post #2217 is that I use AND #1 / CMP #1 because it has absolutely no effect at all on a mask with all bits set: lda greymask and #1 cmp #1 ror greymask The above operation performed when greymask is $FF yields $FF. I did it this way so that I can set greymask to $AA (greyed) or $FF (normal) right at the top of the render routine, and then it just takes care of itself without any tables or conditional code. Of course we use tables elsewhere, for stuff like dithered scrollbar backgrounds in just the way you describe, but for greyed text we'll never need anything more complex then a simple chequerboard pattern. I must have considered this transformation a puzzle in itself at the time, and was quite pleased with the solution. Anyway, I hurled some more code at the render loop in general, to fit those occasions when a) a character is 8 or fewer pixels wide and will be shifted across two bytes, and b) when it is 8 or fewer pixels wide and does not cross a byte boundary on the screen: ldx char_byte_width cpx #2 bcs Wide lda byte_width cmp #2 bcs Wide2 ldy #0 lda (ptr1),y tay lda (shiftptr),y and lmask ldy xbyte and (maskptr),y and ClipMask,y and greymask ora (scr),y sta (scr),y SkipByte3 jmp nextline Wide2 ldy #0 lda (ptr1),y tay lda (shiftptr),y tay and rmask tax tya and lmask ldy xbyte and (maskptr),y and clipmask,y beq @+ and greymask ora (scr),y sta (scr),y @ iny txa and (maskptr),y and clipmask,y beq @+ and greymask ora (scr),y sta (scr),y @ jmp nextline Wide [original variable-width char render routine] Haven't calculated the cycle savings (and I can't notice any discernible improvement in speed), but it must be a bit faster. If we hit the wall here or hereabouts, I don't mind. Edited October 9, 2013 by flashjazzcat Quote Link to comment Share on other sites More sharing options...
Irgendwer Posted October 9, 2013 Share Posted October 9, 2013 And sure enough, it turns out that I didn't use EOR #$FF because it corrupts greymask when it's solid, resulting in skipping alternate scanlines of non-greyed text (since all output is ANDed with greymask). The shifting method used leaves a greymask of $FF intact. Yes, just after reading you previous post I thought about this aspect too. Sorry for the noise. At least popmilo's 'lsr' does the same job and is faster. Regarding the chbuffer order: Yes, this may affects the font format, but may worth it. (In 'seitensprung' I changed it also three times to get more speed. Latest version (unpublished yet) is about 5% faster.) Just a guess, but do you make use of a 2k table for 'shiftptr' data, and your character images are byte aligned? I find your code quite interesting, as mine in 'seitensprung' is totally different (works pixel-wise). Thank you for the peek! Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted October 9, 2013 Author Share Posted October 9, 2013 (edited) I suddenly see why popmilo's code works... D'oh! I just didn't get it before. Apologies! Yes: char data is byte-aligned. 2K shifting table is definitely quick, but still takes some setting up. Edited October 9, 2013 by flashjazzcat Quote Link to comment Share on other sites More sharing options...
popmilo Posted October 10, 2013 Share Posted October 10, 2013 Glad to see our code is helping at least a little Sorry if this seems like digging into smaller details while you should be thinking about more important stuff ("API! Khhmmm..." ) Couple of questions: 1. What are general requirements for text drawing routine ? 2. Is this correct: char_byte_width - width of character in bytes (Do we have characters wider than 8 pixels ?) ptr1 - character data shiftptr - shift tables lmask - ? xbyte - x coordinate of byte in line ClipMask - beginning address of current line in climmask maskptr - ? greymask - mask byte for current line scr - beginning address of current line on screen In last shown code there is no chbuffer. Does it mean you threw it out or is it still used ? ps. Good morning to all Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted October 10, 2013 Author Share Posted October 10, 2013 (edited) Glad to see our code is helping at least a little Sorry if this seems like digging into smaller details while you should be thinking about more important stuff ("API! Khhmmm..." ) Couple of questions: 1. What are general requirements for text drawing routine ? 2. Is this correct: char_byte_width - width of character in bytes (Do we have characters wider than 8 pixels ?) ptr1 - character data shiftptr - shift tables lmask - ? xbyte - x coordinate of byte in line ClipMask - beginning address of current line in climmask maskptr - ? greymask - mask byte for current line scr - beginning address of current line on screen In last shown code there is no chbuffer. Does it mean you threw it out or is it still used ? ps. Good morning to all Good Morning! Heh... I appreciate it, and I'm sorry for just not grasping where you were coming from before - especially since the concept was so blindingly obvious. 1. Prior to rendering a character, various page zero pointers should be set up, and this is done by the SetFont routine, which is called with the ID of a font already in memory (this will be refined somewhat later, so we can cache and pull fonts from disk when required). So therefore all rendering is performed with that font until SetFont is called again with a different ID. There are also a number of styling flags which can be set, including the "outline" flag. More complex styling options (such as outline text) cause a branch to a more complex rendering routine. So, the less styling applied, generally the faster the rendering (outline is the slowest, since it shifts the character on four axes to achieve the effect). All text rendering routines (indeed all rendering routines) call SetUpX first (specifying X, Y, Width and Height), and this takes care of clipping, masking, and sets up lmask and rmask, etc. If SetUpX exits with carry set, the object is clipped out of the visible region and the rendering routine should simply abort at that point. Characters of up to 4 bytes (32 pixels) wide are currently supported, which should be ample. 2. char_byte_width is indeed the width of the character in bytes. This is obtained by taking the pixel width (stored in the font data), and applying it against a LUT. The same LUT is used to obtain byte_width, which is the rendering extent in bytes, including the pixel (bit) offset into the leftmost byte. ptr1 = character data shiftptr = index into shift table (only MSB is manipulated, since each table is 256 bytes long) lmask = mask applied to shifted bits obtained via shiftptr - basically masks out the most significant bits. Rmask is its complement, and is used to extract the most significant bits, which become the high bits in the next byte along. xbyte = x coordinate of byte in line ClipMask = complete X clipping mask (40 bytes). The mask is 256 bytes long, but bytes 40-255 are "off-screen". MaskPtr = hardly know how to explain this one: I'll draw a diagram later. Each window has a 200 element list of codes, each referring to a line of data in the global window mask. The lists effectively form RLE compressed masks, unique to each window, but referring to shared masks in the global resource. Each window, therefore, effectively has a complete mask describing its (arbitrarily shaped) visible area, taking into account any foreground windows obscuring it. Window contents are always drawn through the window mask and clip mask. The Window masks are effectively "regions" (as they were known in the Classic Mac OS, although I must stress that any conceptual similarity to Mac regions is coincidental, and indeed I only discovered and read about regions after the window masks had been implemented - which I found rather serendipitous), and they can be merged and compared (indeed, this is how they are generated). Instead of each window's mask being an 8KB bitmap, all the masks for all the windows (assuming a maximum of maybe eight windows) fit quite happily into a 2KB shared buffer (fifty unique 40 byte masks), owing to the the compression used. Thus a background window can be re-rendered, even if foreground objects had rounded corners (or drop-shadows, as is currently the case). greymask = "greying-out" mask for current line, although if set to $FF, it remains opaque. scr = address of current line on screen There's no chbuffer in the new code, since for character data of only 1 byte, it's convenient to skip the buffer generation stage and hold the character bits in other locations. For wide characters, however, it appeared more expedient to build the buffer first, rather than wrestle with lots of indirect access using different offsets in the Y register, and for those characters, the original routine is branched to. Edited October 10, 2013 by flashjazzcat Quote Link to comment Share on other sites More sharing options...
flashjazzcat Posted October 10, 2013 Author Share Posted October 10, 2013 (edited) Here's a depiction of the window masks: In the first figure, Window A's window list (since it's the top window) is entirely opaque (i.e. no masked areas), since it's the front window. So, it would comprise 200 repetitions of index "0". The desktop's window list, meanwhile, consists of: 75 * index 0 100 * index 1 25 * index 0 The desktop's window list is basically a screenful of binary 1s with a window-shaped hole stamped on it. It shares the global mask "0" with Window A, and introduces a second mask: mask 1. In the second figure, we've added another window behind the first. Window A's list is still 200 repetitions of index 0, and Window B's list is identical to what the desktop list used to be: 75 * index 0 100 * index 1 25 * index 0 The desktop's new mask list, meanwhile, has changed: 40 * index 0 35 * index 2 50 * index 3 50 * index 1 25 * index 0 Masks 2 and 3 were created in the global mask resource when Window B's silhouette was superimposed on Window A's shape. The routines which merge masks are smart enough to re-use existing mask entries in the RLE run, rather than creating redundant duplicates of existing ones. The mask lists are recalculated only when a window moves or is closed, or when the window order changes. There's remarkably little overhead to their computation: repeated indices are simply added to a window's list until a change in the underlying intersected masks is detected. The indices in a window's mask list refer directly to interleaved (LSB/MSB) entries in a look-up-table of mask addresses. Mask buffer space is dynamically allocated when a new mask pattern is required during each window's RLE run. For the purposes of this explanation, I've omitted the drop-shadows on the windows, which introduce two extra mask patterns per window (i.e. the top and bottom borders). Rounded corners would be easy to do using this technique, although the mask storage requirements naturally increase with very complex shapes. Edited October 10, 2013 by flashjazzcat Quote Link to comment Share on other sites More sharing options...
popmilo Posted October 10, 2013 Share Posted October 10, 2013 Just as I thought - simple Seriously - those mask lists are one of the cooler techniques that I saw recently. Great work. On the char drawing side, seems like any kind of complex optimization would be ill advised before rest of gui is developed. Couple things I would think about are: 1. 'Somehow' optimize that 1 char wide character drawing. 2. When scrolling window contents try not to redraw entire window. Copy parts that remain same. 3. Use different routine when drawing text in top window. One that would ignore masking in the largest part of window. ps. Just thinking loudly - too tired to recommend anything concrete Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.