New GUI for the Atari 8-bit

flashjazzcat · October 7, 2013

Possible improvement: could you redraw only edges of slider when moving for a step (by clicking on arrows)?

(Now it looks like you are erasing slider and than drawing it in new place).

That would look nicer but it's more code. Might end up kludgy but I'll see if it's possible (arrow mousedown just updates offset then calls scrollbar redraw).

Actually thinking about it, we might as well not draw the scrollbar background under the scroll thumb at all... That would work with the new control model.

Edited October 7, 2013 by flashjazzcat

flashjazzcat · October 7, 2013

Well, that looks about a thousand times better:

Sometimes we're so busy coding, we overlook obvious enhancements which can be made. Thanks Popmilo!

Note: strictly speaking, in this kind of columnar view, I realize the scrollbars should "snap" to the nearest column offset. Haven't done that yet...

Edited October 7, 2013 by flashjazzcat

w1k · October 7, 2013

woow, fastest

how?

flashjazzcat · October 7, 2013

woow, fastest

how?

How is it rendered? The left portion of the scrollbar background is drawn, then the thumb, then the right portion of the background. Also, I take care not to obliterate anything with the opposite colour when rendering: The white space inside the scrollbar thumb now does not overlap the black border. This simple change reduced flickering to almost nil.

Another point which just occurred to me is that the sliders should probably be drawn ghosted anyway until the button is released unless the scrolling is "live".

Edited October 7, 2013 by flashjazzcat

popmilo · October 7, 2013

Well, that looks about a thousand times better:

Sometimes we're so busy coding, we overlook obvious enhancements which can be made. Thanks Popmilo!

No problem! Glad to help.

Come on, give us more puzzles like this one!

It is nice to sit relaxed in a chair, thinking about higher concepts, letting others do the hard work, and then feeling like you solved something big

ps. Will be really nice to have a chance to start actual work on applications for your Gui... Great work man!

flashjazzcat · October 7, 2013

It is nice to sit relaxed in a chair, thinking about higher concepts, letting others do the hard work, and then feeling like you solved something big

ps. Will be really nice to have a chance to start actual work on applications for your Gui... Great work man!

Well, if it wasn't for observant folk who aren't totally wrapped up in the coding side, I'd probably miss a lot of stuff. That's one good thing about posting regular progress updates, I suppose (although on the face of it, not much seems to have changed in the videos at times).

I share your enthusiasm for starting work on applications. It's gratifying to get the skeleton app in the video responding to events, but when you think how rich the functionality will eventually be... you just want it done!

popmilo · October 7, 2013

I share your enthusiasm for starting work on applications. It's gratifying to get the skeleton app in the video responding to events, but when you think how rich the functionality will eventually be... you just want it done!

Would like to see more small enhancements like this one... Dozen of those, and gui would look even more awesome than it already is

I'm not sure about a dozen possible improvements, but two or three would be nice...

Will you provide source code for gui once it is out in public ?

Would like to see core text drawing routines... Those look like they are taking most of drawing time, and I have a feeling something could be done about it ...

flashjazzcat · October 7, 2013

Would like to see more small enhancements like this one... Dozen of those, and gui would look even more awesome than it already is

I'm not sure about a dozen possible improvements, but two or three would be nice...

Well, best speak now or leave the tweaks till later, since the main drive at the moment is to make real progress, rather than just prettifying the same demo for months on end.

Will you provide source code for gui once it is out in public ?

Would like to see core text drawing routines... Those look like they are taking most of drawing time, and I have a feeling something could be done about it ...

Unsure about the source code situation. I considered releasing other big projects in the past but was talked out of it... many people just download code then forget about it, for one thing. I do it myself all the time.

We optimised the text drawing to death earlier in these pages (or it seemed like it at the time), but I guess it can be looked at again when there are fewer competing priorities. My guess is that because every byte written to the screen goes through two masks (clipping mask and window mask), there's a real limit to just how much faster things can be made without throwing the baby out with the bath water.

With such limitations in mind, here's the core rendering loop for unstyled text. Feel free to suggest improvements.

charlineloop ; line loop
	ldy linecount ; inline code instead of call to get_scr for speed
	lda LineMask,y ; see if we need to render this line
	bne dorenderline
	tay
	ora PrevYMask
	bne notlastline
	jmp char_done
notlastline
	sty PrevYMask
	jmp nextline
	
dorenderline
	sta PrevYMask
	lda linetable,y ; need code to set up mask as well
	sta scr
	lda linetable+200,y
	sta scr+1
	lda (WindowMask),y
	tay
	lda mask_slot_table,y
	sta maskptr
	lda mask_slot_table_hi,y
	sta maskptr+1

	ldx char_byte_width

	ldy #0
	lda (ptr1),y
	tay
	lda (shiftptr),y
	tay
	and lmask
	sta chbuffer
	tya
	and rmask
	sta chbuffer+1
	dex
	beq done1
	
	ldy #1
	lda (ptr1),y
	tay
	lda (shiftptr),y
	tay
	and lmask
	ora chbuffer+1
	sta chbuffer+1
	tya
	and rmask
	sta chbuffer+2
	dex
	beq done1
	
	ldy #2
	lda (ptr1),y
	tay
	lda (shiftptr),y
	tay
	and lmask
	ora chbuffer+2
	sta chbuffer+2
	tya
	and rmask
	sta chbuffer+3
	dex
	beq done1
	
	ldy #3
	lda (ptr1),y
	tay
	lda (shiftptr),y
	and lmask
	ora chbuffer+3
	sta chbuffer+3

done1 ; x should already be 0
	ldy xbyte
char_render_loop
	lda ClipMask,y
;	beq SkipByte ; is this worth doing?
	and (maskptr),y
	and chbuffer,x ; and with render bits
	and greymask
	ora (scr),y
	sta (scr),y
SkipByte
	iny
	inx
	cpx byte_width
	bne char_render_loop

nextline
	lda ptr1
	clc
	adc char_byte_width
	sta ptr1
	bcc *+4
	inc ptr1+1
	
	lda greymask
	and #1
	cmp #1
	ror greymask
	inc linecount
	dec tmp5
	beq char_done
	jmp charlineloop

char_done ; finished render, so check if we need to underline the character

Edited October 7, 2013 by flashjazzcat

The Usotsuki · October 8, 2013

This makes me wonder if a 64K version of Apple II Desktop is possible. xD

flashjazzcat · October 8, 2013

Although I rightfully credited popmilo and others for suggesting and describing the method for drawing and erasing the mouse pointer in an NMI interrupt, I notice that analmux suggested the very same thing in post 2 of this topic! So apologies for failing to acknowledge that when the method was eventually implemented. Apologies also to tebe, whose graphics OBX contribution went by unacknowledged by me all that time ago. Lots of missed treasures in this thread... tebe, if you have sources for your demo, I'd be most interested to have a look.

Edited October 8, 2013 by flashjazzcat

popmilo · October 8, 2013

Date of mentioned post: "Dec 6, 2009", in couple of months it will be 4 years

Don't we just love these projects that get developed in such a long time periods and during all that time the hardware doesn't change ?

ps. Thanks for that code snippet ! Printed it out, started reading it during work

flashjazzcat · October 8, 2013

Date of mentioned post: "Dec 6, 2009", in couple of months it will be 4 years

Don't we just love these projects that get developed in such a long time periods and during all that time the hardware doesn't change ?

ps. Thanks for that code snippet ! Printed it out, started reading it during work

Yeah... fortunately I didn't really start writing anything in earnest until a year later, so we're coming up to three years.

BTW: sorry for lack of comments in code, but I'm certain you'll be able to intuit what's going on.

Irgendwer · October 8, 2013

With such limitations in mind, here's the core rendering loop for unstyled text. Feel free to suggest improvements.
...
	lda greymask
	and #1
	cmp #1
	ror greymask
...

lda greymask

eor #255

sta greymask

??? (had only a quick look, and may not understand what you try to do here.. )

Edit:

done1 ; x should already be 0
ldy xbyte

char_render_loop

lda ClipMask,y

; beq SkipByte ; is this worth doing?

and (maskptr),y

and chbuffer,x ; and with render bits

and greymask

ora (scr),y

sta (scr),y

SkipByte

iny

inx

cpx byte_width

bne char_render_loop

Try to turn the horizontal direction to get rid of the CPX byte_width

(seems it is always positive so

dex

bpl char_render_loop

should work instead...)

Edited October 8, 2013 by Irgendwer

popmilo · October 9, 2013

	
	lda greymask
	and #1
	cmp #1
	ror greymask

As I understood it, its just supposed to rotate greymask, so simple lsr should work:

	
	lda greymask
	lsr
	ror greymask

popmilo · October 9, 2013

lda greymask

eor #255

sta greymask

Irgendwer is right, this does the same thing as 'rotate' if greymask is '10101010'.

flashjazzcat · October 9, 2013

Irgendwer is right, this does the same thing as 'rotate' if greymask is '10101010'.

Yes he is. However, there has to be some archaic reason why I did it this way, since I use EOR #$FF to do the same thing on the dithered desktop background. Who knows... anyway, it can be changed.

Using X to count down to zero (branching when positive) is a good idea, although the contents of chbuffer would have to be stored in reverse order, so perhaps this would bloat the set-up routine.

flashjazzcat · October 9, 2013

Irgendwer is right, this does the same thing as 'rotate' if greymask is '10101010'.

And sure enough, it turns out that I didn't use EOR #$FF because it corrupts greymask when it's solid, resulting in skipping alternate scanlines of non-greyed text (since all output is ANDed with greymask). The shifting method used leaves a greymask of $FF intact.

popmilo · October 9, 2013

As I understood it, its just supposed to rotate greymask, so simple lsr should work:
	
	lda greymask
	lsr
	ror greymask

Me loves puzzles

lsr is already faster and shorter than:

and #1

cmp #1

But, can this be any faster ?

I knew there was another way

As greymask is alternating every odd-even line, you could make a table 200 bytes high and then use y register as counter at the beginning of the loop:

	
lda GreyMaskTable,y
sta greymask

It is 200 bytes, but you could even design masks that wouldn't be just simple xxxx pattern...

ps. Bear with me, I had more free time than usual on job today

flashjazzcat · October 9, 2013

Me loves puzzles

lsr is already faster and shorter than:

and #1

cmp #1

But, can this be any faster ?

I knew there was another way

As greymask is alternating every odd-even line, you could make a table 200 bytes high and then use y register as counter at the beginning of the loop:
	
lda GreyMaskTable,y
sta greymask
It is 200 bytes, but you could even design masks that wouldn't be just simple xxxx pattern...

ps. Bear with me, I had more free time than usual on job today

Point I was trying to make in post #2217 is that I use AND #1 / CMP #1 because it has absolutely no effect at all on a mask with all bits set:

lda greymask
and #1
cmp #1
ror greymask

The above operation performed when greymask is $FF yields $FF. I did it this way so that I can set greymask to $AA (greyed) or $FF (normal) right at the top of the render routine, and then it just takes care of itself without any tables or conditional code. Of course we use tables elsewhere, for stuff like dithered scrollbar backgrounds in just the way you describe, but for greyed text we'll never need anything more complex then a simple chequerboard pattern.

I must have considered this transformation a puzzle in itself at the time, and was quite pleased with the solution.

Anyway, I hurled some more code at the render loop in general, to fit those occasions when a) a character is 8 or fewer pixels wide and will be shifted across two bytes, and b) when it is 8 or fewer pixels wide and does not cross a byte boundary on the screen:

	ldx char_byte_width
	cpx #2
	bcs Wide

	lda byte_width
	cmp #2
	bcs Wide2
	
	ldy #0
	lda (ptr1),y
	tay
	lda (shiftptr),y
	and lmask
	ldy xbyte
	and (maskptr),y
	and ClipMask,y
	and greymask
	ora (scr),y
	sta (scr),y
SkipByte3
	jmp nextline


Wide2
	ldy #0
	lda (ptr1),y
	tay
	lda (shiftptr),y
	tay
	and rmask
	tax
	tya
	and lmask
	ldy xbyte
	and (maskptr),y
	and clipmask,y
	beq @+
	and greymask
	ora (scr),y
	sta (scr),y
@
	iny
	txa
	and (maskptr),y
	and clipmask,y
	beq @+
	and greymask
	ora (scr),y
	sta (scr),y
@
	jmp nextline
	
Wide

[original variable-width char render routine]

Haven't calculated the cycle savings (and I can't notice any discernible improvement in speed), but it must be a bit faster. If we hit the wall here or hereabouts, I don't mind.

Edited October 9, 2013 by flashjazzcat

Irgendwer · October 9, 2013

And sure enough, it turns out that I didn't use EOR #$FF because it corrupts greymask when it's solid, resulting in skipping alternate scanlines of non-greyed text (since all output is ANDed with greymask). The shifting method used leaves a greymask of $FF intact.

Yes, just after reading you previous post I thought about this aspect too. Sorry for the noise. At least popmilo's 'lsr' does the same job and is faster.

Regarding the chbuffer order: Yes, this may affects the font format, but may worth it. (In 'seitensprung' I changed it also three times to get more speed. Latest version (unpublished yet) is about 5% faster.)

Just a guess, but do you make use of a 2k table for 'shiftptr' data, and your character images are byte aligned?

I find your code quite interesting, as mine in 'seitensprung' is totally different (works pixel-wise).

Thank you for the peek!

flashjazzcat · October 9, 2013

I suddenly see why popmilo's code works... D'oh! I just didn't get it before. Apologies!

Yes: char data is byte-aligned. 2K shifting table is definitely quick, but still takes some setting up.

Edited October 9, 2013 by flashjazzcat

popmilo · October 10, 2013

Glad to see our code is helping at least a little

Sorry if this seems like digging into smaller details while you should be thinking about more important stuff ("API! Khhmmm..." )

Couple of questions:

1. What are general requirements for text drawing routine ?

2. Is this correct:

char_byte_width - width of character in bytes (Do we have characters wider than 8 pixels ?)

ptr1 - character data

shiftptr - shift tables

lmask - ?

xbyte - x coordinate of byte in line

ClipMask - beginning address of current line in climmask

maskptr - ?

greymask - mask byte for current line

scr - beginning address of current line on screen

In last shown code there is no chbuffer. Does it mean you threw it out or is it still used ?

ps. Good morning to all

flashjazzcat · October 10, 2013

Glad to see our code is helping at least a little

Sorry if this seems like digging into smaller details while you should be thinking about more important stuff ("API! Khhmmm..." )

Couple of questions:

1. What are general requirements for text drawing routine ?

2. Is this correct:

char_byte_width - width of character in bytes (Do we have characters wider than 8 pixels ?)

ptr1 - character data

shiftptr - shift tables

lmask - ?

xbyte - x coordinate of byte in line

ClipMask - beginning address of current line in climmask

maskptr - ?

greymask - mask byte for current line

scr - beginning address of current line on screen

In last shown code there is no chbuffer. Does it mean you threw it out or is it still used ?

ps. Good morning to all

Good Morning!

Heh... I appreciate it, and I'm sorry for just not grasping where you were coming from before - especially since the concept was so blindingly obvious.

1. Prior to rendering a character, various page zero pointers should be set up, and this is done by the SetFont routine, which is called with the ID of a font already in memory (this will be refined somewhat later, so we can cache and pull fonts from disk when required). So therefore all rendering is performed with that font until SetFont is called again with a different ID. There are also a number of styling flags which can be set, including the "outline" flag. More complex styling options (such as outline text) cause a branch to a more complex rendering routine. So, the less styling applied, generally the faster the rendering (outline is the slowest, since it shifts the character on four axes to achieve the effect). All text rendering routines (indeed all rendering routines) call SetUpX first (specifying X, Y, Width and Height), and this takes care of clipping, masking, and sets up lmask and rmask, etc. If SetUpX exits with carry set, the object is clipped out of the visible region and the rendering routine should simply abort at that point. Characters of up to 4 bytes (32 pixels) wide are currently supported, which should be ample.

2. char_byte_width is indeed the width of the character in bytes. This is obtained by taking the pixel width (stored in the font data), and applying it against a LUT. The same LUT is used to obtain byte_width, which is the rendering extent in bytes, including the pixel (bit) offset into the leftmost byte.

ptr1 = character data

shiftptr = index into shift table (only MSB is manipulated, since each table is 256 bytes long)

lmask = mask applied to shifted bits obtained via shiftptr - basically masks out the most significant bits. Rmask is its complement, and is used to extract the most significant bits, which become the high bits in the next byte along.

xbyte = x coordinate of byte in line

ClipMask = complete X clipping mask (40 bytes). The mask is 256 bytes long, but bytes 40-255 are "off-screen".

MaskPtr = hardly know how to explain this one: I'll draw a diagram later. Each window has a 200 element list of codes, each referring to a line of data in the global window mask. The lists effectively form RLE compressed masks, unique to each window, but referring to shared masks in the global resource. Each window, therefore, effectively has a complete mask describing its (arbitrarily shaped) visible area, taking into account any foreground windows obscuring it. Window contents are always drawn through the window mask and clip mask. The Window masks are effectively "regions" (as they were known in the Classic Mac OS, although I must stress that any conceptual similarity to Mac regions is coincidental, and indeed I only discovered and read about regions after the window masks had been implemented - which I found rather serendipitous), and they can be merged and compared (indeed, this is how they are generated). Instead of each window's mask being an 8KB bitmap, all the masks for all the windows (assuming a maximum of maybe eight windows) fit quite happily into a 2KB shared buffer (fifty unique 40 byte masks), owing to the the compression used. Thus a background window can be re-rendered, even if foreground objects had rounded corners (or drop-shadows, as is currently the case).

greymask = "greying-out" mask for current line, although if set to $FF, it remains opaque.

scr = address of current line on screen

There's no chbuffer in the new code, since for character data of only 1 byte, it's convenient to skip the buffer generation stage and hold the character bits in other locations. For wide characters, however, it appeared more expedient to build the buffer first, rather than wrestle with lots of indirect access using different offsets in the Y register, and for those characters, the original routine is branched to.

Edited October 10, 2013 by flashjazzcat

flashjazzcat · October 10, 2013

Here's a depiction of the window masks:

In the first figure, Window A's window list (since it's the top window) is entirely opaque (i.e. no masked areas), since it's the front window. So, it would comprise 200 repetitions of index "0".

The desktop's window list, meanwhile, consists of:

75 * index 0

100 * index 1

25 * index 0

The desktop's window list is basically a screenful of binary 1s with a window-shaped hole stamped on it. It shares the global mask "0" with Window A, and introduces a second mask: mask 1.

In the second figure, we've added another window behind the first. Window A's list is still 200 repetitions of index 0, and Window B's list is identical to what the desktop list used to be:

75 * index 0

100 * index 1

25 * index 0

The desktop's new mask list, meanwhile, has changed:

40 * index 0

35 * index 2

50 * index 3

50 * index 1

25 * index 0

Masks 2 and 3 were created in the global mask resource when Window B's silhouette was superimposed on Window A's shape. The routines which merge masks are smart enough to re-use existing mask entries in the RLE run, rather than creating redundant duplicates of existing ones.

The mask lists are recalculated only when a window moves or is closed, or when the window order changes. There's remarkably little overhead to their computation: repeated indices are simply added to a window's list until a change in the underlying intersected masks is detected.

The indices in a window's mask list refer directly to interleaved (LSB/MSB) entries in a look-up-table of mask addresses. Mask buffer space is dynamically allocated when a new mask pattern is required during each window's RLE run.

For the purposes of this explanation, I've omitted the drop-shadows on the windows, which introduce two extra mask patterns per window (i.e. the top and bottom borders). Rounded corners would be easy to do using this technique, although the mask storage requirements naturally increase with very complex shapes.

Edited October 10, 2013 by flashjazzcat

popmilo · October 10, 2013

Just as I thought - simple

Seriously - those mask lists are one of the cooler techniques that I saw recently. Great work.

On the char drawing side, seems like any kind of complex optimization would be ill advised before rest of gui is developed.

Couple things I would think about are:

1. 'Somehow' optimize that 1 char wide character drawing.

2. When scrolling window contents try not to redraw entire window. Copy parts that remain same.

3. Use different routine when drawing text in top window. One that would ignore masking in the largest part of window.

ps. Just thinking loudly - too tired to recommend anything concrete

New GUI for the Atari 8-bit

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members