Irgendwer Posted December 20, 2022 Share Posted December 20, 2022 6 hours ago, Spaced Cowboy said: The stack grows downwards While this is the "classic" orientation of a stack, for a 6502 the other way has some advantages. I wish I could easily turn around it for cc65... See discussion here: https://github.com/cc65/cc65/issues/1226 1 Quote Link to comment Share on other sites More sharing options...
Spaced Cowboy Posted December 20, 2022 Author Share Posted December 20, 2022 I'm open to the idea, but right now I'm in the "get it working" phase, not the "optimise the bejeezus out out it"... Example: because I'm not tracking register use (or rather, I'm not tracking it at the block-construct level), I'm freeing all registers at the end of each expression. That means there's a lot of repeated code to load a "register" from a variable, do something, and persist it back to that variable, and the next statement might load the same variable to a register... The logic is all in place for tracking register use within an arbitrarily complicated expression, and that will generalise to multiple-expressions pretty easily I think, but I don't want to bake anything into how it works, until it's at a level I consider "working". Right now we're definitely in "development". I can't even pass function arguments yet You can imagine just how inefficient local variables are, when the general structure is "move from storage to register, operate, move back" and all the calculations for where to move are being repeated because there's no vector holding the {sp + x} location of where the variable is stored. So at the moment, the compiler is very inefficient - but I'm beginning to think that's also because of the one-pass* nature where it takes high-level source code and produces working assembly. That might have to change, and I've been reading up on QBE as "bedtime reading". QBE is very focussed on x86 and (to a lesser extent) arm, but the principles behind it (SSA, register graph-colouring, call-graph optimizations etc.) are pretty portable. Since I'm effectively defining a very minimal virtual machine here (with the extensive use of zero-page "registers") even if all the logic is bare-metal, I think there's a reasonable match that could be made to how QBE operates. It'd be a far bigger jump if we stayed only with X,Y and A. The goals are: Get things working to a level I'm happy with Start optimizing the output. This might involve splitting xtal-c (xtal-compiler) into xtal-cfe and xtal-cbe, for the front and back-end parts of the compiler. The cfe will parse the source and generate a high-level linear intermediate representation (eg: the QBE style). Then xtal-cbe can operate at that linear-code level, and optimize the register use, assign global (and zero-page) storage where it makes sense etc., finally producing assembly language output for the assembler Lots of work to do before any decision needs to be taken - proper functions are first on the list, then variadic ones (I want to write "printf" in xtal, for xtal ), then structs/classes, and then a bit of mop-up/clean-up and then I'll start to think about memory, optimisation etc. * The current compiler is in fact two-pass - I generate an AST tree as a first pass and then walk the tree to generate code. The conceptual level is one-pass though - there's no intermediate representation other than the tree structure (and content) itself. Quote Link to comment Share on other sites More sharing options...
Spaced Cowboy Posted December 24, 2022 Author Share Posted December 24, 2022 Beginning to almost be useful now... If you squint and pretend a total lack of optimization is ok... We now have proper arbitrary-depth function support. There's currently an imposed limit of 16 bytes worth of function argument per function call, but that lets me make the calling protocol a lot more efficient. I think the small use-case of needing more than 8 2-byte parameters to a function could be adequately addressed in future by passing a pointer to a class/struct. I also have the option of bumping that to 32 bytes, the next 16 bytes after the function args have been earmarked as function scratch-space, but I'm not really using anything there, so it's currently unused. Bear in mind that arrays and strings etc. are passed as pointers, so even a 1k-element array only uses 2 bytes to pass its start address. If it really becomes an issue, I could always spill anything that comes after 16 bytes of argument to the stack, at least then you'd only pay the price for stack-backed variables if you used them. That shouldn't actually be too hard to add, but I'm good with the current solution for now. What this means is you can do something like: void dummy(s32 a, u8 b, u16 c) [ print a; print b; print c; ] s32 main() [ s32 a; a = 1000; dummy(a, 5, a/2); return (0); ] and see the output looking like: @elysium xtal % atarisim /tmp/out.com 1000 5 500 Next up I think is function prototypes, and then I'll see if I can figure out how to do varargs based on a prototype indicating that state. At that point printf() might actually be viable Quote Link to comment Share on other sites More sharing options...
Ecernosoft Posted December 24, 2022 Share Posted December 24, 2022 On 11/18/2022 at 10:51 PM, mytek said: I can't even pretend to understand what you are creating, but I can say you are quite the genius Same here unfortunately, or rather fortunately as DUDE, YOU ARE FREAKING AWESOME. Quote Link to comment Share on other sites More sharing options...
Spaced Cowboy Posted December 24, 2022 Author Share Posted December 24, 2022 Thank you for the kind words, but it's not in fact *very* difficult once you get past a few concepts. You create a Parser that breaks down text into tokens You create an Abstract Syntax Tree (AST) that is just a fancy phrase for something that relates those tokens together, and the fact that some nodes (and not others) are allowed to be children of others is what defines the syntax of your language. Once you can define an 'expression' as a collection of nodes (eg: integer, add, integer) and you can parse those expressions at any time, you have the basics of a language Then there's a whole bunch of "how do I actually want this to work as a language" ? And there's a lot of good resources for that. It turns out there's not *much* difference between a compiler (which is what I'm doing) and an interpreter - you can imagine a compiler to be akin to another stage after the interpreter, and there's quite a bit of the same type of code up until the point where they diverge, so "Crafting Interpreters" has been pretty useful and it's very well written, and free to read online. I'm actually culling stuff from lots of places to get what I want to happen, and my work gives me access to O'Reilly "Safari" (basically an online technical library) where there's a lot of awesome stuff to pull from. As for where it's going, I'm hoping that it ends up being something a bit like Action! except with a few less limitations (recursion, 32-bit arithmetic, floats, fast-math, ...). I'm generally headed towards a 'C' like language, but with differences where I think it makes sense to fit into the 8-bit architecture. The other point is to have a language that actually makes it easy to take advantage of the new hardware that I will (eventually, honest!) get working. The first step is getting something working at the unmodified-computer level, and then I can compare it against the extended hardware. It is called extended-atari language after all Paging functions into and out of RAM is going to be a very big part of it, being able to swap out a block of memory with a write, and having literally hundreds of megabytes to play with will make a difference, I think - so building into the language something that seems to "magically" extend the old 8-bit will be quite a bit of fun. The same thing goes for data, of course - and since the screen is just data, that's a big win too. 2 Quote Link to comment Share on other sites More sharing options...
Ecernosoft Posted December 24, 2022 Share Posted December 24, 2022 (edited) 20 hours ago, Spaced Cowboy said: Thank you for the kind words, but it's not in fact *very* difficult once you get past a few concepts. You create a Parser that breaks down text into tokens You create an Abstract Syntax Tree (AST) that is just a fancy phrase for something that relates those tokens together, and the fact that some nodes (and not others) are allowed to be children of others is what defines the syntax of your language. Once you can define an 'expression' as a collection of nodes (eg: integer, add, integer) and you can parse those expressions at any time, you have the basics of a language Then there's a whole bunch of "how do I actually want this to work as a language" ? And there's a lot of good resources for that. It turns out there's not *much* difference between a compiler (which is what I'm doing) and an interpreter - you can imagine a compiler to be akin to another stage after the interpreter, and there's quite a bit of the same type of code up until the point where they diverge, so "Crafting Interpreters" has been pretty useful and it's very well written, and free to read online. I'm actually culling stuff from lots of places to get what I want to happen, and my work gives me access to O'Reilly "Safari" (basically an online technical library) where there's a lot of awesome stuff to pull from. As for where it's going, I'm hoping that it ends up being something a bit like Action! except with a few less limitations (recursion, 32-bit arithmetic, floats, fast-math, ...). I'm generally headed towards a 'C' like language, but with differences where I think it makes sense to fit into the 8-bit architecture. The other point is to have a language that actually makes it easy to take advantage of the new hardware that I will (eventually, honest!) get working. The first step is getting something working at the unmodified-computer level, and then I can compare it against the extended hardware. It is called extended-atari language after all Paging functions into and out of RAM is going to be a very big part of it, being able to swap out a block of memory with a write, and having literally hundreds of megabytes to play with will make a difference, I think - so building into the language something that seems to "magically" extend the old 8-bit will be quite a bit of fun. The same thing goes for data, of course - and since the screen is just data, that's a big win too. I'm super excited! By the way, do any of you know about "professional" atari 8 bit compiling software? Thanks! I'd say that the Atari sold well enough to get it, so why not? (GUI's please....) Edited December 24, 2022 by Ecernosoft Quote Link to comment Share on other sites More sharing options...
Spaced Cowboy Posted December 27, 2022 Author Share Posted December 27, 2022 So the claim above "We now have proper arbitrary-depth function support" (*cough*) turned out to not, in fact, be the case... I was neglecting to consider the case of what happens when we call a function from a function, and also need to use the variables passed into the first function, later on in the first function after we've called the second function. I think that sentence is a little unclear. Let me illustrate with an example: void dummy2(s32 a, u8 b, u16 c) [ print a; print b; print c; ] void dummy(s32 a, u8 b, u16 c) [ dummy2(a/2,b/2,c/2); dummy2(a,b,c); ] s32 main() [ s32 a; a = 1000; dummy(a, 5, a/2); return (0); ] Here main() calls into dummy(), which uses the passed arguments to generate values to call into dummy2() with. However, it then uses those same arguments to call into dummy2() again immediately afterwards. That means that we need to preserve the input values for later use, possibly (like in this case) still calling out to another function with them. The solution wasn't terribly difficult, but it does mean I do have to preserve data on the stack , something I was trying to avoid. We don't need to do this unless it's a nested function-call (ie: anything called from 'main' doesn't need to preserve what was there, because that's the bottom level), but any other function call needs to preserve the data for any of the function-passing registers that it uses - and this has to happen recursively for deeply nested function-calls. I've implemented this save/restore as high-level move.x "instructions", and I'm hopeful that some later optimisation step can use the call-graph to figure out what actually needs to be preserved, and get rid of any excess data moves, but in the meantime - adhering to the rules of optimisation I'm going to let it be. There's many a mountain to climb before this becomes the low-hanging fruit ... I also turned the above into a test @elysium xtal % atarisim /tmp/out.com 500 2 250 1000 5 500 Quote Link to comment Share on other sites More sharing options...
Spaced Cowboy Posted December 27, 2022 Author Share Posted December 27, 2022 On 12/24/2022 at 1:09 PM, Ecernosoft said: I'm super excited! By the way, do any of you know about "professional" atari 8 bit compiling software? Thanks! I'd say that the Atari sold well enough to get it, so why not? (GUI's please....) I don't. I know there's WUDSN (spelling ?) on Windows (I think it's just Windows), but I'm currently using bbedit and make for all this - my IDE is Xcode because I'm really writing a cross-compiler here, even if I'm making sure the language can be entered on the Atari keyboard. There is a pipe-dream, nothing more than a twinkle in a somewhat closed eye, that I might be above to compile all of this for the embedded RP2040 on the expansion board, and have the editor/debugger environment running there, deploying to the XL/XE it's attached to - this is the reason I wanted the character set to be completely and easily (trigraphs, yuk!) mappable to the XL/XE keyboard... I wouldn't hold your breath for this Quote Link to comment Share on other sites More sharing options...
Ecernosoft Posted December 27, 2022 Share Posted December 27, 2022 (edited) 18 hours ago, Spaced Cowboy said: There is a pipe-dream, nothing more than a twinkle in a somewhat closed eye, that I might be above to compile all of this for the embedded RP2040 on the expansion board, and have the editor/debugger environment running there, deploying to the XL/XE it's attached to - this is the reason I wanted the character set to be completely and easily (trigraphs, yuk!) mappable to the XL/XE keyboard... I wouldn't hold your breath for this *tries to process the information* *explodes* *r.i.p. Ecernosoft. Tried to understand information he cannot understand.* Take your time, you've got this! 😃 I don't want to pretend to know what you are trying to do. I bet it'l be awesome when it's done! Edited December 27, 2022 by Ecernosoft Quote Link to comment Share on other sites More sharing options...
Spaced Cowboy Posted December 28, 2022 Author Share Posted December 28, 2022 Two things of note to talk about ... 1) Function prototypes I was never very happy with the way I was accepting function arguments. All the logic was there to do the work, but there was very little error-checking going on - in fact I spent quite some time trying to figure out why a unit test wasn't producing the correct results (poring over the assembly code and ignoring the high-level code) because I'd passed a u32 to a u16 function-parameter, and nothing was telling me that there was anything wrong there. So, now we have function prototypes - one can define just the parameters of the function ahead of time, and when any call is made to the function, we match the passed parameters with the defined ones in the prototype. At the moment, the prototype parameter names also need to match the function ones - this is a minor niggle, and I might sort it out, but it's not terribly high on the list, if I'm being honest What this means is that something like this will work... #import <stdio> void dummy(s32 a, u8 b, u16 c); s32 main() [ s32 a; a = 1000; dummy(a, 5, a/2); return (0); ] void dummy(s32 a, u8 b, u16 c) [ print a; print b; print c; ] but something like this will throw a compile-time error: #import <stdio> void dummy(s32 a, u8 b); s32 main() [ s32 a; a = 1000; dummy(a, 5, a/2); return (0); ] void dummy(s32 a, u8 b, u16 c) [ print a; print b; print c; ] 2) Modules Eagle-eyed readers may have spotted something else in the above two programs - we now have modules of code. Basically the way this works is that if there is a #import <filename> then the compiler will do three things before starting to process any of the source code: It looks for the file $BASEDIR/modules/filename/filename.h, and if it finds it, it prepends the contents of that file to the source code it's about to compile It looks for the file $BASEDIR/modules/filename/filename.xt, and if it finds it, it appends the contents of that file to the source code it's about to compile It looks for the file $BASEDIR/modules/filename/filename.s, and if it finds it, it emits a directive to the assembler to include that file BASEDIR in this case is either obtained from an environment variable, or passed in on the command line. If nothing is set, it defaults to /opt/xtal/lib (which is where I currently deploy everything). Now that we have prototypes, you can put function prototypes into a .h file, functions corresponding to those prototypes into a .xt file, and any assembly code that your function depends on into the .s file. The assembler will only emit code for functions that are used, so there's no need to worry about unused functions bloating the final binary. I'm testing this with stdio at the moment, hence the #import <stdio> calls - this is all in preparation for printf() In fact, though, now that there's multiple source-files in the compiled code, and line-counts no longer mean much, I'm planning on trying to clean up the line/file counts at the high-level-language stage. The assembler already tracks things it includes and understands different contexts, but the compiler stage isn't as capable. I might try putting some effort into that before trying to get varargs/stdargs working. I have a plan for the variable arguments, but it's still just an in-my-head plan, I haven't tried it, and I think knowing where things are going wrong might be useful while debugging 1 Quote Link to comment Share on other sites More sharing options...
Spaced Cowboy Posted January 23, 2023 Author Share Posted January 23, 2023 Wow, it doesn't seem like last year that I last posted about this, but apparently it was... This post isn't really about any huge development within xtal per se, it's about how I'v been spending that time, and 'spending' the time seems like the correct phrase.. The next goal, back then, was basically to get printf() working - ie: get functions with variable-length argument lists. Printf() may seem like an esoteric target this early on, but it's a surprisingly useful debugging tool, and being your typical lazy programmer, I like to make my life as easy as possible However, one of the things about printf is that it takes variable types of argument as well as variable arguments themselves. My plan for this is for the compiler to emit (at the call-site) an encoding of byte-lengths for each argument, so the assembler can figure out what to put where in the register-assignment. It'll make the AST a bit more difficult, but it ought to make the assembler's job a lot easier. While I was thinking about that, it occurred to me that now would be a good time to get the f32 (float) type into the language - we have all the integer types I'm expecting to support (signed and unsigned 8,16 and 32-bit types), structs are on the horizon, and the last outstanding primitive type is the float, which would make my yet-to-be-written printf() even more generic... "Ok, let's add floats", I thought. That was last year. Floats have been metaphorically kicking my behind since then. To the point where I stopped, and wrote a small debugging tool to help me out. It's incredibly useful to be able to page back and forwards through time to see what's happening in memory (and more importantly, why). The largest obstacle though, as will become clear, was my own understanding of floating point formatting... I originally started out using Woz's floating point routines [PDF link], and it all started reasonably well, I wrote myself a small command line app "wozf" that let me type in a float and get the correct representation (or so I thought) and everything was proceeding pretty well until it got to the test-suite stage. I couldn't get it to subtract 9 from 2, it always gave me -1 not -7, and I couldn't for the life of me see why. I reached out for help on the 6502 forum where there was a suggestion that the bias offset was different for negative values compared to positive ones. That made sense in isolation, but it didn't seem to work for other examples... I was fairly sure I was doing something wrong, and that the code was fine - I'd compared my code to his oh, about a thousand times - and I couldn't see any typos. I checked the simulator (that was fine), I checked my assembly output (that was fine). So I gave up - and this was when I started writing the debugging aide above. I wanted to step through things and see them happening, and I wanted to be able to cross-jump between the source-code and the instruction stream to jump around the code easily. A couple of weeks passed while I got this up and running. And then ... I still couldn't see anything wrong [sigh] I decided to look around for other FP engines - I looked into MS basic (but didn't like the 40-bit format, I wanted it to fit into the same space as a u32), I looked at BBC basic, same problem (5 byte floats). Eventually I found an older implementation: the 6502 Software Gourmet Guide and Cookbook [Internet archive link]. This was a 4-byte FP system from 1979 (!) which was actually pretty similar to Woz-format, maybe slightly simpler, but it also had the advantage of "good documentation", and it's written to be far and away more straightforward to follow - Woz was a fiend for compressing his code into as small a space as possible, and it can make it hard to follow the jsr to this that jsr'd to that, which jsr'd back into this etc. This is fine when you know the code (especially if you've written it). When you're trying to grok it, it's ... not so fine. I couldn't find a source with the book-code already in text format (the PDF is just images), but I figured that typing it in might give me a better idea of how it worked anyway, so that's what I did. I found a few typos in the code in the book, but it all went fairly smoothly... Until I tested 2-9, and got -1. Ok, two completely separate implementations of floating point, both giving the same "wrong" result, means it's definitely a PEBCAK problem. I used my new debugging aide to step through, writing down notes as I went, and finally the penny dropped. Stepping through the code made it clear, the logic works well and we got to the penultimate step with an answer of +7 in the accumulator when subtracting 9 from 2. It then complements that and you end up with 00:00:90:03 I was interpreting the high byte of the mantissa as $90 = 10010000 smmmmmmm 0x80 = 1 -> negative 0x40 = 0 -> 0 x 0.5 0x20 = 0 -> 0 x 0.25 0x10 = 1 -> 1 x 0.125 where the exponent = 2^3 = 8 => (-1) * (8) * (0.125) => an answer of -1 But that’s not the case - the mantissa should be regarded as valid for the number of bits specified in the exponent, and if it’s -ve then its a left-aligned 2’s complement number, so 0x90 is more recognizably specified (if it were a 32-bit int) as 0xfffffff9 or -7 And 2-9 is indeed -7, so the logic works just fine. I think I was being thrown off by the large number of zero-bits in a negative number, I normally expect to see an exuberance of 1-bits, and I hadn't twigged about the mantissa valid-bits being governed by the exponent. I also think the far-more-logically-straightforward treatment of floats in the book made it a lot clearer what was going on, even though Woz's code produces exactly the same sort of result. Anyway, right now I have addition, subtraction, multiplication and division working to my satisfaction, and also input (ascii->float) and output (float->ascii) routines, although the float->ascii could probably do with a little polish, you always get scientific notation as output, and it doesn't trim trailing zeros... I also rewrote my little command line utility as a class, which I'll integrate into the assembler so I'll be able to have things like: m_pi: .float 3.141592 .. in the assembly. It turned out to be really useful on the command line: @elysium % float65 0x1d906402 Float: 3.142592 @elysium % float65 3.142592 Hex : 1d.90.64:02 This has been a pretty long post, so as a little bit of eye-candy, this is the result of calling 'FP_ftos' on the floating point representation of 2 If you look closely at the buffer starting at 0x40 (it was just easier to have all the changing registers/values on the same page, I wouldn't really put stuff there), you can see the sequence: 2B 30 2E 32 30 30 30 30 30 30 45 2B 30 31 00 or +0.2000000E+01 (with a trailing \0) This gives me the basics of a working floating point library (most of the other typical FP functions can be expressed in terms of the 4 basic ones) and a way to get the data into and out of the floating point realm. Now I have to create the 'float' types in xtal, and get them worked through the compiler and assembler, as well as the various conversions to and from float to ints of various lengths But since this has been, as mentioned at the top, metaphorically kicking my behind for a month or so, getting to this point feels like progress, and I got a reasonably useful debugging tool out of it too, so today is a good day, well better than most of the past month or so, anyway Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.