Developing a new language - ACUSOL

Alfred · March 21, 2014

Looks interesting.

I think your choice of passing strings by value rather than reference is a mistake. One could very easily waste huge chunks of memory merely by having a bunch of procedures that take string arguments. There is also the overhead of all that copying.

Pab · March 21, 2014

Well, they're PASSED by reference (3-byte pointer) but if a string is called for in the procedure's argument, a string variable will be created.

Alternately, I could just have the compiler treat all STRING arguments as STRING POINTER arguments, and references to those strings be converted by the compiler to pointer references.

Alfred · March 21, 2014

Ok, but I don't like the implications of that array. I sure hope that :

Procedure Test()

Byte Array mybuffer[4096]

Return

doesn't generate 4K of binary zeroes. It was nice to see you special cased the +1 to an INC.

Pab · March 21, 2014

At the moment it does (since I'm just trying to get the compiler to spit out the proper code) but one of the TODO's in my source code is to just close a file module and jump ahead whenever something needs to jump ahead by more than a sector's worth of data. That will be taken care of long before it gets anywhere near even pre-release. And will probably always be "fixed" in the native Atari compiler.

Actually, this is one of the areas where my memory is a little hazy since I haven't worked on actual Atari hardware in over a decade. What is the tradeoff point where it's faster to just load binary zeros than to start a new module and risk the slowdown of loading the new header (thinking of AtariDOS here) and risking a disk seek? I used to think that if it was less than 128 bytes it was better to keep it inline. What's the consensus nowadays.

Yeah, I special ==+1 to INC, ==-1 to DEC, ==+2 to INC INC, and ==-2 to DEC DEC at the moment. Also plan to special multiply and divide by 2, 8, 16, 32, 64, 128 etc. to bit shifting when I get knee-deep into the math code.

Edited March 21, 2014 by Pab

flashjazzcat · March 21, 2014

You shouldn't get seeks anyway: new segment address is just 4 consecutive bytes.

Kyle22 · March 21, 2014

I don't think that matters as much these days, with IDE solid state drives w/512 byte sectors under SpartaDOS-X. They are so fast, I don't think a disk seek would be noticed at all.

What is the tradeoff point where it's faster to just load binary zeros than to start a new module and risk the slowdown of loading the new header (thinking of AtariDOS here) and risking a disk seek? I used to think that if it was less than 128 bytes it was better to keep it inline. What's the consensus nowadays.

Pab · March 21, 2014

I'm still thinking of the days of 1050's and XF551's. And how AtariDOS would retrieve those four bytes one at a time. That was what really slowed down changing segments.

Edited March 21, 2014 by Pab

flashjazzcat · March 21, 2014

Well, the four bytes will likely be sitting there in the sector buffer already (since it's likely that the tail-end of the last segment was a partial sector, so not a burst read), so the only overhead is the code required pull them out of there. But of course there's a (much greater) code overhead associated with pulling many, many zeroes out of the same stream. I don't see a situation where any additional seeking would be necessary at all. A binary executable is just a linear stream of bytes.

Edited March 21, 2014 by flashjazzcat

Pab · March 22, 2014

Okay, I'm now creating a new file segment if a variable or area is allocated in program space of more than 16 characters. Takes the output file from the same "Hello" source code down from 209 bytes to 173.

TXG/MNX · March 22, 2014

Okay, I'm now creating a new file segment if a variable or area is allocated in program space of more than 16 characters. Takes the output file from the same "Hello" source code down from 209 bytes to 173.

Nice, let see how much smaller it can get... next time :-)

Pab · March 22, 2014

I had imagined that people with large numbers of variables to allocate and huge structures would be using banked RAM or an area outside the code area.

Pab · March 23, 2014

Question: Is the order of operations all that important to programmers nowadays? I'm working on my math code, and it could be a bit faster if I didn't sort out the traditional order of operations.

Action treats "a + 3 * b" as "3 times b, added to a." Reading from left to right, it would come out to "a + 3, times b."

Of course, if you want to add three times the value of b to a, you could write "a + (3 * b)" which is what I think most programmers do nowadays anyhow.

At the moment I am specialing "*2" "*4" "*8" "*256" "/2" "/4" "/8" and "/256" to bitwise operations (or in the case of 256, copying and zeroing).

Pab · March 23, 2014

Also, what does everyone think about using "\" as a symbol for "mod" in mathematical equations? I'm accepting "mod" but wanted something shorter.

C uses "%" but Action already has that assigned to Bitwise OR.

I'd also like to come up with symbolic operators for "LSH" and "RSH." I will be evaluating boolean expressions (for IF-FI) with a different routine than mathematical expressions, so '<' is available for left shift and '>' available for right shift, or would that confuse programmers? Of course, on the native Atari version we could use ATASCII 30 and 31 (the left and right arrows) as the operators. Of course, for the old school, "LSH" and "RSH" will still be usable.

Edited March 23, 2014 by Pab

a8isa1 · March 23, 2014

Question: Is the order of operations all that important to programmers nowadays? I'm working on my math code, and it could be a bit faster if I didn't sort out the traditional order of operations.

Action treats "a + 3 * b" as "3 times b, added to a." Reading from left to right, it would come out to "a + 3, times b."

Of course, if you want to add three times the value of b to a, you could write "a + (3 * b)" which is what I think most programmers do nowadays anyhow.

At the moment I am specialing "*2" "*4" "*8" "*256" "/2" "/4" "/8" and "/256" to bitwise operations (or in the case of 256, copying and zeroing).

Pab, I'm excited about your project.

Please don't make me use parentheses when precedence of multiplication or division is implied. It's ingrained. At least it is for people who grew up using AOS calculators.

3 * 2 + 2 * 4 is obvious and much easier than (3 * 2) + (2 * 4)

As for suggestions, These are all that I can think of.

I would love to see code generation for simple cartridges. Code that can run from the cartridge and not simply cop to RAM later.

Perhaps, long integers and long CARDs.

If supporting Action!'s records perhaps borrow Pascal's WITH statement

Does Action! allow assignments of whole records? I can't remember.

A CASE statement wouldn't be a bad thing.

-SteveS

ilmenit · March 23, 2014

Question: Is the order of operations all that important to programmers nowadays? I'm working on my math code, and it could be a bit faster if I didn't sort out the traditional order of operations.

That would be very nice to have. Most of the modern languages support C order of operations.

Pab · March 23, 2014

Pab, I'm excited about your project.

Please don't make me use parentheses when precedence of multiplication or division is implied. It's ingrained. At least it is for people who grew up using AOS calculators.

3 * 2 + 2 * 4 is obvious and much easier than (3 * 2) + (2 * 4)

As for suggestions, These are all that I can think of.

I would love to see code generation for simple cartridges. Code that can run from the cartridge and not simply cop to RAM later.

Perhaps, long integers and long CARDs.

If supporting Action!'s records perhaps borrow Pascal's WITH statement

Does Action! allow assignments of whole records? I can't remember.

A CASE statement wouldn't be a bad thing.

-SteveS

My solution to speeding things up was to essentially create a "math stack" in the cassette buffer, and every time an open parenthesis was encountered to push the current value of the operation onto that stack. Then pop it back off and perform the previously requested operation once the close parenthesis came along.

So if it parsed the expression c * (b - 256) it would:

Evaluate C.
Push the current value of C onto the math stack.
Evaluate B.
Subtract 256 from B
Pop the previous value of C off the math stack.
Multiply the previous value times the current value.

To do order of operations is going to take a bit of presorting, which will have to be done in the compiler. I guess it could be done by pre-sorting the expression before generating the code.

As for Longints and Longcards, I don't think so at this point. Maybe as the language grows.

Direct assignment of objects would be simple. Just move/copy the memory of one into the other. Hadn't thought of doing that. Will do that once I'm knee-deep in full object support.

I'll think about doing CASE once I'm onto boolean operations. What would everyone say to a BOOL type, while I'm at it?

thorfdbg · March 23, 2014

What makes me really worry is that you're thinking too concrete on your implementation. Really, please look up "recursive descend parsing", and all this problem will go away. Whether that will be a stack, or whether this is in the tape buffer is completely irrelevant for the matter. Actually, it will be simply a subroutine call stack if you do it right, and whether that's in any type of tape buffer or on the 6502 stack or a software stack is irrelevant.

What's probably constructive is to use a very simple compiler like the the ABC compiler, and analyze that. The ABC basic compiler is written in BASIC itself, and it is a direct clean approach of a recursive descend parser. Try to understand that first, you'll learn a lot. When I saw that first my reaction (like 20 years ago) was, "Oh, it's really *that* simple". And yes, it is, it's a beauty in itself.

Oh, the ABC compiler is compiled by itself. It's in BASIC. I provided a decompiler for ABC a while ago. You either find it on the net, you drop me a note here.

Probably a good idea to write your compiler in high-level language first, then once you have it, convert it to its own language and compile it by itself. Makes a good test, too.

Pab · March 23, 2014

Probably a good idea to write your compiler in high-level language first, then once you have it, convert it to its own language and compile it by itself. Makes a good test, too.

That is precisely what I'm doing. I'm writing it in Lazarus (Free Pascal RAD) under Windows. Once it's fully functional, I will translate it to its own language to compile a native Atari version.

Alfred · March 23, 2014

I think it's important that you support proper precedence of operations. It's something that any language is expected to handle as a matter of course. As for the LSH/RSH, you might want to be careful about using </> as symbols there. I don't know how you have structured your language but since those symbols are used for conditional expressions, will their presence as bitwise operands make things more difficult to parse ?

Are you meaning to use the stack idea for the parsing of the expressions or do you mean the runtime code will use that mechanism to actually evaluate an expression ?

I think a larger data type than CARD would also be important, say three bytes if you don't want to go to four. You already have a three byte pointer type; would it be that much more work to have a three byte CARD type as well ?

Edited March 23, 2014 by Alfred

JoSch · March 24, 2014

I think it's important that you support proper precedence of operations. It's something that any language is expected to handle as a matter of course. As for the LSH/RSH, you might want to be careful about using </> as symbols there. I don't know how you have structured your language but since those symbols are used for conditional expressions, will their presence as bitwise operands make things more difficult to parse ?

I also think it's not a good idea to use >/< for shift operations. It will turn into a PL/2 like hell ;-)

Consider:

IF VAR > 2 > 8 THEN
...
FI

How should that expression get parsed?

How about using <</>> for shift?

Pab · March 24, 2014

Having slept on the matter, it occurs to me that 32-bit integers and cards might not be all that hard to implement after all. For add/subtract I just ROL/ROR through four bytes instead of two. I know there are multiplication routines out there for the 6502 that multiply two 16 bit numbers into a 32 bit product that I could cadge for this project, and I could probably find some division routines as well.

Only drawback would be that we couldn't necessarily take the same cheap shortcut that Action did and use the floating point package for STR/VAL operations. We'd have to write fresh code for the RTL.

Speaking of floating point, I'm having second thoughts about natively supporting FLOAT variables. My code is littered with TODO statements saying "implement floating point here" and with longcards it might not be as important. If it came down to having longcards for 32-bit values with speed, or just using floating point if you need a variable to hold more than 16 bits of data, which would everyone rather have? Speed amd values of 4.3 million, or decimals and larger values at a speed tradeoff?

flashjazzcat · March 24, 2014

32-bit integers. These are essential for writing any kind of sector editor these days, for a start. Can't copy an FP value directly into the DCB.

Pab · March 24, 2014

Since we're coming up on the fifth page of the topic and people might wander in late in the discussion, let me just review a few basics.

This discussion is about developing a new language for the Atari 8-bit platforms, based upon and developed out of Action. Originally I had called the language "ACUSOL" for Atari Computer Unified Symbolic Object Language, but some people objected to the name so that's now up in the air. There were some good suggestions and we can discuss/vote on it later.

I'm writing the initial compiler in Lazarus (Free Pascal) for Windows. I've since learned that this is the same language that "Effectus" was written in, although I wasn't aware of that project when I started. Free Pascal is cross-compilable, so versions of the compiler could be easily built for Mac, Linux, eCom, and other platforms.

While I've not released any source code at this point because the initial compiler is growing and changing fast as I work on it, once that milestone is achieved everything from that point on is going to be open source and public domain. I do not intend to be the only developer on this project, nor do I expect to have the final say on how the language grows and changes. Anyone and everyone who wants to have a hand in the thing is welcome to it.

The major additions to Action in this new language are Object Oriented Programming extensions, and the ability to address banked RAM easily through RTL routines and native in the compiler.

While a Runtime Library will be needed for many things, the RTL will be open source (as will everything in the project) and modular, Portions of the RTL can be easily swapped out to allow for cross-development for different situations. (For example, developing for the 5200 in the language.)

The purpose of the initial compiler that I'm writing at the moment is to be able to bootstrap a native Atari compiler. In other words, the native compiler for the Atari will be written in its own language, adapted from my Pascal original. Help on that part of the project would be welcome and appreciated.

Much of the language is fluid even at this stage, and opinions and suggestions are actively sought.

snicklin · March 25, 2014

What's wrong with the name Acusol?

It's only a typo for goodness sake...

... away from this:

http://www.echemist.co.uk/productmedia/anusol-cream.jpg

flashjazzcat · March 25, 2014

Well, I expect rapid, soothing relief from this compiler when it's done.

Developing a new language - ACUSOL

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members