Jump to content
IGNORED

Improving TI BASIC performance


Recommended Posts

3 hours ago, OLD CS1 said:

I went back to see the disappointment with my own eyes.  I missed the entire post with the XML modification.  This will not work in hardware, but works great in Classic99.  Still an interesting project, even while lacking application to real iron without modification.

I will see if I have time today to add this stuff to the StrangeCart. This is all doable from the cartridge port without changes to the real iron.

Link to comment
Share on other sites

21 hours ago, senior_falcon said:

Since we have a disassembly of the BASIC groms in INTERN, I believe that it should be possible to convert every gpl instruction to an assembly equivalent. Such an approach should produce a Basic interpreter that is totally compatible with the original gpl based interpreter, but considerably faster. It would have to go somewhere, probably in bank switched pages of ram in the cartridge. Certain things such as the editor would not have to be converted to assembly, although that would be nice.

I may not have been clear in what I was saying.

First off, let's start with an observation about GPL: It is not exactly a performance enhancer.

The GPL instruction              DST @>8324,@>836E is simple, elegant, requires 3 bytes and 55 assembly language instructions.

The assembly equivalent       MOV @>836E,@>8324                           requires 6 bytes and 1 assembly instruction

 

What I was proposing was to exactly duplicate the GPL code with an assembly equivalent. For example, here are some lines from Basic starting at >260B. GPL on the left, the assembly equivalent on the right

ST @>8330,@>8373                     MOVB @>8373,@>8330

CLR @>8311                                 MOVB @HX00,@>8311

CALL GROM @>282C                     BL @CALL          DATA >282C

BS GROM >265B                           JEQ G265B

CEQ @>8342,>2C                         CB @>8342.@HX2C

BR GROM @>2624                         JNE G2624

ST @>8342,>B3                            MOVB @HXB3,@>8342

CALL GROM @>2850                      BL @CALL      DATA >2850

INC @>8311                                  AB @HX01,@>8311

 

I have no idea what Basic is doing with these instructions, but that doesn't matter. In a very short time I created assembly equivalents that will work exactly the same.

 

 

 

  • Like 3
Link to comment
Share on other sites

3 hours ago, senior_falcon said:

I may not have been clear in what I was saying.

First off, let's start with an observation about GPL: It is not exactly a performance enhancer.

The GPL instruction              DST @>8324,@>836E is simple, elegant, requires 3 bytes and 55 assembly language instructions.

The assembly equivalent       MOV @>836E,@>8324                           requires 6 bytes and 1 assembly instruction

 

What I was proposing was to exactly duplicate the GPL code with an assembly equivalent. For example, here are some lines from Basic starting at >260B. GPL on the left, the assembly equivalent on the right

ST @>8330,@>8373                     MOVB @>8373,@>8330

CLR @>8311                                 MOVB @HX00,@>8311

CALL GROM @>282C                     BL @CALL          DATA >282C

BS GROM >265B                           JEQ G265B

CEQ @>8342,>2C                         CB @>8342.@HX2C

BR GROM @>2624                         JNE G2624

ST @>8342,>B3                            MOVB @HXB3,@>8342

CALL GROM @>2850                      BL @CALL      DATA >2850

INC @>8311                                  AB @HX01,@>8311

 

I have no idea what Basic is doing with these instructions, but that doesn't matter. In a very short time I created assembly equivalents that will work exactly the same.

 

 

 

CEQ @>8342,>2C                         CB @>8342.@HX2C                  Is this not   CI >2C,@>8342   ????

Where is this HX2C or you not wasting a byte somewhere to store a >2C ?

 

ST @>8342,>B3                            MOVB @HXB3,@>8342             Is this not   LI >B3,@>8342    ????

Again where is HXB3 at and did you not waste a byte in memory to use it?

 

As I have been doing something like this with RXB 2022.

 

Also I give everyone a leg up on me by including all my source code for any project I do.

Link to comment
Share on other sites

10 minutes ago, RXB said:

CEQ @>8342,>2C                         CB @>8342.@HX2C                  Is this not   CI >2C,@>8342   ????

Where is this HX2C or you not wasting a byte somewhere to store a >2C ?

 

ST @>8342,>B3                            MOVB @HXB3,@>8342             Is this not   LI >B3,@>8342    ????

Again where is HXB3 at and did you not waste a byte in memory to use it?

   . . .

 

Definitely not.  CI and LI are not byte instructions and they only operate on registers.

 

...lee

  • Like 2
Link to comment
Share on other sites

10 minutes ago, Lee Stewart said:

 

Definitely not.  CI and LI are not byte instructions and they only operate on registers.

 

...lee

Yea you would need a swap byte, I messed up and forgot that.

 

But still where is the >2C or >B3 at?

 

Link to comment
Share on other sites

10 hours ago, speccery said:

I don't think we have a C++ compiler for the TMS9900

Insomnia's GCC for the TI has C++ support, although I don't recall how complete the runtime is nor do I recall which version of C++ it supports.

 

One of the example projects (hello_cpp.tar.gz) in the following post is a C++ project:

 

 

  • Like 1
Link to comment
Share on other sites

1 hour ago, chue said:

Insomnia's GCC for the TI has C++ support, although I don't recall how complete the runtime is nor do I recall which version of C++ it supports.

 

One of the example projects (hello_cpp.tar.gz) in the following post is a C++ project:

 

 

Thanks @chue I didn't remember there was C++ support too. I recently compiled the toolchain for macOS and posted in the same thread some messages how I was doing it. I had issues compiling it for my Apple silicon Mac but got a (seemingly) working build for Intel. I haven't used this yet much. Anyway I haven't tested the C++ compilation for the TMS9900.

  • Like 2
Link to comment
Share on other sites

10 hours ago, apersson850 said:

 

Now I see you are considering the same issue. What you want is actually the same thing as the p-system does, but you want a line-by-line compilation, rather than compiling the whole program in one fell swoop. If I understood correctly?

 

Yes. Or rather, I'm suggesting that such an approach would be low-hanging fruit in the design on a new BASIC interpreter. Your point about bounds checking at run-time is a good point. Personally, I'd have a switch allowing bounds checking to be turned on and off, allowing a well-running program to benefit from further performance gains. One reason for Forth's speed is its lack of bounds checking, or indeed any other sanity checking. You're completely free to screw your program (and the system) up in any way you desire. This is possibly a step too far for BASIC systems.

  • Like 2
  • Thanks 1
Link to comment
Share on other sites

4 hours ago, RXB said:

Yea you would need a swap byte, I messed up and forgot that.

 

But still where is the >2C or >B3 at?

Not only would you need to swap bytes, but also do masking and stuff to avoid the other byte, which CI will take into consideration, from affecting the test. CB will be simpler, even taking the fact that you need to store the values somewhere.

It doesn't matter exactly where the constants are, as long as they are somewhere, with a label, so the assembler can find them.

  • Like 2
Link to comment
Share on other sites

1 hour ago, Willsy said:

Your point about bounds checking at run-time is a good point. Personally, I'd have a switch allowing bounds checking to be turned on and off, allowing a well-running program to benefit from further performance gains.

With the p-system you can. There's a compiler directive to turn range checking on/off. You can turn it off as you like, so you can keep checking if you suspect some part of your program is less robust, and turn it off in the rest.

You can do the same with IO-checking. If you turn it off, then you have to check the ioresult right after an IO-operation, to see if it worked or not. If it didn't, and you don't check, then all sorts of things can go wrong.

But it allows you to just ask for a file name, attempt to open it and then ask again, if the open command fails, instead of having the program interrupted by a file error.

  • Like 2
Link to comment
Share on other sites

On 6/11/2022 at 1:25 AM, OLD CS1 said:

How about making CHAR faster?

It looks like that can be done and would be around 2.5x faster. (CHAR is pretty slow in Basic)

(EDIT) Looks to be a little over 3x faster. It still needs to check to be sure characters are only "0123456789ABCDEF"

Edited by senior_falcon
  • Like 4
Link to comment
Share on other sites

2 hours ago, senior_falcon said:

It looks like that can be done and would be around 2.5x faster. (CHAR is pretty slow in Basic)

(EDIT) Looks to be a little over 3x faster. It still needs to check to be sure characters are only "0123456789ABCDEF"

Not an insignificant increase in speed.  Very nice!

  • Like 3
Link to comment
Share on other sites

19 hours ago, apersson850 said:

Not only would you need to swap bytes, but also do masking and stuff to avoid the other byte, which CI will take into consideration, from affecting the test. CB will be simpler, even taking the fact that you need to store the values somewhere.

It doesn't matter exactly where the constants are, as long as they are somewhere, with a label, so the assembler can find them.

We do not have a computer with vast GIGS of memory to waste.

Unless you are using ROM in FinalGROM or SAMS memory is really hard to come by.

Link to comment
Share on other sites

Hoepfully there aren't too many constants of different values. "Ugly" code in some thight environments sometimes use values that coinciently happen to be correct, although they really are part of something else.

But it's frequently so that executable code (by the CPU) is less compact than byte-code that's interpreted. So it's likely there will be a memory issue anyway.

  • Like 1
Link to comment
Share on other sites

This could reside in a bank switched cartridge. With up to 512K available, there is no shortage of space. 12K of BASIC groms would probably be no more than 24K of assembly, so that's only 3 banks.

@RXB It's kind of comical to see someone with such a rudimentary knowledge of assembly try to tutor the rest of us.

Link to comment
Share on other sites

It's also so that the value should be somewhere. Either at some byte at some address, or as an immediate inline with the code. It it's elsewhere, we need two bytes for the address and one for the constant. If it's inline, we need two bytes for the immediate, as the TMS 9900 can only handle word sized immediates. This in turn implies that if we want to use only half of the immediate, we need to handle the other part in some way, to avoid it from interfering.

Say we want to compare R3 to >12. The instruction CI  R3,>1200 has zeroes in the uninteresting half, but we can't see if the left part of R3 is equal to >12 without first zeroing the right part or that register. We can do that with an additional ANDI  R3,>FF00, but that uses another four bytes.

If we now want to compare >12 to a byte somewhere else in memory, it's even more tricky to use an immediate, since an immediate can only work with registers. Another four bytes to fetch the value first.

 

So doing a CB  R3,@HX12 (four bytes) or maybe a CB @SOMEWHERE,@HX12 (six bytes) is preferred for all reasons at almost all times.

  • Like 1
Link to comment
Share on other sites

17 hours ago, senior_falcon said:

This could reside in a bank switched cartridge. With up to 512K available, there is no shortage of space. 12K of BASIC groms would probably be no more than 24K of assembly, so that's only 3 banks.

That's probably right. I wonder if we could write a compiler which could take care of the conversion automatically or at least semi automatically, to have a draft which could be manually patched.

  • Like 3
Link to comment
Share on other sites

22 hours ago, senior_falcon said:

This could reside in a bank switched cartridge. With up to 512K available, there is no shortage of space. 12K of BASIC groms would probably be no more than 24K of assembly, so that's only 3 banks.

@RXB It's kind of comical to see someone with such a rudimentary knowledge of assembly try to tutor the rest of us.

You go directly to insults like you are some kind of god on the mountain.

Granted you are better at assembly, but that does not give you privilege to keep attacking me.

Just quit being a dick about it. That is arrogance.

 

No reason to behave like this and you are smart enough to know better.

Link to comment
Share on other sites

21 hours ago, apersson850 said:

It's also so that the value should be somewhere. Either at some byte at some address, or as an immediate inline with the code. It it's elsewhere, we need two bytes for the address and one for the constant. If it's inline, we need two bytes for the immediate, as the TMS 9900 can only handle word sized immediates. This in turn implies that if we want to use only half of the immediate, we need to handle the other part in some way, to avoid it from interfering.

Say we want to compare R3 to >12. The instruction CI  R3,>1200 has zeroes in the uninteresting half, but we can't see if the left part of R3 is equal to >12 without first zeroing the right part or that register. We can do that with an additional ANDI  R3,>FF00, but that uses another four bytes.

If we now want to compare >12 to a byte somewhere else in memory, it's even more tricky to use an immediate, since an immediate can only work with registers. Another four bytes to fetch the value first.

 

So doing a CB  R3,@HX12 (four bytes) or maybe a CB @SOMEWHERE,@HX12 (six bytes) is preferred for all reasons at almost all times.

What Lee Stewart and I did with RXB ROM was to use Scratch pad  for almost anything done.

From XB GPL we load values into scratch pad like Row, Column and Character and 1 or Repetition value into Scratch pad.

Then execute the Assembly from GPL Registers 0 to 10 for running it all from scratch pad.

Yes the code is run from ROM 3 but XB has 2 other ROMs made by Texas Instruments that seems to have sold well.

Anyway again thanks to Lee Stewart for his help and idea to use GPL registers and only scratch pad for execution.

  • Like 2
Link to comment
Share on other sites

Seems good, but it's a different issue. That's for accessing values frequently used. At least it should be, or there's no point. But if you only want to do an occasional check if a byte is larger than 32, for example, then a CB @BYTETOCHECK,@DD32, where DD32 contains the value 32, is the most efficient. This is true also if the byte to check is in a register. Then you do CB R2,@DD32.

If you repeatedly want to compare against 32, then you should do a MOVB @DD32,R1 first, and then compare CB R2,R1.

In any case, the value 32 has to come from somewhere.

Link to comment
Share on other sites

1 hour ago, RXB said:

You go directly to insults like you are some kind of god on the mountain.

Granted you are better at assembly, but that does not give you privilege to keep attacking me.

Just quit being a dick about it. That is arrogance.

I'm sorry and I stand corrected. On the other hand, I think most people would detect more than a trace of arrogance in these responses of yours. Perhaps you could follow your own advice?

 

I am a GPL programmer working on XB Source code of GPL and Assembly for 30 years now and you are going to quote the XB manual that has many errors in it?

 

LOL sorry you are just wrong.

 

Clearly you know very little about GPL or XB internal workings.

 

LOL you want to argue with me about GPL Code?

 

LOL no you are way off.

 

 

Edited by senior_falcon
  • Like 2
Link to comment
Share on other sites

54 minutes ago, senior_falcon said:

I'm sorry and I stand corrected. On the other hand, I think most people would detect more than a trace of arrogance in these responses of yours. Perhaps you could follow your own advice?

 

I am a GPL programmer working on XB Source code of GPL and Assembly for 30 years now and you are going to quote the XB manual that has many errors in it?

 

LOL sorry you are just wrong.

 

Clearly you know very little about GPL or XB internal workings.

 

LOL you want to argue with me about GPL Code?

 

LOL no you are way off.

 

 

Really do I need to go back and pull up all your quotes?

And the numerous insults' too?

How about just be civil and not hurl them at all, as any debate does not work if you take it personally, the point is the subject not personal attacks.

Just stick to the topic please, that is what I am doing.

Link to comment
Share on other sites

Let me try to bring this thread back to the original topic, which I find interesting. To that effect this message is partially somewhat off-topic I guess, but bear with me, at least personally I found this amusing.

 

Background: I have never used an actual PEB, and I have only once seen one in the flesh (in a TI get-together in Rome organised by @ti99iuc). In Europe PEB's were hard to come by, and the 12 or 13 year old me certainly could not have afforded one. A long way to say that I have no real experience using a TI with genuine TI disk drive. So when I play with the real iron, I am using my ET-PEB as the SAMS memory expansion, disk drive emulation etc. I designed it before or around the same time the TIPI started to exist. And I have only had a real mini memory module for a few years, thanks to @kl99

 

Anyway it turns out I can't properly use my own stuff:

 

In the very beginning of this thread @senior_falcon listed the first three steps in the process: CALL INIT, CALL LOAD("DSK2.MMXB.OBJ"), CALL LINK("WTGROM"). I tried these in classic99 and it all worked. But when I switched to the real iron, the CALL LOAD naturally uses a disk drive, and in my case this is the ET-PEB - and it did not work. This is not a compete surprise, since the DSR and the microcontroller firmware reading the files off SD card are my own design. But I have used in the past editor/assembler and those things worked fine. So what gives? After some testing and head scratching I realised (or remembered) that my SD card FIAD disk system expects to find files in the TIFILES format, but my version of the MMXB.OBJ file on the SD card didn't have the TI files format. The ET-PEB gave me some cryptic error message I wrote a few years ago, but this really was not helpful. Anyway once I realised that the problem was the file being in the wrong format, it turned out that the xdt99 suite had already a program to turn the file into TIFILES format:

xdm99.py --format DIS/FIX80 -T MMXB.obj

This produced the file mmxb.tfi, which my ET-PEB was able to process and load! Also CALL LINK("WTGROM") worked, which was a bit of a surprise since the MiniMemory version I am using is the one I modified for the StrangeCart.

 

So with all of that out of the way, I can next start to work on an update to the StrangeCart, so that it can start to support GRAM functionality on the system GROM area, and after that this whole thing should run on the real iron.

  • Like 5
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...