Jump to content
IGNORED

GCC for the TI


insomnia

Recommended Posts

I have merged the 1.30 branch to main.  I found some issues in the lib1funcs implementations such as not saving some scratch regs.  I'm still seeing some failures in corner cases, such as unoptimised arithmetic shift of a 32-bit value by a count of 16 or more, but I'm releasing anyway as these could take some time to work through.  Release notes for 1.30 are:

gcc patch 1.30
* Pass constants as wides to force_const_mem to avoid assert in combine.c:do_SUBST
* Added calls to correct byte order on all byte and word arith and move
* Changed inline debug to dump entire insn not just operands
* Removed wrongly associative constraints on subtract
* Added separate reg constraints to addhi3, andhi3, subhi3 to allow longer lengths for subreg offset fixes
* Added more unit tests
* Removed constraints in movqi - causes assert in reload
* Added 32 bit shift operations
* Marked R0 as fixed so allocator won't use it
* Added reg saves in lib1funcs as some regs were being trampled
 

  • Like 6
  • Thanks 2
Link to comment
Share on other sites

So with that, to answer the earlier question: yes, libti99 is certified on the 1.30 patches. I've updated my readme files to note that.

 

Hopefully have libti99all out to replace it in another month or two... depends on work.

 

  • Like 3
Link to comment
Share on other sites

I'm not having as much luck with 1.30, compiling my stuff. 

 

I have a lot of literal C strings that are fairly long in my ForceCommand help. 

For example, the literal string here passed to 'wraptext' function:

 

    wraptext("==Run XB program==\n\n"
      "xb <program-path>\n\n"
      "Switch a FinalGROM99 to an Extended BASIC cartridge, configured to RUN the specified program.\n"
      "Default cartridge name is TIXB_G with start address 25474.\n"
      "Use variable XBMOD to override cartridge name, and XBADDR to override start address.\n");

 

Notice below in the '.s' file, I get a rogue quote character on a line all by its lonesome (about half way down).

 

LC3
        text '==Run XB program=='
        byte 10
        byte 10
        text 'xb <program-path>'
        byte 10
        byte 10
        text 'Switch a FinalGROM99 to an Ex'
        text 'tended BASIC cartridge, configured to RUN the specified program.'
        byte 10
'
        text 'Default cartridge name is TIXB_G with start address 25474.'
        byte 10
        text 'Use va'
        text 'riable XBMOD to override cartridge name, and XBADDR to override '
        text 'start address.'
        byte 10
        byte 0

 

I have a lot of these sort of strings in https://github.com/jedimatt42/fcmd/blob/main/b14_moreHelp.c

 

This is the only occurrence I have seen in the .s, and it is the first of the conjoined string literals. It does not occur in the subsequent literals from the same source file. And this is not the largest of those literal strings either.

 

After commenting it (bulk of b14_moreHelp.c) out, it happens in one other case, my next file full of builtin help text.. But, not the first one (in that file)... in this case the 10th... I'll have to eat some dinner and then see if I can spot what the 2 failing occurrences have in common.

Edited by jedimatt42
clarifications
Link to comment
Share on other sites

3 hours ago, jedimatt42 said:

Notice below in the '.s' file, I get a rogue quote character on a line all by its lonesome (about half way down).

I see it.  It looks like the fix I made to limit strings to 64 bytes has a bugged corner case.  If the 63rd byte is non printable (\n in this case) then it closes the already-closed string and resets the count resulting in an extra closing single quote.  It should check if in_text before closing the string when the length is reached.
 

diff --git a/dev/gcc-4.4.0/gcc/config/tms9900/tms9900.c b/dev/gcc-4.4.0/gcc/config/tms9900/tms9900.c
index d7b9fb5..42a13e0 100644
--- a/dev/gcc-4.4.0/gcc/config/tms9900/tms9900.c
+++ b/dev/gcc-4.4.0/gcc/config/tms9900/tms9900.c
@@ -720,7 +720,7 @@ void tms9900_output_ascii(FILE* stream, const char* ptr, int len)
       if (ISPRINT(c))
       {
          /* End TEXT statement */
-         if (count==64)
+         if (in_text && count==64)
          {
             fprintf (stream, "'\n");
             in_text = 0;

 

  • Like 5
  • Thanks 2
Link to comment
Share on other sites

3 hours ago, jedimatt42 said:

Damn, everything compiled... now I have to test it all. Thanks @khanivore 

Built fine for me too ... once I had paths set up to xdt99 and so on.

But it doesn't get far, unfortunately.  It bombed almost immediately with a bad opcode after trying to execute BL *R12 when R12 contained >6000.  I'm thinking the compiler had a function address stored in R12 but it got trampled by your trampoline or other inline assembly?

I built with -fno-function-cse and it got further and printed some junk on the screen, but that was about it.

Link to comment
Share on other sites

:) I have some inline assembly that needs to declare the clobbers properly. I plan to compare the .s files from old compiler, and the new. 

 

I do a lot of weird things, like type casting my trampoline to the function I want to call so the compiler sets up all the arguments while I then go to a different function, screw with the stack, and then go to the target bank and function.

 

I observed with -Os vs -O2 that the small version of some parts were significantly bigger than the O2 version. This is probably bugs like missing volatiles on my end. Or it could be benign.. we'll see. But at least none of my nearly full banks overflowed. 

  • Like 1
Link to comment
Share on other sites

On a new program I ran into a weird one... I'm getting:

 

/mnt/d/work/libti99ALL/vdp.h:98: error: ‘asm’ operand requires impossible reload
make: *** [Makefile:69: tarot.o] Error 1

 

This is on VDP_SET_ADDRESS_WRITE, which is:

 

inline void VDP_SET_ADDRESS_WRITE(unsigned int x) { __asm__  volatile ( "swpb %0\n\tmovb %0,@>8c02\n\tswpb %0\n\tori %0,>4000\n\tmovb %0,@>8c02\n\tandi %0,>3fff" : : "r"(x) : "cc"); }

 

Sorry for the single line formatting. Now the function it's failing in calls this four times, but only the third instance causes the error:

 

Spoiler

 

 

I don't see anything weird in the assembly output - it seems to have built it. The third instance is the only one that uses R1, but why would that matter?

 

Spoiler

 

 

It's getting late so no minimal code at the moment... what does that error actually mean?

 

 

Link to comment
Share on other sites

1 hour ago, Tursi said:

It's getting late so no minimal code at the moment... what does that error actually mean?

I don't know tbh.  The comment in gcc/reload1.c just says:

              /* If this was an ASM, make sure that all the reload insns
                 we have generated are valid.  If not, give an error
                 and delete them.  */

but without reproducing it, I can't see what "reload insns" it means.

Do you need to declare the condition code clobber?  I think that would be implied.  Not sure if related though.

Link to comment
Share on other sites

13 hours ago, khanivore said:

I don't know tbh.  The comment in gcc/reload1.c just says:

              /* If this was an ASM, make sure that all the reload insns
                 we have generated are valid.  If not, give an error
                 and delete them.  */

but without reproducing it, I can't see what "reload insns" it means.

Do you need to declare the condition code clobber?  I think that would be implied.  Not sure if related though.

I never saw any difference whether I did or didn't, but as I thought I was learning more I was trying to be more diligent. ;)

 

Removing the "cc" doesn't help the error. It's strange that of three instances in the same function, that's the only one, and it's even in the middle - not first or last. Hell, it's even a copy of the first block.

 

Lovely to see even the gcc wiki don't really seem to know... ;)

https://gcc.gnu.org/wiki/reload

Some forum posts suggest this code was unchanged even quite a while before this version, I saw the same block in a snippet from 3.3, supposedly. So lots of people discussing it, some even with insight. Most of the examples are very complicated, though... I don't see much that leads to resolution. It's odd that it's never happened before and the macro, to me, seems about as simple as you can get.

 

I tried a number of tricks.. it had the same fault with O2, with a return variable, as a #define (well, I was desperate). It's not out of registers, that's all I could determine. When I replaced it with C code, it happily found two registers to use for the operation:

 

VDPWA=vdpdst&0xff;
VDPWA=(vdpdst>>8)|0x40;

	mov  r9, r3
	swpb r3
	movb r3, @>8C02
	mov  r9, r2
	sra  r2, >8
	ori  r2, >40
	swpb r2
	movb r2, @>8C02

 

So not ideal, but it works. I also did a read through of the patch changes, but everything made sense enough to me... what little sense I have ;) My code runs, for now, I don't know there's much else here but to use the workaround.

 

  • Like 2
Link to comment
Share on other sites

6 hours ago, Tursi said:

My code runs, for now, I don't know there's much else here but to use the workaround.

Yeah, 'fraid so.  I'm glad you did find a workaround so probably best to just wait and see does a pattern emerge in time to help us reproduce it.

  • Like 2
Link to comment
Share on other sites

42 minutes ago, khanivore said:

Yeah, 'fraid so.  I'm glad you did find a workaround so probably best to just wait and see does a pattern emerge in time to help us reproduce it.

Workarounds are not new here... all my GCC code has at least one. Just glad you're been willing to invest so many hours to get things more stable. I'll take a compiler 'crash' over incorrect code anyday - far faster to debug. ;)

 

  • Like 5
  • Thanks 3
Link to comment
Share on other sites

22 hours ago, khanivore said:

wait and see does a pattern emerge in time to help us reproduce it.

// tms9900-gcc -S -Os -std=c99 test.c

int gColor;

inline void VDP_SET_ADDRESS_WRITE(unsigned int x) {
    __asm__ volatile ("swpb %0\n\tmovb %0,@>8c02\n\tswpb %0\n\tori  %0,>4000\n\tmovb %0,@>8c02\n\tandi %0,>3fff" : : "r"(x) : "cc");
}

void vdpmemread (int, unsigned char *, int);

unsigned char test () {
    for (int row=0; row<2; ++row) {
        for (int col=4; col<6; ++col) {
            unsigned char buf[1];
            unsigned char buf2[1];
            int vdpsrc=0;
            int vdpdst=0;
            vdpmemread(vdpdst, buf2, 1);

            vdpdst=row*2 + col*2 + gColor;
            vdpmemread(vdpsrc, buf, 1);
            vdpmemread(vdpdst, buf2, 1);

            VDP_SET_ADDRESS_WRITE(vdpdst);
        }
    }
}

 

Your blog is surprisingly complete.

  • Like 5
Link to comment
Share on other sites

@khanivore This isn't anything you've done, it also happens on patch 1.12A (the special build we made for Ghostbusters from many years ago).  It also doesn't happen on GCC 6.3.0 for Windows/x86 which is the native GCC I'm using to build the much older 4.4.0 TI GCC.

  • Like 4
Link to comment
Share on other sites

3 hours ago, JasonACT said:
// tms9900-gcc -S -Os -std=c99 test.c

Very good.  That does reproduce it.  I googled and other is some chatter about this error in the GCC mailing lists so it could be a compiler bug?  I built using my last gcc13 build and it doesn't appear, so it should go away once we upgrade.

I seems to be to do with the "r" constraint and for some reason reload doesn't feel able to alloc a new reg here.  If I change it to general "g" it does compile but the output is useless:

        swpb @>2(r10)
        movb @>2(r10),@>8c02
        swpb @>2(r10)
        ori @>2(r10),>4000
        movb @>2(r10),@>8c02
        andi @>2(r10),>3fff

 

Another simple workaround would be to use r0 as a scratch which would also obviate the final andi.

4 hours ago, JasonACT said:

Your blog is surprisingly complete.

Cheers mate!  I'll try to keep it up to date as we find out new stuff.

  • Like 3
Link to comment
Share on other sites

34 minutes ago, khanivore said:

Another simple workaround would be to use r0 as a scratch which would also obviate the final andi.

Ah, that's a pretty good idea. I could just do that instead. :)

 

  • Like 3
Link to comment
Share on other sites

Just to make sure I understand the rules: r0, r10, and r11 are the only 'reserved' registers.

Consequently, declaring them in clobbers is pointless. No harm, no help.

 

Does r0's usage ever leak ?  Such that I need to preserve it in my __asm__ blocks? I see it is often used as an index into the stack without being preserved at the beginning of a function... 

The only other usages I observe are replacements of ci register, 0 with mov register, r0 

 

It looks like 'no', r0 is never read from, except during the sequence of stashing things onto the stack. Can this assumption be confirmed?

 

--- 

Aside, preserving r12 in ForceCommand's trampoline for cartridge banking calls fixed fundamentals in my programming model, still very broken, but it gets the banner and chime out before crashing in the user prompt loop

Link to comment
Share on other sites

5 hours ago, jedimatt42 said:

Does r0's usage ever leak ?  Such that I need to preserve it in my __asm__ blocks? I see it is often used as an index into the stack without being preserved at the beginning of a function... 

The only other usages I observe are replacements of ci register, 0 with mov register, r0 

 

It looks like 'no', r0 is never read from, except during the sequence of stashing things onto the stack. Can this assumption be confirmed?

I can only talk about the R0 question, as I don't know the answers to the others, but yes R0 has been taken out of the register pool and may be used as a scratch register without preserving its value.  It's actually used for variable shifts too, not just the stack, but again its value is never needed once the shift is completed.

  • Like 2
  • Thanks 1
Link to comment
Share on other sites

11 hours ago, jedimatt42 said:

Just to make sure I understand the rules: r0, r10, and r11 are the only 'reserved' registers.

Consequently, declaring them in clobbers is pointless. No harm, no help.

As @JasonACT said, R0 is assumed to be a clobber, no need to declare it.  No insns expect R0 to have a persistent value.


R10 is Stack pointer, so must be preserved (easiest option) or declared clobber (though this might confuse gcc because it wouldn't be able to save it if it has no free regs).

 

R11 is already clobbered on entry to a function so no need to preserve it.  Unless you are an inline, in which case, yes, do declare it as clobbered or save it.

 

R9 is the frame pointer.  It should also be preserved or declared clobbered.  It isn't always used, especially if build is optimised, and never if you use -fomit-frame-pointer, but better to preserve or delcare clobbered.

 

You can also clobber as many regs as you have params (up to 8).  So if your function has 5 params, these are in R1-R5 and can be clobbered since they are passed by value (edit: I can think you safely use any reg R1-R8 without preserving).

 

Any other regs should be preserved or clobbered in case they are in use by the calling function.

Edited by khanivore
  • Like 3
  • Thanks 1
Link to comment
Share on other sites

A follow up on register allocation ...  tl;dr "preserving regs creates smaller object code".

Using R12-R15 as a test case, if we mark a reg as 0 in the CALL_USED_REGISTERS this means that reg must be "preserved across function call boundaries".  Then the function prologue generates code like this:

        ai   r10, >FFF4
        mov  r10, r0
        mov  r11, *r0+
        mov  r9, *r0+
        mov  r12, *r0+
        mov  r13, *r0+
        mov  r14, *r0+
        mov  r15, *r0

 

and a function can safely call any other functions it wants without saving any regs, since it knows the called function will save them.  OTOH if we don't mark them as preserved then the prologue is much shorter, and instead declares a stack frame .... 

        ai   r10, >FFF0
        mov  r9, @>E(r10)
        mov  r11, @>C(r10)

 

... because it must save everything it is using into the frame before calling another function like this:

        mov  r3, @>2(r10)
        mov  r4, @>4(r10)
        mov  r5, @>6(r10)
        mov  r6, @>8(r10)
        mov  r13, @>A(r10)
        bl   @f2

 

I see two issues with this: 1) the code to save the regs on the stack frame is not as efficient since it using labels instead of register indirect; 2) the caller doesn't know what the callee may use so must be conservative and save everything.

 

So it seems to me we would be better marking more regs as preserved and the onus is on each function to save the regs they use.  This shouldn't affect inline assembly that declares clobbers.  Also I think 8 regs for function params is excessive.  If we reduce to 4 and mark R5-R8 as general available, but to be preserved, it seems we can further reduce output code size.

 

A counter argument is that leaf functions have to save what they use if regs are preserved but don't need to save anything if not.  So the code for leaf functions does get bigger.  But I built libti99 with and without preserved regs though and overall the code is smaller with preserved regs.

 

 

 

  • Like 2
Link to comment
Share on other sites

Outsider curiosity question. 

 

Do you need any special incantations to declare a leaf sub-routine? 

Perhaps the compiler simply looks for any calls within the sub-routine and modifies the entry and exit code?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...