Jump to content
IGNORED

GCC for the TI


insomnia

Recommended Posts

4 hours ago, khanivore said:

For example if addhi3 operands[2] is a subreg and it is not dead, then I think we have no choice but to move to an intermediate (r0), swap there, and then add.

That's fine with me. The goal is correct code -- we can worry about faster code later. As long as the code works it makes it safer to pull in code from other platforms.

 

If you are writing code and need top performance from C on the TI, just stop using chars and you're golden. ;)

 

  • Like 4
Link to comment
Share on other sites

On 2/16/2024 at 7:12 AM, khanivore said:

Thanks. I was missing that library.  Building now.  But has hit one of the assertions I didn't think could be hit, so need to do a little bit more work...

(edit: basically operands[2] to addhi3 has a missed extend from byte to int in _do_map_screen.  I had mistakenly assumed that could only happen to operands[1])

I've finally got back to testing the new GCC 1.30 patch and...  Well, I guess I don't need to report this :)

Link to comment
Share on other sites

On 2/14/2024 at 5:18 AM, JasonACT said:

I wonder why changing the function return type from unsigned char read_keyboard() to int makes a difference?

I've been bisecting my commits to find where this behaviour changed and it was when I added constraints to cmpqi.  For reasons I cannot explain it works if I delete the constraints (or replace with general 'g' so at least there is a length):
 

--- a/dev/gcc-4.4.0/gcc/config/tms9900/tms9900.md
+++ b/dev/gcc-4.4.0/gcc/config/tms9900/tms9900.md

@@ -332,11 +338,14 @@
 ;;
 ;; This is a good candidate for define_insn_and_rewrite but this doesn't exist yet
 ;; in gcc4.4.0
+;;
+;; For some unknwon reason, adding specific constraints here causes a reload error.
+;; Making a general constraint with a worst-case length of 6 bytes avoid the issue.
 
 (define_insn_and_split "cmpqi"
   [(set (cc0)
-       (compare (match_operand:QI 0 "nonimmediate_operand" "=rR>,Q,  rR>,Q")
-                (match_operand:QI 1 "general_operand"      "rR>,rR>,Q,  Q")))]
+       (compare (match_operand:QI 0 "nonimmediate_operand" "=g")
+                (match_operand:QI 1 "general_operand"      "g")))]
   ""
   "cb   %0, %1"
   "CONST_INT_P (operands[1])"
@@ -348,7 +357,7 @@
     int val = INTVAL (operands[1]) & 0xff;
     operands[1] = force_const_mem (QImode, GEN_INT (val));
   }
-  [(set_attr "length" "2,4,4,6")]
+  [(set_attr "length" "6")]
 )
 
 ;;-------------------------------------


 

  • Like 2
Link to comment
Share on other sites

12 hours ago, khanivore said:

I've been bisecting my commits to find where this behaviour changed and it was when I added constraints to cmpqi.  For reasons I cannot explain it works if I delete the constraints (or replace with general 'g' so at least there is a length):

That's good news, and the latest changes are building it now...  but while I hope I applied all the changed files properly, I'm now seeing this in Ghostbusters:

image.thumb.png.23e8a11148c6bd1c1363dfb4d9b179dd.png

The map screen isn't flashing the roamers anymore.  FYI: I'm using a modified set of source code files, to make each bank fit into 8KB here, because it's overflowing almost all the banks now.  I had made these changes a couple of months ago, changing functions to static or inline, depending on how much smaller each became, then I made a few small edits here and there for any remaining overflows.

 

Also, the car, when driving, with its initial move left via the joystick, seems to blink the graphics.  Probably the same issue though, everything else in the game is working properly, as far as I can tell.

Link to comment
Share on other sites

3 hours ago, JasonACT said:

I'm now seeing this in Ghostbusters:

Yeah that's the known issue with andhi3 and a byte mask.  It's fixed in head of 1.30 branch.  Planning to merge later today hopefully.  This is my output for the code above:

 

L3
        movb @gSaveIntCnt, r1
        sra  r1,8 ; andhi extend op1
        andi r1, >1


(I just noticed actually I can use a SWPB instead of SRA if the mask is a constant less than 0x100, saves a few cycles)

3 hours ago, JasonACT said:

Probably the same issue though, everything else in the game is working properly, as far as I can tell.

Could be.  I've added subreg/offset checks to add, and, sub for now so will hopefully be much improved after I get them merged.

 

  • Like 4
Link to comment
Share on other sites

@khanivore The latest 1.30 is better, but the PK counter is no longer increasing - I think it's this generated code, extending op1 when it shouldn't:

 

	def	increase_pk
increase_pk
	ai   r10, >FFFC
	mov  r14, @>2(r10)
	mov  r11, *r10
	mov  r1, r14
	li   r1, >1
	mov  r1, @>83D6
	mov  @game+22, r4
	jeq  L82
	dec  r4
	mov  r4, @game+22
	jmp  L83
L82
	mov  r4, @game+24
	mov  @game, r6
	mov  @game+2, r3
	li   r2, buildings+8
	mov  r2, r5
	inct r5
L87
	mov  *r2, r0
	jeq  L84
	inc  r4
	mov  *r5, r7
	jeq  L85
	dec  r7
	mov  r7, *r5
L85
	mov  *r2, r7
	dec  r7
	mov  r7, *r2
	jne  L84
	mov  r3, r8
	sra  r8,8 ; addhi extend op1
	ai   r8, >12C

 

Link to comment
Share on other sites

14 hours ago, JasonACT said:

I think it's this generated code, extending op1 when it shouldn't:

Yes, it sees a non zero offset of 2 and emits an extend:

 

; addhi3-71 : (insn 71 315 293 <stdin>:186 (set (reg:HI 8 r8 [orig:146+2 ] [146])
;         (plus:HI (reg:HI 8 r8 [orig:146+2 ] [146])                              
;             (const_int 300 [0x12c]))) 63 {addhi3} (nil))
                                                           
; addhi3, alt=4
; addhi3, op[1] off 2
        sra  r8,8 ; addhi extend op1
        ai   r8, >12C   

 

But 2 is an even number so we were good, no byte offset correction was needed.  It's an offset into a struct I think in this case.   I've seen it happen in arrays too.  I've changed the offset check to be if  % 2 == 1.  Need to add another unit test or two as well.

 

(edit: actually I don't think that's a sufficient test.  Value may have come an even address from a byte array for example.  I need to do some testing on original reg with structs and arrays.  Standby .....)

  • Like 2
Link to comment
Share on other sites

The offsets of +/- 2 actually related to the long (32-bit) add in increase_pk.  I've tidied up the offset checker function and added a mode param so it now verifies for +1/-1 QImode or HImode respectively.  I've removed the check of REG_EXPR.  I can't remember why I added it and I don't think it does what I want anyway.  Just comparing REGNO to ORIGINAL_REGNO seems to be enough to rule out offsets in registers that have already been extended or truncated.

 

I've run the tests I have and all seems to be working.  I tried building ghostbusters but got errors about section headers overlapping.  I guess that's the known bank size issue. 

 

Changes all pushed to the 1.30 branch.  If anyone has a chance to test it before I merge to main that would be great.  Thanks.

  • Like 2
Link to comment
Share on other sites

Pleased to report that all my code works:

 

- libti99all

- vgmcomp2 (although small changes were needed, I think these were actually bugs in the original code, nothing to do with the compiler)

- mario bros

- super space acer (I have one bug I didn't see before, but it's newer code and not tested well, might be my fault)

 

I recompiled the original libti99 and everything appeared to work there, but I didn't run deeper tests than the testlib app. puff() seems broken, but it has been for a while and is not necessarily the compiler's fault, I haven't looked at it.

 

I also didn't look at the previous pain points to see if the new code was clean, apologies. It's been a few hours already and I didn't plan for this today.

 

I did note that the library compatibility is not there - it cost me a bit of time to realize some of vgmcomp2 was linking the old libti99 and some was linking the new one. ;) So upgrading the compiler will mean rebuilding the libs. I think that's a fair cost for the improvements.

 

 

 

  • Like 6
Link to comment
Share on other sites

13 hours ago, khanivore said:

If anyone has a chance to test it before I merge to main that would be great.

Ghostbusters still has $0 showing (all the time) and the PK counter remains at 0.  I'm not sure why, the code looks ok, and I even started doubting myself (I.E. if I had edited the source code for tests previously) so I recompiled with the really old GCC 1.12A and it all works again.

  • Like 1
Link to comment
Share on other sites

@khanivore It looks like the compiler is setting up R0 with the value 10, hoping it doesn't change (which it does) so it can divide by 10 later:

 

	def	uint2str
uint2str
	clr  r7
	li   r2, >2710
	li   r6, strbuf
	mov  r7, r8
	li   r0, >A
L22
	mov  r1, r5
	clr  r4
	div  r2, r4
	mov  r4, r5
	mov  r1, r4
	clr  r3
	div  r2, r3
	mov  r4, r1
	mov  r5, r0
	jgt  L20
	mov  r8, r0
	jeq  L21
L20
	ai   r5, >30
	swpb r5
	movb r5, *r6+
	li   r8, >1
L21
	mov  r2, r4
	xor  r0, r4
	abs  r2
	mov  r2, r3
	seto r2
	jlt  $+>4
	clr  r2
	div  r0, r2
	inv  r4
	jlt  $+4
	neg  r2
	inc  r7
	ci   r7, >5
	jne  L22
	mov  r8, r0
	jne  L23
	movb @LC3, *r6+
L23
	clr  r1
	movb r1, *r6
	li   r1, strbuf
	b    *r11
	.size	uint2str, .-uint2str

 

  • Like 1
Link to comment
Share on other sites

2 hours ago, JasonACT said:

 It looks like the compiler is setting up R0 with the value 10, hoping it doesn't change (which it does) so it can divide by 10 later:

 

That's odd.  R0 should not be available to the allocator as it is used as a scratch reg by some insns.  I'll look into that.

  • Like 1
Link to comment
Share on other sites

7 hours ago, Tursi said:

I did note that the library compatibility is not there - it cost me a bit of time to realize some of vgmcomp2 was linking the old libti99 and some was linking the new one. ;) So upgrading the compiler will mean rebuilding the libs. I think that's a fair cost for the improvements.

What's changed that needs the libs rebuilt?  I thought we reverted those changes.  SP is back to R10 was the main one I think.

 

(edit: oh wait, I see, R0 was a general reg in 1.19 and so is used by some lib functions.  That explains the use in uint2str.  @JasonACT maybe gb needs a lib rebuilt?)


(edit2: reminder to self, test before posting, nvm I see R0 used in uint2str with that latest compiler too)

  • Haha 1
Link to comment
Share on other sites

7 hours ago, Tursi said:

puff() seems broken

I see these errors there:

/libti99.a(puff.o): In function `L19':
(.text+0xd2): undefined reference to `__ashlsi3'
./libti99.a(puff.o): In function `L18':
(.text+0xfe): undefined reference to `__ashrsi3'
./libti99.a(puff.o): In function `L18':
(.text+0x112): undefined reference to `__ashlsi3'
make: *** [Makefile:154: test] Error 1

 

It looks like gcc doesn't emit its own code to implement 32 bits shifts when optimising for size.  I'll need to add versions of these into libgcc.

Link to comment
Share on other sites

I can see R0 isn't actually reserved, but it should be:

 

--- a/dev/gcc-4.4.0/gcc/config/tms9900/tms9900.h
+++ b/dev/gcc-4.4.0/gcc/config/tms9900/tms9900.h
@@ -289,7 +289,7 @@ extern short *reg_renumber; /* def in local_alloc.c */
 /* 1 for registers that have pervasive standard uses and are not available
  * for the register allocator.  */
 #define FIXED_REGISTERS \
-  {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0}
+  {1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0}
 /* SC 1  2  3  4  5  6  7  8  BP SP LR CB AP 14 15*/
 

 

  • Confused 1
Link to comment
Share on other sites

10 hours ago, khanivore said:

What's changed that needs the libs rebuilt?  I thought we reverted those changes.  SP is back to R10 was the main one I think.

 

(edit: oh wait, I see, R0 was a general reg in 1.19 and so is used by some lib functions.  That explains the use in uint2str.  @JasonACT maybe gb needs a lib rebuilt?)


(edit2: reminder to self, test before posting, nvm I see R0 used in uint2str with that latest compiler too)

Register usage changed a bit, and was breaking the assumptions made about what registers needed to be saved. I don't recall the specifics, as once I saw it I figured it was expected.

 

 

  • Like 1
Link to comment
Share on other sites

10 hours ago, khanivore said:

I see these errors there:

/libti99.a(puff.o): In function `L19':
(.text+0xd2): undefined reference to `__ashlsi3'
./libti99.a(puff.o): In function `L18':
(.text+0xfe): undefined reference to `__ashrsi3'
./libti99.a(puff.o): In function `L18':
(.text+0x112): undefined reference to `__ashlsi3'
make: *** [Makefile:154: test] Error 1

 

It looks like gcc doesn't emit its own code to implement 32 bits shifts when optimising for size.  I'll need to add versions of these into libgcc.

That's weird, since it builds clean here. I wonder if my tests were invalid.

I copied my work folder to a new folder, deleted the build output folder, pulled down and switched to branch 1.30, and ran install from there.

 

Link to comment
Share on other sites

39 minutes ago, Tursi said:

That's weird, since it builds clean here. I wonder if my tests were invalid.

I copied my work folder to a new folder, deleted the build output folder, pulled down and switched to branch 1.30, and ran install from there.

 

Are you building with -O2 or -Os ?  I think I might have changed that in the libti99 Makefile.  I definitely saw calls to __ashlsi3 and friends in my unit tests too.

Link to comment
Share on other sites

10 hours ago, khanivore said:

I can see R0 isn't actually reserved, but it should be:

That's fixed it, I no longer see any issues in Ghostbusters in my special version, and a single overlap of 4 bytes in the real version, which should be easy to work around.

  • Like 3
Link to comment
Share on other sites

In bank0, main.c, you can change function detect_f18a() to be "static void detect_f18a()" which gets you back enough space to build properly.

 

Edit: I forgot to add, change the makefile to include " -fno-function-cse" at the end of the CCFLAGS

Edited by JasonACT
  • Like 1
Link to comment
Share on other sites

33 minutes ago, khanivore said:

Are you building with -O2 or -Os ?  I think I might have changed that in the libti99 Makefile.  I definitely saw calls to __ashlsi3 and friends in my unit tests too.

Looks like O2 (I think I'm going to change all my stuff to Os though).

 

So we can expect one more patch for the R0 issue? I'm surprised that didn't bite me :)

 

Link to comment
Share on other sites

1 hour ago, Tursi said:

Looks like O2 (I think I'm going to change all my stuff to Os though).

 

So we can expect one more patch for the R0 issue? I'm surprised that didn't bite me :)

 

Yeah, I’ll merge that tomorrow.  I’ll include the 32-bit shift lib funcs too. 

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...