TheMole Posted December 6, 2023 Share Posted December 6, 2023 Just to be sure, I tried the follwoing: Running the compiler without input -> works fine, complains about no input (obviously) Trying to compile on a non-existing file -> works fine, complains that the file couldn't be openend Trying to compile an empty file -> segfaults It tried both of these without flags and with -O2 on the off chance that perhaps it was a specific codepath that was broken, but the results were exactly the same. So, I don't think it makes much sense for me to PM you a file However, on the off chance that it is helpful anyway, this is the file I tried compiling on my first attempt: https://raw.githubusercontent.com/themole-ti/ghostbusters/main/bank0/crt0.c 1 Quote Link to comment Share on other sites More sharing options...
+khanivore Posted December 6, 2023 Share Posted December 6, 2023 3 hours ago, TheMole said: Just to be sure, I tried the follwoing: Running the compiler without input -> works fine, complains about no input (obviously) Trying to compile on a non-existing file -> works fine, complains that the file couldn't be openend Trying to compile an empty file -> segfaults It tried both of these without flags and with -O2 on the off chance that perhaps it was a specific codepath that was broken, but the results were exactly the same. So, I don't think it makes much sense for me to PM you a file However, on the off chance that it is helpful anyway, this is the file I tried compiling on my first attempt: https://raw.githubusercontent.com/themole-ti/ghostbusters/main/bank0/crt0.c Compiles fine for me. Though I had to comment out include tramplines.h Is there any way to efficiently distribute binaries? deb, VM image, docker, tarball? But I guess if your CPU is arm then my x86-64 bins are no good anyway? (I'm not even going to think about how you would build a tms9900 target compiler for arm host on an x86 build host 🙂) 1 Quote Link to comment Share on other sites More sharing options...
Tursi Posted December 6, 2023 Share Posted December 6, 2023 > (I'm not even going to think about how you would build a tms9900 target compiler for arm host on an x86 build host 🙂) Canadian Cross!! (No, I haven't done one for over 15 years. Anything I was knew is obsolete.) Linux exes are a pain in the ass. They hard code dependencies in the executable so there's no guarantee they'll run on another system even if it is the same cpu. That said, a lot of people use Docker because it's easier to ship an entire environment than to fix the problem that needed it in the first place. It's not the worst thing I've ever used. That would still be Lotus Notes. Docker does cause some conflicts on Windows with WSL2 that made me have to roll back to WSL1 on my old PC, I don't know if they have fixed those. I might be able to help a little with the segfault... MAYBE. I isolated two segfaults in my Super Space Acer code that I worked around with the old compiler. One was a very long string constant - anything longer than 1k would fail to compile. (ie: const char x[] = "hello world for 1024+1 bytes...."; ) It'd be nice if this was extended, not sure where such a limit exists, but IMO there's no harm filling a whole bank with a single string. (That'd be 8k, but my string is a bit under 2k ). Of course, I worked around it by splitting the string, though that does insert an unwanted NUL in the middle. I thought that the other was related to dividing by a signed char, but stepping back through my git history, I can't reproduce. Running the install.sh under Ubuntu WSL, it seemed to finish binutils fine (which, frankly, I don't think I managed before). GCC warned me about needing GMP 4.1+ and MPFR 2.3.2+, and I told the machine to MPFR itself, but that didn't work. It's very possible I didn't build on this installation before. I installed those libs and ran again. It looked like binutils and gcc itself built okay, but I had some issues with libiberty and libgcc: Quote make[3]: Entering directory '/home/tursilion/newtms9900-gcc/build/gcc-4.4.0/build/libiberty/testsuite' make[3]: Nothing to be done for 'install'. make[3]: Leaving directory '/home/tursilion/newtms9900-gcc/build/gcc-4.4.0/build/libiberty/testsuite' make[2]: Leaving directory '/home/tursilion/newtms9900-gcc/build/gcc-4.4.0/build/libiberty' /bin/bash: line 3: cd: tms9900/libstdc++-v3: No such file or directory make[1]: *** [Makefile:10624: install-target-libstdc++-v3] Error 1 make[1]: Leaving directory '/home/tursilion/newtms9900-gcc/build/gcc-4.4.0/build' make: *** [Makefile:2476: install] Error 2 ... /home/tursilion/newtms9900-gcc/build/gcc-4.4.0/build/./gcc/xgcc -B/home/tursilion/newtms9900-gcc/build/gcc-4.4.0/build/./gcc/ -B/home/tursilion/newtms9900-gcc/newgcc9900/tms9900/bin/ -B/home/tursilion/newtms9900-gcc/newgcc9900/tms9900/lib/ -isystem /home/tursilion/newtms9900-gcc/newgcc9900/tms9900/include -isystem /home/tursilion/newtms9900-gcc/newgcc9900/tms9900/sys-include -g -O2 -O2 -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wcast-qual -Wold-style-definition -isystem ./include -g -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -Dinhibit_libc -I. -I. -I../.././gcc -I../../../libgcc -I../../../libgcc/. -I../../../libgcc/../gcc -I../../../libgcc/../include -DHAVE_CC_TLS -o _ffsdi2.o -MT _ffsdi2.o -MD -MP -MF _ffsdi2.dep -DL_ffsdi2 -c ../../../libgcc/../gcc/libgcc2.c \ ../../../libgcc/../gcc/libgcc2.c: In function ‘__ffsdi2’: ../../../libgcc/../gcc/libgcc2.c:547: error: unrecognizable insn: (insn 101 100 102 22 ../../../libgcc/../gcc/libgcc2.c:545 (set (subreg:HI (reg:SI 21 [ prephitmp.34 ]) 2) (const_int 65535 [0xffff])) -1 (nil)) ../../../libgcc/../gcc/libgcc2.c:547: internal compiler error: in extract_insn, at recog.c:2048 Please submit a full bug report, with preprocessed source if appropriate. See <http://gcc.gnu.org/bugs.html> for instructions. make[1]: *** [Makefile:359: _ffsdi2.o] Error 1 make[1]: Leaving directory '/home/tursilion/newtms9900-gcc/build/gcc-4.4.0/build/tms9900/libgcc' make: *** [Makefile:11552: all-target-libgcc] Error 2 === Failed to build libgcc.a === I seem to remember seeing those before, so probably not new. I never cared in the past, and sure enough the executables I care about are installed where I need them. It's late and I shouldn't be up, but I did a quick test on libti99 (which is actually a new version that merges the Coleco and TI versions - I'll release it properly soon) and Super Space Acer (which builds and runs on the old compiler, too, so no bug fix checking). With this build TESTLIB fails right after asking whether to enable F18A tests. It ends up setting an illegal graphics mode. The code is clearly running, I can watch the heat map and see it scanning the keyboard and running the graphics tests. EXAMPLE fails as well, similar symptoms. Everything's right except VDP register 1, which is returned from SET_BITMAP_RAW as an unsigned char (it was an INT in the other libti99, for exactly this reason, but it was working in the old compiler - an assertion I will double check before I close this email ) It was probably enough to just look at vdp_setgraphics.c - it shows the bug. The call looks like this: void set_graphics(unsigned char sprite_mode) { unsigned char x = set_graphics_raw(sprite_mode); VDP_SET_REGISTER(VDP_REG_MODE1, x); VDP_REG1_KSCAN_MIRROR = x; } unsigned char set_graphics_raw(unsigned char sprite_mode) { vdpchar = vdpchar_default; scrn_scroll = scrn_scroll_default; unsigned char unblank = VDP_MODE1_16K | VDP_MODE1_UNBLANK | VDP_MODE1_INT | sprite_mode; (.. do a bunch of register setup which is all correct .. ) return unblank; <-- we lose it here. For some reason, instead of calculating the value above (which is E0 + sprite_mode (0) in this case), it does a SETO R1 } I'll attach the two files for your review in case it helps. I'll also attach the assembly for the old version of the compiler, which seems to work. vdp_setgraphics_bug.zip 3 Quote Link to comment Share on other sites More sharing options...
+khanivore Posted December 6, 2023 Share Posted December 6, 2023 1 hour ago, Tursi said: ../../../libgcc/../gcc/libgcc2.c:547: error: unrecognizable insn: (insn 101 100 102 22 ../../../libgcc/../gcc/libgcc2.c:545 (set (subreg:HI (reg:SI 21 [ prephitmp.34 ]) 2) (const_int 65535 [0xffff])) -1 (nil)) ../../../libgcc/../gcc/libgcc2.c:547: internal compiler error: in extract_insn, at recog.c:2048 This looks like the same issue I saw when initialising a long. It says it can't find an insn to set a HI (16-bit) front a const in the expansion of init SI (32-bit). Even though "movhi" is right there. I'll keep looking into this one. (edit: just noticed, this is building libgcc, which is broken anyway for now, but not used unless using floats) 1 hour ago, Tursi said: I'll attach the two files for your review in case it helps. I'll also attach the assembly for the old version of the compiler, which seems to work. Very good, thanks, I'll look into this as well 1 Quote Link to comment Share on other sites More sharing options...
TheMole Posted December 6, 2023 Share Posted December 6, 2023 Update: I fetched the latest from the main branch, and while it still errors out, it doesn't segfault anymore... maybe the error message itself is helpful: [CC] bank0/crt0.c... bank0/crt0.c:1: internal compiler error: in subreg_highpart_offset, at emit-rtl.c:1304 Please submit a full bug report, with preprocessed source if appropriate. See <http://gcc.gnu.org/bugs.html> for instructions. make: *** [bank0/crt0.o] Error 1 3 hours ago, khanivore said: Compiles fine for me. Though I had to comment out include tramplines.h Is there any way to efficiently distribute binaries? deb, VM image, docker, tarball? But I guess if your CPU is arm then my x86-64 bins are no good anyway? (I'm not even going to think about how you would build a tms9900 target compiler for arm host on an x86 build host 🙂) I think it's something macos (and perhaps arm) specific, so don't worry too much about it. I don't want this to get in the way of the progress you're making (have I said thank you for picking up where Insomnia left off already?)! Maybe we can get back to it once you're comfortable with the stability on Linux... 1 Quote Link to comment Share on other sites More sharing options...
+khanivore Posted December 6, 2023 Share Posted December 6, 2023 Oops, I missed a byte shift. value should be in MSB before &. Defining DEBUG in tms9900.c dumps the insn expansions into the .s file. In tms9900.md:1404 this: val = INTVAL(operands[2]) & 0xFF00; was causing : ; iorqi3-28 ; OP0 : (reg:QI 1 r1)code=[reg:QI] ; OP1 : (reg:QI 1 r1)code=[reg:QI] ; OP2 : (const_int -32 [0xffffffffffffffe0])code=[const_int:VOID] ; iorqi3 intval=FFFFFFE0 val=FF00 seto r1 but should be: ; iorqi3-28 ; OP0 : (reg:QI 1 r1)code=[reg:QI] ; OP1 : (reg:QI 1 r1)code=[reg:QI] ; OP2 : (const_int -32 [0xffffffffffffffe0])code=[const_int:VOID] ; iorqi3 intval=FFFFFFE0 val=E000 ori r1, >E000 Unit test added 2 Quote Link to comment Share on other sites More sharing options...
+khanivore Posted December 6, 2023 Share Posted December 6, 2023 Oops2, also in iorqi3, this output_asm_insn("socb %3, %0", operands); should be: output_asm_insn("socb %2, %0", operands); as there is no op3 1 Quote Link to comment Share on other sites More sharing options...
+khanivore Posted December 6, 2023 Share Posted December 6, 2023 4 hours ago, Tursi said: One was a very long string constant - anything longer than 1k would fail to compile. (ie: const char x[] = "hello world for 1024+1 bytes...."; ) It'd be nice if this was extended, not sure where such a limit exists, but IMO there's no harm filling a whole bank with a single string. (That'd be 8k, but my string is a bit under 2k ). Of course, I worked around it by splitting the string, though that does insert an unwanted NUL in the middle. I'm not seeing this one, init of a string of 1029 chars works fine for me. Maybe related to bss init in crt0 or other? (correction, while it compiles ok, it crashes when run in the emulator, looks like maybe an assembler limitation, most likely binutils-2.19.1/gas/config/tc-tms9900.c:566) 1 Quote Link to comment Share on other sites More sharing options...
+khanivore Posted December 6, 2023 Share Posted December 6, 2023 1 hour ago, TheMole said: Update: I fetched the latest from the main branch, and while it still errors out, it doesn't segfault anymore... maybe the error message itself is helpful: Unfortunately, I still can't reproduce that one. I even tried a clean checkout in case something on my branch fixed it, but I get no errors at all here (aside from the known ones in libgcc) 1 hour ago, TheMole said: I think it's something macos (and perhaps arm) specific, so don't worry too much about it. I don't want this to get in the way of the progress you're making (have I said thank you for picking up where Insomnia left off already?)! Maybe we can get back to it once you're comfortable with the stability on Linux... Thanks! No problem I'm happy to help and learn something new. 1 Quote Link to comment Share on other sites More sharing options...
Asmusr Posted December 6, 2023 Share Posted December 6, 2023 Imagine how much easier it would be if xdt99 had a built in c compiler. 3 Quote Link to comment Share on other sites More sharing options...
Tursi Posted December 6, 2023 Share Posted December 6, 2023 5 hours ago, khanivore said: I'm not seeing this one, init of a string of 1029 chars works fine for me. Maybe related to bss init in crt0 or other? (correction, while it compiles ok, it crashes when run in the emulator, looks like maybe an assembler limitation, most likely binutils-2.19.1/gas/config/tc-tms9900.c:566) Nope, it was definitely the compiler that crashed. I wasn't anywhere near running code yet by that point. Try 2k... it might not have been exactly 1k. I worked around it, but I was surprised by it. Also not terribly important. There are other ways to put large amounts of data in there. 1 Quote Link to comment Share on other sites More sharing options...
Tursi Posted December 6, 2023 Share Posted December 6, 2023 6 hours ago, khanivore said: Oops, I missed a byte shift. value should be in MSB before &. Defining DEBUG in tms9900.c dumps the insn expansions into the .s file. In tms9900.md:1404 this: Ah, that makes sense. I was trying to figure out why the compiler came up with >FFFF for that sequence - sign extension happened. Good catch! 2 Quote Link to comment Share on other sites More sharing options...
+chue Posted December 6, 2023 Share Posted December 6, 2023 I will just put out another data point here. I have about 40 unit tests that I've written in the past that test various things: my own TI code, @Tursi's libTi99, as well as gcc tms9900 compiler output. Almost all of my unit tests pass with yesterday's release (patch gcc-4.4.0-tms9900-1.23.patch). I saw a couple of issues, during testing: The first being background/ foreground colors not being set as expected, and the second being unexpected output on one of my unit tests. I still need to investigate, but thought I'd post what I've found so far. The GCC version I built with is 13.2.1, and I am running Fedora Linux 39 x64. Great work so far with the GCC updates, @khanivore! 3 Quote Link to comment Share on other sites More sharing options...
+khanivore Posted December 6, 2023 Share Posted December 6, 2023 38 minutes ago, Tursi said: Nope, it was definitely the compiler that crashed. I wasn't anywhere near running code yet by that point. Try 2k... it might not have been exactly 1k. I worked around it, but I was surprised by it. Also not terribly important. There are other ways to put large amounts of data in there. Ok, have it now. I was using cc1 for my tests but if I use gcc -c I see the error. Looks like the tms9900.c file just tries to put all the text in one block where other backends split it across multiple lines. Should be an easy fix. 1 Quote Link to comment Share on other sites More sharing options...
mrvan Posted December 6, 2023 Share Posted December 6, 2023 It’s nice to see this thread moving gcc forward again. 🙂 1 Quote Link to comment Share on other sites More sharing options...
+khanivore Posted December 6, 2023 Share Posted December 6, 2023 I've pushed another update to main, patch 1.24, to fix the issues we found today 2 Quote Link to comment Share on other sites More sharing options...
Tursi Posted December 7, 2023 Share Posted December 7, 2023 Built and tested again. Libti99ALL works now (there are a couple of bugs but I don't know if they are my side or gcc - but it's the same as the old compiler.) However, Super Space Acer fails - coming up corrupted and then crashing. It will take me longer to dig into it to see where it's failing as it's much, much more complex, but as far as the title page /runs/, the title page is just corrupted. So I have a good idea where to debug, early in the startup. 1 Quote Link to comment Share on other sites More sharing options...
Tursi Posted December 7, 2023 Share Posted December 7, 2023 In this case, the bug was in my RLE unpack function. At a point where it's supposed to mask off a bit in a byte value, it instead zeros it. The rest of the function looks correct. void RLEUnpack(unsigned int p, const unsigned char *buf, unsigned int nMax) { unsigned char z; int cnt; // looks like the boss pack code has some bugs and packs too many bytes, we need this cnt = nMax; VDP_SET_ADDRESS_WRITE(p); while (cnt > 0) { z=*buf; if (z&0x80) { // run of byte buf++; z&=0x7f; raw_vdpmemset(*buf, z); buf++; } else { // sequence of data buf++; raw_vdpmemcpy(buf, z); buf+=z; } cnt-=z; } } This generates this asm. Our inputs are R1=>0000, R2=>6D72, R3=>1800 The first bytes at >6D72 are 8B 00 01 04 B7 00 01 10 def RLEUnpack RLEUnpack ai r10, >FFF6 Stack setup mov r10, r0 mov r11, *r0+ mov r9, *r0+ mov r13, *r0+ mov r14, *r0+ mov r15, *r0 mov r2, r14 save CPU address to R14 mov r3, r15 save byte count to R15 mov r1, r2 move VDP address to R2 sla r2, 8 get LSB of address movb r2, @>8C02 write to VDP address ori r1, >4000 merge command bits to VDP address for write srl r1, >8 unnecessary (in this case) clear of LSB sla r1, 8 movb r1, @>8C02 write command and high byte to VDP address jmp L163 jump into loop terminator, it'll come back to L166 L166 movb *r14+, r13 get first byte into R13 (8D) and increment (optimization!) jgt L164 jump if it's positive jeq L164 or zero * handler for run of byte (>80 bit set) clr r13 ** Bug? We are supposed to do z&=0x7f to remove the >80 bit <<----- movb r13, r2 copy the result into r2 for the call (count) srl r2, 8 make it a byte movb *r14+, r1 get the data byte we need into R1 (and increment) li r3, raw_vdpmemset address of the function to call bl *r3 call it (why not use immediate?) jmp L165 jump down to wrap up this token * handler for sequence of data (>80 bit clear) L164 movb r13, r9 copy the (now a count) into R9 (R9 temp is unneeded) srl r9, 8 make byte count into a word mov r9, r2 copy the word into R2 for the function call mov r14, r1 copy the data address into R1 for the function call li r3, raw_vdpmemcpy address of the function to call bl *r3 call it (again, immediate?) a r9, r14 add the count to the source address L165 srl r13, 8 make byte count into a word s r13, r15 subtract it from the total count L163 mov r15, r15 check remaining count jgt L166 if still positive, loop around mov *r10+, r11 else, restore stack and return mov *r10+, r9 mov *r10+, r13 mov *r10+, r14 mov *r10+, r15 b *r11 Quote Link to comment Share on other sites More sharing options...
+TheBF Posted December 7, 2023 Share Posted December 7, 2023 It's really cool to see the compiler output. How hard will it be to convince the compiler to use: BL @raw_vdpmemcpy Quote Link to comment Share on other sites More sharing options...
Tursi Posted December 7, 2023 Share Posted December 7, 2023 4 hours ago, TheBF said: It's really cool to see the compiler output. How hard will it be to convince the compiler to use: BL @raw_vdpmemcpy In cases where the call is made more than once, loading it in a register is good for performance, but I don't know if there's enough information to decide which way is quicker. There are a few other places where it could be more optimal - like loading the address to VDP would be quicker to just OR the >4000 into the original value and use SWPB. But, the compiler is also trying to account for every possible case, and sometimes that leads to slightly less optimal code. Overall I'm usually pretty impressed by the GCC output. 3 Quote Link to comment Share on other sites More sharing options...
+khanivore Posted December 7, 2023 Share Posted December 7, 2023 7 hours ago, Tursi said: In this case, the bug was in my RLE unpack function. At a point where it's supposed to mask off a bit in a byte value, it instead zeros it. The rest of the function looks correct. Oops again, same bug in AND as was in OR. Missing right shift in tms9900.md:1326. Should be : val = (INTVAL(operands[2]) << 8) & 0xFF00; My test missed it because this code path is only executed for immediate. I'll add another test. Regarding all the bit shifts, one thing I think I can do is define "strict" and "nonstrict" versions of byte extensions. nonstrict means we don't care about the low byte in a reg if it is only ever used for byte ops and could use SWPB which should be faster the SRL. Though in the unnecessary case above, the shift actually comes from the VDP_SET_ADDRESS_WRITE macro VDPWA=(((x|0x4000)>>8)); The compiler doesn't know why you are shifting right, but it knows it needs to shift it left to do a MOVB. Another possible improvement, I'm thinking that saving R13,R14,R15 on every function call is excessive as we never emit BLWP. I could make R15 the SP and R14 the BP to make R1-R10 general regs and reduce stack/mem use. 1 Quote Link to comment Share on other sites More sharing options...
+khanivore Posted December 7, 2023 Share Posted December 7, 2023 7 hours ago, TheBF said: It's really cool to see the compiler output. How hard will it be to convince the compiler to use: BL @raw_vdpmemcpy In fact it does, but not when using the optimiser flags. It thinks it is faster not to. I'll have to look into why it thinks that. 1 Quote Link to comment Share on other sites More sharing options...
Tursi Posted December 7, 2023 Share Posted December 7, 2023 10 hours ago, khanivore said: Though in the unnecessary case above, the shift actually comes from the VDP_SET_ADDRESS_WRITE macro VDPWA=(((x|0x4000)>>8)); The compiler doesn't know why you are shifting right, but it knows it needs to shift it left to do a MOVB. Another possible improvement, I'm thinking that saving R13,R14,R15 on every function call is excessive as we never emit BLWP. I could make R15 the SP and R14 the BP to make R1-R10 general regs and reduce stack/mem use. Ahh, interesting. Some 8-bit compilers, like SDCC for the Z80, recognize that sequence as accessing a single byte of a temporary and directly reach for it instead of shifting. I have no idea how they detect that though. I'd say you're right on R13-R15. If we save that on every function call, removing that could be a big win. 2 Quote Link to comment Share on other sites More sharing options...
Tursi Posted December 7, 2023 Share Posted December 7, 2023 I suppose I could optimize my macro using pointers to save those shifts. That wouldn't hurt my feelings too much and would be worth the cycles. 1 Quote Link to comment Share on other sites More sharing options...
+chue Posted December 7, 2023 Share Posted December 7, 2023 On 12/6/2023 at 2:42 PM, chue said: I saw a couple of issues, during testing: The first being background/ foreground colors not being set as expected, and the second being unexpected output on one of my unit tests. I still need to investigate Just to close on the above, these are code issues on my end and not issues in the compiler. 2 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.