42bs Posted October 5, 2022 Share Posted October 5, 2022 Added depacker for LZSA format 1: https://github.com/42Bastian/new_bjl/tree/main/exp/depacker Slightly better compression compared to LZ4, but 6% slower de-compression. 5 1 Quote Link to comment Share on other sites More sharing options...
42bs Posted October 9, 2022 Author Share Posted October 9, 2022 (edited) Update: - Sync'ing GPU with 68k was somewhat wrong. Reading the "sync" flag from GPU RAM seems to be no good idea. Now uses GPU->CPU Interrupt - untp did hand crash spuriously. Added a counter to see if depacking works. - added a raw byte copy for speed comparison Edited October 9, 2022 by 42bs 3 Quote Link to comment Share on other sites More sharing options...
42bs Posted October 10, 2022 Author Share Posted October 10, 2022 Added speed optimized versions of LZSA1, LZ4 und TP. 1 Quote Link to comment Share on other sites More sharing options...
Cyprian Posted October 11, 2022 Share Posted October 11, 2022 On 10/9/2022 at 2:45 PM, 42bs said: - Sync'ing GPU with 68k was somewhat wrong. Reading the "sync" flag from GPU RAM seems to be no good idea. what was wrong with that sync flag ? Quote Link to comment Share on other sites More sharing options...
42bs Posted October 11, 2022 Author Share Posted October 11, 2022 (edited) I had a strange behavior that sometimes, the 68k did continue even though the GPU was not yet finished. Maybe, it was because I used only one flag/semaphore. I need to try, if it works better if one flag is used to kick the GPU and _another_ is used to notify the 68k when the GPU has done. But actually, I think the GPU->CPU interrupt is the better method, as the 68k can be stopped. Edit: I tried to step back, but could not reproduce. But, nevertheless, using the interrupt speeds things up, as the 68k does not hog the bus. TP-Fast w/ interrupt and `stop #$2000`: 122ms TP-fast w/o interrupt: 154s Edited October 11, 2022 by 42bs Quote Link to comment Share on other sites More sharing options...
42bs Posted October 11, 2022 Author Share Posted October 11, 2022 (edited) I cannot emphasis more: DO NOT USE CLR.L to GPU RAM!!! Edited October 11, 2022 by 42bs 3 Quote Link to comment Share on other sites More sharing options...
ggn Posted October 11, 2022 Share Posted October 11, 2022 Oh, that's what the .noclear directive does in rmac! (which actually does nothing, there's no implementation in there, just a message ) Quote Link to comment Share on other sites More sharing options...
42bs Posted October 11, 2022 Author Share Posted October 11, 2022 6 minutes ago, ggn said: Oh, that's what the .noclear directive does in rmac! (which actually does nothing, there's no implementation in there, just a message ) "Warning: CLR.L opcode ignored..." 🙂 That'll be tough, if rmac would just remove all "clr.l" in the code 🙂 1 Quote Link to comment Share on other sites More sharing options...
ggn Posted October 11, 2022 Share Posted October 11, 2022 5 hours ago, 42bs said: "Warning: CLR.L opcode ignored..." 🙂 That'll be tough, if rmac would just remove all "clr.l" in the code 🙂 Well I guess we could range check the clr.ls if outputting Jaguar code (whenever possible) and have a warning issued. Or, we could simply tell everyone off for using clr.l in memory as it's a load-modify-store instruction and wastes cycles Quote Link to comment Share on other sites More sharing options...
42bs Posted October 12, 2022 Author Share Posted October 12, 2022 clr.l is load-modify-store? Why do you think. Quote Link to comment Share on other sites More sharing options...
Cyprian Posted October 12, 2022 Share Posted October 12, 2022 ok. clr instruction is buggy in 68k, I wonder what would it be if you use e.g. "move.l" instead. "CLR instruction always reads from an operand before clearing it" http://www.easy68k.com/paulrsm/doc/trick68k.htm 22 hours ago, 42bs said: using the interrupt speeds things up, as the 68k does not hog the bus. TP-Fast w/ interrupt and `stop #$2000`: 122ms TP-fast w/o interrupt: 154s nice hint. thanks 3 1 Quote Link to comment Share on other sites More sharing options...
42bs Posted October 12, 2022 Author Share Posted October 12, 2022 (edited) 1 hour ago, Cyprian said: ok. clr instruction is buggy in 68k, I wonder what would it be if you use e.g. "move.l" instead. "CLR instruction always reads from an operand before clearing it" http://www.easy68k.com/paulrsm/doc/trick68k.htm Ouch! Thanks! This explains why "clr.l d0" takes 6 cycles. Edited October 12, 2022 by 42bs Quote Link to comment Share on other sites More sharing options...
Chilly Willy Posted October 16, 2022 Share Posted October 16, 2022 Yeah, lots of 68K based computers/consoles (like the Amiga and Genesis) warned against using clr on hardware registers. It's one of those things you learned while programming in assembly on those systems. Jaguar is just another with limitations on instructions like clr. Quote Link to comment Share on other sites More sharing options...
42bs Posted October 17, 2022 Author Share Posted October 17, 2022 8 hours ago, Chilly Willy said: Yeah, lots of 68K based computers/consoles (like the Amiga and Genesis) warned against using clr on hardware registers. It's one of those things you learned while programming in assembly on those systems. Jaguar is just another with limitations on instructions like clr. Yepp, HW registers are often a source of trouble. But I did not know about "clr.l" is reading (for 68k). You never stop learning new things. Quote Link to comment Share on other sites More sharing options...
DEATH Posted November 19, 2022 Share Posted November 19, 2022 "CRL" is not buggy on the 68000. It's an instruction intended for multiprocessor architecture or subroutine/conditional speed test. This is the "reverse" instruction of "TAS" TAS test an set, CLR test and clear This is the reason why it cannot be used on address registers, it's reserved only for testing data (again, on a multiprocessor, mutithread/subroutine or wathever architecture) CRL (and TAS) should therefore always be followed by a conditional jump 2 Quote Link to comment Share on other sites More sharing options...
DEATH Posted November 19, 2022 Share Posted November 19, 2022 should be tested because it was a long time ago.... Quote Link to comment Share on other sites More sharing options...
SCPCD Posted November 20, 2022 Share Posted November 20, 2022 (edited) I don't think that the CLR instruction is the "reverse" of TAS as the CLR sets flags according to destination value (which is always 0). TAS sets flags depending of what was here and what it now is. I would imagine that to reduce transistor count, CLR instruction probably share same or partially state machine from another instruction like the NEG instruction for exemple (as they both have same size field, effective address mode and timing). Edited November 20, 2022 by SCPCD 1 Quote Link to comment Share on other sites More sharing options...
42bs Posted November 20, 2022 Author Share Posted November 20, 2022 Also TAS is only defined for byte access. The m680xx manual lists only TAS (for m68000) and CAS/CAS2 (x40) as multi-cpu instructions. Another hint for @SCPCD being right is the fact, that m68000 and m68008 read the destination. 1 Quote Link to comment Share on other sites More sharing options...
DEATH Posted November 20, 2022 Share Posted November 20, 2022 (edited) Like I said it was a long time ago, I don't remember very well... At the time I remember that the CLR instruction was already causing problems, as soon as the 68000 came out. The recommendations, the rule, was that we NEVER use (or be very careful) the CLR instruction whatsoever on a multi processor/multi BUS MASTER or even single CPU/BUS MASTER system. For a single CPU/BUS MASTER system it was because the instruction takes an extra read cycle (whatever the reason, if it's just to erase an operand...), and for a multi processor/BUS MASTER system it was (from memory) because the READ-MODIFY-WRITE cycle of this instruction could be "broken" in the middle and the end result could then be unexpected. So you had to be careful where (and/or when) to use it. Officially only the TAS instruction uses the RMW cycle of the 68000 which cannot be split. Logically any other instruction that would do a sort of RMW would do so with a simple read cycle, an ALU cycle then a write cycle, the BUS being able to be resumed between the read and write cycle. There may be a misunderstanding somewhere, a bug in the documentation or something else... I think it may be best never to use an "RMW" type instruction on the Jaguar with data shared between multiple processors (edit : exept for the TAS instruction) Edited November 20, 2022 by DEATH Quote Link to comment Share on other sites More sharing options...
DEATH Posted November 20, 2022 Share Posted November 20, 2022 Oh, also for some hardware registers this may have unexpected consequences. Because some registers can be modified (or cause an event) just by reading them. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.