ivop Posted May 26, 2019 Share Posted May 26, 2019 (edited) BTW you can also record the converted Pokey values and stream those. Perhaps you can just record a sid2gumby track with Altirra. Don't know if it supports stereo SAP-R. Edit: BTW2: this does not necessarily mean way more streams (16 or 18). AUDCTL is constant and one sid voice is mirrored on the second pokey. Also, IIRC, the AUDC values of one half of the 16-bit channels are constant. But even if that's not the case, you end up with 6 AUDF/AUDC pairs, hence 12 streams of importance. Just need two extra stores for the mirrored sid voice. Edited May 26, 2019 by ivop Quote Link to comment Share on other sites More sharing options...
rensoup Posted May 27, 2019 Author Share Posted May 27, 2019 BTW you can also record the converted Pokey values and stream those. Perhaps you can just record a sid2gumby track with Altirra. Don't know if it supports stereo SAP-R. Edit: BTW2: this does not necessarily mean way more streams (16 or 18). AUDCTL is constant and one sid voice is mirrored on the second pokey. Also, IIRC, the AUDC values of one half of the 16-bit channels are constant. But even if that's not the case, you end up with 6 AUDF/AUDC pairs, hence 12 streams of importance. Just need two extra stores for the mirrored sid voice. I didn't know Sid2Gumby, was only aware of the latest iteration of your AtariSid... Honestly, S2G feels a little dated, especially stacked against AS5 . Wish I could SAPR those instead... but yeah at 15khz that's not going to happen (although only the first channel seems to vary much during a frame) I don't know the tech behind AS5, nor do I know anything about Pokey . But just out of curiosity, I think I read you were doing software mixing, does that mean a Sid voice is not mapped to a Pokey voice ? although according to the PMG visualizer that seems to be the case. Quote Link to comment Share on other sites More sharing options...
_The Doctor__ Posted May 28, 2019 Share Posted May 28, 2019 IDK as was just listening to a song in another thread that captured the angry bees perfectly... I think emkay has the bees all trained and spinning about in pokey just like they do in sid... Quote Link to comment Share on other sites More sharing options...
dmsc Posted May 28, 2019 Share Posted May 28, 2019 Hi all! Perhaps larger buffer sizes in dmsc's lzss would yield good gains too? It's pretty usable for any tune under 2 minutes right now. Interesting that stream #3 actually increases in size (probably the drum channel), but stream #1 and #5 decrease in size, with an overall saving of an extra 85 bytes. But it would need an extra operation during decompression. I just implemented 12 bit matches in the LZSS coder/decoder, this allows using larger window sizes, and produces a big gain in compression ratio, compare: SHADOW.SAP: 8 bit match, 16 bytes window: max offset = 16, max len = 17, match bits = 8, ratio: 5675 / 42759 = 13.27% 12 bit match, 128 bytes window: max offset = 128, max len = 33, match bits = 12, ratio: 2201 / 42759 = 5.15% 3D_RMT.SAP (from RMT distribution, "3D, Atari version by raster/c.p.u. 2009"): 8 bit match, 16 bytes window: max offset = 16, max len = 17, match bits = 8, ratio: 15493 / 53568 = 28.92% 12 bit match, 128 bytes window: max offset = 128, max len = 33, match bits = 12, ratio: 4800 / 53568 = 8.96% As the player is only 180 bytes, in the case of the "3D_RMT" sample, the player now is smaller than the original RMT player, 4980 v/s 5259 bytes. The new "lzss" program accepts options, use "-8" for original 8 bit matches, "-2" for new 12 bit matches, or any other combination to try other match window sizes. lzss-sap-20190527.zip shadows-12.xex 3d_rmt-12.xex 3d_rmt.xex 6 Quote Link to comment Share on other sites More sharing options...
ivop Posted May 28, 2019 Share Posted May 28, 2019 (edited) I didn't know Sid2Gumby, was only aware of the latest iteration of your AtariSid... Honestly, S2G feels a little dated, especially stacked against AS5 . Wish I could SAPR those instead... but yeah at 15khz that's not going to happen (although only the first channel seems to vary much during a frame) I don't know the tech behind AS5, nor do I know anything about Pokey . But just out of curiosity, I think I read you were doing software mixing, does that mean a Sid voice is not mapped to a Pokey voice ? although according to the PMG visualizer that seems to be the case. https://github.com/ivop/atarisid It seems you have missed AtariSid 6? The binaries are in the xex directory. You are right that the PMG's do not vary much, but that's because they only tell the volume, at 50Hz The width is not based on the samples. There simply wasn't enough cpu time to show some sort of osciloscope when I managed to switch from 7.8kHz to 15.6kHz. Atarisid uses Pokey's first channel as a 15.6kHz timer. Each time it fires, it writes to pokey channels 2, 3 and 4. The whole emulation is a sort of a soft-synth. Edited May 28, 2019 by ivop 2 Quote Link to comment Share on other sites More sharing options...
rensoup Posted May 29, 2019 Author Share Posted May 29, 2019 Hi all! I just implemented 12 bit matches in the LZSS coder/decoder, this allows using larger window sizes, and produces a big gain in compression ratio, compare: Tried it on 7 gates of Jambala (110KB SAP). Lzs8: 30KB Lzs12: 8.5KB The new version can be a tiny bit slower but still several times faster than the original player! Guess it's an all around solution now 1 Quote Link to comment Share on other sites More sharing options...
rensoup Posted May 29, 2019 Author Share Posted May 29, 2019 https://github.com/ivop/atarisid It seems you have missed AtariSid 6? The binaries are in the xex directory. Well yeah I somehow missed it, sounds really nice (and cpu intensive)! Quote Link to comment Share on other sites More sharing options...
dmsc Posted May 30, 2019 Share Posted May 30, 2019 Hi! Tried it on 7 gates of Jambala (110KB SAP). Lzs8: 30KB Lzs12: 8.5KB The new version can be a tiny bit slower but still several times faster than the original player! Guess it's an all around solution now Main difference is in the RAM usage, as the 12 bit version uses 128 bytes for each stream. I implemented a third version, with 16 bit matches. Using 8 bits for the window size (so 256 bytes for each stream), the compression is much better, tested it with 4 samples (attached original and compressed files): ---- shadows.sap ----- max offset= 16, max len= 17, match bits= 8, ratio: 5675 / 42759 = 13.27% max offset= 128, max len= 33, match bits= 12, ratio: 2201 / 42759 = 5.15% max offset= 256, max len= 256, match bits= 16, ratio: 1103 / 42759 = 2.58% ---- 3d_rmt.rsap ----- max offset= 16, max len= 17, match bits= 8, ratio: 15493 / 53568 = 28.92% max offset= 128, max len= 33, match bits= 12, ratio: 4800 / 53568 = 8.96% max offset= 256, max len= 256, match bits= 16, ratio: 3536 / 53568 = 6.60% ---- 4tk35.rsap ------ max offset= 16, max len= 17, match bits= 8, ratio: 27051 / 106587 = 25.38% max offset= 128, max len= 33, match bits= 12, ratio: 9442 / 106587 = 8.86% max offset= 256, max len= 256, match bits= 16, ratio: 6342 / 106587 = 5.95% ---- aurora.rsap ----- max offset= 16, max len= 17, match bits= 8, ratio: 38741 / 114048 = 33.97% max offset= 128, max len= 33, match bits= 12, ratio: 14495 / 114048 = 12.71% max offset= 256, max len= 256, match bits= 16, ratio: 11903 / 114048 = 10.44% As you see, now "shadow.sap" is a little more than 1kB. My samples are from the RMT128 distribution, converted to SAP type-R. Have Fun! lzss-sap-20190529.zip 3d_rmt-16.xex shadows-16.xex aurora-16.xex 4tk35-16.xex samples.zip 7 Quote Link to comment Share on other sites More sharing options...
tebe Posted May 30, 2019 Share Posted May 30, 2019 great tool DMSC playlzs16.asm (aurora.lz16) plays wrong Quote Link to comment Share on other sites More sharing options...
dmsc Posted May 30, 2019 Share Posted May 30, 2019 Hi! great tool DMSC playlzs16.asm (aurora.lz16) plays wrong Thanks. I'm not near the PC now, but you must use te following command line: lzss -6 input.rsap test.lz12 [code] This is the same as: [code] lzss -b 16 -o 8 -m 1 input.rsap test.lz12 Quote Link to comment Share on other sites More sharing options...
tebe Posted May 30, 2019 Share Posted May 30, 2019 Aurora with CPU usage indicator lzss_POKEY_cpu_usage.zip 2 Quote Link to comment Share on other sites More sharing options...
pirx Posted May 31, 2019 Share Posted May 31, 2019 THese are fantastic results, kudos!!! Quote Link to comment Share on other sites More sharing options...
emkay Posted May 31, 2019 Share Posted May 31, 2019 IDK as was just listening to a song in another thread that captured the angry bees perfectly... I think emkay has the bees all trained and spinning about in pokey just like they do in sid... Hehe. To wich one do you refer exactly? Quote Link to comment Share on other sites More sharing options...
ivop Posted May 31, 2019 Share Posted May 31, 2019 Another idea: turn the compressed data+player into a SAP file again, but this time Type B This would allow existing SAP players without SAP-R support to play these songs (i.e. ALL of them, except for Altirra, but that's not strictly a SAP player). Quote Link to comment Share on other sites More sharing options...
rensoup Posted May 31, 2019 Author Share Posted May 31, 2019 Hi! Main difference is in the RAM usage, as the 12 bit version uses 128 bytes for each stream. I implemented a third version, with 16 bit matches. Using 8 bits for the window size (so 256 bytes for each stream), the compression is much better, tested it with 4 samples (attached original and compressed files): Tried it on 7 gates of Jambala (110KB SAP). Lzs8: 30KB Lzs12: 8.5KB lzs16: 3.7KB !! Can't wait for the next release 2 Quote Link to comment Share on other sites More sharing options...
dmsc Posted December 3, 2019 Share Posted December 3, 2019 Hi all! On 5/31/2019 at 5:22 PM, rensoup said: Can't wait for the next release Well, @rensoup discovered a bug in the compressor, so here is a new version. Have Fun! lzss-sap-20191202.zip 3 1 Quote Link to comment Share on other sites More sharing options...
elmer Posted December 15, 2019 Share Posted December 15, 2019 (edited) On 5/13/2019 at 11:14 PM, xxl said: LZ4 is very fast gpl3.txt - 35147 bytes exomizer - 12382 bytes + depacker 1 page =~ 12.3 KB, decompress 128 frames (2.6 sec) deflate - 11559 bytes + depacker 2 pages =~ 11.8 KB, decompress 179 frames (3.6 sec) LZ4 - 15622 bytes + depacker <150 bytes =~ 15.3 KB, decompress 55 frames (1,1 sec) Even though @dmsc has done such a wonderful job in fulfilling @rensoup's needs for SAP compression/decompression, I'm resurrecting this old post to go back to the less-specific discussion about file compression on our old 8-bit and 16-bit computers/consoles. For me, LZ4's minimum match length of 4-bytes is a serious problem on these old target machines, although it is absolutely fine on the 32-bit target platforms that the LZ4 algorithm was actually designed for. To show why I think that, I've written an optimized 6502 decompressor for aPLib (to replace the slow 65C02 example that is in aPLib), and done some testing on it. The gpl3.txt text file example that @xxl shows doesn't really match Atari game code or data, but it is interesting to see how aPLib does in comparison to the others ... aplib - 13148 bytes + depacker 1 page =~12.8 KB, decompress 68.5 frames (1,4 sec) So ... 0.3 sec longer than @xxl's LZ4 depacker, for a saving of 2.5KB. Looking at the compressed data, over 30% of the matches are under 4-bytes in length. Looking at the "Legend of Xanadu 2" actual game data that I have mentioned before, using aPLib results in 934005 out of 1635397 matches (i.e. 57%) that were less than 4-bytes. This suggests that even if decompression speed is your most important criteria, then someone should be able to come up with a better solution than LZ4 for the kind of data that we see on 8-bit and 16-bit machines. For programmers that haven't seen it yet, may I point out Emmanuel Marty's LZSA ... https://github.com/emmanuel-marty/lzsa For anyone that is willing to spend a few more cycles to get better compression than LZSA, my 6502 decompressor for aPLib can be found here ... https://github.com/jbrandwood/aplpak Edited December 16, 2019 by elmer Remove whitespace. 2 1 Quote Link to comment Share on other sites More sharing options...
dmsc Posted December 16, 2019 Share Posted December 16, 2019 Hi! 2 hours ago, elmer said: Even though @dmsc has done such a wonderful job in fulfilling @rensoup's needs for SAP compression/decompression, I'm resurrecting this old post to go back to the less-specific discussion about file compression on our old 8-bit and 16-bit computers/consoles. For me, LZ4's minimum match length of 4-bytes is a serious problem on these old target machines, although it is absolutely fine on the 32-bit target platforms that the LZ4 algorithm was actually designed for. Yes, I do believe that the LZ4 format is not the best for our 8-bit machines! 2 hours ago, elmer said: To show why I think that, I've written an optimized 6502 decompressor for aPLib (to replace the slow 65C02 example that is in aPLib), and done some testing on it. The gpl3.txt text file example that @xxl shows doesn't really match Atari game code or data, but it is interesting to see how aPLib does in comparison to the others ... aplib - 13148 bytes + depacker 1 page =~12.8 KB, decompress 68.5 frames (1,4 sec) So ... 0.3 sec longer than @xxl's LZ4 depacker, for a saving of 2.5KB. Looking at the compressed data, over 30% of the matches are under 4-bytes in length. Looking at the "Legend of Xanadu 2" actual game data that I have mentioned before, using aPLib results in 934005 out of 1635397 matches (i.e. 57%) that were less than 4-bytes. This suggests that even if decompression speed is your most important criteria, then someone should be able to come up with a better solution than LZ4 for the kind of data that we see on 8-bit and 16-bit machines. For programmers that haven't seen it yet, may I point out Emmanuel Marty's LZSA ... https://github.com/emmanuel-marty/lzsa Did not know that, a great format indeed! But it got me thinking, perhaps an LZSS derived format with an optional bit for reusing last match offset could archive greater compression with a very small size - specially in the 6502, where you can load bits with one instruction. Reusing match offset is great when compressing frames of an animation or other data that has only some bytes changed from the previous, as all the offsets are referencing the last frame, so are the same. Have Fun! Quote Link to comment Share on other sites More sharing options...
xxl Posted December 16, 2019 Share Posted December 16, 2019 9 hours ago, elmer said: aplib - 13148 bytes + depacker 1 page =~12.8 KB, decompress 68.5 frames (1,4 sec) So ... 0.3 sec longer than @xxl's LZ4 depacker, for a saving of 2.5KB. an even more efficient smallzl4 compressor is available - https://create.stephan-brumme.com/smallz4/#numbers in tests, the decompression time on atari did not change Quote Link to comment Share on other sites More sharing options...
rensoup Posted December 16, 2019 Author Share Posted December 16, 2019 16 hours ago, elmer said: To show why I think that, I've written an optimized 6502 decompressor for aPLib (to replace the slow 65C02 example that is in aPLib), and done some testing on it. The gpl3.txt text file example that @xxl shows doesn't really match Atari game code or data, but it is interesting to see how aPLib does in comparison to the others ... aPlib looks like it could be interesting for Prince of Persia. I tried to compress a bunch of files and the compression ratio was pretty good for code (not as good as deflate but within 5%) but not for graphics. I could probably use it for loading the main executable which is around 23KB compressed (40+KB uncompressed). inflate takes about 4-5 seconds which is a big pause. Where do you think it sits on the decompression rate axis ? somewhere around LZSA2 ? Any chance you could provide the source in MADS format with a decompression example? I quickly hacked it to get it to build but I can't get it do decompress properly (I skipped the APK 4 bytes header) Quote Link to comment Share on other sites More sharing options...
rensoup Posted December 16, 2019 Author Share Posted December 16, 2019 To follow up on my original question... I was looking to compress 2 types of data and decompress them in realtime: 1. music: which @dmsc masterfully solved with LZSS 2. sprite data. There may probably not be any good solution for 2. For a single uncompressed 200 bytes frame, LZ4 can't do much, it gave almost no compression. I also tried @Irgendwer's autogamy which compressed slightly better than LZ4. Only deflate gave reasonable results but the decompression time was astronomical. I got almost as good results as deflate by simply removing empty bytes and storing an extra byte per line (sprites are 8 bytes large max) but the prospect of having to write a sprite routine for this format was too daunting so I gave up. Quote Link to comment Share on other sites More sharing options...
rensoup Posted December 16, 2019 Author Share Posted December 16, 2019 7 hours ago, xxl said: an even more efficient smallzl4 compressor is available - https://create.stephan-brumme.com/smallz4/#numbers in tests, the decompression time on atari did not change I tried a bunch of LZ4 compressors like the one you mentioned but most of them gave marginal gains at best (especially if you use the best compression option with the original LZ4). LZ4 still has the worst compression ratio. Quote Link to comment Share on other sites More sharing options...
elmer Posted December 16, 2019 Share Posted December 16, 2019 3 minutes ago, rensoup said: I tried a bunch of LZ4 compressors like the one you mentioned but most of them gave marginal gains at best (especially if you use the best compression option with the original LZ4). LZ4 still has the worst compression ratio. Yes, the problem isn't in the LZ4 compressor, it's in the LZ4 data format. LZ4 was never designed to produce the best compression ratios, it was specifically designed for both fast compression and decompression in order to reduce the memory usage of databases and other large datasets on modern PCs and servers. Just like smalllz4, Emmanuel Marty has also done his own "optimal" LZ4 packer that can get slightly better results than the official LZ4 packer ... https://github.com/emmanuel-marty/lz4ultra Again, the results are marginal improvements, and not major gains. Quote Link to comment Share on other sites More sharing options...
elmer Posted December 16, 2019 Share Posted December 16, 2019 1 hour ago, rensoup said: Where do you think it sits on the decompression rate axis ? somewhere around LZSA2 ? I've not written an LZSA2 decompressor, so I don't really know. Peter Ferrie's 6502 decompressor that is included in LZSA2 is written very similarly to his 65C02 decompressor for aPLib, and my aPLib code is about 2.5x faster than his. An optimized LZSA2 decompressor should be faster than an aPLib decompressor ... there has to be some size/speed tradoff for the extra compression that aPLib gets, just like with the LZ4 decompressors. Now that Emmanuel Marty has written an open source compressor for aPLib (https://github.com/emmanuel-marty/apultra), we have the opportunity to tweak the format a tiny bit in order to improve the decompression speed on the 6502. 1 hour ago, rensoup said: Any chance you could provide the source in MADS format with a decompression example? I quickly hacked it to get it to build but I can't get it do decompress properly (I skipped the APK 4 bytes header) I don't know MADS format, but I could probably take a look at it. I'm more interested in writing for a banked Atari cartridge rather than a floppy-disc game, so I modified the assembler in HuC to output Atari .car format files. If you've already hacked my source to get it to build, then you've probably already done everything that is needed. I suspect that the problem that you've found with my aPLpak format is because you only skipped the first 4 bytes of header info instead of the 12 bytes that you can skip if you are only compressing a single file. aPLpak is designed to store multiple compressed files/assets, so it starts with a header table that lists where the start/size is of all of the files within the archive. I'll post some example code. Quote Link to comment Share on other sites More sharing options...
xxl Posted December 16, 2019 Share Posted December 16, 2019 seems to be a popular compressor. https://github.com/svendahl/cap. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.