Great Exploitations

cd-w · June 24, 2008

Would it be possible to support the DEFLATE algorithm used to compress files in a ZIP archive? The algorithm is basically string matching (LZ77) combined with huffman coding, so you are halfway there already.

Chris

Thomas Jentzsch · June 24, 2008

Looks like some of my code is really nicely optimized (try Jammed too).

For optimal Hufman encoding, you should make the compression more flexible. E.g. count all bytes and adjust the compression accordingly. Decompression of Huffman data is really simple. I have done that in Jammed. The central loop looks like that:

.loopValue:
asl	 shift			  ; 5
beq	 .loadValue		 ; 2/3
.contValue:
rol						; 2
inx						; 2
sec						; 2
sbc	 SubTab,x		   ; 4
bcs	 .loopValue		 ; 2/3

The biggest problem is the table with the ordered bytes. It takes 256 bytes, which is quite a lot compared to the possible compression ratio. Though maybe the table could be optoimized somehow too.

Thomas Jentzsch · June 24, 2008

Would it be possible to support the DEFLATE algorithm used to compress files in a ZIP archive? The algorithm is basically string matching (LZ77) combined with huffman coding, so you are halfway there already.

I suppose that would be overkill for most files.

But maybe have a look at pucrunch which is an extremely compact [~260 bytes) implementation of what you are looking for.

Also have a look at xip, which is simplier but also even smaller.

batari · June 24, 2008

Would it be possible to support the DEFLATE algorithm used to compress files in a ZIP archive? The algorithm is basically string matching (LZ77) combined with huffman coding, so you are halfway there already.

Chris

Thanks - I read that and got some ideas. LZ77 got me thinking that certain byte pairs might be common, e.g. 85 02. I'm going to analyze some binaries and see how common they really are. Maybe searching for unused bytes and using those as a code for common byte pairs would work - as every binary seems to have a few unused bytes.

I also read up more on Huffman coding, and found my codes are suboptimal. I tried various types and found that I can get Swoops! down to 3649 bytes using just Huffman codes (not bad considering zipping it results in a 3625 byte file, though some of that is due to zip's headers and the like.)

batari · June 24, 2008

Looks like some of my code is really nicely optimized (try Jammed too).

For optimal Hufman encoding, you should make the compression more flexible. E.g. count all bytes and adjust the compression accordingly. Decompression of Huffman data is really simple. I have done that in Jammed. The central loop looks like that:
.loopValue:
asl	 shift			 ; 5
beq	 .loadValue		; 2/3
.contValue:
rol					; 2
inx					; 2
sec					; 2
sbc	 SubTab,x		  ; 4
bcs	 .loopValue		; 2/3
The biggest problem is the table with the ordered bytes. It takes 256 bytes, which is quite a lot compared to the possible compression ratio. Though maybe the table could be optoimized somehow too.

Just looked at xip, interestingly it's doing exactly what I tried earlier today (fixed-length Huffman-like coding.) I wonder if I can beat it by looking for byte pairs?

batari · June 24, 2008

Looks like some of my code is really nicely optimized (try Jammed too).

My first attempt increases the size to 4354 bytes... Jammed indeed

However, using a xip-like method as I did above, I was able to beat .zip with this one - 4071 vs. 3899 bytes.

Thomas Jentzsch · June 24, 2008

However, using a xip-like method as I did above, I was able to beat .zip with this one - 4071 vs. 3899 bytes.

Including decompression code size?

batari · June 24, 2008

However, using a xip-like method as I did above, I was able to beat .zip with this one - 4071 vs. 3899 bytes.

Including decompression code size?

No, that hasn't been written yet

But presumably, a zip decompression routine would be much longer than xip. I haven't looked into the length of the xip decompressor, but I don't think it's very large.

Thomas Jentzsch · June 24, 2008

But presumably, a zip decompression routine would be much longer than xip. I haven't looked into the length of the xip decompressor, but I don't think it's very large.

~60 bytes, IIRC.

supercat · June 25, 2008

Many 2600 games contain a fair amount of redundant data. In some cases, significant quantities of code or data are duplicated in multiple banks; in other cases, data may be duplicated to allow for sufficiently-rapid access. I would think something like the following approach would be pretty good: Divide the last 4K bank (the only one to be compressed) into records, each of which starts with a type/length header and the specified follow-on data:

0nnnnnnn -- 1 to 128 bytes of literal data (most common record type)

10nnnnnn -- 2 to 65 copies of a particular data value (first byte following the record header)

11nnnnnn -- 2 to 65 bytes copied from the address specified in the next two bytes. Note that the specified address could be in any bank.

The worst-case behavior of this scheme would be to expand the code by one byte per 128, or 32 bytes total. Any time a byte appears three or more times consecutively would cost 2-3 bytes. Cloning a sequence of up to 65 bytes would take 3-4 bytes.

Lempel-Ziv coding may also work well, but might take more work to decompress. Something like the modified run-length encoding described above should be helpful in many situations without being overly complex.

Thomas Jentzsch · June 25, 2008

Lempel-Ziv coding may also work well, but might take more work to decompress. Something like the modified run-length encoding described above should be helpful in many situations without being overly complex.

Doing all together we would have something like LZW here, right?

I did an experiment yesterday with my games folder and compressed all 4K files with ZIP into single archives. On average(!) the IMO better games take more space, while prototypes, WIP and bad games take less space. Sizes were between 1.8 and 4.2k.

batari · June 25, 2008

Many 2600 games contain a fair amount of redundant data. In some cases, significant quantities of code or data are duplicated in multiple banks; in other cases, data may be duplicated to allow for sufficiently-rapid access. I would think something like the following approach would be pretty good: Divide the last 4K bank (the only one to be compressed) into records, each of which starts with a type/length header and the specified follow-on data:

0nnnnnnn -- 1 to 128 bytes of literal data (most common record type)

10nnnnnn -- 2 to 65 copies of a particular data value (first byte following the record header)

11nnnnnn -- 2 to 65 bytes copied from the address specified in the next two bytes. Note that the specified address could be in any bank.

The worst-case behavior of this scheme would be to expand the code by one byte per 128, or 32 bytes total. Any time a byte appears three or more times consecutively would cost 2-3 bytes. Cloning a sequence of up to 65 bytes would take 3-4 bytes.

Then I think it would make more sense to do 4-67 for 10nnnnnn, and 5-68 for 11nnnnnn, as there would be no point in changing modes unless there was actual gain.

I like this idea so far, and unless my initial tests are incorrect, it's the most efficient of the bunch. I'll have to write an actual compressor and decompressor now to see if my compression test results were indeed accurate.

Thomas Jentzsch · June 25, 2008

Then I think it would make more sense to do 4-67 for 10nnnnnn, and 5-68 for 11nnnnnn, as there would be no point in changing modes unless there was actual gain.

But you don't know what is coming next. If there is a mode change anyway, you'd loose one byte in compression. 3-66 and 4-67 seems best.

batari · June 25, 2008

Then I think it would make more sense to do 4-67 for 10nnnnnn, and 5-68 for 11nnnnnn, as there would be no point in changing modes unless there was actual gain.

But you don't know what is coming next. If there is a mode change anyway, you'd loose one byte in compression. 3-66 and 4-67 seems best.

I think that changing modes will result in a very small gain in exchange for slower decompression. I'd think that 99% of the time, changing modes would simply break even space-wise.

I successfully implemented this and I've already beat all other schemes, and I didn't even do the repeating byte code - I found that 0 for literal and 1 for copy was all I needed.

EDIT: Tried 5-68 and 4-67, and 4-67 resulted in a savings of around 1% as I predicted, but I counted the number of mode switches and that number is unchanged on the 4 binaries I tried, which kind of surprised me. So 4-67 it is (actually, I'm using 4-131, as there are just two modes.)

supercat · June 25, 2008

I successfully implemented this and I've already beat all other schemes, and I didn't even do the repeating byte code - I found that 0 for literal and 1 for copy was all I needed.

Depending upon the exact implementation of the compressor/decompressor, you could represent a repeating byte sequence using four bytes: the byte value, then a 'copy' code for the number of repetitions minus one, then the address of the first byte.

Thomas Jentzsch · June 26, 2008

Depending upon the exact implementation of the compressor/decompressor, you could represent a repeating byte sequence using four bytes: the byte value, then a 'copy' code for the number of repetitions minus one, then the address of the first byte.

Or you define a relative window for code repeating (e.g. up to -4k). Then you only need e.g. 12 bits for the address.

supercat · June 27, 2008

Depending upon the exact implementation of the compressor/decompressor, you could represent a repeating byte sequence using four bytes: the byte value, then a 'copy' code for the number of repetitions minus one, then the address of the first byte.

Or you define a relative window for code repeating (e.g. up to -4k). Then you only need e.g. 12 bits for the address.

Data may be repeated in any bank. Often at the same address within a bank but not always. A "short repeat" code may be helpful, but there are always tradeoffs.

Thomas Jentzsch · June 27, 2008

Data may be repeated in any bank. Often at the same address within a bank but not always. A "short repeat" code may be helpful, but there are always tradeoffs.

Sure. But why reserving 16 bits for a 13 bit ROM?

18 Comments

Recommended Comments

cd-w 553

Link to comment

Thomas Jentzsch 10,864

Link to comment

Thomas Jentzsch 10,864

Link to comment

batari 4,549

Link to comment

batari 4,549

Link to comment

batari 4,549

Link to comment

Thomas Jentzsch 10,864

Link to comment

batari 4,549

Link to comment

Thomas Jentzsch 10,864

Link to comment

supercat 125

Link to comment

Thomas Jentzsch 10,864

Link to comment

batari 4,549

Link to comment

Thomas Jentzsch 10,864

Link to comment

batari 4,549

Link to comment

supercat 125

Link to comment

Thomas Jentzsch 10,864

Link to comment

supercat 125

Link to comment

Thomas Jentzsch 10,864

Link to comment

Recently Browsing 0 members

Apps

My Activity Streams

More