2600 Compression redux
This is actually quite different than the first topic on 2600 compression.
Let's suppose we have some RAM available, and want to crunch a 4k bank of arbitrary 2600 code/data by as much as feasible, assuming the decompressor will live within some of the space we have freed. Therefore the decompressor should be simple, but compression can be as complicated as a modern computer can handle.
I have analyzed various 2600 binaries, and I'm showing the most common 5 bytes are typically A9, A5, 85, 02, and F0 (which would probably come as no surprise.) Also showing up a lot are 00, FF, 01 (more so on bankswitched games.) But in any case the most common 8 bytes typically comprise around 25% of the total size, sometimes up to 50%.
I have come up with a scheme based on Huffman coding. The program analyzes the binary, calculates the most common 8 bytes, and sorts them. The 8 bytes comprise the first 8 bytes of the image as as a lookup table, then the binary is encoded. If found, each of the 8 bytes are encoded in a bitsteam as 10, 110, 1110, 11110, 111110, 1111110, 11111110, and 11111111 respectively. The other 248 bytes are encoded as they appear but with a leading zero bit to distinguish them from the shorter codes. The decompression routine should be simple and short. This one certainly would be both, albeit a little slow computationally, but that's probably fine.
I tried it with bankswitched games first, and these are the results (per bank):
Asteroids: 2862, 4048
Gingerbread Man: 3810, 3817
Thrust: 3575, 3886, 3899, 4125(!)
Minigame Multicart: 2633, 2808, 3150, 3780, 2546, 3245, 3117, 3213
Medieval Mayhem: 4023, 3171, 2974, 3927, 3682, 3736, 3831, 3847
Juno First (beta 10): 3860, 3298, 3300, 3595, 3373, 3834, 2591, 2643
Strat-O-gems: 3528, 3421, 3972, 3970, 2963, 2953, 3939, 2713
(Some of you may know why I'm pleased with the 32k results, but I won't get into that now.)
4k games were less impressive:
Yars' Revenge: 3945
Space Invaders: 3964
Pitfall: 3896
Adventure: 3874
Gunfight: 3680
Swoops!: 4161(!!)
So the reason I've posted this was to open up a discussion as to how compression of 2600 binaries might be improved in terms of compressed code size, while balancing the size of the decompressor as it must be integrated into the compressed image.
18 Comments
Recommended Comments