newcoleco Posted June 7, 2016 Share Posted June 7, 2016 (edited) This thread is about DAN1 data compression. I would love to get feedback before and after the final release. Name : DAN1 Author : Daniel Bienvenu aka NewColeco Start Year : 2014 End Year : 2016 Description : LZSS variant data compression format using Elias gamma length values, four fixed offset sizes, and raw parts (literal as is, see RLE). Objective : Great compression ratio no extra memory needed during decompression. Can be used on vintage 8-bit systems. Details : DAN1 is a format based on LZSS (a variant of LZ77) which attempts compression by finding matches from already seen data. These matches are encoded as a pair of values : size and relative offset. Size is encoded by using Elias gamma, Offset is encoded in 1,4,8 or 12 bits depending on its value. When no match is found for some particular data, it is stored raw, literals. Bits are used as markers in DAN1 format to identify between literals and matches, and between the different sizes of offsets. The first byte is literal by default since no already seen data exists, no need for a marker. An end marker is used to tell the decompressor to stop regardless if there are even more data in the stream or in memory. A special marker is used to tell the decompressor to copy as-is a sequence of bytes all at once. Since the offset sizes are hard coded and no dynamic tables such as adaptive Huffman are used in this format, no extra memory is needed to decompress the data. Data format (short text version) : The first byte is literal The following bytes are bits indicating what is next, reading from the highest bit first. If it's a bit set to 1, the next byte is literal If it's a bit reset to 0, the next bits (and byte) are values for size and relative offset. The size is encoded using Elias gamma format : 1 = 1, 010 = 2, 011 = 3, 00100 = 4, 00101 = 5, 00110 = 6, 00111 = 7, etc. The offset is encoded first with bits marker telling which format size (1, 4, 8 or 12 bits) is the encoded offset then the value itself. RLE marker is a sequence of 17 bits reset to 0 followed by 1 bit set to 1, the number of literals to copy is 27 more than the value of the following byte. END marker is a sequence of 18 bits reset to 0. Offset encoding details : The offset is relative to where to write upcoming data; if the offset value is zero then what we have to write next is what we just wrote as the last byte. Because the size value can be one, meaning a match of a single byte found near, the encoded offset should not be encoded too big in order to save bits, denying relative offset value to be in 8 or 12 bits. For a match of only 2 bytes, a relative offset value in 12 bits doesn't provide any real saving. So here's a table explaining the way offsets are written in DAN1. If the size value is 1, marker bit 0 means the offset is encoded in one bit, marker bit 1 means the offset value (minus 2) is encoded in 4 bits. 0 0 = offset of zero (last byte we just wrote), 0 1 (before the last byte we just wrote), 1 0000 ("before before" the last byte we just wrote), 1 0001, ..., 1 1111. If the size value is 2, marker bit 1 means the offset value (minus 18) is encoded in a byte (8 bits), marker bits 01 means the offset value (minus 2) is encoded in 4 bits, and marker bits 00 means the offset value is encoded in 1 bit. If the size value is greater than 2, marker bit 1 means the offset value (minus 274) is encoded in 4 bits and a byte (12 bits), marker bits 01 means the offset value (minus 18) is encoded in a byte (8 bits), marker bits 001 means the offset value (minus 2) is encoded in 4 bits, marker 000 means the offset value is encoded in 1 bit. Download : DAN1 (EXE, SRC, ASM) Beta version 20160715 : dan1-beta-20160715.zip Older versions : DAN1 (EXE file) Beta version 20160712 : dan1-beta-20160712.zip DAN1 (EXE file) Beta version 20160710 (all-in-one tool) : dan1-beta-20160710.zip DAN1 (EXE file) Beta version 20160704 (compression tool only) : dan1-beta-20160704.zip DAN1 (EXE files) Alpha version 20160618 (limited to file size up to 512K) : dan1-alpha-20160618.zip DAN1 (EXE files) Alpha version 20160604 (limited to file size up to 256K) : dan1.zip Usage : Compression (beta version) command-line :> dan1.exe [options] filename options include : -r to activate rle encoding, -v to verbose, -y to automaticaly say yes to overwrite, -o to specify an output filename Compression (alpha version) command-line :> dan1.exe filename Decompression command-line :> ddan1.exe filename Z80 Assembly Code to decompress in VRAM (ColecoVision version) * 20160715 - fixed RLE bug and optimized for size * ; DAN1 + RLE to VRAM ; 2016 Daniel Bienvenu, Canada ; HL = Source ; DE = Destination (in VRAM) dan1rvram: ; Set Write in VRAM at DE ld c, $BF out (c), e set 6, d out (c), d res 6, d ; Init. Read bits ld a, $80 ; Copy literal byte dan1r_copy_byte: ld b, $01 jr dan1r_literals2main dan1r_special: pop de call getbit ret nc ; EXIT ld b, (hl) ; Load counter value as a byte inc hl inc b call dan1r_literals ld b, $1A dan1r_literals2main: call dan1r_literals ; Main loop dan1r_main_loop: call getbit ; Check next bit jr c,dan1r_copy_byte ; Elias gamma decoding + Special marker push de ld de, $0001 ld b,d dan1r_eliasgamma_0: inc b call getbit ; Check next bit jr c, dan1r_eliasgamma_value bit 4,b jr nz, dan1r_special ; Special marker "0000000000000000" jr dan1r_eliasgamma_0 dan1r_eliasgamma_value_loop: call getbite ; Load next bit into DE rl d dan1r_eliasgamma_value: djnz dan1r_eliasgamma_value_loop push de pop bc ; BC = LENGTH ; Get Offset value ld d, $00 ; Reset Offset to $0000 ; ON LEN GOTO TWO_OFFSETS, THREE_OFFSETS ; GOTO FOUR_OFFSETS ex af,af' ld a,b or a jr z, dan1r_bzero ld a, $03 dan1r_bzero: or c ld e, a ex af, af' dec e jr z, dan1r_offset2 dec e jr z, dan1r_offset3 ld e, d dan1r_offset4: call getbit ; Check next bit jr nc, dan1r_offset3 call getnibblee ; Get next nibble -> E inc e ld d,e ; D = E + 1 jr dan1r_offset3a dan1r_offset3: call getbit ; Check next bit jr nc, dan1r_offset2 dan1r_offset3a: ld e, (hl) ; Load offset as a byte (8 bits) -> E inc hl ex af, af' ld a,e add a, $12 ld e,a jr nc, dan1r_offset3b inc d dan1r_offset3b: ex af,af' jr dan1r_copy_from_offset dan1r_offset2: call getbit ; Check next bit jr nc, dan1r_offset1 call getnibblee ; Get next nibble -> E inc e inc e jr dan1r_copy_from_offset dan1r_offset1: call getbit ; Load next bit -> E rl e ; Copy previously seen bytes dan1r_copy_from_offset: ex (sp), hl ; Store source, Restore destination push hl ; Store destination scf sbc hl, de ; HL = source = destination - offset - 1 pop de ; DE = destination ; BC = count ; COPY BYTES ex af,af' set 6,d dan1r_copybytes_loop: push bc ld c, $BF out (c), l nop out (c), h inc hl nop nop in a, ($BE) nop nop nop out (c), e nop out (c), d inc de nop nop out ($BE), a pop bc dec bc ld a,b or c jr nz, dan1r_copybytes_loop res 6,d ex af,af' pop hl ; Restore source address jp dan1r_main_loop dan1r_literals: ld c, $BE dan1r_literals_loop: outi inc de jr nz, dan1r_literals_loop ret getnibblee: call getbite ; Load next bit -> E call getbite ; Load next bit -> E call getbite ; Load next bit -> E getbite: call getbit ; Load next bit -> E rl e ret getbit: add a, a ret nz ld a, (hl) inc hl rla ret Example : Here is a file encoded in DAN1 format ( in hex values ) : 71 00 00 FF FF 00 00 0F FF F0 00 00 FF FF 02 00 00 00 Input : 71 00 00 FF FF 00 00 0F FF F0 00 00 FF FF 02 00 00 00 Input size : 18 bytes Decode : 71 < size = 65535 , offset in a bit = 0 > < size = 65535 , offset in a bit = 0 > < size = 65535 , offset in a bit = 0 > < size = 2 , offset in a bit = 0> < END marker > Output : 71 71 71 71 71 71 ... 71 Output size : 196608 bytes Details : 196608 times the letter 'q' (see test2.bin file in Exomizer ZIP archive) Edited July 17, 2016 by newcoleco 4 Quote Link to comment Share on other sites More sharing options...
newcoleco Posted July 11, 2016 Author Share Posted July 11, 2016 DAN1 Compression tool version Beta 20160710 seems stable and perform at a good speed. Decompression routine is a bit slow in its small version and that's expected ( size vs speed dilema ). Overall results surpass or equal the majority of available data compression for 8bit systems. Mission Accomplished. I do a speech during ADAMCon July 15-17, 2016, Guelph, Ontario, Canada. Any questions? remark? comment? 4 Quote Link to comment Share on other sites More sharing options...
Kamshaft Posted July 12, 2016 Share Posted July 12, 2016 Over my head but still cool! 1 Quote Link to comment Share on other sites More sharing options...
Tursi Posted July 12, 2016 Share Posted July 12, 2016 DAN1 Compression tool version Beta 20160710 seems stable and perform at a good speed. Any questions? remark? comment? What's your license terms? I could include it in my libti99coleco but my library is PD. 1 Quote Link to comment Share on other sites More sharing options...
newcoleco Posted July 12, 2016 Author Share Posted July 12, 2016 (edited) What's your license terms? I could include it in my libti99coleco but my library is PD. You probably know I like sharing my tools and games for decades, making them available freely, so what's the good licence terms for what I do believe is right? Is freeware good? or perhaps gpl? or is lgpl applicable? i don't know. But what I do know is I make it available and hope it helps retro gaming enthusiasts like you and me. Edited July 12, 2016 by newcoleco Quote Link to comment Share on other sites More sharing options...
newcoleco Posted July 12, 2016 Author Share Posted July 12, 2016 (edited) Over my head but still cool! And that's part of my challenge right now as I write my speech for this week-end to make this new information somewhat clear, less abstract. Please ask your questions if any. Edited July 12, 2016 by newcoleco Quote Link to comment Share on other sites More sharing options...
Tursi Posted July 12, 2016 Share Posted July 12, 2016 You probably know I like sharing my tools and games for decades, making them available freely, so what's the good licence terms for what I do believe is right? Is freeware good? or perhaps gpl? or is lgpl applicable? i don't know. But what I do know is I make it available and hope it helps retro gaming enthusiasts like you and me. I don't believe with most of the various drives to formalize free software (which I why my library went straight to public domain - I don't owe anything on it and I don't control its usage at all ). But that's why I need to ask before I can include your code, if I put it into my library then the same thing happens to that code (not to your ownership of the protocol or any future versions, of course ). So entirely up to you! But I'd like to include the decoder and also port it across to the TI-99 for that version of my lib. 1 Quote Link to comment Share on other sites More sharing options...
Tursi Posted July 12, 2016 Share Posted July 12, 2016 Oh, but I would add that GPL and LGPL should be applied with care. GPL is an "infectious" license which places demands on anyone using your code -- it would force anything using Dan1 to ALSO be GPL, and not everyone wants to release their source. LGPL is okay only if your code is distributed as a library. In source form as you list above, it would be the same as GPL. Licenses like "BSD" are a little more free in terms of freedom, IMO. Personally I don't use any of them except in cases where GPL has forced me to use it. 1 Quote Link to comment Share on other sites More sharing options...
CrazyBoss Posted July 12, 2016 Share Posted July 12, 2016 I tried to use it in the upcomming CollectorVision: Star Command game, I used it to compress 16kb vram down to around 1600 bytes. I took very long time to compress, Using an I5 CPU. And the CPU did not use more then 33% all the time. I dont know if the decompressor is quicker then the one I usely use, but it works good and most importend no use of RAM. So I think from now one I will use the DAN1 if its allowed for Commercial use By the way the compressor said its PP2PC and a tool to extract bitmapped data Dont know if its the newest version it said beta-20160710 Quote Link to comment Share on other sites More sharing options...
newcoleco Posted July 12, 2016 Author Share Posted July 12, 2016 I tried to use it in the upcomming CollectorVision: Star Command game, I used it to compress 16kb vram down to around 1600 bytes. I took very long time to compress, Using an I5 CPU. And the CPU did not use more then 33% all the time. I dont know if the decompressor is quicker then the one I usely use, but it works good and most importend no use of RAM. So I think from now one I will use the DAN1 if its allowed for Commercial use By the way the compressor said its PP2PC and a tool to extract bitmapped data Dont know if its the newest version it said beta-20160710 Thanks for reminding me to clean up my code for the final release. PP2PC is another tool I made to convert PowerPaint pictures into Dale Wick's ".pc" files. I simply copy-pasted part of its code to implement options (or flags) such as "-v" to verbose all steps, "-h" to show help and "-r" to allow RLE encoding as "there is a string of uncompressed data here better not to try to compress it wastes memory space". I will explain all about DAN1 options (flags) in my speech. The compression tool tries to explore all possibilities and pick up the best solution for this format and that's why it's slow. I guess a quick no so optimized result for development purpose should be an option for this tool, giving more an idea of the space needed and decoding speed while programming and testing projects. If the actual dan1 decompression routine is a bit slow for your taste, I'm sure someone will come up with a faster version the more this tool used. Side note : Try "dan1 -r filename" to allow RLE markers if you consider your data to be difficult to compress such as highly dithered pictures. Quote Link to comment Share on other sites More sharing options...
JEFF31 Posted July 13, 2016 Share Posted July 13, 2016 Hey Coleco programmers... (from A to Z) I've followed this thread for several years and I'm just a Coleco fan with ZERO knowledge of programming on the system (years ago I wanted to learn but those days are behind me...). but to all of you (for what it's worth)... THANK YOU for sharing ideas etc here so that you can continue producing games for the Coleco Continue please... Jeff31 Quote Link to comment Share on other sites More sharing options...
CrazyBoss Posted July 13, 2016 Share Posted July 13, 2016 I dont really mind the speed just wondering why it dont use all cpu speed available The most importend is at the z80 side Thanks for a great tool Thanks for reminding me to clean up my code for the final release. PP2PC is another tool I made to convert PowerPaint pictures into Dale Wick's ".pc" files. I simply copy-pasted part of its code to implement options (or flags) such as "-v" to verbose all steps, "-h" to show help and "-r" to allow RLE encoding as "there is a string of uncompressed data here better not to try to compress it wastes memory space". I will explain all about DAN1 options (flags) in my speech. The compression tool tries to explore all possibilities and pick up the best solution for this format and that's why it's slow. I guess a quick no so optimized result for development purpose should be an option for this tool, giving more an idea of the space needed and decoding speed while programming and testing projects. If the actual dan1 decompression routine is a bit slow for your taste, I'm sure someone will come up with a faster version the more this tool used. Side note : Try "dan1 -r filename" to allow RLE markers if you consider your data to be difficult to compress such as highly dithered pictures. Quote Link to comment Share on other sites More sharing options...
alekmaul Posted July 13, 2016 Share Posted July 13, 2016 Sorry Daniel, I don't understand one thing. How do we use it with bmp file for example ? Is it currently only for raw files ? Quote Link to comment Share on other sites More sharing options...
newcoleco Posted July 14, 2016 Author Share Posted July 14, 2016 Sorry Daniel, I don't understand one thing. How do we use it with bmp file for example ? Is it currently only for raw files ? Short answer : It takes raw files. Not limited to pictures. This is not a BMP to CV converter. Long answer : No, it's not an image converter, It's a tool to compress raw data into this format I named DAN1. Data in VRAM can be many things such as fullscreen pictures, charsets, game levels, etc. For BMP files, try an image converter such as like BMP2PP or DITHER to transform BMP into a picture file. Extract the PATTERN and COLOR data tables into binary files and apply DAN1 compression. Add the compressed data into your project and use the appropriate code to setup the video chip right and see the result on screen. Quote Link to comment Share on other sites More sharing options...
alekmaul Posted July 14, 2016 Share Posted July 14, 2016 ok, thanks Daniel, it is clear for me, was what I understood ! Quote Link to comment Share on other sites More sharing options...
newcoleco Posted July 17, 2016 Author Share Posted July 17, 2016 (edited) Speech about DAN1 data compression was done at the ADAMCon 28, July 16, 2PM in Guelph, Ontario, Canada. I tried as much as possible to be clear without going into details. I've explained why I consider this format to be great for ColecoVision projects. Link to a copy of the presentation and related files soon. Edited July 17, 2016 by newcoleco Quote Link to comment Share on other sites More sharing options...
newcoleco Posted July 22, 2016 Author Share Posted July 22, 2016 Just a few words before receiving tons of messages about how to use this tool and what's the difference between dan1vram and dan1rvram routines included in the zip archive file. DAN1 has two "flavors" : a normal one and a special with RLE coding for long strings of uncompressible data. You won't need the extra RLE coding most of the time because LZSS works just fine by itself. In the case you want to use the special RLE coding, use the flag ' -r ' to allow its usage during compression, which doesn't mean it will be necessary and used for your data. DAN1VRAM is for normally compressed DAN1 data. DAN1RVRAM* is for DAN1 data using special RLE coding. * : You can use DAN1RVRAM routine for both DAN1 normal data and DAN1 with RLE data. I hope this is clear. Please share and comment. I'm already getting suggestions about tools using DAN1 in various ways. Quote Link to comment Share on other sites More sharing options...
alekmaul Posted November 1, 2016 Share Posted November 1, 2016 (edited) Hi, I have a strange behavior with your compressor Daniel.i tried to integrate it in my tool (thanks for sharing source code), it works fine, but compression ratio is lower than a pletter compressor !Dan1 :const byte TILtitle3DAN1[] = { // tile size dan1 compressed 965 bytes (52%)const byte COLtitle3DAN1[] = { // color size dan1 compressed 793 bytes (60%)const byte MAPtitle3DAN1[] = { // map size dan1 compressed 385 bytes (49%)Pletter :const byte TILtitle3PLE[] = { // tile size pletter compressed 957 bytes (52%)const byte COLtitle3PLE[] = { // color size pletter compressed 787 bytes (61%)const byte MAPtitle3PLE[] = { // map size pletter compressed 385 bytes (49%) I tried (only the TIL part on my image) with your exe compressor, and it gave same size, 965 bytes. Do you think it is a normal behavior for Dan1 compression ? I added the c source file of the pletter compressor i used (entry point is int pletterCompress(byte *src,byte *dst,int n) Of course, for some pictures, your compressor is sometime better, but not all the time. It is just to inform you and to share information about compression i used this image for test plet.txt Edited November 1, 2016 by alekmaul Quote Link to comment Share on other sites More sharing options...
newcoleco Posted November 24, 2016 Author Share Posted November 24, 2016 (edited) Dear alekmaul, The behavior you are experimenting is normal for 3 reasons : 1. The fixed set may be overkill and waste bits in some cases, but my tests tend to demonstrate that a set of 4 sizes is optimal (5 being too much). The static set of 1,4,8,12 bits size to encode the offset values is optimal for "large" ColecoVision graphics in average according to my sample set. An offset using 12 bits (value going up to 2^12) goes beyond the limit of a charset (size 2^11), making the highest bit totally useless and so increase artificially the size of the compressed data. Your picture has lots of repeated sequences very close to each other, within a charset distance maximum. 2. Pletter has multiple options to encode offset values and uses the best fit one according to the data. DAN1 encodes offset values into the four fixed sizes 1,4,8 and 12 bits depending on their values. This set is in average ideal, but not adapted to the data. A workaround is to modify DAN1 to use a different set more adapted to the data used in your project, which is what I did in the slideshow demo in another thread and named it DAN1C (C simply means it was the variation C I tried that day and worked nicely). 3. DAN1 is not a miracle solution, but is surely a great one. My sample set to develop DAN1 was a bunch of bitmap screens from ColecoVision projects, pixelated photos and converted ZX Spectrum pixel arts. I would love to see someone make an intensive test with hundred of data files related to 8bit homebrew projects. In fact, this would be very helpful to do the next generation of data compression tool. Extra note : I've used with success a variation named DAN1C (C variation) that uses 1,3,8 and 11 bits to encode offset values. This variation is briefly mentioned in another thread as an alternative that works better for some graphics like the slideshow demo. Compared to the original set of 1,4,8 and 12 bits, the variation tend to reduce even more simple data such as the color table and so give better results. Hi, I have a strange behavior with your compressor Daniel.i tried to integrate it in my tool (thanks for sharing source code), it works fine, but compression ratio is lower than a pletter compressor !Dan1 :const byte TILtitle3DAN1[] = { // tile size dan1 compressed 965 bytes (52%)const byte COLtitle3DAN1[] = { // color size dan1 compressed 793 bytes (60%)const byte MAPtitle3DAN1[] = { // map size dan1 compressed 385 bytes (49%)Pletter :const byte TILtitle3PLE[] = { // tile size pletter compressed 957 bytes (52%)const byte COLtitle3PLE[] = { // color size pletter compressed 787 bytes (61%)const byte MAPtitle3PLE[] = { // map size pletter compressed 385 bytes (49%) I tried (only the TIL part on my image) with your exe compressor, and it gave same size, 965 bytes. Do you think it is a normal behavior for Dan1 compression ? I added the c source file of the pletter compressor i used (entry point is int pletterCompress(byte *src,byte *dst,int n) Of course, for some pictures, your compressor is sometime better, but not all the time. It is just to inform you and to share information about compression i used this image for test Edited November 24, 2016 by newcoleco Quote Link to comment Share on other sites More sharing options...
newcoleco Posted December 5, 2016 Author Share Posted December 5, 2016 (edited) Solved. Somehow, it is space consuming to split into 3 tables (PATTERN, COLOR, NAME) instead of just 2 (PATTERN, COLOR) assuming we talk about a bitmap screen. When using only 2 tables, DAN1 performs as expected, getting better results than Pletter. When using 3 tables under 2048 bytes each, DAN1 is using more bits than needed to store offset values. A solution is to adapt and there are some ways to make it. For this problem, I considered a simple variation of DAN1 using a reasonable number of bits to store offset values adapted to the data, which in this case a maximum of 10 bits will do the trick. The results are : With 3 tables Pletter : const byte TILtitle3PLE[] = { // tile size pletter compressed 957 bytes (52%) const byte COLtitle3PLE[] = { // color size pletter compressed 787 bytes (61%) const byte MAPtitle3PLE[] = { // map size pletter compressed 385 bytes (49%) Total = 2129 bytes Dan1_10 (adapted) : const byte TILtitle3DAN1_10[] = { // tile size dan1 compressed 947 bytes (53%) const byte COLtitle3DAN1_10[] = { // color size dan1 compressed 784 bytes (61%) const byte MAPtitle3DAN1_10[] = { // map size dan1 compressed 385 bytes (49%) Total = 2116 bytes As for if the picture was kept as a bitmap into 2 tables, DAN1 performs nicely, and sometimes only 11 bits max perform better. The results are : With 2 tables (assuming screen filled with all the possible characters to show a bitmap screen) Pletter : const byte TILtitle3Plet[] = { // tile size pletter compressed 1259 bytes (79%) const byte COLtitle3Plet[] = { // color size pletter compressed 794 bytes (87%) Total = 2053 bytes Dan1_12 (original) : const byte TILtitle3DAN1_12[] = { // tile size dan1 compressed 1248 bytes (79%) const byte COLtitle3DAN1_12[] = { // color size dan1 compressed 799 bytes (87%) Total = 2047 bytes Dan1_11 (adapted) : const byte TILtitle3DAN1_11[] = { // tile size dan1 compressed 1233 bytes (79%) const byte COLtitle3DAN1_11[] = { // color size dan1 compressed 791 bytes (87%) Total = 2024 bytes Since DAN1 source code is given, it is possible to change it by yourself and adapt the solution to your needs like I've done here to present these results. With more data and time, I may create a new software solution, perhaps as a new format and keep this one as is; it's at least something to think about. Edited December 5, 2016 by newcoleco Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.