Graph2Font RLE

glurk · September 17, 2021

So I had a few K of data, a font and some G2F screen data, and noticed it had lots of repeated runs interspersed with sequential runs of bytes. So I made this RLE decoder for it, and saved about 32% (on this particular data). Maybe someone will find it useful. It doesn't handle runs of non-repeated, non-sequential data, as I didn't have many. So it was sort of purpose-built. Uses 5 zero page locations.

Quote

RLEDECOMP       SUBROUTINE

                 ldy   #1
                 lda   (src_addr),Y     ; load the data byte first
                 pha                      ; and push it
                 dey
                 sty    temp               ; disable the incrementer
                 lda   (src_addr),Y    ; get the command
                 bne   .ahead3
                 pla                         ; pop 'A' and then,
                 rts                         ; exit when CMD is $00
.ahead3   asl
                 bcc   .ahead2        ; if high bit set,
                 inc    temp            ; enable the incrementer
.ahead2     lsr
                 tax
                 pla                        ; pop the data byte
.rle1          sta   (dst_addr),Y
                ;clc                        ; unneeded I guess??
                 adc    temp           ; add the increment
                 inc    dst_addr
                 bne   .ahead1
                 inc    dst_addr+1
.ahead1     dex
                bne   .rle1
                ;clc                        ; apparently unneeded also?
                 lda   #2
                 adc    src_addr
                 sta    src_addr
                 txa                        ; faster than LDA #0
                 adc    src_addr+1
                 sta     src_addr+1
                 bne   RLEDECOMP ; unconditional

Sorry about the formatting, I couldn't figure out how to do that on this board. I think the two CLC instructions are unneeded, it works for me without them. I also don't have an encoder, because I did it by hand, hahaha... Just posting it in case someone finds it useful.

EDIT FOR EXAMPLE:

.BYTE 3,$01,128+5,$04 decodes to: 01 01 01 04 05 06 07 08

Edited September 17, 2021 by glurk

TGB1718 · September 17, 2021

May be just lucky, but both asl and lsr can set the carry bit

glurk · September 17, 2021

But it's an ASL soon followed by an LSR. So I guess the same zero that went in comes back out ! Hahaha.

It's the second CLC that I wonder about. I left them in there just in case so I can put them back in.

EDIT: Looking at it again, the carry should still be clear at that point (I think)

Edited September 17, 2021 by glurk

Harry Potter · September 17, 2021

If you have cc65, I have a version of the Stac technique online, but it's for CBM machines and will need to be converted for the Atari. I can give you a link. Are you interested?

glurk · September 17, 2021

Don't know much about 'Stac' but example code is always good... This one really came about by accident, and it worked really well for me, so I cleaned up the code and posted. I tend to like using code I write myself, so I can (usually) understand what it's doing :)

Harry Potter · September 17, 2021

Try looking for "Stac compression" on wikipedia.com. My version is at c65 additions - Manage /compress at SourceForge.net.

Harry Potter · September 17, 2021

I had an update to make Stac slightly better, but the copy broke. I need to redo it. Any recommendations to make it better?

glurk · September 17, 2021

If you're asking me in particular, I'm no expert on data compression at all. I wrote something similar to the above on an Apple ][ years ago, but this was something I just hacked together... I'm sure there are others here that know much more about this kinda stuff.

Harry Potter · September 17, 2021

glurk: I thank you for your response. I was not asking you in particular. I was asking the group. I'm asking because I want to improve it. I tried MTF and some of my own ideas. Only one of them worked. It was simply an optimization for performance that performs extra checks to make sure LZ77 blocks were efficient. I need to redo it.

tebe · September 17, 2021

RLE Encoder 1.2

http://madteam.atari8.info/index.php?prod=uzytki

the fastest and shortest


rle_depacker

    sta inputPointer+1

loop    jsr getByte
    beq stop
    lsr @

    tay
lp0    jsr getByte
lp1    sta $ffff
outputPointer    equ *-2

    inw outputPointer

    dey
_bpl    bmi loop

    bcs lp0
    bcc lp1

getByte    inx
    sne
    inc inputPointer+1

    lda $ff00,x        ; lo(inputPointer) = 0 !!!
inputPointer    equ *-2

stop    rts

glurk · September 17, 2021

tebe-

I wrote this one to run from ROM (cartridge-based), so no self-modifying code, plus it decodes runs like $00,$00,$32,$33,$34,$35 which I didn't see any other schemes doing, really. So hopefully I'm not re-inventing the wheel. I even looked at NES stuff, which uses RLE a lot, but that stuff is too different architecture.

glurk · September 18, 2021

Replying to myself, I've worked on this some more and got it to also handle runs of arbitrary data. Here's a before/after sample of g2f data:

(128 BYTES)
00 00 00 00 00 01 02 00 00 00 3B 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 07 08 09 0A 3C 3D 3E 3F 40 0F 41 11 3F 42 0B 02 15 43 44 45 0F 41 11 00 00 00 00 00
00 00 00 00 17 18 19 1A 46 47 48 49 4A 4B 4C 4D 46 4E 4F 26 23 50 51 52 53 4C 4D 00 00 00 00 00
00 00 00 00 29 2A 2B 2C 54 55 56 35 57 58 59 5A 5B 5C 34 38 5D 5E 5F 2F 00 59 5A 00 00 00 00 00

(72 BYTES)
05 00 82 01 03 00 01 3B 08 00 01 04 0F 00 84 07 85 3C 48 0F 41 11 3F 42 0B 02 15 83 43 43 0F 41
11 09 00 84 17 88 46 01 46 82 4E 42 26 23 84 50 82 4C 09 00 84 29 83 54 01 35 86 57 42 34 38 83
5D 42 2F 00 82 59 05 00

I think 128 bytes down to 72 is 44% compression. That's pretty good, IMO. It's really intended for g2f data, which generates a lot of runs of consecutive bytes. I haven't cleaned up the code yet, but it works. But no-one seems interested in it anyway, haha.... ?

tebe · September 18, 2021

3 hours ago, glurk said:

But no-one seems interested in it anyway, haha.... ?

it's not true

glurk · September 18, 2021

Ok, here's the new version then. It even uses an (optional) undocumented opcode, which saves 1 byte and 2 cycles. I guess I need to write an encoder to go with it, but not sure how to go about that... As a Windows program, or in C, or something else?

Anyway, this code is as tight and fast as I can make it. Someone more clever than myself might be able to optimize further:

RLEDECOMP        SUBROUTINE

            ldy    #1
            lda    (TEMP3),Y        ; load the data byte first
            pha                ; and push it
            dey
            sty    TEMP5            ; disable the incrementer
            lda    (TEMP3),Y        ; get the command
            bne    .ahead3
            pla                ; pop 'A' and then,
            rts                ; exit when CMD is $00
.ahead3        asl
            bmi    .dorun
            bcc    .ahead2        ; if high bit set,
            inc    TEMP5            ; enable the incrementer
.ahead2        lsr
            tax
            pla                ; pop the data byte
.rle1            sta    (TEMP1),Y
            adc    TEMP5            ; add the increment
            inc    TEMP1
            bne    .ahead1
            inc    TEMP2
.ahead1        dex
            bne    .rle1
            lda    #2
            adc    TEMP3
            sta    TEMP3
            bcc    RLEDECOMP
            inc    TEMP4
            bne    RLEDECOMP        ; unconditional
.dorun        .byte $4B,$7F        ; undocumented ASR #$7F, same as AND #$7F, LSR
            tax
            pla
            inc    TEMP3
            bne    .drloop
            inc    TEMP4
.drloop        inc    TEMP3
            bne    .ahead4
            inc    TEMP4
.ahead4        sta    (TEMP1),Y
            lda    (TEMP3),Y
            inc    TEMP1
            bne    .ahead6
            inc    TEMP2
.ahead6        dex
            bne    .drloop
            beq    RLEDECOMP        ; unconditional

TEMP1 and 2 are the destination, TEMP3 and 4 are the source, TEMP5 is the "increment-er" - all zero-page locations.

$00 = END of stream

$01-$3F = Repeat the following byte that number of times

$41-$7F = Dump the following n-$40 bytes directly

$81-$FF = Repeat the following byte n-$80 times, incrementing by 1 each byte

I (think) that's an accurate description. In any case, I'm using it and it works...

Edited September 18, 2021 by glurk

Harry Potter · September 18, 2021

I'm jealous of you. I ask you to port it to cc65 C and let me use it for Template Creator CBM which is to be ported to the Atari 8-bit series.

glurk · September 18, 2021

There was a slight bug in the above code, the relevant part should be this:

.ahead3        asl
            bcc    .ahead2        ; if high bit set,
            inc    TEMP5            ; enable the incrementer
.ahead2        bmi    .dorun
            lsr
            tax

Harry Potter-

This is just a de-compressor for a certain type of data, with specific characteristics. Graph2Font generated data primarily. It wouldn't be great at general purpose stuff. It's "kind of" based on RLEINC for the NES, but they decompress into a 'port' (a single address) on their PPU, and this one works with src/dest addresses.

I probably need to test it more to make sure there are no other bugs like the one I caught above. It's really easy to make an "off-by-one" error that messes up the whole thing. I haven't tested every possible case, but I think it's solid now(?) I mean, I'm using it in a program now, and it's working with the data there...

Harry Potter · September 18, 2021

Okay. How does it fare against the Stac technique?

Harry Potter · September 18, 2021

BTW, I'm working on my own compression technique. It's very slow, but, right now, I'm concentrating on getting better numbers. I have to optimize later.

glurk · September 18, 2021

The way I look at it - at least for "game stuff", which this is - is that it's OK for the COMPRESSOR to be slow, since more time can/should allow it to compress better, and you only do that part once anyway. But you want the in-game DECOMPRESSOR to be really fast.

That said, I "invented" this scheme in my head, and then wrote a decoder for it. Writing the code to actually do the compression itself is a whole other thing. Recursion, ouch, makes my head hurt. Plus, I guess it should be a Windows-based compressor, and Windows programming is not my thing at all.

I actually encoded/compressed my own data by hand, which was not that big a deal, really, mostly involved removing bytes from a text file of data and putting in the counts etc... For this to really be useful for others, a compressor will need to be written, but I'm not sure how to do that part of it, hahaha. Never written one before, I just came up with the (simple, really) RLE-variant compression scheme.

glurk · September 19, 2021

tebe-

I decided to test this on a real g2f image, and I used your "Gollum" image. The file Gollum_C64_Rensoup_Tebe.scr which is 1200 bytes, compresses down to 253 bytes using this scheme. I actually encoded it by hand (lots of work) because I don't yet have an encoder, and I'm not sure at all that I really have the skills/know how to write one. It would have to be a Windows program, or at least commandline C.

But, 1-(253/1200) is 0.789166666 which is 79% compression!! I'd say that's pretty good.

I could REALLY use help writing an encoder for this, from anyone who wants to volunteer. I don't think I can do it, I'm mostly a 6502 guy, and can't do much with Windows-based programming.

tebe · September 19, 2021

I don't know how you got that result.

I compressed this file with RLE and got a result of 994 bytes

gollum.7z

glurk · September 19, 2021

It's not just simple RLE, it encodes sequences like 06,07,08,09,0A,0B,0C,0D etc. That sequence would be encoded like:

.byte $88,$06

And those n+1 increasing sequences are so common in the .SCR files, that's why I wrote this thing. Another example:

.byte 06,FF,86,DD,42,BA,AE,02,FF decodes to: FF,FF,FF,FF,FF,FF,DD,DE,DF,E0,E1,E2,,BA,AE,FF,FF

It's really efficient at n+1 increasing sequences.

Edited September 19, 2021 by glurk
fixed bad mistake!

glurk · September 19, 2021

I'm working on writing an encoder for this, it's going very slowly. Here is the current output when run on the "Gollum_C64_Rensoup_Tebe.scr" file:

File is 1200 bytes long.

Repeated run of 81 bytes of value: 0

Increasing run of 31 bytes starting with: 1

Increasing run of 3 bytes starting with: 160

Increasing run of 2 bytes starting with: 35

Repeated run of 3 bytes of value: 0

Increasing run of 4 bytes starting with: 37

Those values are all decimal - and correct. It then gets to the $0D byte, which I haven't handled yet. I'm trying to write this encoder for Windows, and the language I'm using is QB64 (BASIC) and I've never really used it before.

tebe · September 19, 2021

1 hour ago, glurk said:

It's not just simple RLE, it encodes sequences like 06,07,08,09,0A,0B,0C,0D etc. That sequence would be encoded like:

.byte $88,$06

ahhh, now I understand your idea, clever

Edited September 19, 2021 by tebe

ivop · September 19, 2021

Perhaps an easier way to implement this is to see it as a form of delta compression. First create the deltas, then use plain RLE on the deltas. Increasing runs result in runs of 1. Decreasing runs result in runs of $ff. Normal runs turn into delta, 0, 0, 0 etc...

Edited September 19, 2021 by ivop

Graph2Font RLE

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members