a yellow deer. they're looking at the viewer from a window.

the VeadoDelta encoding  
21 march 2023 (return)

figured it was a good idea to document the image encoding format i’m currently using in veadotube (it’s not used in any currently available versions, though)! it’s a simple delta-encoding format, hence the name.

each channel is encoded separately, compressing delta values with prefix codes. the compressed image is in RGB with alpha, with all channels having 8-bit depth (0 to 255).

in order to compress a channel, it must first take the original values and apply the difference between the previous value and the current, keeping the first value intact. like this:

original:  6   7   8   7   11  14  14  14  14  14
   delta:  6   1   1  -1   4   3   0   0   0   0

these deltas are stored as signed bytes, meaning that they range from -128 to 127 and take advantage of byte overflows/underflows:

original:  0   255 0   240
   delta:  0   -1  1  -16

the delta values are then encoded as prefix codes of variable bit length:

abs value    | sequence
x = 0        | 1 0
x = 1 ~ 2    | 1 1 [1 bit, x - 1] [sign bit]
x = 3 ~ 4    | 0 1 0 [1 bit, x - 3] [sign bit]
x = 5 ~ 8    | 0 1 1 [2 bits, x - 5] [sign bit]
x = 9 ~ 16   | 0 0 1 0 [3 bits, x - 9] [sign bit]
x = 17 ~ 32  | 0 0 1 1 [4 bits, x - 17] [sign bit]
x = 33 ~ 64  | 0 0 0 1 0 [5 bits, x - 33] [sign bit]
x = 65 ~ 128 | 0 0 0 1 1 [6 bits, x - 65] [sign bit]

the sign bit is 1 for positive, and 0 for negative. there’s also a special bit sequence that must be used when 0 is repeated at least 7 times, with a maximum value of 262:

0 0 0 0 [8 bits, x - 7]

it’s important to note that bit sequences are read/written from the least significant bit to most.

each encoded channel is padded by one byte, meaning that they have an exact byte length, instead of storing bits from more than one channel in one byte.

here’s the entire data buffer itself:

- start offset of the encoded R channel, 4 bytes
- start offset of the encoded G channel, 4 bytes
- start offset of the encoded B channel, 4 bytes
- encoded alpha channel
- encoded R channel
- encoded G channel
- encoded B channel

a few notes:

i employed this encoding as i needed something to replace the previous solution i went with; the image in the avatar file used to be stored as the original image file. this meant that if the user imported a PNG file, that same PNG file is kept intact in the avatar file, so everytime an avatar is loaded the entire image file needed to be parsed again, which can be slow.

images in memory also took a lot of space. i had two options here:

the latter option is what mini 1.4 does, and it’s not ideal, honestly – it’s partly why mini currently has a 2048x2048 limit, so that people don’t innocently crash veado with heavy images.

so the idea was to have an encoding that followed the following requisites:

i tried to go with QOI, which works for most purposes, but i realised that i wanted to take advantage of threading. so i figured i’d go through the “write my own encoding” route, which isn’t as insane as it sounds – as i’m working through image file decoding, most project files (Photoshop, MediBang, SAI, whatnot) roll out their own solutions for encoding images for their own purposes (and most do delta encoding as well!)

in the end, VeadoDelta works quite well! it seemed to achieve the file size goal from my tests (remind me to properly post all the tests here! i don’t have them with me right now), so it’s what i’m using for future versions of veado :]

it has one caveat: in the worst case (white pixel followed by grey pixel and so on), an encoded channel takes up 1.25x the original size. that rarely happens, but in the case of the encoding happening to the same size of the raw image or larger, it simply throws away the encoding and uses the raw image instead. thus, before decoding, the program must check if the data is smaller than what the raw image would take – if so, decode! otherwise, just use the buffer straight away.

it’s also honestly surprising how image processing is taking a lot more of my time than i expected, but i guess it makes sense, as i’m making an app that in its essence simply puts images together!