borg prefix compressed chunks with decompressed size?

we don't have that yet, but guess it would be good for:

decompression buffer allocation
not only knowing the csize, but also the size without actually reading and decompressing it

as this is encrypted, we do not disclose anything by this.

borg can then find out:

csize: by looking at the PUT header, by looking at the length of the read data, via the new repo index
size: by reading a few bytes from the chunk, decrypting them reveals: compression type, level, size

borg 1.1/1.2 needs to read all archive metadata streams to find out all chunk sizes. the new way would "only" need to read a few bytes from each chunk exactly once.

May 18 '22 01:05 ThomasWaldmann

to simplify getting the size, for the ObfuscatedSize type chunks this would mean to prefix with the un-obfuscated-uncompressed size.

May 18 '22 12:05 ThomasWaldmann

What we had in borg 1.2 is like:

OBFUS_HEADER = TYPE8 0x00 csize32  # the length is of the payload without the obfuscation trailer
COMPR_HEADER = TYPE8 0x00

OK, so the first idea for borg 2 was like this:

OBFUS_HEADER = TYPE8 0xFF size32 csize32  # csize32 = len(payload) - len(obfusc_trailer)
COMPR_HEADER = TYPE8 LEVEL8 size32

Maybe simpler 2nd idea:

COMPR_HEADER = TYPE8 LEVEL8 size32 csize32
type, level, csize refer to the compressed data
size is how much it is after decompression
and, important, the payload **might be longer than csize (if obfusc_trailer is appended)**

Guess this would be nice to implement a size/csize api:

headers = repo.get_headers(chunkids)
decrypt_parse(chunkids, headers) -> [(id, size, csize), ...]

Hrm, crap, guess we can not use the AEAD ciphers to just decrypt a part of the payload.

Aug 19 '22 11:08 ThomasWaldmann

So guess we would need something like:

sizeof(encrypted_metadata)
encrypted_metadata (with type, level, size, csize, ... whatever), using Struct or msgpacked dict.
obfuscated, (separately) encrypted, compressed data

Aug 19 '22 16:08 ThomasWaldmann

superseded by #6987.

Sep 05 '22 14:09 ThomasWaldmann