TW
TW
For people running into memory issues, I guess we can assume they have a lot of relatively small files, so there is only 1 chunk in the chunk_ids list. So,...
@RonnyPfannschmidt can you check? ^^^
@Gelma borg accesses these hashtables a lot and my gut feeling about speed is: hashindex (RAM) > hashindex (mmap) >> sqlite (disk). Especially for non-first backups of mostly unchanged files,...
@RonnyPfannschmidt maybe we can split the "save RAM" (by using less RAM), "mmap compatible" (layout) and "actually use mmap" (only have in RAM what we access) aspects. The first two...
Please be more specific: which (full) command exactly takes too long?
OK, so you recompress "if different" AND you exclude stuff. So how do you think this can get faster?
If you exclude stuff, it has to process all archives and look at all files in there anyway. If you only recompress "if different", it will recompress a specific chunk...
Note: the chunks index only has `id -> refcount, size, csize`. Question: does it read the whole chunk to determine the compression type? Answer: yes, see `ArchiveRecreator.chunk_processor`: ``` if recompress...
If you do actions which influence the contents of an archive (by recreating it with new excluded files), it of course need to read each archive, how else should it...
IIRC, I've seen code that activates re-chunking only if chunker params are different. So, in that case it would be: read, authenticate, decrypt, decompress, rechunk, compress, encrypt, authenticate, write. But...