MLA icon indicating copy to clipboard operation
MLA copied to clipboard

Consider caching decrypted and decompressed data

Open commial opened this issue 2 years ago • 0 comments

When an archive is seeked, the reader has to:

  1. go back to the beginning of the corresponding encrypted block
  2. decrypt it
  3. decompress from the start of the compressed block

If another seek occurs, the uncompressed data and the decrypted data are lost, and the work must be done again.

This is not a problem when performing linear extraction, but some pattern suffers from this behavior, for instance:

  1. Create an archive with sparsed files, ie. [File 1 content 1][File 2 content 1][File 1 content 2][File 2 content 2] etc.
  2. Iterate over the file list (File 1, File 2) and read them

In the worst case of n tiny part for n files , every block could be decrypted and decompress n-times.

To avoid it, a cache could be use between layers. The implementation:

  • can be provided to each layer
  • could be a layer of its own
  • must have a strategy to avoid consuming the whole RAM in case of very large file (maybe with a LRU?)

Implementing #156 would be a way to check for the reality of performance increase

commial avatar Apr 26 '23 12:04 commial