tantivy icon indicating copy to clipboard operation
tantivy copied to clipboard

Index encryption via Encrypted Directory [reopen]

Open mcrakhman opened this issue 11 months ago • 0 comments

Hi @fulmicoton!

Decided to open a new issue instead of closed one (https://github.com/quickwit-oss/tantivy/issues/1474). We really need your input on this, because if you don't agree with the approach we won't be able to merge it into the repo.

This is what we propose:

  • use AES-CTR for each individual file
  • the key for the file is derived from the master key for the directory using HKDF

In terms of implementation I have the following questions. How we should do this architecturally? Looking at the code you have Directory level (atomic_read, atomic_write), FileHandle level for reading and you have WritePtr which is used for writing.

It seems we would need to have the encryption/decryption in atomic_read, atomic_write. In terms of FileHandle we can have custom file handle for the EncryptedDirectory which would decrypt stuff on the fly (knowing the IV for the file and the derived symmetric key).

In terms of WritePtr we for sure need to account for last block which may not be multiple of 16/32 bytes (depending on AES type). So if we for example append some bytes (not a multiple of 16) to the file (which length can also be not a multiple of 16) we would need to pad these bytes with previous ones and re-encrypt them together.

Also where should we store the IV vectors, we can store them separately, so the file and offsets will not be affected (or maybe we can append them in the end, so again the offsets stay the same).

Thanks!

mcrakhman avatar Apr 02 '24 15:04 mcrakhman