nmail icon indicating copy to clipboard operation
nmail copied to clipboard

[Enhancement] Split cache sqlite to improve performance?

Open Kabouik opened this issue 3 years ago • 1 comments

Maybe not an enhancement, but rather a discussion to investigate ways to improve performance with large encrypted caches.

Following https://github.com/d99kris/nmail/issues/74#issuecomment-835238996, it appears that the new sqlite caching has many benefits for small caches or unencrypted caches, but downsides quickly show with large encrypted caches (long encrypting, decrypting and read times). At the moment, with my >18GB cache, it can be frustrating even on a modern computer, but I actually also observe long processing times on an account with a relatively moderate 4GB cache size.

From there, how helpful and feasible would splitting the cache into smaller files be? The cache is already subdivided into multiple sqlite files at the moment, but I think they are based on folders. I am sorry if I have forgotten previous discussions where this might have been ruled out already, but I assume more smaller files would mean more frequent encrypting/decrypting but also faster processing each time. Perhaps there's a sweet spot where this time would be unnoticeable on a relatively modern machine?

For instance, sqlite files could be split per day, week, contact, or whatnot. This would result in a large size variance among files though, so a maximal size per sqlite file, above which a new file is created, would likely be more elegant and robust. I assume this would help with implementing #81, and could also mitigate the issue with nmail actually requiring twice as much disk space as the actual cache size, due to the decrypted temp/ folder.

The way I imagine it, decrypting could be done only once per session to limit reading times, i.e., each temp/ folder associated with each sqlite file would be deleted only when quitting nmail. However I think an extra max_temp_storage=5GB option would be welcome, and upon reaching that threshold, temp/ folders could be deleted sequentially based on recency.

Potential issues with split cache (not exhaustive):

  • how to deal with searches across the whole cache (i.e., multiple sqlite files) that have not been decrypted yet? How is it dealt with at the moment: is search done per folder only?
  • how to deal with continuous scrolling?

There could be a keybind for "Force decrypting all cache before search", but I feel it's an ugly workaround.

Kabouik avatar Jun 29 '21 09:06 Kabouik

Hi! Here are two other potential ways that could improve performance for encrypted cache: a. nmail to encrypt email body data fields in table rather than entire sqlite file (email headers to be kept in a full-file encrypted sqlite db) b. investigate again whether any sqlite encryption "plugin/add-on" can be used.

During the development of sqlite support I initially was using option (a) in my local implementation, but encountered some bugs. I suppose I could try revive that code and iron out the remaining bugs for it.

d99kris avatar Jul 09 '21 06:07 d99kris

Will proceed to move all feature requests into Discussions section, for a clearer separation of bugs and feature requests.

d99kris avatar Nov 29 '22 12:11 d99kris