libkiwix icon indicating copy to clipboard operation
libkiwix copied to clipboard

Purge cached ZIMs / open entries from kiwix-serve after a certain delay

Open benoit74 opened this issue 2 years ago • 4 comments

Currently kiwix-serve is caching ZIMs / open entries and its cache size is only limited by number of ZIMs / open entries (if I understood @mgautierfr well).

We could probably benefit from purging the cache from ZIMs / open entries which have not been accessed since a given amount of time, this would probably free a significant amount of memory.

This expectation comes that on library.kiwix.org we have a varnish cache in front of kiwix-serve with a retention of 24h. Varnish cache consumes between 2.5 and 5G of RAM, while kiwix-serve consumes a lot more, and it keeps growing (even if growth is slower and slower as time pass by). For instance today after 3.5 days of uptime, kiwix-serve is already consuming about 10G of RAM.

Would it make any sense / be feasible with a reasonable effort?

benoit74 avatar Nov 24 '23 10:11 benoit74

I would like to come back to the problem description before we talk about any solution.

@mgautierfr Can you confirm please:

  • Does kiwix-serve runs on the longer term in a stable manner from a memory consumption perspective?
  • Do we have any working system to prevent memory exhaustion?

kelson42 avatar Nov 26 '23 13:11 kelson42

Does kiwix-serve runs on the longer term in a stable manner from a memory consumption perspective? Do we have any working system to prevent memory exhaustion?

We have a system to limit what it is stored in term of the number of entries (pretty well resumed in https://github.com/kiwix/k8s/issues/147)

While it is not technically speaking a system to prevent memory exhaustion (we don't reason by memory, and a entry could be really big), on the long term the memory consumption hits a limit (which one ? Can't say) and we should be stable.

mgautierfr avatar Nov 27 '23 09:11 mgautierfr

I have read our two tickets more than twice and what I can say is:

  • I still don't understand really much how it works today
  • We have many level of caching: libzim, libkiwix, kiwix-serve?
  • It seems to not be properly documented!?
  • The garbage-collection of memory is not explained (all these things should benefit of standards caching methodology... but not sure).
  • Obviously it's very hard that way to control memory usage
  • impossible to say a top memory usage
  • I don't like to have these internal driven (only) with env. variables.

Make a lot of problems and no clue where to start really... documenting things in detail is probably the thing.

kelson42 avatar Dec 03 '23 19:12 kelson42

@veloman-yunkan Would you be able to make a dedicated documentation in the online doc how cache works in libkiwix?

kelson42 avatar May 11 '24 15:05 kelson42