meilisearch-python icon indicating copy to clipboard operation
meilisearch-python copied to clipboard

optimization: with an in memory cache decorator

Open Sanix-Darker opened this issue 3 years ago • 11 comments

Description A small "in memory" cache system as a decorator (not talking about lru_cache and its cousins...), talking about a customizable cache with a TTL...

Basic example Inside the class

import time

class Index():
  def __init__():
  ...
  self.cache_activated: bool = True
  self.cache_args_values: Dict[str, Any] = {}
  self.cache_ttl: int = 1

  def clean_cache(self) -> None:
      """"
      A cleaner for the actual cache
      
      """
      self.cache_args_values = {}
      
  def cache(self, func: Any) -> Any:
      """
      A custom cache decorator to save output of a function depending on arguments
      With a ttl
      
      func
      """
      def _wrapped(*args: Any, **kwargs: Any) -> Any:
          if not self.cache_activated:
               return func(*args, **kwargs)
               
          cache_key: str = str(args) + str(kwargs)
          if cache_key in self.cache_args_values:
              if time.time() - self.cache_args_values[cache_key]['at'] <= self.cache_ttl:
                  return self.cache_args_values[cache_key]['as']
  
          self.cache_args_values[cache_key] = {
              'at': time.time(),
              'as': func(*args, **kwargs)
          }
          return self.cache_args_values[cache_key]['as']
  
      return _wrapped
  
  
  @self.cache
  def search(self, query: str, opt_params: Optional[Dict[str, Any]] = None) -> Dict[str, Any]:
      .......

  @self.cache
  def get_filterable_attributes(self) -> List[str]:
      .......

  @self.cache
  def get_stop_words(self) -> List[str]:
      .......
      
  @self.cache
  def get_document(self, document_id: str) -> Dict[str, Any]:
      .......
    
  @self.cache
  def get_settings(self) -> Dict[str, Any]:
      .......

This approach could be useful if we want to prevent the meilisearch to get hit with the same parameters each time in a short delay (i faced that issue btw)...

Other Any other things you want to add.

Sanix-Darker avatar Jan 08 '22 18:01 Sanix-Darker

I have thought about lru_cache for some of these, but have not suggested it because I was worried about stale cache hits. If I am looking at your proposal correctly it looks it would only hit the cache with rapid requests (1 second by default) and solve for the problem I was worried about?

If the team decides to implement it I would suggest also using functools wraps.

from functools import wraps
...

  def cache(self, func: Any) -> Any:
      """
      A custom cache decorator to save output of a function depending on arguments
      With a ttl
      
      func
      """
      @wraps(func)
      def _wrapped(*args: Any, **kwargs: Any) -> Any:
          ...

sanders41 avatar Jan 08 '22 19:01 sanders41

it would only hit the cache with rapid requests (1 second by default) and solve for the problem I was worried about?

Yes, IMO, the dev using the lib in a project can adapt that cache_ttl value depending on the need

If the team decides to implement it I would suggest also using functools wraps.

Ok ok, keep me update if possible then, i will be happy to implement that... :-)

Sanix-Darker avatar Jan 08 '22 19:01 Sanix-Darker

Seems reasonable to me.

I'm just a contributor not a maintainer so it won't be my call, just giving my thoughts as a contributor :smile:

sanders41 avatar Jan 08 '22 19:01 sanders41

@Sanix-Darker our conversation got me thinking. What do you think about instead of a TTL cache using an LRU cache that gets cleared on update, insert, delete of the corresponding attribute? I'm thinking this will give increased performance because it would always hit a cache if no changes have been made, and also solve for the stale cache problem. Something like:

from functools import lru_cache


class Index:
    ...

    @lru_cache(maxsize=1)
    def get_filterable_attributes(self) -> List[str]:
        ...

    def update_filterable_attributes(self, body: List[str]) -> Dict[str, int]:
        ...
        get_filterable_attributes.clear_cache()

    def reset_filterable_attributes(self) -> Dict[str, int]:
        ...
        get_filterable_attributes.clear_cache()

sanders41 avatar Jan 09 '22 13:01 sanders41

Mhhhh, the custom TTL cache up there is a kind of lru_cache with a lifetime for each values in the dictionary, and if you look at the source code of the lru_cache, it's also a dictionnary based cache system...

IMHO, i think it will be better to implement a ttl cache here and it will be nice that we customize it as much as possible... (for example, depending on the size of the data saved in the 'as' key, we could ajust the 'at' value)...

And if the problem is about performance, we could also implement my decorator in a small Cython module and import it as a C binding function... :)

At the end... each solution is good for me, as i said up there, we need a cache, whatever the implementation is... :)

Sanix-Darker avatar Jan 09 '22 13:01 Sanix-Darker

Agreed, either option would work. Good point on the potential large size of a dataset and not wanting to keep that around.

sanders41 avatar Jan 09 '22 13:01 sanders41

I did some testing and lru_cache is going to be more complicated to implement here than benefits it would bring.

For the TTL cache the @self.cache causes an error because self isn't available at that point so the cache decorator would need to be changed to a static method. It seems like a weird hybrid of a static method and instance method so I'm not sure if it has any unintended consequences, but this is what it took to get it to work.

    @staticmethod
    def cache(func: Any) -> Any:
       """
       A custom cache decorator to save output of a function depending on arguments
       With a ttl

       func
       """

       @wraps(func)
       def _wrapped(self, *args: Any, **kwargs: Any) -> Any:
           if not self.cache_activated:
               return func(*args, **kwargs)

           cache_key: str = str(args) + str(kwargs)
           if cache_key in self.cache_args_values:
               if (
                   time.time() - self.cache_args_values[cache_key]["at"]
                   <= self.cache_ttl
               ):
                   return self.cache_args_values[cache_key]["as"]

           self.cache_args_values[cache_key] = {
               "at": time.time(),
               "as": func(self, *args, **kwargs),
           }
           return self.cache_args_values[cache_key]["as"]

       return _wrapped

    ...

    @cache 
    def search(...

Then after doing that I did some testing and there would be some trade offs to implementing it. There are speed ups with multiple requests, but it comes at a cost of slowing down single requests.

Admittedly the test I did was "quick and dirty" just to get an idea of the differences:

import json
from datetime import datetime, timedelta

from meilisearch import Client

index_name = "test"
client = Client("http://localhost:7700", "masterKey")
response = client.create_index(index_name)
client.wait_for_task(response["uid"])
index = client.get_index(index_name)

with open(
    "/path/to/datasets/movies.json", "r"
) as f:
    data = json.load(f)
    response = index.add_documents(data)
    index.wait_for_task(response["uid"], timeout_in_ms=100000)

test_runs = 10
single_filterable = timedelta(00.00)
multi_filterable = timedelta(00.00)
single_search = timedelta(00.00)
multi_search = timedelta(00.00)
for _ in range(test_runs):
    start = datetime.now()
    index.get_filterable_attributes()
    end = datetime.now()
    single_filterable = single_filterable + (end - start)
    index.clean_cache()   # only for test with cache

    start = datetime.now()
    result = index.search("spiderman")
    end = datetime.now()
    single_search = single_search + (end - start)
    index.clean_cache()   # only for test with cache

for _ in range(test_runs):
    start = datetime.now()
    for _ in range(50):
        index.get_filterable_attributes()
    end = datetime.now()
    multi_filterable = multi_filterable + (end - start)

    start = datetime.now()
    for _ in range(50):
        index.search("spiderman")
    end = datetime.now()
    multi_search = multi_search + (end - start)

print(f"single_filterable = {single_filterable  / test_runs}")
print(f"multi_filterable = {multi_filterable  / test_runs}")
print(f"single_search = {single_search  / test_runs}")
print(f"multi_search = {multi_search  / test_runs}")

Without cache: single_filterable = 0:00:00.003637 multi_filterable = 0:00:00.116656 single_search = 0:00:00.010340 multi_search = 0:00:00.433834

With cache: single_filterable = 0:00:00.004560 multi_filterable = 0:00:00.000518 single_search = 0:00:00.012421 multi_search = 0:00:00.001159

sanders41 avatar Jan 15 '22 13:01 sanders41

Hi @Sanix-Darker, Thank you so much for your interest and your proposition. I am sorry for the delay in answer. Yes it seems to be a great idea but I have a couple of questions about it:

  • What is the impact for the "default meilisearch user", i.e. the user who only defines the URL and the master key and that's it. Does he have to configure anything else if he has a large data set for example?

This approach could be useful if we want to prevent the meilisearch to get hit with the same parameters each time in a short delay (i faced that issue btw)...

  • Is it useful to implement it for each method? Shouldn't we focus on the search method first? Just for the record, this is not a priority right now as far as our work is concerned and if it is implemented, it needs to be well tested. Thanks again

alallema avatar Jan 18 '22 11:01 alallema

Thanks for this benchmark @sanders41 and for the update with the @static_method, the code i provided can be tweak and optimized even more :-), so that the cache get activated depending on the size of the output data from functions... in a way, for some specifics incoming args, because we got small size in outputs, we can still hit meilisearch for next calls... and we activate the cache when the return size is bigger than the allowed limit of the previous call... (that's why i prever a self implem instead of lru_cache and... yeah it could sound a little weird... but i still think it could be 'nice' to have the cache here... and we can implement it well)

Sanix-Darker avatar Jan 18 '22 20:01 Sanix-Darker

Hi @alallema , don't worry for the delay :-).

Does he have to configure anything else if he has a large data set for example?

The only thing am seeing here is to add some properties for the cache :

from meilisearch import CACHE_ON_SEARCH 

# we can have these parameters:

client = Client(
   ...
   cache=True, # optional, default is false
   cache_ttl=10, # optional, we're going to have a default value, maybe 1s
   cache_max_allowed_size=128 # optional, default is 32, the binary size of outputs of functions
   cache_targets=[CACHE_ON_SEARCH, '...'] # optional, array of methods specifying where we want to put the cache, default is everywhere
)
# by adding these attributes at the instantiation of the class, 
# the dev is telling exactly what he want his client to do... 
# of course those parameters can be shared to the Index class too

Is it useful to implement it for each method? Shouldn't we focus on the search method first?

Absolutely agreed... we could also allow the dev to choose where he want to activate it !

Just for the record, this is not a priority right now as far as our work is concerned and if it is implemented, it needs to be well tested.

Also agreed :-)

Sanix-Darker avatar Jan 18 '22 20:01 Sanix-Darker

Hi @Sanix-Darker, I love the idea!

Ok ok, keep me update if possible then, I will be happy to implement that... :-)

It would be so great if you still want to do it 🤩 Thanks again

alallema avatar Feb 03 '22 16:02 alallema

@alallema I would suggest we close this one. I implemented something simlilar and did see benefits, but I got a report from a user that he was getting an out of memory error caused by the cache. The cache had a short expire time of 30 seconds so it wasn't caused by the cache growing over a long period of time. I never reproduced it myself so I'm not sure if it was caused by a high number of requests, very large responses that needed caching, or a machine with low memory.

Either way it seems the use of in memory cache runs this potential for issues here. I think the solution would be to use something like Redis, but that is out of scope of the SDK and something the user could implement on their own if needed.

sanders41 avatar Sep 07 '23 15:09 sanders41

Let's close it in this case @sanders41 I go with your judgment! 😊

curquiza avatar Sep 12 '23 09:09 curquiza