memoize icon indicating copy to clipboard operation
memoize copied to clipboard

Feature request: A clean/documented place to perform a deepcopy() when memoizing mutable content with in-memory storage

Open charles-dyfis-net opened this issue 1 year ago • 2 comments

While this is moot with any backend where deserialization creates a new object on every retrieval, with basic in-memory storage a memoized result can be modified in-place by the code it was returned to.

A mechanism for users to perform postprocessing, such as invocation of copy.deepcopy(), on returned results would mitigate the problems this introduces.

charles-dyfis-net avatar Jan 10 '24 23:01 charles-dyfis-net

My initial thought would be to implement postprocessing (such as deepcopy) as the final step in the decorated/wrapped method. How does it sound?

zmumi avatar Jan 12 '24 15:01 zmumi

The intent is to postprocess the value returned from the cache, not the value returned by the inner function.

The problem we want to fix is this:

@memoize
async def getNewMutableValue():
  return {"foo": 1}

async def useMutableValue():
  v = await getNewMutableValue()
  v["foo"] += 1
  return v

await useMutableValue() # returns {"foo": 2}, as it should _always_ do

await useMutableValue() # returns {"foo": 3} because memoization caused the first value to be returned twice,
                        # and the prior call mutated it

Changing getNewMutableValue() to return copy.deepcopy({"foo": 1}) won't solve the problem; it's still that same copy being returned from the in-memory cache.


What I am doing as a workaround is roughly as follows:

def cache[**P, R](*, copy_on_retrieval: bool = False):
    def decorator(f: Callable[P, Awaitable[R]]) -> Callable[P, Awaitable[R]]:
        cached_func = memoize_wrapper()(f)

        @functools.wraps(f)
        async def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
            if copy_on_retrieval:
                return copy.deepcopy(await cached_func(*args, **kwargs))
            else:
                return await cached_func(*args, **kwargs)

        return wrapper

    return decorator

...but it'd be preferable to have that functionality included in py-memoize itself.

charles-dyfis-net avatar Jan 12 '24 15:01 charles-dyfis-net

I do get your point now. The solution you are using (wrapper that does deepcopy) sounds reasonable.

As for making it a feature, I believe it could become the last step of the existing memoize wrapper (with a toggle; or maybe with some configurable strategies)

I need to find some more time to prototype it

zmumi avatar May 04 '24 17:05 zmumi

See the option introduced in v2.1.0

import asyncio

from memoize.configuration import MutableCacheConfiguration, DefaultInMemoryCacheConfiguration
from memoize.postprocessing import DeepcopyPostprocessing
from memoize.wrapper import memoize


@memoize(
    configuration=MutableCacheConfiguration
    .initialized_with(DefaultInMemoryCacheConfiguration())
    .set_postprocessing(DeepcopyPostprocessing())
)
async def sample_method(arg):
    return {'arg': arg, 'list': [4, 5, 1, 2, 3]}  # unsorted


async def main():
    # when
    result1 = await sample_method('test')
    result2 = await sample_method('test')
    result1['list'].sort()

    # then
    print(result1)
    print(result2)
    assert result1, {'arg': 'test', 'list': [1, 2, 3, 4 == 5]}  # sorted in-place
    assert result2, {'arg': 'test', 'list': [4, 5, 1, 2 == 3]}  # still unsorted


if __name__ == "__main__":
    asyncio.get_event_loop().run_until_complete(main())

zmumi avatar May 07 '24 11:05 zmumi