es-toolkit icon indicating copy to clipboard operation
es-toolkit copied to clipboard

feat(memoize): add memoize

Open de-novo opened this issue 1 year ago • 10 comments

Summary

This PR addresses the goal outlined in Issue #91 by implementing the memoize function to cache the results of function calls based on the arguments provided. Additionally, function overloads have been added to enhance flexibility and support various use cases.

Benchmark

스크린샷 2024-07-15 오후 3 04 25

Changes Made

Functionality:

  • Implemented the memoize function to cache function results.

Tests:

  • Added comprehensive test cases to ensure the correctness of the memoization functionality with different overloads.
  • Verified the correct functioning of memoize with default cache, custom resolver, and custom cache.

Documentation:

  • Updated documentation with examples demonstrating the usage of memoize with different combinations of resolvers and caches.

de-novo avatar Jul 15 '24 06:07 de-novo

@de-novo is attempting to deploy a commit to the Toss Team on Vercel.

A member of the Team first needs to authorize it.

vercel[bot] avatar Jul 15 '24 06:07 vercel[bot]

This function is quite complex, let me some time to review this!

raon0211 avatar Jul 16 '24 15:07 raon0211

Hello, this looks great!

However, it seems like the function's implementation has become a bit too complex with all the function overloading. One of the key design principles of the es-toolkit is simplicity, as mentioned here. We should aim to keep the implementation simple, even if it means not covering every edge case.

How about we use objects to manage the resolver and cache? This could help simplify our implementation.

import { head } from '../../array/head.ts';

function memoize(func, { resolver = head, cache = new Map() }) {
  const memoized = function(...args) {
    const key = resolver(args);
    
    if (cache.has(key)) {
      return cache.get(key);
    }

    const result = func(...args);
    cache.set(result);
    return result;
  }

  memoized.cache = cache;

  return memoized;
}

raon0211 avatar Jul 19 '24 01:07 raon0211

Meanwhile, I'm not sure we should accept all types of functions in our memoize.

Why not accept functions that take a single argument in es-toolkit and have users pass unary functions to our memoize? If we only accept unary functions, implementing memoize becomes straightforward.

interface MemoizeOptions<P, R> {
  cache?: Cache<P, R>;
}

function memoize<P, R>(func: (arg: P) => R, { cache = new Map() }: MemoizeOptions<P, R> = {}) {
  const memoized = function(arg: P): R {    
    if (cache.has(arg)) {
      return cache.get(key)!;
    }

    const result = func(arg);
    cache.set(result);
    return result;
  }

  memoized.cache = cache;

  return memoized;
}

I believe this approach is much simpler and easier to read. We could implement a separate memoize for compatibility in es-toolkit/compat.

raon0211 avatar Jul 19 '24 01:07 raon0211

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 99.77%. Comparing base (82010d8) to head (6ed45bc). Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #208   +/-   ##
=======================================
  Coverage   99.77%   99.77%           
=======================================
  Files         160      161    +1     
  Lines        1311     1325   +14     
  Branches      353      358    +5     
=======================================
+ Hits         1308     1322   +14     
  Misses          2        2           
  Partials        1        1           

codecov-commenter avatar Jul 22 '24 07:07 codecov-commenter

Hello, this looks great!

However, it seems like the function's implementation has become a bit too complex with all the function overloading. One of the key design principles of the es-toolkit is simplicity, as mentioned here. We should aim to keep the implementation simple, even if it means not covering every edge case.

How about we use objects to manage the resolver and cache? This could help simplify our implementation.

import { head } from '../../array/head.ts';

function memoize(func, { resolver = head, cache = new Map() }) {
  const memoized = function(...args) {
    const key = resolver(args);
    
    if (cache.has(key)) {
      return cache.get(key);
    }

    const result = func(...args);
    cache.set(result);
    return result;
  }

  memoized.cache = cache;

  return memoized;
}

@raon0211 I agree that the function has become complex due to overloading! I have modified it to manage using an object, as you suggested.

de-novo avatar Jul 22 '24 07:07 de-novo

Meanwhile, I'm not sure we should accept all types of functions in our memoize.

Why not accept functions that take a single argument in es-toolkit and have users pass unary functions to our memoize? If we only accept unary functions, implementing memoize becomes straightforward.

interface MemoizeOptions<P, R> {
  cache?: Cache<P, R>;
}

function memoize<P, R>(func: (arg: P) => R, { cache = new Map() }: MemoizeOptions<P, R> = {}) {
  const memoized = function(arg: P): R {    
    if (cache.has(arg)) {
      return cache.get(key)!;
    }

    const result = func(arg);
    cache.set(result);
    return result;
  }

  memoized.cache = cache;

  return memoized;
}

I believe this approach is much simpler and easier to read. We could implement a separate memoize for compatibility in es-toolkit/compat.

I think we need to consider the second suggestion a bit more.

interface MemoizeOptions<P, R> {
  cache?: Cache<P, R>;
}

function memoize<P, R>(func: (arg: P) => R, { cache = new Map() }: MemoizeOptions<P, R> = {}) {
  const memoized = function(arg: P): R {    
    if (cache.has(arg)) {
      return cache.get(arg)!;
    }

    const result = func(arg);
    cache.set(arg, result);
    return result;
  }

  memoized.cache = cache;

  return memoized;
}

If implemented as suggested, there is an issue with the inference of the memoized function’s arguments.

const memoized = memoize((a: number, b: number, c: number) => a + b + c);
// memoized =  (...args: number[]) => number & { cache: Cache<number[], number>; }
expect(memoized(1, 2, 3)).toBe(6);
expect(memoized(1, 3, 5)).toBe(9);

While simplicity in implementation is important, if the memoized function is inferred as (...args: number[]) => number & { cache: Cache<number[], number>; }, developers will need to check the code directly to understand the function’s behavior. This can worsen the developer’s user experience.

Additionally, a custom resolver to create cache keys should be supported, which will allow developers to use the function more conveniently. Lodash, for example, supports this feature.

Let me know if you need any further modifications!

de-novo avatar Jul 22 '24 07:07 de-novo

I thought using the first argument as the default key for memoization is very implicit.

So I thought of supporting memoize for functions that only get a single argument; this would enable us not consider edge cases and keep the behavior straightforward.

We might support resolvers in our compat version of memoize, but I thought supporting functions accepting only one argument might be enough in most cases.

raon0211 avatar Jul 24 '24 15:07 raon0211

@raon0211 The proposed approach to use the first argument as the default key for memoization is quite implicit. To simplify and avoid edge cases, we could support memoization only for functions that accept a single argument. This would keep the behavior straightforward. However, consider the following test case:

  interface Test {
    id: string;
    name: string;
    age: number;
  }
  const fn = (test: Test) => {
    return `${test.id} ${test.name} ${test.age}`;
  };
  const memoized = memoize(fn);
  const test: Test = { id: '1', name: 'test', age: 1 };
  const test2: Test = { id: '1', name: 'test', age: 1 };
  memoized(test)
  console.log(memoized.cache.get(test2)); // undefined

Without a resolver, in cases where the first argument is an object, we can’t determine if the objects are equivalent (e.g., having the same id). This can lead to issues where logically identical objects are not recognized as the same cache key. Hence, supporting a resolver in our memoize function would be crucial for handling such scenarios effectively.

de-novo avatar Jul 29 '24 06:07 de-novo

Yes, you are right. Guess we should accept a resolver here.

raon0211 avatar Jul 29 '24 15:07 raon0211

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
es-toolkit ❌ Failed (Inspect) Aug 15, 2024 0:24am

vercel[bot] avatar Aug 15 '24 12:08 vercel[bot]