helpers icon indicating copy to clipboard operation
helpers copied to clipboard

Mechanism to refresh shared cache

Open gpsaggese opened this issue 7 months ago • 2 comments

The idea is to check in the cache in Git so that everyone can run the unit tests without hitting the API

The problem is that sometimes we want to refresh it (e.g., there is a new model, we change the format, or we changed the system prompt)

> pytest --update_llm_cache

We already have a --update_outcomes

The design is that the LLM cache is a singleton in the entire code base and the switch above controls the mode "FALLBACK" -> "REFRESH" which forces everything to be updated in the json file and then you commit the changed cache

  • [ ] Make a proposal

gpsaggese avatar May 21 '25 13:05 gpsaggese

@gpsaggese When I run pytest --update_llm_cache to refresh the cache, do I need to regenerate for all the cached responses, or only for the specific responses included in the tests?

srinivassaitangudu avatar May 29 '25 20:05 srinivassaitangudu

The approach should be similar to --update_outcomes where all the global cache mode is switched to UPDATE and cache is updated. All the code that touches the cache then refreshes the cache for whatever they touch. Does it make sense?

gpsaggese avatar May 30 '25 13:05 gpsaggese

Got it, I have a few questions regarding the use cases of the cache: 1. Are we using the cache file solely for testing purposes? 2. If it’s only for testing, then the above approach makes sense. 3. However, if the cache is also used outside of testing (e.g., in production or for local development), then having a mechanism to refresh the entire cache would be more appropriate.

Correct me if i am wrong.

srinivassaitangudu avatar Jun 01 '25 16:06 srinivassaitangudu

@gpsaggese i have raise Draft PR #788

I have given 2 proposals, please review it.

srinivassaitangudu avatar Jun 01 '25 22:06 srinivassaitangudu

Got it, I have a few questions regarding the use cases of the cache:

  1. Are we using the cache file solely for testing purposes? 2. If it’s only for testing, then the above approach makes sense. 3. However, if the cache is also used outside of testing (e.g., in production or for local development), then having a mechanism to refresh the entire cache would be more appropriate.

Correct me if i am wrong.

Let's document all the assumptions and specs in a running doc (see my PR comments).

  • The cache is used mainly for testing purposes.
  • Not sure why production or testing would really make a big difference. The logic is "use the cache value for the same input", if the user "knows" that the cache needs to be refreshed then flip the switch to refresh only what's touched
  • As to production, all the executables should have a --update_llm_cache in the same way that everybody has --verbose to control the logging. You can instrument hparser.add_verbosity_arg(parser) and hdbg.init_logger(verbosity=args.log_level)to make generalize. This is a different PR

gpsaggese avatar Jun 03 '25 14:06 gpsaggese

@gpsaggese I have a few questions regarding the shared cache setup: 1. Currently, the get_completion() tests use a temporary cache file where we store dummy requests and responses. We then test in "REPLAY" mode by verifying if the same response is returned. 2. In the new approach, we plan to use the actual cache file for testing. Should we remove the old test cases based on the temporary cache? 3. If any changes(few fixes) are required in hopenai.py to support this, should that go in a separate PR?

srinivassaitangudu avatar Jun 05 '25 17:06 srinivassaitangudu

1 and 2, good point.

  • There should be a cache for the unit tests (which we check in), one for local execution (specific of a user, not shared, etc)
  • The cache infra should already support setting a path
    • So when we run in unit test mode (I think there is a hook somewhere) the global path to the cache gets changed to llm_cache.unit_test..., while in normal mode, the cache is tmp.llm_cache...
  • Does it make sense?
  • I would still create a design doc explaining all this
  1. The goal is to have cohesive PR, each one fixing one thing, adding unit tests and so on.
  • My workflow is to work in a single PR, e.g., HelpersTask..., then as I get to pieces that are mergeable, I use i git_branch_copy to create a HelpersTask..._2 PRs, which I can regress individually, merge to master and then do i git_merge_master
  • This is the flow described in the other bug https://github.com/causify-ai/helpers/issues/722

gpsaggese avatar Jun 05 '25 17:06 gpsaggese

  • So when we run in unit test mode (I think there is a hook somewhere) the global path to the cache gets changed to llm_cache.unit_test..., while in normal mode, the cache is tmp.llm_cache...

@gpsaggese

  1. Regarding the cache path switching in unit test mode versus normal mode, do we need to update anything in hcache.py to handle that? I’ve tried tracking down the hook that changes the path to llm_cache.unit_test... but haven’t been able to find it.

  2. Also, could you review PR #788 , The code is less confusing. Finally, please open a new issue for fixing hopenai.py so I can fix add more tests and push coverage percentage.

srinivassaitangudu avatar Jun 06 '25 19:06 srinivassaitangudu

I've done the reviewed for #788. Can we merge it?

Then we can start using the entire system.

gpsaggese avatar Jun 17 '25 13:06 gpsaggese

@gpsaggese i have made few changes, can you review it now?

srinivassaitangudu avatar Jun 17 '25 14:06 srinivassaitangudu