helpers Mechanism to refresh shared cache

The idea is to check in the cache in Git so that everyone can run the unit tests without hitting the API

The problem is that sometimes we want to refresh it (e.g., there is a new model, we change the format, or we changed the system prompt)

> pytest --update_llm_cache

We already have a --update_outcomes

The design is that the LLM cache is a singleton in the entire code base and the switch above controls the mode "FALLBACK" -> "REFRESH" which forces everything to be updated in the json file and then you commit the changed cache

[ ] Make a proposal

May 21 '25 13:05 gpsaggese

@gpsaggese When I run pytest --update_llm_cache to refresh the cache, do I need to regenerate for all the cached responses, or only for the specific responses included in the tests?

May 29 '25 20:05 srinivassaitangudu

The approach should be similar to --update_outcomes where all the global cache mode is switched to UPDATE and cache is updated. All the code that touches the cache then refreshes the cache for whatever they touch. Does it make sense?

May 30 '25 13:05 gpsaggese

Got it, I have a few questions regarding the use cases of the cache: 1. Are we using the cache file solely for testing purposes? 2. If it’s only for testing, then the above approach makes sense. 3. However, if the cache is also used outside of testing (e.g., in production or for local development), then having a mechanism to refresh the entire cache would be more appropriate.

Correct me if i am wrong.

Jun 01 '25 16:06 srinivassaitangudu

@gpsaggese i have raise Draft PR #788

I have given 2 proposals, please review it.

Jun 01 '25 22:06 srinivassaitangudu

Got it, I have a few questions regarding the use cases of the cache:

Are we using the cache file solely for testing purposes? 2. If it’s only for testing, then the above approach makes sense. 3. However, if the cache is also used outside of testing (e.g., in production or for local development), then having a mechanism to refresh the entire cache would be more appropriate.

Correct me if i am wrong.

Let's document all the assumptions and specs in a running doc (see my PR comments).

The cache is used mainly for testing purposes.
Not sure why production or testing would really make a big difference. The logic is "use the cache value for the same input", if the user "knows" that the cache needs to be refreshed then flip the switch to refresh only what's touched
As to production, all the executables should have a --update_llm_cache in the same way that everybody has --verbose to control the logging. You can instrument hparser.add_verbosity_arg(parser) and hdbg.init_logger(verbosity=args.log_level)to make generalize. This is a different PR

Jun 03 '25 14:06 gpsaggese

@gpsaggese I have a few questions regarding the shared cache setup: 1. Currently, the get_completion() tests use a temporary cache file where we store dummy requests and responses. We then test in "REPLAY" mode by verifying if the same response is returned. 2. In the new approach, we plan to use the actual cache file for testing. Should we remove the old test cases based on the temporary cache? 3. If any changes(few fixes) are required in hopenai.py to support this, should that go in a separate PR?

Jun 05 '25 17:06 srinivassaitangudu

1 and 2, good point.

There should be a cache for the unit tests (which we check in), one for local execution (specific of a user, not shared, etc)
The cache infra should already support setting a path
- So when we run in unit test mode (I think there is a hook somewhere) the global path to the cache gets changed to llm_cache.unit_test..., while in normal mode, the cache is tmp.llm_cache...
Does it make sense?
I would still create a design doc explaining all this

The goal is to have cohesive PR, each one fixing one thing, adding unit tests and so on.

My workflow is to work in a single PR, e.g., HelpersTask..., then as I get to pieces that are mergeable, I use i git_branch_copy to create a HelpersTask..._2 PRs, which I can regress individually, merge to master and then do i git_merge_master
This is the flow described in the other bug https://github.com/causify-ai/helpers/issues/722

Jun 05 '25 17:06 gpsaggese

So when we run in unit test mode (I think there is a hook somewhere) the global path to the cache gets changed to llm_cache.unit_test..., while in normal mode, the cache is tmp.llm_cache...

@gpsaggese

Regarding the cache path switching in unit test mode versus normal mode, do we need to update anything in hcache.py to handle that? I’ve tried tracking down the hook that changes the path to llm_cache.unit_test... but haven’t been able to find it.
Also, could you review PR #788 , The code is less confusing. Finally, please open a new issue for fixing hopenai.py so I can fix add more tests and push coverage percentage.

Jun 06 '25 19:06 srinivassaitangudu

I've done the reviewed for #788. Can we merge it?

Then we can start using the entire system.

Jun 17 '25 13:06 gpsaggese

@gpsaggese i have made few changes, can you review it now?

Jun 17 '25 14:06 srinivassaitangudu