promptulate
promptulate copied to clipboard
Add cache for llm generate
🚀 Feature Request
promptulate need cache for LLM generating. Having a cache means that the large model output is used on the first input, and the output is cached, and on the second input, if the same data has been entered before, the previously cached data is used directly.
Method 1
For example:
import pne
response: str = pne.chat("gpt-4o", "What's promptulate?")
The answer is generated by the gpt-4o driver during the first run and then into the cache, and the cached data is used directly during the second run.
Default no cache, if you want to open cache, use the following pattern:
import pne
response: str = pne.chat("gpt-4o", "What's promptulate?", cache_seed=111)
When your cache_seed is 111, your cache is queried.
Method 2
Use enable_cache parameter. For exmaple:
import pne
response: str = pne.chat("gpt-4o", "What's promptulate?", enable_cache=True)
The answer is generated by the gpt-4o driver during the first run and then into the cache, and the cached data is used directly during the second run.
Compare
The first approach is a little more granular and can be cached based on different user ids? Useless, prompt key is the same.
So method2 is simple and enough.
import pne
user_id = "123123"
response: str = pne.chat("gpt-4o", "What's promptulate?", cache_seed=user_id)