promptulate icon indicating copy to clipboard operation
promptulate copied to clipboard

Add cache for llm generate

Open Undertone0809 opened this issue 7 months ago • 0 comments

🚀 Feature Request

promptulate need cache for LLM generating. Having a cache means that the large model output is used on the first input, and the output is cached, and on the second input, if the same data has been entered before, the previously cached data is used directly.

Method 1

For example:

import pne

response: str = pne.chat("gpt-4o", "What's promptulate?")

The answer is generated by the gpt-4o driver during the first run and then into the cache, and the cached data is used directly during the second run.

Default no cache, if you want to open cache, use the following pattern:

import pne

response: str = pne.chat("gpt-4o", "What's promptulate?", cache_seed=111)

When your cache_seed is 111, your cache is queried.

Method 2

Use enable_cache parameter. For exmaple:

import pne

response: str = pne.chat("gpt-4o", "What's promptulate?", enable_cache=True)

The answer is generated by the gpt-4o driver during the first run and then into the cache, and the cached data is used directly during the second run.

Compare

The first approach is a little more granular and can be cached based on different user ids? Useless, prompt key is the same.

So method2 is simple and enough.

import pne

user_id = "123123"
response: str = pne.chat("gpt-4o", "What's promptulate?", cache_seed=user_id)

Undertone0809 avatar Jul 21 '24 11:07 Undertone0809