GPTCache
GPTCache copied to clipboard
[Enhancement]: Option to set context in request in GPTCache Server
What would you like to be added?
I am using GPTCache Server and use /put
and /get
primarily . In my use case, there are multiple user utilizing this server. I want to add context to every request, it could be anything like id
or request_id
so that put
and get
adds or looks up according to that.
example:
/put
body might look like this:
{
"prompt": "hello",
"answer" : "Hi there!",
"id": "abc123"
}
below will return the answer because it is cached with same id
/get
{
"prompt": "hi",
"id": "abc123"
}
below will not return any answer even if it was cached the id is different
/get
{
"prompt": "hi",
"id": "xyz567"
}
Why is this needed?
My application uses GPTCache server as it is, and it is multitenant. I can have multiple user/organisation/project and they don't want to share cache between them
Anything else?
No response
good ideas!
Agree, because I have also encountered this issue, and now the same content cannot be separated from multiple sessions before
But before making any changes, it can be distinguished as follows
Each time a cache is added, an identification ID can be added in front of the content, and this identification ID can also be concatenated during queries For example, when adding: {ID} Hello, when querying: {ID} Hello
Agree, because I have also encountered this issue, and now the same content cannot be separated from multiple sessions before
But before making any changes, it can be distinguished as follows
Each time a cache is added, an identification ID can be added in front of the content, and this identification ID can also be concatenated during queries For example, when adding: {ID} Hello, when querying: {ID} Hello
I have tried this before. I started caching prompt and response like, {user_id} {prompt}
, same I tried querying. It has too many false positives.
example:
prompt: "132 Hello", "133 Hello" matched with same response.
I think since it has vector based(semantic matching), it cannot do strict matching, which results in false matches. I could be wrong though.
we are maintaining our fork and have added multi-tenancy there: https://github.com/NumexaHQ/GPTCache/pull/1/commits/41aae693ff6534523f3db4e423ccda5bf72efc12