GPTCache
GPTCache copied to clipboard
Chroma API change for 0.4.0 version
** This should land Monday the 17th **
Chroma is upgrading from 0.3.29
to 0.4.0
. 0.4.0
is easier to build, more durable, faster, smaller, and more extensible. This comes with a few changes:
-
A simplified and improved client setup. Instead of having to remember weird settings, users can just do
EphemeralClient
,PersistentClient
orHttpClient
(the underlying directClient
implementation is also still accessible) -
We migrated data stores away from
duckdb
andclickhouse
. This changes the api for thePersistentClient
that used to referencechroma_db_impl="duckdb+parquet"
. Now we simply setis_persistent=true
.is_persistent
is set for you totrue
if you usePersistentClient
. -
Because we migrated away from
duckdb
andclickhouse
- this also means that users need to migrate their data into the new layout and schema. Chroma is committed to providing extension notification and tooling around any schema and data migrations (for example - this PR!).
After upgrading to 0.4.0
- if users try to access their data that was stored in the previous regime, the system will throw an Exception
and instruct them how to use the migration assistant to migrate their data. The migration assitant is a pip installable CLI: pip install chroma_migrate
. And is runnable by calling chroma_migrate
Please reference the readme at chroma-core/chroma-migrate to see a full write-up of our philosophy on migrations as well as more details about this particular migration.
Please direct any users facing issues upgrading to our Discord channel called #get-help. We have also created a email listserv to notify developers directly in the future about breaking changes.
TODO
- [x] Migrated any
duckdb+parquet
strings to the new format - [ ] Notified users about the breaking change (this PR, other suggestions?)
Welcome @jeffchuber! It looks like this is your first PR to zilliztech/GPTCache 🎉
please make the dev branch as the target branch
@SimFG done!
@jeffchuber If I use the Chroma 0.3.29 and run the latest code, there will be a error. right?
@SimFG that is correct - this new API change only supports 0.4.0
and above.
@jeffchuber please give a look for the failed unit test
@SimFG looks like sqlite needs to be updated - https://github.com/chroma-core/chroma/issues/836
are you all open to making this change?
@jeffchuber I have a idea. Is it possible to allow users to choose through parameters, that is to say, keep the previous code by default. If you want to use chrome 0.4.0, you can add additional parameters to use.
def __init__(
self,
client_settings=None,
persist_directory=None,
collection_name: str = "gptcache",
top_k: int = 1,
use_new_version: bool = False,
):
self.top_k = top_k
if client_settings:
self._client_settings = client_settings
else:
self._client_settings = chromadb.config.Settings()
if persist_directory is not None:
if use_new_version:
self._client_settings = chromadb.config.Settings(
is_persistent=True, persist_directory=persist_directory
)
else:
self._client_settings = chromadb.config.Settings(
chroma_db_impl="duckdb+parquet", persist_directory=persist_directory
)
self._client = chromadb. Client(self._client_settings)
self._persist_directory = persist_directory
This can minimize the impact on users. When users want to pursue a better experience, they can manually pass a parameter.
@SimFG we could so something like this user proposed (and was merged) for langchain - https://github.com/hwchase17/langchain/pull/7891?
@jeffchuber yes you can try to do it!
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: jeffchuber
To complete the pull request process, please assign cxie after the PR has been reviewed.
You can assign the PR to them by writing /assign @cxie
in a comment when ready.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
@SimFG added backwards compatibility, can you retrigger the tests?
@jeffchuber Now the error is that the sqlite version is too low. Look at the solution, if it is below python 3.10, you need to manually install a higher version of sqlite and replace it. I think this is very unfriendly to users.
As far as I can tell - this is a different base OS issue. We use python:3.10-slim-bookworm
to back our Docker images that run tests, I'm not sure if GPTCache uses python:3.8-slim-bullseye
or ubuntu-20.04
or other?
@jeffchuber You can solve this problem by merging the latest dev branch. If the user uses chromadb, the lower version 0.3.26 will be installed by default, because I need to ensure the availability of GPTCache. If the user wants to use the new features of a higher version of chromadb, I believe he should also understand this part of the incompatibility problem.