chatgpt-retrieval-plugin
chatgpt-retrieval-plugin copied to clipboard
Add Chroma datastore support
This adds vector datastore support for Chroma, an open-source embedding database. Resolves #60.
cc @atroyn pls review!
- [x] Test this code end-to-end
- [x] Make sure it can run with
poetry run start
! - [x] Write some integration tests
- [x] Make sure it can run with
- [x] Resolve uncertainties (see
TODO(csvoss)
in the current PR) about which embeddings function to use. The outer DataStore wrapper provides its own scaffolding for invoking embeddings on each query, because it generatesQueryWithEmbedding
objects that then get passed to our child class's_query
method. However, the Chroma client also accepts anembedding_function
.- [x] Pick one of these embedding functions that should be the source-of-truth method that the Chroma datastore implementation handles embeddings
- [x] Make sure that embeddings are only handled by that source-of-truth method, and that the code paths do not result in embeddings being created more than once by both methods (which would be more expensive!)
@csvoss lmk when this is shipping, trying to use chroma for a chatgpt plugin template to publish on replit
@Bardia95 this is currently blocked on me, we're refactoring some internal chroma stuff before going ahead @csvoss, we might want to move this into draft until that lands, expected in the next couple of days
Freshly rebased off main
!
I don't seem to have permissions to mark this PR as a draft.
While running poetry run start
, it keeps saying
ValueError: The sentence_transformers python package is not installed. Please install it with `pip install sentence_transformers`
even after running python3.10 -m pip install sentence_transformers
It also says
File "/usr/local/lib/python3.10/sqlite3/dbapi2.py", line 27, in <module>
from _sqlite3 import *
ModuleNotFoundError: No module named '_sqlite3'
while trying to use python3.10
.
Any suggestions/fix?
+1 any reason this hasn't been merged?
This should be deprecated in favor of https://github.com/openai/chatgpt-retrieval-plugin/pull/232
Deprecated in favor of https://github.com/openai/chatgpt-retrieval-plugin/pull/232 !