chatgpt-retrieval-plugin icon indicating copy to clipboard operation
chatgpt-retrieval-plugin copied to clipboard

Add Chroma datastore support

Open csvoss opened this issue 1 year ago • 4 comments

This adds vector datastore support for Chroma, an open-source embedding database. Resolves #60.

cc @atroyn pls review!


  • [x] Test this code end-to-end
    • [x] Make sure it can run with poetry run start!
    • [x] Write some integration tests
  • [x] Resolve uncertainties (see TODO(csvoss) in the current PR) about which embeddings function to use. The outer DataStore wrapper provides its own scaffolding for invoking embeddings on each query, because it generates QueryWithEmbedding objects that then get passed to our child class's _query method. However, the Chroma client also accepts an embedding_function.
    • [x] Pick one of these embedding functions that should be the source-of-truth method that the Chroma datastore implementation handles embeddings
    • [x] Make sure that embeddings are only handled by that source-of-truth method, and that the code paths do not result in embeddings being created more than once by both methods (which would be more expensive!)

csvoss avatar Mar 26 '23 19:03 csvoss

@csvoss lmk when this is shipping, trying to use chroma for a chatgpt plugin template to publish on replit

Bardia95 avatar Apr 06 '23 16:04 Bardia95

@Bardia95 this is currently blocked on me, we're refactoring some internal chroma stuff before going ahead @csvoss, we might want to move this into draft until that lands, expected in the next couple of days

atroyn avatar Apr 06 '23 16:04 atroyn

Freshly rebased off main!

I don't seem to have permissions to mark this PR as a draft.

csvoss avatar Apr 09 '23 23:04 csvoss

While running poetry run start, it keeps saying

ValueError: The sentence_transformers python package is not installed. Please install it with `pip install sentence_transformers`

even after running python3.10 -m pip install sentence_transformers

It also says

File "/usr/local/lib/python3.10/sqlite3/dbapi2.py", line 27, in <module>
    from _sqlite3 import *
ModuleNotFoundError: No module named '_sqlite3'

while trying to use python3.10 . Any suggestions/fix?

Screenshot from 2023-04-19 18-28-33
Screenshot from 2023-04-19 18-28-48

justanotherlad avatar Apr 19 '23 13:04 justanotherlad

+1 any reason this hasn't been merged?

sdan avatar Apr 26 '23 02:04 sdan

This should be deprecated in favor of https://github.com/openai/chatgpt-retrieval-plugin/pull/232

atroyn avatar May 09 '23 02:05 atroyn

Deprecated in favor of https://github.com/openai/chatgpt-retrieval-plugin/pull/232 !

csvoss avatar May 10 '23 19:05 csvoss