private-gpt icon indicating copy to clipboard operation
private-gpt copied to clipboard

Fix future incompatibility from langchain Chroma

Open johnbrisbin opened this issue 2 years ago • 0 comments

Cleanup #5003 which removes the from_documents functionality which conflicted with another method of the same name. Replaced from_documents with from_texts. See: https://github.com/hwchase17/langchain/pull/5003

It looks like this will go through and they have marked the change as breaking existing code when it goes into the main branch and is released.

In more detail, the from_documents method is removed and another incompatible prototype is all that remains (in vectorstores.py) Apparently, the from_documents implementation used in privateGPT was a convenience to connect the output of text_splitters (Documents) which contain both the text and metadata to Chroma from_texts which expects two lists, one with strings and one with dicts (metadata). The code used in the old implementation of from_documents to split the Document list into two list is inserted into ingest.py (main) to allow use of the from_texts method. This approach is compatible with both the old and prospective versions of langchains support for the Chroma db.

This is my first pull request, so feel free to correct my technique.

johnbrisbin avatar May 20 '23 01:05 johnbrisbin