langchain icon indicating copy to clipboard operation
langchain copied to clipboard

FAISS.add_embeddings is typed to take iterables but does not.

Open startakovsky opened this issue 1 year ago • 0 comments

System Info

MacOS Langchain Version 0.0.181 Python Version 3.11.3

Who can help?

@eyurtsev I wasn't sure who to reach out to. The following is the signature for adding embeddings to FAISS:

FAISS.add_embeddings(
    self,
    text_embeddings: 'Iterable[Tuple[str, List[float]]]',
    metadatas: 'Optional[List[dict]]' = None,
    **kwargs: 'Any',
) -> 'List[str]'

Notice that text_embeddings takes an iterable. However, when I do this I get a failure with my iterable, but when wrapped in a list function then it is successful.

Information

  • [ ] The official example notebooks/scripts
  • [ ] My own modified scripts

Related Components

  • [ ] LLMs/Chat Models
  • [ ] Embedding Models
  • [ ] Prompts / Prompt Templates / Prompt Selectors
  • [ ] Output Parsers
  • [ ] Document Loaders
  • [X] Vector Stores / Retrievers
  • [ ] Memory
  • [ ] Agents / Agent Executors
  • [ ] Tools / Toolkits
  • [ ] Chains
  • [ ] Callbacks/Tracing
  • [ ] Async

Reproduction

from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

vs = FAISS.from_texts(['a'], embedding=OpenAIEmbeddings())
vector = OpenAIEmbeddings().embed_query('b')

# error happens with this next line, see "Expected behavior" below.
vs.add_embeddings(iter([('b', vector)]))

# no error happens when wrapped in a list
vs.add_embeddings(list(iter([('b', vector)])))

Expected behavior

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
...
File ~/.pyenv/versions/3.11.3/envs/myenv/lib/python3.11/site-packages/faiss/class_wrappers.py:227, in handle_Index.<locals>.replacement_add(self, x)
    214 def replacement_add(self, x):
    215     """Adds vectors to the index.
    216     The index must be trained before vectors can be added to it.
    217     The vectors are implicitly numbered in sequence. When `n` vectors are
   (...)
    224         `dtype` must be float32.
    225     """
--> 227     n, d = x.shape
    228     assert d == self.d
    229     x = np.ascontiguousarray(x, dtype='float32')

ValueError: not enough values to unpack (expected 2, got 1)

startakovsky avatar May 27 '23 11:05 startakovsky