langchain icon indicating copy to clipboard operation
langchain copied to clipboard

namespace argument not taken into account when creating Pinecone index

Open LeoGrin opened this issue 1 year ago • 0 comments

Quick summary

Using the namespace argument in the function Pinecone.from_existing_index has no effect. Indeed, it is passed to pinecone.Index, which has no namespace argument.

Steps to reproduce a relevant bug

import pinecone
from langchain.docstore.document import Document
from langchain.vectorstores.pinecone import Pinecone
from tests.integration_tests.vectorstores.fake_embeddings import FakeEmbeddings

index = pinecone.Index("langchain-demo") # this should be a new index
texts = ["foo", "bar", "baz"]
metadatas = [{"page": i} for i in range(len(texts))]
Pinecone.from_texts(
    texts,
    FakeEmbeddings(),
    index_name="langchain-demo",
    metadatas=metadatas,
    namespace="test-namespace",
)

texts = ["foo2", "bar2", "baz2"]
metadatas = [{"page": i} for i in range(len(texts))]
Pinecone.from_texts(
    texts,
    FakeEmbeddings(),
    index_name="langchain-demo",
    metadatas=metadatas,
    namespace="test-namespace2",
)

# Search with namespace
docsearch = Pinecone.from_existing_index("langchain-demo", 
                                         embedding=FakeEmbeddings(),
                                         namespace="test-namespace")
output = docsearch.similarity_search("foo", k=6)
# check that we don't get results from the other namespace
page_contents = [o.page_content for o in output]
assert set(page_contents) == set(["foo", "bar", "baz"])

Fix

The namespace argument used in Pinecone.from_existing_index and Pinecone.from_texts should be stored as an attribute and used by default by every method.

LeoGrin avatar Mar 18 '23 12:03 LeoGrin