langchain
langchain copied to clipboard
namespace argument not taken into account when creating Pinecone index
Quick summary
Using the namespace
argument in the function Pinecone.from_existing_index
has no effect. Indeed, it is passed to pinecone.Index
, which has no namespace
argument.
Steps to reproduce a relevant bug
import pinecone
from langchain.docstore.document import Document
from langchain.vectorstores.pinecone import Pinecone
from tests.integration_tests.vectorstores.fake_embeddings import FakeEmbeddings
index = pinecone.Index("langchain-demo") # this should be a new index
texts = ["foo", "bar", "baz"]
metadatas = [{"page": i} for i in range(len(texts))]
Pinecone.from_texts(
texts,
FakeEmbeddings(),
index_name="langchain-demo",
metadatas=metadatas,
namespace="test-namespace",
)
texts = ["foo2", "bar2", "baz2"]
metadatas = [{"page": i} for i in range(len(texts))]
Pinecone.from_texts(
texts,
FakeEmbeddings(),
index_name="langchain-demo",
metadatas=metadatas,
namespace="test-namespace2",
)
# Search with namespace
docsearch = Pinecone.from_existing_index("langchain-demo",
embedding=FakeEmbeddings(),
namespace="test-namespace")
output = docsearch.similarity_search("foo", k=6)
# check that we don't get results from the other namespace
page_contents = [o.page_content for o in output]
assert set(page_contents) == set(["foo", "bar", "baz"])
Fix
The namespace
argument used in Pinecone.from_existing_index
and Pinecone.from_texts
should be stored as an attribute and used by default by every method.