haystack-core-integrations icon indicating copy to clipboard operation
haystack-core-integrations copied to clipboard

MongoDBAtlasDocumentStore doesn't recognize/use my connection string when creating MongoClient

Open scooter4j opened this issue 1 year ago • 2 comments

Describe the bug When I try to create a MongoDBAtlasDocumentStore, specifying my personal mongo connection string via environment variable os.environ["MONGO_CONNECTION_STRING"] = "mongodb+srv://scooter4j:[email protected]/?retryWrites=true&w=majority&appName=cluster0", I'm unable to establish a connection to my mongo db instance. Instead, I get the following error:

pymongo.errors.ServerSelectionTimeoutError: ac-qoeetyn-shard-00-01.3oecvqa.mongodb.net:27017: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006) (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms),ac-qoeetyn-shard-00-02.3oecvqa.mongodb.net:27017: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006) (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms),ac-qoeetyn-shard-00-00.3oecvqa.mongodb.net:27017: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006) (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms), Timeout: 30s, Topology Description: <TopologyDescription id: 66a6d4661a19d4beecec5a30, topology_type: ReplicaSetNoPrimary, servers: [<ServerDescription ('ac-qoeetyn-shard-00-00.3oecvqa.mongodb.net', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('ac-qoeetyn-shard-00-00.3oecvqa.mongodb.net:27017: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006) (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>, <ServerDescription ('ac-qoeetyn-shard-00-01.3oecvqa.mongodb.net', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('ac-qoeetyn-shard-00-01.3oecvqa.mongodb.net:27017: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006) (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>, <ServerDescription ('ac-qoeetyn-shard-00-02.3oecvqa.mongodb.net', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('ac-qoeetyn-shard-00-02.3oecvqa.mongodb.net:27017: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006) (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>]> python-BaseException

The code sees my personal connection string and sets the resolved_connection_string to my conneciton string, but this string isn't used when creating the MongoClient connection. See image below:

Screen Shot 2024-08-01 at 12 41 18 PM

To Reproduce My code: `import os from InstructorEmbedding import INSTRUCTOR from haystack_integrations.document_stores.mongodb_atlas import MongoDBAtlasDocumentStore from haystack_integrations.components.retrievers.mongodb_atlas import MongoDBAtlasEmbeddingRetriever

os.environ["MONGO_CONNECTION_STRING"] = "mongodb+srv://scooter4j:[email protected]/?retryWrites=true&w=majority&appName=cluster0"

model = INSTRUCTOR('hkunlp/instructor-base')

instruction = "Represent the physical fitness paragraph for retrieval:" query = "What days did my right leg have odd sensations?"

query_embedding = model.encode([[instruction,query]])

Initialize the document store

document_store = MongoDBAtlasDocumentStore( database_name="sq_rag_sandbox", collection_name="training_notes", vector_search_index="vector_index", )

print(f"Document store contains {document_store.count_documents()} documents")

retriever = MongoDBAtlasEmbeddingRetriever(document_store=document_store)

example run query

blah = retriever.run(query_embedding=query_embedding[0].tolist())

print("placeholder....")`

Note that I use Instructor for my embeddings, but that's immaterial, really, as the problem comes when trying to connect to the MongoDB Atlas cluster independently of getting the embeddings.

Describe your environment (please complete the following information):

  • OS: [e.g. iOS] MacOS 12.6.7

  • Haystack version:

  • (venv) [Scotts-MacBook-Pro-3]~/code/GenAI_Sandbox> pip freeze | grep haystack chroma-haystack==0.18.0 haystack-ai @ git+https://github.com/deepset-ai/haystack.git@1c53aae8f09acb9866de2af503331865710857eb haystack-bm25==1.0.2 haystack-experimental==0.1.0 mongodb-atlas-haystack==0.4.1

  • Integration version:

scooter4j avatar Aug 06 '24 15:08 scooter4j

I've investigated the issue on my end and was able to set up the connection without any errors. The problem you're encountering might be due to a missing SSL certificate required for the connection. Installing the certifi package could resolve this. You can refer to this post for more details: PyMongo SSL Certificate Verify Failed.

Additionally, the official documentation offers resources for troubleshooting TLS errors: PyMongo TLS Troubleshooting.

I recommend trying these solutions and letting us know if the issue persists.

Amnah199 avatar Aug 14 '24 12:08 Amnah199

I've been down that path.... for my own, personal code I use certifi when setting up the MongoClient, as shown (and I'm able to connect to Mongo Atlas without problem using this code). However, one can not specify the tlsCAFile parameter to pass to the MongoClient when creating a MongoDBAtlasDocumentStore object....

def connect_to_db(mongodb_uri, database, use_tls=True):
    if use_tls:
        mongodb_client = MongoClient(mongodb_uri, tlsCAFile=certifi.where())
    else:
        mongodb_client = MongoClient(mongodb_uri)

    database = mongodb_client[database]
    print("Connected to the MongoDB database!")
    return mongodb_client, database

scooter4j avatar Aug 14 '24 14:08 scooter4j

@scooter4j, could you provide any updates on whether this issue has been resolved?

Amnah199 avatar Sep 16 '24 20:09 Amnah199

Closing this issue as stale.

Amnah199 avatar Sep 20 '24 11:09 Amnah199