Problem with trying to use AWS OpenSearch
Hi,
I am trying to create an OpenSearchDocumentStore. I created an AWS OpenSearch domain using my AWS account (using root access to AWS).
I set the hosts argument to OpenSearchDocumentStore as hosts=[{'host': "blah.aos.us-east-1.on.aws", 'port': 443}]. The host is my OpenSearch domain endpoint (I've used blah in place of what's in the actual domain name) in AWS.
My issue is that I don't how to set up the http_auth argument for creating the OpenSearchDocumentStore. From the haystack code, it looks like one could give it an OpenSearch username-password tuple, or AWS authorization.
I decided to go with the AWS authorization, since I do not know which username and password is required. I created an IAM user after logging into my AWS account as root and set up access credentials per AWS instructions. I now have an IAM username, an ARN (which comprised my
Then, per the haystack instructions for OpenSearchDocumentStore, I started docker up on my Windows computer and ran docker with the command (I added the OPENSEARCH_INITIAL_ADMIN_PASSWORD based on the error message I got when I had not included it):
docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1024m -Xmx1024m" -e "OPENSEARCH_INITIAL_ADMIN_PASSWORD=<myIAMpassword>" opensearchproject/opensearch:2.17.0
Then I created the OpenSearchDocumentStore by setting http_auth in a few different ways (with AWS4Auth, AWSV4SignerAuth, AWSV4SignerAsyncAuth) using my credentials (access key and secret access key). But whenever I tried to call the file converter in my pipeline, I got an authorization exception, like the below:
Traceback (most recent call last):
File "C:\Users\chawl\anaconda3\envs\ragenv\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\chawl\anaconda3\envs\ragenv\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\chawl\rag\ragenv\haystack_rag_docloader.py", line 281, in
I am not sure how to fix the issue. I have no idea where it gets the expected String-to-Sign, or why the AWS Secret Access Key I set (which I got when I set up myself as the IAM user with access credentials) is not correct.
I also tried setting the http_auth to an AWSAuth() instance, after setting os.environ['AWS_ACCESS_KEY_ID'], os.environ['AWS_SECRET_KEY_ID'], and os.environ['AWS_DEFAULT_REGION']. Again, I did not have an error creating the OpenSearchDocumentStore, but when I called write_documents, I got the error below:
Traceback (most recent call last):
File "C:\Users\chawl\anaconda3\envs\ragenv\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\chawl\anaconda3\envs\ragenv\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\chawl\rag\ragenv\haystack_rag_docloader.py", line 200, in
At this point, I've spent many days trying to get this to work and read a lot on IAM, setting up access credentials and authorization on AWS and OpenSearch websites, but to no avail. I could not find any documentation on Haystack's github or website that was helpful to resolve the issue.
I would really appreciate your help so I can start to use OpenSearchDocumentStore on my Windows machine, for my RAG project.
Thanks in advance, Sanjay
Hello @sanjayc2 if you run the docker command, it means you are running OpenSearch locally on your own machine. In that case you don't need any AWS account.
If you are just starting with this project, I would recommend to use docker locally. https://opensearch.org/blog/replacing-default-admin-credentials/
I can confirm that the steps listed here still are up to date: https://docs.haystack.deepset.ai/docs/opensearch-document-store#initialization
docker pull opensearchproject/opensearch:2.11.0
docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1024m -Xmx1024m" opensearchproject/opensearch:2.11.0
and in a new python environment do the following after running pip install opensearch-haystack:
from haystack_integrations.document_stores.opensearch import OpenSearchDocumentStore
from haystack import Document
document_store = OpenSearchDocumentStore(hosts="http://localhost:9200", use_ssl=True,
verify_certs=False, http_auth=("admin", "admin"))
document_store.write_documents([
Document(content="This is first"),
Document(content="This is second")
])
print(document_store.count_documents())
Note that starting from OpenSearch version 2.12, there is no default password anymore: https://opensearch.org/blog/replacing-default-admin-credentials/ So the authentication in the example above only works that way with opensearch:2.11 or older.
Thank you very much.
With regards to using a later version (e.g., 2.17) of Opensearch to work locally, will the instructions in https://opensearch.org/blog/replacing-default-admin-credentials/ allow one to do that? If not, how would one need to set up the authentication? (would I have to create an Opensearch account with a username and password?). I ask because the newer version of OpenSearch is much faster and uses less memory (which is a premium on my Windows machine).
Thank you again for your help.