ml-commons
ml-commons copied to clipboard
Add integration tests for the RAG pipeline covering OpenAI and Bedrock.
Description
Integration (end-to-end) testing of RAG.
Issues Resolved
[List any issues this PR will resolve]
Check List
- [ ] New functionality includes testing.
- [ x] All tests pass
- [ ] New functionality has been documented.
- [ ] New functionality has javadoc added
- [ x] Commits are signed per the DCO using --signoff
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.
env:
JAVA_HOME_20.0.2_x64: /opt/hostedtoolcache/jdk/20.0.2/x64
JAVA_HOME: /opt/hostedtoolcache/jdk/20.0.2/x64
JAVA_HOME_20_0_2_X64: /opt/hostedtoolcache/jdk/20.0.2/x64
AWS_DEFAULT_REGION: us-west-2
AWS_REGION: us-west-2
AWS_ACCESS_KEY_ID: ***
AWS_SECRET_ACCESS_KEY: ***
AWS_SESSION_TOKEN: ***
@ylwu-amzn do you know if the AWS region used for Github CI is always going to be us-west-2? If so, I can just hard code that in my tests. I am currently using us-east-1 (that's what we have in the example in the public docs).
When I use AWS CLI to test it in us-west-2, I get this error:
An error occurred (AccessDeniedException) when calling the InvokeModel operation: Your account is not authorized to invoke this API operation.
The service doesn't appear to be GA in us-west-2.
That message clearly indicates that it's blocking my access. I am using AWS credentials with admin access.
OK, we figured it out. We had to sign up for Anthropic in each region. Can you check if the AWS account used for Github CI has access to all the Bedrock models in us-west-2?
@dhrubo-os? Can you check if the AWS account used for Github CI has permission to use Bedrock Anthropic in PDX?
@dhrubo-os It's only the Bedrock tests that are still failing. I am able to run them fine using my own AWS credentials in both us-east-1 and us-west-2 so I suspect the issue is with the AWS account that is tied to GH CI/CD. Who can help us find the right AWS account and enable Bedrock in us-west-2?
cc: @ylwu-amzn
@dhrubo-os It's only the Bedrock tests that are still failing. I am able to run them fine using my own AWS credentials in both us-east-1 and us-west-2 so I suspect the issue is with the AWS account that is tied to GH CI/CD. Who can help us find the right AWS account and enable Bedrock in us-west-2?
cc: @ylwu-amzn
This test ran just an hour ago. I see the spotless error only:
+import�static�org.opensearch.ml.utils.TestHelper.makeRequest;
+import�static�org.opensearch.ml.utils.TestHelper.toHttpEntity;
+
+import�java.nio.file.Files;
+import�java.nio.file.Path;
+import�java.util.Locale;
+import�java.util.Map;
+import�java.util.Set;
+
import�org.apache.hc.core5.http.HttpHeaders;
import�org.apache.hc.core5.http.io.entity.EntityUtils;
import�org.apache.hc.core5.http.message.BasicHeader;
@@ -28,155 +35,165 @@
import�org.opensearch.ml.common.MLTaskState;
import�org.opensearch.ml.utils.TestHelper;
-import�java.nio.file.Files;
-import�java.nio.file.Path;
-import�java.util.Locale;
-import�java.util.Map;
-import�java.util.Set;
-
-import�static�org.opensearch.ml.utils.TestHelper.makeRequest;
-import�static�org.opensearch.ml.utils.TestHelper.toHttpEntity;
+import�com.google.common.collect.ImmutableList;
+import�com.google.common.collect.ImmutableMap;
public�class�RestMLRAGSearchProcessorIT�extends�RestMLRemoteInferenceIT�{
����private�static�final�String�OPENAI_KEY�=�System.getenv("OPENAI_KEY");
-����private�static�final�String�OPENAI_CONNECTOR_BLUEPRINT�=
-��������"{\n"
-������������+�"����\"name\":�\"OpenAI�Chat�Connector\",\n"
-������������+�"����\"description\":�\"The�connector�to�public�OpenAI�model�service�for�GPT�3.5\",\n"
-������������+�"����\"version\":�2,\n"
-������������+�"����\"protocol\":�\"http\",\n"
-������������+�"����\"parameters\":�{\n"
-������������+�"��������\"endpoint\":�\"api.openai.com\",\n"
-������������+�"��������\"model\":�\"gpt-3.5-turbo\",\n"
-������������+�"��������\"temperature\":�0\n"
-������������+�"����},\n"
-������������+�"����\"credential\":�{\n"
-������������+�"��������\"openAI_key\":�\""�+�OPENAI_KEY�+�"\"\n"
... (363 more lines that didn't fit)
Violations also present in:
src\test\java\org\opensearch\ml\rest\RestMLRemoteInferenceIT.java
Run 'gradlew.bat :opensearch-ml-plugin:spotlessApply' to fix these violations.
Can we fix this first to see if this is working? I believe I gave all the access it requires for bedrock.
@dhrubo-os can you check again?
This is what I see in the latest run:
64 tests completed, 2 failed, 10 skipped
Tests with failures:
- org.opensearch.ml.rest.RestMLRAGSearchProcessorIT.testBM25WithBedrock
- org.opensearch.ml.rest.RestMLRAGSearchProcessorIT.testBM25WithBedrockWithConversation
That was a really bad merge. I may need to scrap this one and start over.
I created a new PR as this one is on a branch that seems way out of date/sync with upstream.
https://github.com/opensearch-project/ml-commons/pull/2213
Should we close this PR as I see another PR merged https://github.com/opensearch-project/ml-commons/pull/2213 ? @austintlee
This is already done. Closing.