ml-commons icon indicating copy to clipboard operation
ml-commons copied to clipboard

Add integration tests for the RAG pipeline covering OpenAI and Bedrock.

Open austintlee opened this issue 2 years ago • 13 comments

Description

Integration (end-to-end) testing of RAG.

Issues Resolved

[List any issues this PR will resolve]

Check List

  • [ ] New functionality includes testing.
    • [ x] All tests pass
  • [ ] New functionality has been documented.
    • [ ] New functionality has javadoc added
  • [ x] Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

austintlee avatar Oct 13 '23 19:10 austintlee

env:
    JAVA_HOME_20.0.2_x64: /opt/hostedtoolcache/jdk/20.0.2/x64
    JAVA_HOME: /opt/hostedtoolcache/jdk/20.0.2/x64
    JAVA_HOME_20_0_2_X64: /opt/hostedtoolcache/jdk/20.0.2/x64
    AWS_DEFAULT_REGION: us-west-2
    AWS_REGION: us-west-2
    AWS_ACCESS_KEY_ID: ***
    AWS_SECRET_ACCESS_KEY: ***
    AWS_SESSION_TOKEN: ***

@ylwu-amzn do you know if the AWS region used for Github CI is always going to be us-west-2? If so, I can just hard code that in my tests. I am currently using us-east-1 (that's what we have in the example in the public docs).

austintlee avatar Oct 13 '23 22:10 austintlee

When I use AWS CLI to test it in us-west-2, I get this error:

An error occurred (AccessDeniedException) when calling the InvokeModel operation: Your account is not authorized to invoke this API operation.

The service doesn't appear to be GA in us-west-2.

austintlee avatar Oct 13 '23 23:10 austintlee

ar to be GA in us-west-2.

From the docs, bedrock is available for us-west-2?

dhrubo-os avatar Oct 13 '23 23:10 dhrubo-os

That message clearly indicates that it's blocking my access. I am using AWS credentials with admin access.

austintlee avatar Oct 13 '23 23:10 austintlee

OK, we figured it out. We had to sign up for Anthropic in each region. Can you check if the AWS account used for Github CI has access to all the Bedrock models in us-west-2?

austintlee avatar Oct 14 '23 00:10 austintlee

@dhrubo-os? Can you check if the AWS account used for Github CI has permission to use Bedrock Anthropic in PDX?

austintlee avatar Oct 24 '23 06:10 austintlee

@dhrubo-os It's only the Bedrock tests that are still failing. I am able to run them fine using my own AWS credentials in both us-east-1 and us-west-2 so I suspect the issue is with the AWS account that is tied to GH CI/CD. Who can help us find the right AWS account and enable Bedrock in us-west-2?

cc: @ylwu-amzn

austintlee avatar Nov 04 '23 17:11 austintlee

@dhrubo-os It's only the Bedrock tests that are still failing. I am able to run them fine using my own AWS credentials in both us-east-1 and us-west-2 so I suspect the issue is with the AWS account that is tied to GH CI/CD. Who can help us find the right AWS account and enable Bedrock in us-west-2?

cc: @ylwu-amzn

This test ran just an hour ago. I see the spotless error only:


          +import�static�org.opensearch.ml.utils.TestHelper.makeRequest;
          +import�static�org.opensearch.ml.utils.TestHelper.toHttpEntity;
          +
          +import�java.nio.file.Files;
          +import�java.nio.file.Path;
          +import�java.util.Locale;
          +import�java.util.Map;
          +import�java.util.Set;
          +
           import�org.apache.hc.core5.http.HttpHeaders;
           import�org.apache.hc.core5.http.io.entity.EntityUtils;
           import�org.apache.hc.core5.http.message.BasicHeader;
          @@ -28,155 +35,165 @@
           import�org.opensearch.ml.common.MLTaskState;
           import�org.opensearch.ml.utils.TestHelper;
           
          -import�java.nio.file.Files;
          -import�java.nio.file.Path;
          -import�java.util.Locale;
          -import�java.util.Map;
          -import�java.util.Set;
          -
          -import�static�org.opensearch.ml.utils.TestHelper.makeRequest;
          -import�static�org.opensearch.ml.utils.TestHelper.toHttpEntity;
          +import�com.google.common.collect.ImmutableList;
          +import�com.google.common.collect.ImmutableMap;
           
           public�class�RestMLRAGSearchProcessorIT�extends�RestMLRemoteInferenceIT�{
           
           ����private�static�final�String�OPENAI_KEY�=�System.getenv("OPENAI_KEY");
          -����private�static�final�String�OPENAI_CONNECTOR_BLUEPRINT�=
          -��������"{\n"
          -������������+�"����\"name\":�\"OpenAI�Chat�Connector\",\n"
          -������������+�"����\"description\":�\"The�connector�to�public�OpenAI�model�service�for�GPT�3.5\",\n"
          -������������+�"����\"version\":�2,\n"
          -������������+�"����\"protocol\":�\"http\",\n"
          -������������+�"����\"parameters\":�{\n"
          -������������+�"��������\"endpoint\":�\"api.openai.com\",\n"
          -������������+�"��������\"model\":�\"gpt-3.5-turbo\",\n"
          -������������+�"��������\"temperature\":�0\n"
          -������������+�"����},\n"
          -������������+�"����\"credential\":�{\n"
          -������������+�"��������\"openAI_key\":�\""�+�OPENAI_KEY�+�"\"\n"
      ... (363 more lines that didn't fit)
  Violations also present in:
      src\test\java\org\opensearch\ml\rest\RestMLRemoteInferenceIT.java
  Run 'gradlew.bat :opensearch-ml-plugin:spotlessApply' to fix these violations.

Can we fix this first to see if this is working? I believe I gave all the access it requires for bedrock.

dhrubo-os avatar Nov 04 '23 19:11 dhrubo-os

@dhrubo-os can you check again?

This is what I see in the latest run:

64 tests completed, 2 failed, 10 skipped
Tests with failures:
 - org.opensearch.ml.rest.RestMLRAGSearchProcessorIT.testBM25WithBedrock
 - org.opensearch.ml.rest.RestMLRAGSearchProcessorIT.testBM25WithBedrockWithConversation

austintlee avatar Nov 04 '23 20:11 austintlee

That was a really bad merge. I may need to scrap this one and start over.

austintlee avatar Mar 01 '24 00:03 austintlee

I created a new PR as this one is on a branch that seems way out of date/sync with upstream.

austintlee avatar Mar 18 '24 22:03 austintlee

https://github.com/opensearch-project/ml-commons/pull/2213

austintlee avatar Mar 18 '24 22:03 austintlee

Should we close this PR as I see another PR merged https://github.com/opensearch-project/ml-commons/pull/2213 ? @austintlee

ylwu-amzn avatar Mar 21 '24 17:03 ylwu-amzn

This is already done. Closing.

austintlee avatar Sep 29 '24 00:09 austintlee