ml-commons [Feature/multi_tenancy] Tenant-aware integration tests for Connector, Model, Agent, Model Groups

Description

Tests the entire Create-Get-Update-Search-Delete cycle with multi-tenancy (both enabled and not).

Validates expected results when tenant aware and when not (current status quo).

Looking for review on the approach before replicating to models and agents and other tenant-aware updates, although I won't wait before starting that work. If you want a simpler review, get started now. :)

To execute just these tests:

./gradlew ':opensearch-ml-plugin:integTest' -Dtests.rest.tenantaware=true

or

./gradlew ':opensearch-ml-plugin:integTest' -Dtests.rest.tenantaware=false

Because the classes end in IT, the "false" version executes with normal integ tests as well, a separate test needs to be done for the true case.

Check List

[x] New functionality includes testing.
[x] Commits are signed per the DCO using --signoff.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Aug 09 '24 05:08 dbwiddis

Should to separate out all the bug related fixes in one PR and then may be a separate PR for integ tests?

Aug 30 '24 14:08 dhrubo-os

Should to separate out all the bug related fixes in one PR and then may be a separate PR for integ tests?

Why?

Aug 30 '24 15:08 dbwiddis

Should to separate out all the bug related fixes in one PR and then may be a separate PR for integ tests?

Why?

easier review, clearer commit history, simplified debugging and rollback (if needed), maintaining PR scope.

Aug 30 '24 15:08 dhrubo-os

It's somewhat normal to write a (failing) test and bug fix in the same PR.

This was sitting in a mergeable state for over a week with failing tests commented out before the latest fixes. Fixes without a corresponding failing (integ) test will lack the context of why they are being changed.

Seems like it will continue to delay continuing to fix other bugs with a lot of admin work.

Aug 30 '24 16:08 dbwiddis

./gradlew ':opensearch-ml-plugin:integTest' --tests "org.opensearch.ml.rest.RestMLRAGSearchProcessorIT.testBM25WithCohereUsingLlmResponseField" is currently failing, output from the model is

  "ext": {
    "retrieval_augmented_generation": {
      "error": "Unknown error or response."
    }
  }

Aug 30 '24 16:08 dbwiddis

./gradlew ':opensearch-ml-plugin:integTest' --tests "org.opensearch.ml.rest.RestMLRAGSearchProcessorIT.testBM25WithCohereUsingLlmResponseField" is currently failing, output from the model is
  "ext": {
    "retrieval_augmented_generation": {
      "error": "Unknown error or response."
    }
  }

Let me retry again. Yesterday test was passing in my PRs.

Aug 30 '24 16:08 dhrubo-os

Yesterday test was passing in my PRs.

It passes without the cohere api key change.

Looks like BM25_SEARCH_REQUEST_WITH_LLM_RESPONSE_FIELD_TEMPLATE expects 5 strings before 3 ints, but we are sending 7 strings resulting in two nulls:

 {
     "_source": ["text"],
     "query" : {
       "match": {"text": "president"}
     },
      "ext": {
         "generative_qa_parameters": {
           "llm_model": "command",
           "llm_question": "who is lincoln",
           "context_size": null,
           "message_size": null,
           "timeout": 5,
           "llm_response_field": "5"
         }
     }
   }

Aug 30 '24 16:08 dbwiddis

Looks like #2246, #2250, and #2257 are related but the tests was disabled at the time (and still lis with your PR). There was a recent pull to main, probably should put that in in favor of my fix.

Aug 30 '24 16:08 dbwiddis

I just need this commit: https://github.com/opensearch-project/ml-commons/commit/659a836775047e1d8f88a09eced7828d92e65d41

Aug 30 '24 16:08 dbwiddis

I just need this commit: 659a836

Backported and merged to this branch.

Aug 30 '24 16:08 dhrubo-os

I just need this commit: 659a836

Backported and merged to this branch.

Thanks! I was in the process of doing that on my PR branch. I'll see if I can extract some of the bugfix commits from this PR without too much effort.

Aug 30 '24 16:08 dbwiddis

Tests are now passing on this PR but it's failing on coverage checks for the remote model. Guess I'll add that to jacoco suppressions for now.

> Task :opensearch-ml-plugin:jacocoTestCoverageVerification FAILED
[ant:jacocoReport] Rule violated for class org.opensearch.ml.sdkclient.SdkClientFactory: lines covered ratio is 0.7, but expected minimum is 0.8
[ant:jacocoReport] Rule violated for class org.opensearch.ml.sdkclient.SdkClientFactory.SocketAccess: lines covered ratio is 0.0, but expected minimum is 0.8

Aug 30 '24 17:08 dbwiddis

This is pretty consistently failing on these tests:

Tests with failures:
117 tests completed, 2 failed, 14 skipped
 - org.opensearch.ml.rest.RestMLRAGSearchProcessorIT.testOpenAITextEmbeddingModel_ISO8859_1
 - org.opensearch.ml.rest.RestMLRemoteInferenceIT.testOpenAITextEmbeddingModel_ISO8859_1

These are not associated with my commits, but possibly related to other recent merges on feature/multi_tenancy.

Oct 24 '24 05:10 dbwiddis

This is pretty consistently failing on these tests:
Tests with failures:
117 tests completed, 2 failed, 14 skipped
 - org.opensearch.ml.rest.RestMLRAGSearchProcessorIT.testOpenAITextEmbeddingModel_ISO8859_1
 - org.opensearch.ml.rest.RestMLRemoteInferenceIT.testOpenAITextEmbeddingModel_ISO8859_1
These are not associated with my commits, but possibly related to other recent merges on feature/multi_tenancy.

Yeah I noticed these tests are also failing in 2.x branch. So this is related to OpenAi api. I'll look into this tomorrow to fix.

Oct 24 '24 06:10 dhrubo-os

Closing in favor of #3516 which squashed and merged these commits

Feb 12 '25 00:02 dbwiddis

ml-commons ml-commons copied to clipboard

[Feature/multi_tenancy] Tenant-aware integration tests for Connector, Model, Agent, Model Groups

Description

Check List

ml-commons
ml-commons copied to clipboard