ml-commons icon indicating copy to clipboard operation
ml-commons copied to clipboard

Support OCI Connector and conversation search with GenAI

Open khoaisohd opened this issue 1 year ago • 7 comments

Description

We need to support a connector to the Oracle Cloud Infrastructure (OCI) and OCI GenAI service as ML so that the conversation search can work for Oracle Cloud Infrastructure

Issues Resolved

https://github.com/opensearch-project/ml-commons/issues/2049

Changes

Add OCI_SIGV1 protocol and connector to call OCI services API

  • Add dependencies to oci java sdk common library V2
  • Add implementation for OciConnector and OciConnector Executor

Support conversational search with OCI GenAI

  • Add OCI_GENAI LLM provider
  • Whitelist OCI GenAIs ervice endpoint
  • Update DefaultLlmImpl to handle OCI_GENAI LLM provider
  • Generate prompt for OCI_GENAI LLM provider

Add a few more test cases to improve code coverage from 82.68% to 86.81%

Test Plan

  • Unit test
  • End to end test with cluster with new OCI LLM provider

Check List

  • [ ] New functionality includes testing.
    • [ ] All tests pass
  • [ ] New functionality has been documented.
    • [ ] New functionality has javadoc added
  • [ ] Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

khoaisohd avatar Feb 13 '24 04:02 khoaisohd

Codecov Report

Attention: Patch coverage is 86.80556% with 19 lines in your changes missing coverage. Please review.

Project coverage is 81.91%. Comparing base (7add721) to head (6aa57ab). Report is 6 commits behind head on main.

:exclamation: Current head 6aa57ab differs from pull request most recent head 47e8278

Please upload reports for the commit 47e8278 to get more accurate results.

Files Patch % Lines
...engine/algorithms/remote/OciConnectorExecutor.java 83.92% 7 Missing and 2 partials :warning:
...g/opensearch/ml/common/connector/OciConnector.java 82.22% 8 Missing :warning:
...estionanswering/generative/llm/DefaultLlmImpl.java 90.47% 0 Missing and 2 partials :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2088      +/-   ##
============================================
+ Coverage     81.12%   81.91%   +0.78%     
+ Complexity     5991     5700     -291     
============================================
  Files           565      545      -20     
  Lines         24822    22993    -1829     
  Branches       2619     2368     -251     
============================================
- Hits          20138    18835    -1303     
+ Misses         3580     3222     -358     
+ Partials       1104      936     -168     
Flag Coverage Δ
ml-commons 81.91% <86.80%> (+0.78%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Feb 13 '24 05:02 codecov[bot]

DCO is missing

dhrubo-os avatar Feb 13 '24 22:02 dhrubo-os

Rebase on top of main

khoaisohd avatar Feb 23 '24 16:02 khoaisohd

Rebased an fix security check by update org.apache.commons:commons-compress version 1.25.0 -> 1.26.0

khoaisohd avatar Feb 23 '24 17:02 khoaisohd

Re-triger, somehow there is no changes but the test was successful in linux before but now fail

khoaisohd avatar Feb 27 '24 15:02 khoaisohd

The PR passed all the checks before, I just added a few comments to address PR comments and now it always failed the test in Linux because of disk issues. I tried to trigger multiple test times but it does not help. Do we have any instructions so that I can follow to clean up the testing environment? cc @dhrubo-os @ylwu-amzn @samuel-oci

{"error":{"root_cause":[{"type":"m_l_limit_exceeded_exception","reason":"Disk Circuit Breaker is open, please check your resources!"}],"type":"m_l_limit_exceeded_exception","reason":"Disk Circuit Breaker is open, please check your resources!"},"status":500}

khoaisohd avatar Mar 01 '24 04:03 khoaisohd

Can you add a blueprint of OCI_GENAI LLM in the ml-commons/docs/remote_inference_blueprints that has been used in this conversational search test example? I don't see it in the current repo. Yeah added documentation

khoaisohd avatar Mar 04 '24 22:03 khoaisohd

@samuel-oci @khoaisohd Do you guys want to get this ready for the 2.18 release? Are there any Oracle customers asking for this feature? @khoaisohd can you rebase against main?

austintlee avatar Sep 29 '24 00:09 austintlee