ml-commons [FEATURE] Support for HttpConnector request and response body transformations through scripts

Is your feature request related to a problem? Client applications interacting with various http endpoints through remote models + http connectors need to be made vendor-aware as each LLM vendor, e.g. OpenAI, Cohere and AWS (Bedrock) defines their own inputs and outputs for their inference API. This will lead to every client application having the same vendor-specific logic.

What solution would you like? At a minimum, it would be good to have a way to map vendor specific parameters to a common set of parameters that client applications can use.

For cases that fall outside the common set of parameters that work across multiple vendors (those that cover 90% of the use cases we know of today), we can provide support for Mustache or Painless scripts to allow users to customize how inputs are prepared before being sent to LLMs and how outputs are presented back to the calling application. This can be used to tweak prompts and transform responses.

What alternatives have you considered? A clear and concise description of any alternative solutions or features you've considered.

Do you have any additional context?

Oct 10 '23 17:10 austintlee

OpenAI vs Bedrock

OpenAI Input

{
    "model": "gpt-3.5-turbo",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }

Output

"choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "\n\nHello there, how may I assist you today?",
    },
    "finish_reason": "stop"
  }]

Bedrock (based on what I see in the blueprint) Input

{
  "prompt": "\\n\\nHuman: this is my question\\n\\nAssistant:"
}

Output

{
  "completion": "this is your answer."
}

Oct 10 '23 19:10 austintlee

Hey @austintlee I believe LiteLLM can help here.

consistent i/o

We simplify these LLM API calls by translating from the OpenAI format to provider-specific formats.

Here's our current openai param mapping (missing coverage occurs if the provider just doesn't offer an equivalent param): Screenshot 2023-10-21 at 6 48 38 PM

We also guarantee a consistent input/output format:

from litellm import completion
import os

## set ENV variables 
os.environ["OPENAI_API_KEY"] = "your-openai-key" 
os.environ["COHERE_API_KEY"] = "your-cohere-key" 

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# cohere call
response = completion(model="command-nightly", messages=messages)
print(response)

With a guaranteed consistent output, text responses will always be available at ['choices'][0]['message']['content']

bedrock

import os 
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = completion(
            model="anthropic.claude-instant-v1", 
            messages=[{ "content": "Hello, how are you?","role": "user"}]
)

https://docs.litellm.ai/docs/providers/bedrock

Oct 22 '23 01:10 krrishdholakia

@krrishdholakia Thanks for bringing this to our attention. Let's move this discussion over to #1495.

Oct 24 '23 05:10 austintlee

@austintlee, can you provide specific examples of when painless scripts lacks the utilities you require to perform request/response data transformations?

What is the model or AI service / API you were trying to integrate with?
What is the specific function missing and can you refer to equivalent functions in other langauges?

Are there specific tensor transformation functions that you need? For instance, are there specific functions available in these tools that you need in OpenSearch:

https://d2l.djl.ai/chapter_preliminaries/ndarray.html
https://spark.apache.org/docs/latest/ml-features

Feb 05 '24 20:02 dylan-tong-aws

You can find the detail here -> #1990

Feb 05 '24 21:02 austintlee

@austintlee With this PR https://github.com/opensearch-project/ml-commons/pull/1954, we can support pre/post process function on any type of data from 2.12, not just text docs input data.

BTW, I replied https://github.com/opensearch-project/ml-commons/issues/1990#issuecomment-1928167932 with the correct post process function

Feb 05 '24 22:02 ylwu-amzn

ml-commons ml-commons copied to clipboard

[FEATURE] Support for HttpConnector request and response body transformations through scripts

consistent i/o

bedrock

ml-commons
ml-commons copied to clipboard