ml-commons
ml-commons copied to clipboard
[FEATURE] Allow the RAG processor to use HttpConnectors directly
Is your feature request related to a problem? The RAG processor today makes remote inference calls (chat completion) via remote models. Although the model framework exposes a single interface for interacting with models of all types, for remote inference, each of the vendors such as OpenAI, Cohere and AWS SageMaker Bedrock has their own API with different input parameters and each returns inference results in their own format.
The model framework today handles this by accepting a Map<String, String> in the MLInput and returning the response of a remote endpoint as another bag of properties called dataAsMap deep inside MLOutput. Today, the RAG pipeline has been implemented to work with OpenAI and its input and output formats and does not work for any other remote inference endpoints. If we were to simply return 'dataAsMap' back to the calling application, then we put the burden on the client application to parse this and maintain vendor-specific if-else logic. This also runs counter to the whole point of putting connectors behind the model interface as a way to push the connector implementation behind the scenes.
What solution would you like? I don't think it makes sense to try to make the RAG pipeline work with arbitrary (http) connectors. It is designed to work with remote inference endpoints that perform a specific function, i.e. ChatCompletion. We can pick a few well-known vendors (OpenAI, Cohere and AWS) and support them by mapping their inputs and outputs to some standard set of input parameters ("model" and "question") and outputs ("error" and "answer"). This will probably cover 90% of use cases for RAG and keep all client-side code simple and clean. Ideally, this should be handled by the model and connector framework (I will open a separate issue for this).
What alternatives have you considered? #1475
Do you have any additional context? Add any other context or screenshots about the feature request here.
Notes from community meeting: Create connector then create model automatically. Or create model will also create a connector.
leave it for backlog for now, @austintlee will bring up for discussion later