ml-commons icon indicating copy to clipboard operation
ml-commons copied to clipboard

[FEATURE] DNS resolution support for model endpoints

Open gaurav7830 opened this issue 1 year ago • 0 comments

Is your feature request related to a problem? Current implementation has some scaling challenges if we use the model at the scale. Below are some of the challenges.

  1. There is no way to check whether the model/endpoint is healthy or not dynamically before calling the underlying model.
  2. There is no way to distribute the load across multiple model endpoints where the underlying model is same on a single registered model.
  3. In case multiple clients uses the same model endpoint in the model and that endpoint went down then there are no way to move away dynamically from that endpoint other then updating the model.

What solution would you like? We have came up with a solution where the model will use the dns endpoint instead of direct model endpoint in the model configuration. And at the time of making a predict api call that dns will get's resolve to actual underlying model. Dns service like Route53 provides functionalities of balancing load across multiple dns with weighted routing. We can setup and mechanism to monitor the health of the model in the DNS service. In case of issue in the model, we just have to update the dns entry with the updates endpoint rather then updating the model configuration.

As part of this feature, i am in favour of creating the new protocol ("dns") which will be the extended version of aws_sigv4. New protocol will internally used all the aws_sigv4 functionalities. It will have additional functionality of resolving dns endpoint.

gaurav7830 avatar Mar 05 '24 12:03 gaurav7830