kubedl
kubedl copied to clipboard
[feature request] inference pipeline support
What would you like to be added: Does kubedl also support optimising inference pipelines? (situations that we have a set of consecutive models in a sequential pipeline). It would be nice if this feature could be added in future versions.
Why is this needed:
Hi @saeid93 , thanks for raising this. currently, it is not supported. We welcome your contributions if you would like to add this !
@saeid93 hi, let me take a further understanding at what you mean:
you have a set of consecutive models and deployed in different pods, user initiates a inference request and it hits all these model serving pods in some kind of order (defined by your pipeline) and returns the result.
do I understand it correctly ?
@SimonCqk Exactly!