MLServer icon indicating copy to clipboard operation
MLServer copied to clipboard

Deal with expensive explain calls

Open seldondev opened this issue 4 years ago • 0 comments
trafficstars

Currently Explain endpoint is served as v2 predict endpoint, which is synchronous by design (from the client perspective).

In some explanations, especially if we are not using the gpu the call to explain is expensive and cannot be done synchronously (timeouts, etc.)

We might want need to make it async, i.e. start_explain returning an id and then the client checks get_explain_results with this id (these are just examples).

seldondev avatar Oct 18 '21 08:10 seldondev