torchserve-dashboard
torchserve-dashboard copied to clipboard
add methods to InferenceAPI
add get_predictions add get_explanations add get_workflow_predictions
I created my own branch from your 'inference' branch. I added what I had already impelemnted + docstrings.
I'm not sure whether to return the response or its jsonified version.
I don't think we should implement a logic depending on the input type inside of the InferenceApi.
It should work no matter if the input_
is a buffer, a byte array, a file path, idk.
Same for the task type, the InferenceAPi should behave the same way for images, text, ...
I don't think we should implement a logic depending on the input type inside of the InferenceApi. It should work no matter if the
input_
is a buffer, a byte array, a file path, idk.
yes, it feels straightforward to convert the inputs to a common type input_
and send a request to the torchserver.
But the problem is we don't know what structure/type of input the model server/handler is expecting. Is it json, is it a data stream, is it a string, is it form-encoded etc etc.
Input request parsing logics are here and also here. We can just implement things the predefined/base classes (for now).
For example VisionHandler request would be
input_=open(file_path).read()
httpx.post(PRED_API, data=input_) #or files={'data': input_} ?
That said, I have built custom handlers that expect different inputs like:
input_={
"path":file_path,
"some_other_param":'blahblah'
}
httpx.post(PRED_API, json=input_) # which requires a whole different request format and would fail otherwise
Same for the task type, the InferenceAPi should behave the same way for images, text, ...
Also it is not possible to determine which UI to show for which model endpoint. I guess we can show all input modalities and the person can choose/combine smartly...and we would send everything with a default encoding format.
But that wouldn't be as cool as swagger UI.
Fastapi uses a pydantic BaseModel to define a response type class to solve this problem. From BaseModel it is possible to get an openapi schema. I think the request_envelope
is a similar concept just less developed (no schema).
It might be worth looking at replacing base envelope with pydantic, don't know how much of an extensive change this would require on torchserve side.