Iman Tabrizian
Iman Tabrizian
@sourabh-burnwal Sorry for the delayed response. We don't have any specific examples for ONNX, Python, and torchscript models but you should be able to use repository agents for the same....
Can you elaborate where you want support for `asyncio`? Python backend already supports asyncio in BLS https://github.com/triton-inference-server/python_backend#business-logic-scripting-beta. There is also a ticket on our roadmap to support asyncio for sending...
Can you elaborate more on the use case that you want asyncio support for? The current async io support in Python backend is only for the async BLS requests. The...
I see.. I am wondering if you do not want to block when the `execute` function returns then when do you need the results from your RPC/HTTP requests and how...
If your requests are batched, you can use async in with the current version of Python backend to create a coroutine for each request to download the images asynchronously. You...
@manhtd98 We have implemented decoupled API support which partially addresses the feature requested here: https://github.com/triton-inference-server/python_backend#decoupled-mode Having said that we have a feature on the road map to support full async...
I think detecting dependencies automatically in BLS can be complicated. How about adding a feature in Triton named lazy model loading such that Triton will try to automatically load the...
The `platform` field is the legacy field and the `backend` field is the new convention. There is no difference in terms of functionality at all. Specifying `platform: tensorrt_plan` or `backend:...
@AJHoeh Are you compiling the same version of the stub and Python backend? For example, if you are using 22.05 version of Triton you need to compile 22.05 version of...
That's correct. If the Python version matches you don't need to compile the stub. @dyastremsky Could you please file a ticket so that we can investigate this further? Thanks.