server
server copied to clipboard
[question] Development workflow for custom C++ backends
When developing custom backends with triton, the workflow can be fairly slow. Mainly due to having to restart the server each time I want to test something. Usually it can take up to 30 seconds, especially if I have a lot of debug settings or I'm running in a debugger. Are there any recommendations for speeding up that workflow? Maybe by not completely shutting the server down each time.
Thanks!
I don't think there is a way to get around this. The backend shared library needs to be reloaded to pick up the new changes. Currently, if a backend library is loaded in Triton it will not be unloaded until Triton is shut down:
https://github.com/triton-inference-server/backend#backend-and-model