Iman Tabrizian
Iman Tabrizian
I don't know whether we expose this parameter in the model configuration. @pranavsharma Do you know whether it is possible to use this option when serving the models using onnxruntime...
Hi @zhaozhiming37, sorry for the delayed resposne. ONNXRuntime backend is managed by the Microsoft team so they should be able to provide more info.
@debermudez is this something that you could help with reviewing?
thanks for the contribution! we've merged another PR that should fix this.
@rwgk Python 3.12 has implemented [PEP 684](https://peps.python.org/pep-0684/) which allows each sub-interpreter to have its own GIL. Does this move the needle for supporting sub-interpreter interface in pybind11?
> What exactly do you have in mind? I meant mostly providing a path for embedding sub-interpreters in pybind11 without having to use the underlying Python C-API. For example, it...
@debermudez @matthewkotila @tgerdesnv Is this something that we could potentially add for the C++ clients?
> @Tabrizian Wouldnt this change need to go into the server repo? No, I think what @philipp-schmidt is proposing is to make changes to the client repo cmakes so that...
@tanmayv25 any ideas?
Dynamic batching is a feature of Triton Inference Server that allows inference requests to be combined by the server, so that a batch is created dynamically. This typically results in...