BentoML
BentoML copied to clipboard
Support for TensorRT
Regarding high performance of transformer models with TensorRT, would it be envisageable to adapt bentoML to work with?
I assume this is for docker container?
Yes for sure.
you can build it on top of the Docker GPU image we provided and work from there 😄
Hi @aarnphm sorry I don't really know what includes tensorRT. Doesn't it correspond to a framework/runtime as ON X?
+1 on tensorrt support. Benchmarks against Triton would also be great!
we will consider this integration after bentoml 1.0
Hello! Do we have any updates on the TensorRT integration? First-class TensorRT support for model serving seems to be a very useful feature.
This is supported in BentoML 1.2 now! An example project is coming soon!