deepvats
deepvats copied to clipboard
Speed up model inference with Nvidia TensorRT
I just saw this a couple of days ago: https://developer.nvidia.com/blog/nvidia-announces-tensorrt-8-2-and-integrations-with-pytorch-and-tensorflow/
It seems to be an "easy" way of speeding up inference time in Pytorch/TF models with one line of code. This would be critical for the use of the tool, since right now the bottleneck lies in the computation of the embeddings.