Implement support for CUDA Streams
TensorRT engines support two execution modes, sync and async. Sync is already supported and doesn't require any CUDA Streams to run.
For async to be supported we need to have support for creating and passing CUDA streams to the execution context along with the data to be executed.
We may be able to get this support via https://github.com/bheisler/RustaCUDA since it's already wrapping the CUDA API.
Picking up work on this.
The overall project is looking awesome! What is the state of this and is it ready for simple AIs?
@enfipy Thanks! It is ready for simple Als as long as they don't require custom plugins to run in TRT. Work for adding plugin support is on going but might take a little bit since managing the C++ -> Rust -> C++ relationship is kind of tricky.
There are some pretty big QoL changes in the upcoming 0.4 release that I'm still putting the finishing touches on. If you're comfortable with it I would recommend use the crate directly from the develop branch instead of the current published version on crates.io.
Let me know if you run into any issues or sharp edges while using the library!
Brilliant! Thanks a lot for your hard work, I will definitely try this out!