TorchSharp Support CUDA streams

PyTorch includes CUDA streams, which let multiple GPU requests run in parallel.

However it appears that TorchSharp does not support CUDA streams. I searched the codebase and can't find anything like PyTorch's torch.cuda.Stream class, or C# wrappers for e.g. the wait_stream(), default_stream() and record_stream() methods.

Jan 01 '25 16:01 medovina

hey @medovina , thanks for the heads up.

It seems currently we're missing this implementation, so I'm adding missing feature tag here for the implementation.

I've checked PyTorch wrapper for the libtorch, it is mostly depending on CUDA's API calls. stream.py Stream.cpp

We'll consider this in the future versions.

Jan 29 '25 09:01 ghost

Great, thanks for considering this. Streams can be pretty important for good performance when performing inference on multiple threads, so I'd be very happy to see them supported in TorchSharp.

Jan 29 '25 09:01 medovina