WIP Cupy integration
I did this to change torch tensor storage of decoded frames to cupy.
This is because the memory footprint of cupy is much lower than pytorch. Since I plan on using this in a multiprocessed env, I needed the footprint on the GPU per process to be low. This should allow me to use ~50 processes on a 16GB VRAM GPU, which is much better than about 15 with pytorch.
I wasn't really planning on merging this, and doing the work to allow for storage nature choice with a flag or whatever. I planned on just making it run for my specific use in particular (hence many deleted functionnalities). But I think it's still good that people coming here know that cupy integration is possible.
Also I tried jax, but the imutable nature of the jax arrays prevented me from doing anything.
Weirdly, I noticed I need to import cupy before torch, or else cupy fails to import !
Note that torch / cupy interoperability is alive and well.
That's pretty cool, thanks for making your code visible here! It will definitely save others time if they have the same idea of using cupy+tvl.