vision
vision copied to clipboard
VideoReader with cuda backend fails with num_workers > 1
🐛 Describe the bug
I set the backend as CUDA. I have compiled torchvision master from source with ffmpeg 4.2.9 with nvenc.
import torchvision
torchvision.set_video_backend("cuda")
in my dataloader I have
vid_reader = torchvision.io.VideoReader(video_path, "video")
which causes this error immediately
vid_reader = torchvision.io.VideoReader(video_path, "video")
File "/home/ganesh/vision/torchvision/io/video_reader.py", line 161, in __init__
self._c = torch.classes.torchvision.GPUDecoder(src, device)
RuntimeError: CUDA error: initialization error
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
it works fine if in DataLoader I don't set num_workers i.e. num_workers=0.
dataloader = DataLoader(
dataset,
collate_fn=collate_fn,
batch_size=2,
# num_workers=2,
# prefetch_factor=2,
shuffle=True,
)
dataloader for reference as well. with num_workers and prefetch_factor commented out it works fine. Can Dataset classes not use the GPU inside the worker?
Versions
torchvision master with commit 3fb88b3ef1ee8107df74ca776cb57931fe3e9e1e
pytorch is nightly as of 27Oct.
ffmpeg 4.2.9
cuda 11.8
Hi @ganessh22 ,
Using cuda tensors within a multi-processing context is not really supported unfortuntately. See e.g. https://pytorch.org/docs/stable/data.html#multi-process-data-loading and the other resources linked from there
It is generally not recommended to return CUDA tensors in multi-process loading because of many subtleties in using CUDA and sharing CUDA tensors in multiprocessing
Thank you. I will use it cautiously. I was able to get the above code working though with
import multiprocessing as mp
mp.set_start_method('spawn', force=True)
at the beginning of the code as fork will cause CUDA reinitialised error. For getting a fast video dataset I sadly see no other option unless I have much more CPU power and RAM.