decord icon indicating copy to clipboard operation
decord copied to clipboard

Parallel loading with torch.utils.data.DataLoader

Open YingtianDt opened this issue 11 months ago • 4 comments

Using torch.utils.data.DataLoader, the loading seems to create dead lock. May I ask if this is expected?

YingtianDt avatar Mar 21 '24 14:03 YingtianDt

Are you using cpu(0) with num_of_workers > 1?

xiao7199 avatar Mar 22 '24 20:03 xiao7199

I was using the default setting, so yes with cpu(0). It turns out that the problem is solved by initializing the video reader within each worker process, instead of first initializing the readers and send them too the worker process.

YingtianDt avatar Mar 24 '24 12:03 YingtianDt

Could you please share your code for this part? I've been facing a similar issue recently. Thanks!

xiao7199 avatar Mar 24 '24 15:03 xiao7199

My code is a part of a larger codebase, but I basically did the following:

from decord import VideoReader
from torch.utils.data import Dataset, DataLoader

class MyDataset(Dataset):
      ...
      def __getitem__(self, index):
            path = self.paths[index]
            frames = VideoReader(path)
            ...

dataset = MyDataset()
dataloader = DataLoader(dataset, ..., num_workers=16)

YingtianDt avatar Mar 26 '24 20:03 YingtianDt