Using list as path/clips/frames containers lead to memory leaks for very large datasets and num_workers>1
https://github.com/pytorch/vision/blob/6c2e0ae88b056ba2ac897d4a7c1b7153cefcb444/torchvision/datasets/video_utils.py#L73C1-L73C1
Hi,
The class VideoClips uses the list datastructure to store video_paths, clips, frame_rate etc. However, In multi-dataloader workers paradigm and for very large datasets (Kinetics400/700), this leads to memory leaks. This is described in this issue
https://github.com/pytorch/pytorch/issues/13246
Please replace it with either torch tensor/panadas dataframe/ numpy arrays
Thank you for the report @gurdeepmaurya . As you noted, this is largely due to an unsolved issue on the pytorch Dataloader's side and there isn't much that can be done in torchvision to resolve it.
Unfortunately, storing all of that data in tensors / arrays is not possible, because a lot of those lists have variable lengths. For example, the .clip attribute is a list of length num_video and contains tensors of shape (num_clips_for_that_video, num_frames) ; but it wouldn't be possible to convert that list into a 3D tensor.