vision icon indicating copy to clipboard operation
vision copied to clipboard

Using list as path/clips/frames containers lead to memory leaks for very large datasets and num_workers>1

Open gurdeepmaurya opened this issue 2 years ago • 1 comments

https://github.com/pytorch/vision/blob/6c2e0ae88b056ba2ac897d4a7c1b7153cefcb444/torchvision/datasets/video_utils.py#L73C1-L73C1

Hi,

The class VideoClips uses the list datastructure to store video_paths, clips, frame_rate etc. However, In multi-dataloader workers paradigm and for very large datasets (Kinetics400/700), this leads to memory leaks. This is described in this issue

https://github.com/pytorch/pytorch/issues/13246

Please replace it with either torch tensor/panadas dataframe/ numpy arrays

gurdeepmaurya avatar Dec 19 '23 05:12 gurdeepmaurya

Thank you for the report @gurdeepmaurya . As you noted, this is largely due to an unsolved issue on the pytorch Dataloader's side and there isn't much that can be done in torchvision to resolve it. Unfortunately, storing all of that data in tensors / arrays is not possible, because a lot of those lists have variable lengths. For example, the .clip attribute is a list of length num_video and contains tensors of shape (num_clips_for_that_video, num_frames) ; but it wouldn't be possible to convert that list into a 3D tensor.

NicolasHug avatar Jan 02 '24 10:01 NicolasHug