DALI
DALI copied to clipboard
Video reader fails when random shuffle is enabled
Version
1.30
Describe the bug.
Hi! I recently stumbled upon a bug where if I create a pipeline with random_shuffle=True
it will fail with error:
Detected variable frame rate video. The decoder returned frame that is past the expected one
Even though the video I'm using was transcoded to be of fixed frame rate (5 fps) and works perfectly fine when random_shuffle=False
is passed to the pipeline. I'm attaching code for reproduction and file that fails for me. I'd appreciate any help with this issue. :)
File used to reproduce the issue (basically black screen, duration=4m40s, fps=5): https://github.com/NVIDIA/DALI/assets/5787705/57817017-b574-4bbf-b7aa-d9c61d856b66
Minimum reproducible example
import nvidia.dali.fn as fn
from nvidia.dali import pipeline_def
from nvidia.dali.plugin.pytorch import DALIGenericIterator, LastBatchPolicy
class CustomDataloader:
def __init__(
self,
video_paths,
batch_size,
num_threads,
sequence_length,
stride=1,
pad_sequences=True,
transpose=True,
resolution=None,
random_shuffle=False,
drop_last=False,
):
self.sequence_length = sequence_length
self.stride = stride
self.pad_sequences = pad_sequences
self.transpose = transpose
self.resolution = resolution
self.random_shuffle = random_shuffle
self.video_pipeline = self._video_pipeline(
video_paths=video_paths,
batch_size=batch_size,
num_threads=num_threads,
device_id=0,
)
self.dali_iterator = DALIGenericIterator(
[self.video_pipeline],
["frames", "video_idx", "frame_idx"],
reader_name="clips_reader",
auto_reset=False,
last_batch_policy=LastBatchPolicy.DROP if drop_last else LastBatchPolicy.PARTIAL,
)
@pipeline_def
def _video_pipeline(self, video_paths):
frames, video_idx, frame_idx = fn.readers.video(
name="clips_reader",
device="gpu",
filenames=video_paths,
enable_timestamps=True,
labels=[],
sequence_length=self.sequence_length,
pad_sequences=self.pad_sequences,
shard_id=0,
num_shards=1,
random_shuffle=self.random_shuffle,
initial_fill=512,
stride=self.stride,
prefetch_queue_depth=4,
pad_last_batch=True,
read_ahead=True,
seed=42,
file_list_include_preceding_frame=False,
)
if self.transpose:
frames = fn.transpose(frames, perm=(0, 3, 1, 2))
return frames, video_idx, frame_idx
def __iter__(self):
for batch, *_ in self.dali_iterator:
yield batch
self.dali_iterator.reset()
def main():
print("> random_shuffle=False")
dataloader = CustomDataloader(
["void-4m40s.mp4"],
batch_size=1,
num_threads=1,
sequence_length=5 * 15,
random_shuffle=False,
drop_last=True,
)
for _ in dataloader:
pass
print("> random_shuffle=True")
dataloader = CustomDataloader(
["void-4m40s.mp4"],
batch_size=1,
num_threads=1,
sequence_length=5 * 15,
random_shuffle=True,
drop_last=True,
)
for _ in dataloader:
pass
if __name__ == "__main__":
main()
Relevant log output
> random_shuffle=False
> random_shuffle=True
140362360223296 Exception in thread: [/opt/dali/dali/operators/reader/loader/video_loader.cc:683] Detected variable frame rate video. The decoder returned frame that is past the expected one
Stacktrace (6 entries):
[frame 0]: /../python3.10/site-packages/nvidia/dali/libdali_operators.so(+0x69a2ee) [0x7fa9d20252ee]
[frame 1]: /../python3.10/site-packages/nvidia/dali/libdali_operators.so(+0x54008c) [0x7fa9d1ecb08c]
[frame 2]: /../python3.10/site-packages/nvidia/dali/libdali_operators.so(+0x284f6fc) [0x7fa9d41da6fc]
[frame 3]: /../python3.10/site-packages/nvidia/dali/libdali_operators.so(+0x4ad4a50) [0x7fa9d645fa50]
[frame 4]: /lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7faa03ae9b43]
[frame 5]: /lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7faa03b7ba00]
Traceback (most recent call last):
File "/../bug.py", line 117, in <module>
main()
File "/../bug.py", line 112, in main
for _ in dataloader:
File "/../bug.py", line 82, in __iter__
for batch, *_ in self.dali_iterator:
File "/../python3.10/site-packages/nvidia/dali/plugin/pytorch.py", line 211, in __next__
outputs = self._get_outputs()
File "/../python3.10/site-packages/nvidia/dali/plugin/base_iterator.py", line 298, in _get_outputs
outputs.append(p.share_outputs())
File "/../python3.10/site-packages/nvidia/dali/pipeline.py", line 1003, in share_outputs
return self._pipe.ShareOutputs()
RuntimeError: Critical error in pipeline:
Error when executing GPU operator readers__Video encountered:
Error in worker thread: [/opt/dali/dali/operators/reader/loader/video_loader.cc:683] Detected variable frame rate video. The decoder returned frame that is past the expected one
Stacktrace (6 entries):
[frame 0]: /../python3.10/site-packages/nvidia/dali/libdali_operators.so(+0x69a2ee) [0x7fa9d20252ee]
[frame 1]: /../python3.10/site-packages/nvidia/dali/libdali_operators.so(+0x54008c) [0x7fa9d1ecb08c]
[frame 2]: /../python3.10/site-packages/nvidia/dali/libdali_operators.so(+0x284f6fc) [0x7fa9d41da6fc]
[frame 3]: /../python3.10/site-packages/nvidia/dali/libdali_operators.so(+0x4ad4a50) [0x7fa9d645fa50]
[frame 4]: /lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7faa03ae9b43]
[frame 5]: /lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7faa03b7ba00]
Current pipeline object is no longer valid.
Other/Misc.
No response
Check for duplicates
- [X] I have searched the open bugs/issues and have found no duplicates for this bug report
Hi @Grzego,
Thank you for reaching out. The code you provided reproduces the problem on my side too. Let me investigate what is going on under the hood.
I checked the code and it seems that the video decoder is fed with the same packets from the video container and returns frames with different timestamps (in the faulty case we seek to frame 750, but the decoder returns 749 and then 751 missing 750, when the shuffling is off it return 750 and 751 as expected). I have reported the corresponding video team but it may take a while for them to get back to me.