DALI icon indicating copy to clipboard operation
DALI copied to clipboard

How to use a single pipeline function to decode a video file (ex: .mp4) into video AND audio tensors

Open zade-twelvelabs opened this issue 1 year ago • 1 comments

Describe the question.

  1. fn.readers.video must be used to read and decode video files
  2. fn.readers.file must be used to decode audio files, but does not accept video formats

So if I can't uses fn.readers.file to read a videos audio, and fn.readers.video does not decode video audio, how do I decode a .mp4 files audio?

Check for duplicates

  • [X] I have searched the open bugs/issues and have found no duplicates for this bug report

zade-twelvelabs avatar Aug 05 '24 20:08 zade-twelvelabs

Hi @zade-twelvelabs,

Thank you for reaching out. Currently, DALI doesn't support decoding audio from mp4 files. The current audio decoding capabilities (and the flow) are described here. What you can do is use the external source operator and utilize FFmpeg to load and decode audio from mp4 containers. As audio decoding is not GPU accelerated in DALI, there shouldn't be a substantial perf overhead due to this.

JanuszL avatar Aug 05 '24 21:08 JanuszL