Indexing video with binary mask
Describe the question.
Is it possible to index data nodes using a binary mask within a pipeline?
More precisely, I want to remove possible black bars on top and at the bottom of videos. For that, I compute all rows in a video with all-blac pixels and get a binary mask:
mask = fn.reductions.sum(video, axes=[0, 2, 3]) > 0
How can I now use the mask to crop the video accordingly?
Check for duplicates
- [X] I have searched the open bugs/issues and have found no duplicates for this bug report
Hi @Tomsen1410,
I think you can try something like this:
mask = fn.reductions.sum(video, axes=[0, 2, 3]) > 0
shape = fn.reductions.sum(fn.cast(mask, dtype=types.INT32), dtype=types.INT32)
mask_shifted = fn.slice(fn.cast(mask, dtype=types.UINT8), 0, shape, axes=[0]) == 0
anchor = fn.reductions.sum(fn.cast(mask_shifted, dtype=types.INT32), dtype=types.INT32)
video_trim = fn.slice(video, anchor, shape, axes=[1])
as long as the content is bigger than the black bars.
You can also try nonsilent_region operation that was designed for the audio signal in mind but should work in your case as well
anchor, shape = fn.nonsilent_region(fn.cast(mask, dtype=types.UINT8), reset_interval=1, window_length=1)
It would be best for you to check which one performs better with your data.
Closing this issue. If there is anything else we can help with please reopen this one or create a new one.