decord apply transformation using `VideoReader`

Hi, I know that we can apply the arguments width and height to resize the read video frames when using VideoReader, however if we want to crop the video frame first (or any other transformation) then resize, I think currently there is no way to do this directly. Currently I used .asnumpy() to change it to a numpy array, and then cropped it. However I noticed that the performance is far worse than using width and height to resize directly (more than 6 hours difference in my dataset). But using width and height lacks the flexibility of applying some customize transformation on the video frames.

Thus I am wondering if its possible to ship an optional function arguments to the VideoReader. Thanks!

Jul 08 '21 00:07 chris-tkinter

@chris-tkinter Your question imply that decord width and height arguments both on VideoLoader and VideoReader apply pure resize. It means that there is a potential of image stretching.

I can't find explanation about what width and height argument does, both in the docs nor in the code comment.

Dec 09 '21 05:12 yasirroni

me too

I want to crop some region and then resize. butI cannot find any API in VideoReader that can do this

Jan 17 '22 09:01 Dawn-LX

Plus I could not find a way to set interpolation method nor saw the default one in codes or somewhere. Please inform if anyone finds

Jun 06 '22 06:06 alercelik

I write a transform to do the transformation. I think we can read the audio in its original shape, then use torchvision to do this.

`from torchvision import transforms as T, utils import decord

transform = T.Compose([ ToTensorVideo(), T.Resize(image_size), T.CenterCrop(image_size), NormalizeVideo([0.5], [0.5]), ]) vidoe_path="the palce of you video here" vr = decord.VideoReader(video_path) sample_index = [0,1,2,3] video = vr.get_batch(sample_index) video = transform(video) print(video.shape)`

Feb 25 '23 07:02 kangqiyue