ml5-library icon indicating copy to clipboard operation
ml5-library copied to clipboard

Resized videos are read as their original size

Open lindapaiste opened this issue 2 years ago • 0 comments

We convert an HTMLVideoElement to a tensor using TensorFlow's tf.browser.fromPixels. This function looks at the intrinsic size of the video using videoWidth and videoHeight rather than looking at the current size using width and height like it does for images (source). Users might not be aware of this and wonder why their "small" videos are so slow to process.

Some models require a fixed image size so this is not an issue as we resize all inputs to that size. I noticed it while working on StyleTransfer which can accept any size image.

We could add a check in toTensor that applies the width and height of the video using tf.image.resizeBilinear.

We could add an imageSize option to the "any size" models where the user can specify a size that they want their input media resized to. For example, they might want to evaluate a video at half of its original size. The StyleTransfer webcam example is noticeably faster when resizing the input.

lindapaiste avatar Jun 12 '22 21:06 lindapaiste