ml5-library
ml5-library copied to clipboard
Resized videos are read as their original size
We convert an HTMLVideoElement
to a tensor using TensorFlow's tf.browser.fromPixels
. This function looks at the intrinsic size of the video using videoWidth
and videoHeight
rather than looking at the current size using width
and height
like it does for images (source). Users might not be aware of this and wonder why their "small" videos are so slow to process.
Some models require a fixed image size so this is not an issue as we resize all inputs to that size. I noticed it while working on StyleTransfer
which can accept any size image.
We could add a check in toTensor
that applies the width
and height
of the video using tf.image.resizeBilinear
.
We could add an imageSize
option to the "any size" models where the user can specify a size that they want their input media resized to. For example, they might want to evaluate a video at half of its original size. The StyleTransfer
webcam example is noticeably faster when resizing the input.