vision-camera-resize-plugin icon indicating copy to clipboard operation
vision-camera-resize-plugin copied to clipboard

Float32 array returns zeroed out values past index 128 on each row

Open computerjazz opened this issue 4 months ago • 7 comments

Hi, I'm trying to pipe vision camera output into a tflite face detection model that requires 128x128x3 float32 input. Here's the relevant code:

        const IMAGE_SIZE = 128
        const COLOR_CHANNELS = 3

        const data = resize(frame, {
          crop: {
            x: 0,
            y: 0,
            width: frame.width,
            height: frame.height,
          },
          scale: {
            width: IMAGE_SIZE,
            height: IMAGE_SIZE,
          },
          pixelFormat: 'rgb',
          dataType: 'float32',
        })

        // log out first row of image (and 1st pixel R value of the second row)
        console.log('first row plus 1px', data.slice(0, IMAGE_SIZE * COLOR_CHANNELS + 1))

        const outputs = bfsr.model?.runSync([data])

I was noticing that my face would only be detected if it was in the left side of the screen. Upon inspecting the resized array, I noticed that only the first 128 indices from each row had values:

Screenshot 2024-02-21 at 2 17 40 PM

This did not happen when I changed dataType to uint8: Screenshot 2024-02-21 at 2 22 02 PM

Am I missing something? Thanks!

computerjazz avatar Feb 21 '24 22:02 computerjazz