ml5-library icon indicating copy to clipboard operation
ml5-library copied to clipboard

[devOps] Error: "n.videoElt.captureStream is not a function" in Safari

Open ccarse opened this issue 6 years ago • 4 comments

Dear ml5 community,

I'm submitting a new issue. Please see the details below.

→ Step 1: Describe the issue 📝

Did you find a bug? Want to suggest an idea for feature? I'm receiving the following bug when trying to use the YOLO model in Safari:

Unhandled Promise Rejection: TypeError: n.videoElt.captureStream is not a function. (In 'n.videoElt.captureStream()', 'n.videoElt.captureStream' is undefined)
dispatchException — runtime.js:569

It looks like captureStream isn't supported in Safari? Is there an alternative api I can use?

Here's my code:

import * as React from 'react';
import * as ReactDOM from 'react-dom';
const ml5 = require('ml5');

interface SmartCameraState {
  isLoading: boolean;
  results: string;
  width: number;
  height: number;
}

export class SmartCamera extends React.Component<{}, SmartCameraState> {
  videoRef?: HTMLVideoElement;
  canvasRef?: HTMLCanvasElement;

  detector?: any;

  constructor(props: {}) {
    super(props);
    this.state = {
      isLoading: true,
      results: '',
      width: 640,//1280,
      height: 480//960
    };
  }

  async componentDidMount() {
    if (!this.videoRef || !this.canvasRef) { return; }

    const ctx = this.canvasRef.getContext('2d') as CanvasRenderingContext2D;
    ctx.lineWidth = 5;
    ctx.strokeStyle = "#FFFFFF";
    ctx.font = '20px Arial';
    ctx.textBaseline = 'top';

    // Create a webcam capture
    const stream = await navigator.mediaDevices.getUserMedia({ video: { width: this.state.width, height: this.state.height, facingMode: 'environment'} });
    console.log('Camera loaded');
    this.videoRef.srcObject = stream;
    await this.videoRef.play();

    const classifyVideo = () => {
      this.detector.detect(gotResult);
    }

    const gotResult = (err: any, results: {label: string, confidence: number, x: number, y: number, w: number, h: number}[]) => {
      if (this.state.isLoading) { this.setState({isLoading: false}); }

      ctx.clearRect(0, 0, this.state.width, this.state.height);
      results.forEach(result => {
        const resultStr = `${result.label} ${(result.confidence * 100).toFixed(1)}%`;
        const xpos = this.state.width * result.x;
        const ypos = this.state.height * result.y;
        const boxWidth = this.state.width * result.w;
        const boxHeight = this.state.height * result.h;
        const textWidth = ctx.measureText(resultStr).width;
        // console.log(`x: ${xpos} y: ${ypos} w: ${boxWidth} h: ${boxHeight}`);

        ctx.beginPath();
        ctx.rect(xpos, ypos, boxWidth, boxHeight);
        ctx.stroke();
        ctx.fillStyle = "#FFFFFF";
        ctx.fillRect(xpos, ypos, textWidth, 22);
        ctx.fillStyle = "#000000";
        ctx.fillText(resultStr, xpos, ypos);
      });

      // this.setState({results: JSON.stringify(results, null, 2)});
      classifyVideo();  
    }

    this.videoRef.
    this.detector = await ml5.YOLO(this.videoRef, () => classifyVideo());
    // classifyVideo();
  }
  
  render() {
    return (
      <>
        {this.state.isLoading ? <div>loading...</div> : null}
        <div>
          <video id="video" autoPlay muted loop playsInline ref={this.setVideoInputRef} width={this.state.width} height={this.state.height} style={{position: 'fixed'}}/>
          <canvas width={this.state.width} height={this.state.height} ref={ ref => ref && (this.canvasRef = ref)} style={{position: 'fixed'}}/>
        </div>
      </>
    );
  }

  setVideoInputRef = (ref: HTMLVideoElement) => {
    ref && (this.videoRef = ref) 
  }
}

ccarse avatar Oct 14 '19 18:10 ccarse

I was able to monkey patch this by doing this.videoRef.captureStream = () => stream; I think it would be nice if instead of passing the model a video or image element we could pass a MediaStream. That way I could just give it the stream that is returned from navigator.mediaDevices.getUserMedia.

ccarse avatar Oct 16 '19 14:10 ccarse

@ccarse - Thanks so much for this investigation! This is definitely something we need to keep in mind + something to add to the browser testing todos to make sure we can support the modern browsers. Let's keep this issue open as a note.

Thanks for following up!

joeyklee avatar Oct 16 '19 18:10 joeyklee

Hello, any update on this topic ? I am having the same issue on Safari when using Yolo with mp4 files

mr1985 avatar Jun 17 '20 06:06 mr1985

This issue is caused by the Video utility which we use in YOLO, MobileNet, and StyleTransfer.

The Video utility copies the user's video into a new <video> object capturing the video stream -- which is not supported by Safari. https://github.com/ml5js/ml5-library/blob/c3123cac0b1dfa0ed8e3e2588e8dea72ccd05aa8/src/utils/Video.js#L58-L63

I believe that the reason for copying the video is so that we can resize it into the shape that is required by the model, without messing up the video that is displayed on the page.

Funny thing is...it does not actually resize the input data! We are setting the width and height properties on the video, but the TensorFlow tf.browser.fromPixels function gets the size from the videoWidth and videoHeight properties, which contain the intrinsic size of the video. These are read-only properties which we cannot set.

https://github.com/ml5js/ml5-library/blob/c3123cac0b1dfa0ed8e3e2588e8dea72ccd05aa8/src/utils/Video.js#L64-L67

https://github.com/tensorflow/tfjs/blob/8d96f5dd140e7114e167dfe4d4fe4300f4aaf4a8/tfjs-core/src/ops/browser.ts#L131-L136

My recommendation is that we read the video at its current size and convert the current frame into a TensorFlow tensor first. Then we can use TensorFlow functions like tf.image.resizeBilinear to resize the tensor of pixel data.

lindapaiste avatar Jun 12 '22 18:06 lindapaiste