ml5-library
ml5-library copied to clipboard
[devOps] Error: "n.videoElt.captureStream is not a function" in Safari
Dear ml5 community,
I'm submitting a new issue. Please see the details below.
→ Step 1: Describe the issue 📝
Did you find a bug? Want to suggest an idea for feature? I'm receiving the following bug when trying to use the YOLO model in Safari:
Unhandled Promise Rejection: TypeError: n.videoElt.captureStream is not a function. (In 'n.videoElt.captureStream()', 'n.videoElt.captureStream' is undefined)
dispatchException — runtime.js:569
It looks like captureStream isn't supported in Safari? Is there an alternative api I can use?
Here's my code:
import * as React from 'react';
import * as ReactDOM from 'react-dom';
const ml5 = require('ml5');
interface SmartCameraState {
isLoading: boolean;
results: string;
width: number;
height: number;
}
export class SmartCamera extends React.Component<{}, SmartCameraState> {
videoRef?: HTMLVideoElement;
canvasRef?: HTMLCanvasElement;
detector?: any;
constructor(props: {}) {
super(props);
this.state = {
isLoading: true,
results: '',
width: 640,//1280,
height: 480//960
};
}
async componentDidMount() {
if (!this.videoRef || !this.canvasRef) { return; }
const ctx = this.canvasRef.getContext('2d') as CanvasRenderingContext2D;
ctx.lineWidth = 5;
ctx.strokeStyle = "#FFFFFF";
ctx.font = '20px Arial';
ctx.textBaseline = 'top';
// Create a webcam capture
const stream = await navigator.mediaDevices.getUserMedia({ video: { width: this.state.width, height: this.state.height, facingMode: 'environment'} });
console.log('Camera loaded');
this.videoRef.srcObject = stream;
await this.videoRef.play();
const classifyVideo = () => {
this.detector.detect(gotResult);
}
const gotResult = (err: any, results: {label: string, confidence: number, x: number, y: number, w: number, h: number}[]) => {
if (this.state.isLoading) { this.setState({isLoading: false}); }
ctx.clearRect(0, 0, this.state.width, this.state.height);
results.forEach(result => {
const resultStr = `${result.label} ${(result.confidence * 100).toFixed(1)}%`;
const xpos = this.state.width * result.x;
const ypos = this.state.height * result.y;
const boxWidth = this.state.width * result.w;
const boxHeight = this.state.height * result.h;
const textWidth = ctx.measureText(resultStr).width;
// console.log(`x: ${xpos} y: ${ypos} w: ${boxWidth} h: ${boxHeight}`);
ctx.beginPath();
ctx.rect(xpos, ypos, boxWidth, boxHeight);
ctx.stroke();
ctx.fillStyle = "#FFFFFF";
ctx.fillRect(xpos, ypos, textWidth, 22);
ctx.fillStyle = "#000000";
ctx.fillText(resultStr, xpos, ypos);
});
// this.setState({results: JSON.stringify(results, null, 2)});
classifyVideo();
}
this.videoRef.
this.detector = await ml5.YOLO(this.videoRef, () => classifyVideo());
// classifyVideo();
}
render() {
return (
<>
{this.state.isLoading ? <div>loading...</div> : null}
<div>
<video id="video" autoPlay muted loop playsInline ref={this.setVideoInputRef} width={this.state.width} height={this.state.height} style={{position: 'fixed'}}/>
<canvas width={this.state.width} height={this.state.height} ref={ ref => ref && (this.canvasRef = ref)} style={{position: 'fixed'}}/>
</div>
</>
);
}
setVideoInputRef = (ref: HTMLVideoElement) => {
ref && (this.videoRef = ref)
}
}
I was able to monkey patch this by doing this.videoRef.captureStream = () => stream; I think it would be nice if instead of passing the model a video or image element we could pass a MediaStream. That way I could just give it the stream that is returned from navigator.mediaDevices.getUserMedia.
@ccarse - Thanks so much for this investigation! This is definitely something we need to keep in mind + something to add to the browser testing todos to make sure we can support the modern browsers. Let's keep this issue open as a note.
Thanks for following up!
Hello, any update on this topic ? I am having the same issue on Safari when using Yolo with mp4 files
This issue is caused by the Video utility which we use in YOLO, MobileNet, and StyleTransfer.
The Video utility copies the user's video into a new <video> object capturing the video stream -- which is not supported by Safari.
https://github.com/ml5js/ml5-library/blob/c3123cac0b1dfa0ed8e3e2588e8dea72ccd05aa8/src/utils/Video.js#L58-L63
I believe that the reason for copying the video is so that we can resize it into the shape that is required by the model, without messing up the video that is displayed on the page.
Funny thing is...it does not actually resize the input data! We are setting the width and height properties on the video, but the TensorFlow tf.browser.fromPixels function gets the size from the videoWidth and videoHeight properties, which contain the intrinsic size of the video. These are read-only properties which we cannot set.
https://github.com/ml5js/ml5-library/blob/c3123cac0b1dfa0ed8e3e2588e8dea72ccd05aa8/src/utils/Video.js#L64-L67
https://github.com/tensorflow/tfjs/blob/8d96f5dd140e7114e167dfe4d4fe4300f4aaf4a8/tfjs-core/src/ops/browser.ts#L131-L136
My recommendation is that we read the video at its current size and convert the current frame into a TensorFlow tensor first. Then we can use TensorFlow functions like tf.image.resizeBilinear to resize the tensor of pixel data.