Investigate porting some of the engines to run in the browser

Open rotemdan opened this issue 2 years ago • 0 comments

It is technically possible, overall, since the core components: espeak-ng and onnxruntime both fully support running in the browser. Actually, onnxruntime-web, unlike onnxruntime-node (the currently used package), can also make use of the GPU via WebGL and WebGPU, which may give a performance boost for some users.

However, it is a lot of work, and only a subset of the engines can be supported (no cloud engines, in particular). There are several reasons why the web may not be the most effective platform for Echogarden:

Significantly slower inference when using CPU for ONNX models
No cross-domain network connectivity - can't connect to Google Cloud, Microsoft, Amazon etc. without a proxy
Large initial download size would make it too heavy and slow to load as part of a standard web page directly, especially for Whisper models which are several hundred megabytes to gigabytes in size
Large memory requirement for the VITS models, starting at about 800MB - 1GB, which is a bit too much for a browser
Due to the high code complexity, data size, and memory consumption, it is unlikely that a browser extension, internally bundling some of the models, would be accepted to the Chrome and Firefox web stores
Will require a virtual file system to store models and make use of downloadable packages
Requires duplicating a lot of prior work, porting many node.js-only APIs, and increasing code complexity
Possibly lots of issues with inconsistent browser support and browser security constraints
Not future-proof. Due to changing restrictions of browsers, the runtime environment is not guaranteed be reliably reproducible in the future, meaning that it may need continuous maintenance to ensure it keeps working on the newest browsers

Currently, getting a working web-based UI client is a higher priority, so work on this is not likely to start any time soon.

Aug 25 '23 07:08 rotemdan