wasi-nn Question: Compatibility with XGBoost or scikit-learn models

Hello! Firstly, thanks for the amazing work on enabling WebAssembly for ML 🤗!

I'm fairly new to adopting WebAssembly for machine learning, but I'm particularly interested to compile non-neural network type models to WebAssembly for inference. Can I clarify that the following backend support for wasi-nn:

Tensorflow, ONNX, OpenVINO, etc.

specifically for ONNX, imply that wasi-nn should also work with XGBoost and scikit-learn models/pipelines, for example, in this documentation? The idea here is that if I have an existing XGBoost model that I'd like to deploy as inference, I would need to first convert the model to an ONNX format and then write the wasi-nn bindings (AssemblyScript or Rust bindings) to execute the model?

Thanks in advance!

Oct 18 '23 15:10 chongshenng

@chongshenng, sounds like there are several options here:

if you're interested in getting something working soon-ish, you could go down the model conversion route and try to run your model in ONNX (or any of the other backends that are implemented by a WebAssembly engine: OpenVINO in Wasmtime, GGML in WasmEdge, several of these in WAMR...). If you're really interested in ONNX specifically, you might want to talk to @devigned, who has this mostly implemented and could probably use some help.
if you're really just interested in XGBoost or scikit-learn, maybe the answer is to add those to the graph-encoding enum; this is a longer-term proposition, however, because you or someone else would have to implement the WASI bindings to that ML backend in some WebAssembly engine. This is not too difficult, but you may not be interested in bootstrapping a whole new ML backend.

Once you get the above piece figured out ("what engine and backend will run this model?"), then you can write a program in some language, compile it to WebAssembly, and execute the model. Here's a good example of that using Wasmtime and OpenVINO: main.rs. Notice how, since we're using Rust and we have created some Rust bindings already (the wasi-nn crate), you won't need to create any additional bindings yourself — just use the crate and compile to the wasi32-wasi target. If you use some other language, e.g., C, you would have to create the bindings yourself. Again, this is not too difficult but you may not be interested in this part.

Some more details on bindings (feel free to ignore if this is too much!): because this wasi-nn specification has switched to using the WIT language, the bindings could be auto-generated for you by wit-bindgen. This allows you to use more languages, but note that not all engines support the component model (i.e., the ABI for WIT) yet, so for those engines this path is not helpful.

To sum it up: (1) decide which ML backend to use (ONNX?), (2) decide which engine to use and make sure it supports the backend, and (3) compile your code to WebAssembly and run it in the engine. Hope that helps!

Oct 18 '23 23:10 abrown

@abrown, thank you for the clear explanations! I appreciate the details, it helps my understanding as well.

I would very much be interested to learn/help with running ONNX models. Let me reach out to @devigned separately to understand what currently exists.

Am I correct in understanding that [tract](https://github.com/sonos/tract) (specifically tract_onnx) is another option to run ONNX in WebAssembly? How is wasi-nn different with tract_onnx?

Oct 19 '23 08:10 chongshenng

I think what tract is trying to do is compile the ML backend itself — the implementations of all the operators — into a WebAssembly module. (That is if I remember correctly how they use WebAssembly...). I found an example where they explain this a bit. The major difference with wasi-nn is that wasi-nn delegates to an ML backend outside the WebAssembly sandbox that can use any special HW features the system has available (threads, wider SIMD, special instructions, etc.). From measurements @mingqiusun and I did some time ago there is a large-ish performance gap between using an ML backend outside (wasi-nn) versus using one inside the sandbox. But the inside approach is more portable.

Oct 26 '23 00:10 abrown