wasmtime
wasmtime copied to clipboard
[WASI-NN] Add support for a ONNXruntime backend using ort
This change adds an ONNXruntime backend for WASI-NN. Since there is only one backend implemented, this will help to move the standardization of this proposal forward.
Also, the example usage of the ONNXruntime backend shows how to build a component using WASI-NN and ONNXruntime.
@abrown please take a look when you have a minute. Thank you!!
@devigned, I'll take a closer look next week. I will say I've been been changing the ground under your feet here (sorry 😁) with https://github.com/bytecodealliance/wasmtime/pull/7679; that PR will mean we have to move some of your tests over to test-programs
and write Rust tests instead of bash scripts. An overall improvement, just inconvenient timing.
As for the CI failures here: I think we can add a line to skip-tree
in deny.toml
to ignore the version mismatch of the half
dependency: { name = "half", depth = 1 }. As for the
cargo vet` failures, I think we'll have to audit those, maybe even as a separate PR... anything you can do to reduce the number of dependencies coming in will make that easier (drop features?).
This is great, I have been testing different ONNX models and encountered a data type mismatch error for certain models, specifically at the wasi-nn compute function in bindings.rs. The error indicates a failure to parse ctx: GraphExecutionContext as i32:
1: error while executing at wasm backtrace:
0: 0x245b42a3 - wit-component:shim!indirect-wasi:nn/inference-compute
1: 0x698ad - octaioxide::bindings::wasi::nn::inference::compute::h1bf34b4c8ec1bf77
at ....................bindings.rs:11137:15
2: Failed while accessing backend
3: Data type mismatch: was Int64, tried to convert to Float32
As far as I can tell the ctx value is 0 when I pass it to compute, I've tried to cast it to f32, but the compiler expects a u32.
Very possible I'm missing something obvious as I am fairly new to both WASM and Rust. But thought I would share.
For some extra context, both models which have produced this error have been classification models (specifcially an xgboost and sklearn's SGDClassifier model) - Are there model types in ONNX format which we would never expect this to be compatible with?
This is great, I have been testing different ONNX models and encountered a data type mismatch error for certain models, specifically at the wasi-nn compute function in bindings.rs. The error indicates a failure to parse ctx: GraphExecutionContext as i32:
1: error while executing at wasm backtrace: 0: 0x245b42a3 - wit-component:shim!indirect-wasi:nn/inference-compute 1: 0x698ad - octaioxide::bindings::wasi::nn::inference::compute::h1bf34b4c8ec1bf77 at ....................bindings.rs:11137:15 2: Failed while accessing backend 3: Data type mismatch: was Int64, tried to convert to Float32
As far as I can tell the ctx value is 0 when I pass it to compute, I've tried to cast it to f32, but the compiler expects a u32.
Very possible I'm missing something obvious as I am fairly new to both WASM and Rust. But thought I would share.
For some extra context, both models which have produced this error have been classification models (specifcially an xgboost and sklearn's SGDClassifier model) - Are there model types in ONNX format which we would never expect this to be compatible with?
Is there any chance you could share code that reproduces this behavior, or create a branch with a failing test reproducing this behavior?
@sunfishcode, I believe I need someone that is a trusted contributor to Wasmtime to approve the cargo vet
dependencies added in this PR (per https://docs.wasmtime.dev/contributing-coding-guidelines.html#dependencies-of-wasmtime). Would you please consider allow listing the failing vet dependencies?
I'm not sure if the cargo deny
result for ittapi-sys
is spurious or if the license is going to be a problem. What would you advise?
Also, if this is not how this stuff should work, please let me know :).
@devigned. I see a bunch of dependencies for the failing vet check:
bytemuck:1.14.2 missing ["safe-to-deploy"]
bytes:1.5.0 missing ["safe-to-deploy"]
castaway:0.2.2 missing ["safe-to-deploy"]
color_quant:1.1.0 missing ["safe-to-deploy"]
compact_str:0.7.1 missing ["safe-to-deploy"]
crunchy:0.2.2 missing ["safe-to-deploy"]
fdeflate:0.3.4 missing ["safe-to-deploy"]
filetime:0.2.16 missing ["safe-to-deploy"]
flate2:1.0.28 missing ["safe-to-deploy"]
half:2.3.1 missing ["safe-to-deploy"]
image:0.24.8 missing ["safe-to-deploy"]
matrixmultiply:0.3.8 missing ["safe-to-deploy"]
ndarray:0.15.6 missing ["safe-to-deploy"]
num-complex:0.4.4 missing ["safe-to-deploy"]
num-integer:0.1.45 missing ["safe-to-deploy"]
ort:2.0.0-rc.0 missing ["safe-to-deploy"]
ort-sys:2.0.0-rc.0 missing ["safe-to-deploy"]
png:0.17.11 missing ["safe-to-deploy"]
rawpointer:0.2.1 missing ["safe-to-deploy"]
rustversion:1.0.14 missing ["safe-to-deploy"]
simd-adler32:0.3.7 missing ["safe-to-deploy"]
static_assertions:1.1.0 missing ["safe-to-deploy"]
tar:0.4.40 missing ["safe-to-deploy"]
ureq:2.9.1 missing ["safe-to-deploy"]
xattr:1.1.1 missing ["safe-to-deploy"]
I would guess that at least some of these are related to testing? If that is the case, one thing that I suggested in #7807 is to create a test fixture file with all the tensor bytes in the right format. This could remove some of these dependencies, reducing the amount of crates that we will have to audit.
@abrown, I've trimmed down to about as slim as I think I can get the dependencies. I've removed any dependency on half
and ndarray
which cut the list of dependencies by over half.
12 unvetted dependencies:
bytes:1.5.0 missing ["safe-to-deploy"]
castaway:0.2.2 missing ["safe-to-deploy"]
compact_str:0.7.1 missing ["safe-to-deploy"]
filetime:0.2.16 missing ["safe-to-deploy"]
flate2:1.0.28 missing ["safe-to-deploy"]
ort:2.0.0-rc.0 missing ["safe-to-deploy"]
ort-sys:2.0.0-rc.0 missing ["safe-to-deploy"]
rustversion:1.0.14 missing ["safe-to-deploy"]
static_assertions:1.1.0 missing ["safe-to-deploy"]
tar:0.4.40 missing ["safe-to-deploy"]
ureq:2.9.1 missing ["safe-to-deploy"]
xattr:1.2.0 missing ["safe-to-deploy"]
WDYT?
@abrown, I've trimmed down to about as slim as I think I can get the dependencies. I've removed any dependency on
half
andndarray
which cut the list of dependencies by over half.WDYT?
Thanks! Let's talk about this in the Wasmtime meeting tomorrow.
I audited the dependencies this PR would bring in and the only crate I didn't audit was compact_str
: I do not feel comfortable putting a stamp of approval on this crate without some further discussion. The crate is well-designed and has many kinds of tests, including proptests, but the fundamental unsafe
issue is that the crate manually implements what would otherwise be an enum
: if a string is less than 24 bytes, it lives on the stack; otherwise, it lives on the heap. This causes all kinds of unsafe
pointer copies for its special integer representation, unsafe
ways to pass in unchecked UTF-8 strings, several unsafe
implementations of Sync
, Send
, LifetimeFree
, etc. I would note that all of these unsafe instances are carefully documented with "SAFETY:" comments which seemed reasonable, but the large amount of unsafety and how inherent it is to the crate gave me pause. Perhaps someone else can more confidently vouch for this?
Given your review and the fact that I'm seeing extensive fuzzing and miri testing in the repo I think it's safe to put all that into the vet entry. Vetting isn't so much guaranteeing that the crate is correct but moreso that it doesn't intentionally do bad things. You've done an initial review and the author is clearly covering all the bases they can, so I think it's reasonable to say it's well vetted.
@abrown would you like me to add the vet audit entry for compact_str
? Is that all we need to get this moving forward?
@devigned: yeah, go for it!
outstanding. Thanks for your work, @devigned and to the reviewers!
go figure:
thread 'main' panicked at /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ort-sys-2.0.0-rc.0/build.rs:308:22:
downloaded binaries not available for target riscv64gc-unknown-linux-gnu
you may have to compile ONNX Runtime from source
stack backtrace:
0: rust_begin_unwind
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
1: core::panicking::panic_fmt
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
2: build_script_build::prepare_libort_dir
3: build_script_build::real_main
4: build_script_build::main
5: core::ops::function::FnOnce::call_once
note: Some details are omitted, run with RUST_BACKTRACE=full
for a verbose backtrace.
warning: build failed, waiting for other jobs to finish...
Error: Process completed with exit code 101.
downloaded binaries not available for target riscv64gc-unknown-linux-gnu
That's fine; this just needs a few more target checks.
downloaded binaries not available for target riscv64gc-unknown-linux-gnu
That's fine; this just needs a few more target checks.
I'm on it.
let's go baby
GACK. MingGW!!! crazy amazing
you having fun yet, @devigned ?
Since this isn't intended to work everywhere just yet, it might be best to add a new builder on CI specifically dedicated to testing ONNX rather than trying to get it to work across all the platforms.
If that works for you I'd recommend copying this to start, updating all the bits there (e.g. run on ubuntu, not on windows), and then be sure to add an entry down here to ensure we gate on it in CI.
Also, what I might also recommend, if you add prtest:full
as a string in a commit message it'll run full CI on this PR before going to the merge queue, and that may help avoid the bouncing back and forth between the merge queue and back here.
drinking booze all night now over here, best soap opera I've watched in years
@alexcrichton, in https://github.com/bytecodealliance/wasmtime/pull/7691/commits/2b7a104684f0c45e5969f5ef25b858f8069d69bd I'm reducing the set of triplets in which ONNX will run. I do not plan on pursuing riscv or s390 precompiled onnxruntime bins at this point. Are you good with the matrix-test approach or would you still prefer I break it out into another builder?
Nah if that works for you seems fine, just trying to save some CI headache.