wasmtime [WASI-NN] Add support for a ONNXruntime backend using ort

[WASI-NN] Add support for a ONNXruntime backend using ort

Open devigned opened this issue 8 months ago • 8 comments

This change adds an ONNXruntime backend for WASI-NN. Since there is only one backend implemented, this will help to move the standardization of this proposal forward.

Also, the example usage of the ONNXruntime backend shows how to build a component using WASI-NN and ONNXruntime.

Dec 14 '23 23:12 devigned

@abrown please take a look when you have a minute. Thank you!!

Dec 14 '23 23:12 devigned

@devigned, I'll take a closer look next week. I will say I've been been changing the ground under your feet here (sorry 😁) with https://github.com/bytecodealliance/wasmtime/pull/7679; that PR will mean we have to move some of your tests over to test-programs and write Rust tests instead of bash scripts. An overall improvement, just inconvenient timing.

As for the CI failures here: I think we can add a line to skip-tree in deny.toml to ignore the version mismatch of the half dependency: { name = "half", depth = 1 }. As for the cargo vet` failures, I think we'll have to audit those, maybe even as a separate PR... anything you can do to reduce the number of dependencies coming in will make that easier (drop features?).

Dec 15 '23 22:12 abrown

This is great, I have been testing different ONNX models and encountered a data type mismatch error for certain models, specifically at the wasi-nn compute function in bindings.rs. The error indicates a failure to parse ctx: GraphExecutionContext as i32:

1: error while executing at wasm backtrace:
       0: 0x245b42a3 - wit-component:shim!indirect-wasi:nn/inference-compute
       1: 0x698ad - octaioxide::bindings::wasi::nn::inference::compute::h1bf34b4c8ec1bf77
                       at ....................bindings.rs:11137:15
2: Failed while accessing backend
3: Data type mismatch: was Int64, tried to convert to Float32

As far as I can tell the ctx value is 0 when I pass it to compute, I've tried to cast it to f32, but the compiler expects a u32.

Very possible I'm missing something obvious as I am fairly new to both WASM and Rust. But thought I would share.

For some extra context, both models which have produced this error have been classification models (specifcially an xgboost and sklearn's SGDClassifier model) - Are there model types in ONNX format which we would never expect this to be compatible with?

Feb 06 '24 10:02 mtobin-tdab

This is great, I have been testing different ONNX models and encountered a data type mismatch error for certain models, specifically at the wasi-nn compute function in bindings.rs. The error indicates a failure to parse ctx: GraphExecutionContext as i32:
1: error while executing at wasm backtrace:
       0: 0x245b42a3 - wit-component:shim!indirect-wasi:nn/inference-compute
       1: 0x698ad - octaioxide::bindings::wasi::nn::inference::compute::h1bf34b4c8ec1bf77
                       at ....................bindings.rs:11137:15
2: Failed while accessing backend
3: Data type mismatch: was Int64, tried to convert to Float32
As far as I can tell the ctx value is 0 when I pass it to compute, I've tried to cast it to f32, but the compiler expects a u32.

Very possible I'm missing something obvious as I am fairly new to both WASM and Rust. But thought I would share.

For some extra context, both models which have produced this error have been classification models (specifcially an xgboost and sklearn's SGDClassifier model) - Are there model types in ONNX format which we would never expect this to be compatible with?

Is there any chance you could share code that reproduces this behavior, or create a branch with a failing test reproducing this behavior?

Feb 08 '24 14:02 devigned

@sunfishcode, I believe I need someone that is a trusted contributor to Wasmtime to approve the cargo vet dependencies added in this PR (per https://docs.wasmtime.dev/contributing-coding-guidelines.html#dependencies-of-wasmtime). Would you please consider allow listing the failing vet dependencies?

I'm not sure if the cargo deny result for ittapi-sys is spurious or if the license is going to be a problem. What would you advise?

Also, if this is not how this stuff should work, please let me know :).

Feb 08 '24 14:02 devigned

@devigned. I see a bunch of dependencies for the failing vet check:

  bytemuck:1.14.2 missing ["safe-to-deploy"]
  bytes:1.5.0 missing ["safe-to-deploy"]
  castaway:0.2.2 missing ["safe-to-deploy"]
  color_quant:1.1.0 missing ["safe-to-deploy"]
  compact_str:0.7.1 missing ["safe-to-deploy"]
  crunchy:0.2.2 missing ["safe-to-deploy"]
  fdeflate:0.3.4 missing ["safe-to-deploy"]
  filetime:0.2.16 missing ["safe-to-deploy"]
  flate2:1.0.28 missing ["safe-to-deploy"]
  half:2.3.1 missing ["safe-to-deploy"]
  image:0.24.8 missing ["safe-to-deploy"]
  matrixmultiply:0.3.8 missing ["safe-to-deploy"]
  ndarray:0.15.6 missing ["safe-to-deploy"]
  num-complex:0.4.4 missing ["safe-to-deploy"]
  num-integer:0.1.45 missing ["safe-to-deploy"]
  ort:2.0.0-rc.0 missing ["safe-to-deploy"]
  ort-sys:2.0.0-rc.0 missing ["safe-to-deploy"]
  png:0.17.11 missing ["safe-to-deploy"]
  rawpointer:0.2.1 missing ["safe-to-deploy"]
  rustversion:1.0.14 missing ["safe-to-deploy"]
  simd-adler32:0.3.7 missing ["safe-to-deploy"]
  static_assertions:1.1.0 missing ["safe-to-deploy"]
  tar:0.4.40 missing ["safe-to-deploy"]
  ureq:2.9.1 missing ["safe-to-deploy"]
  xattr:1.1.1 missing ["safe-to-deploy"]

I would guess that at least some of these are related to testing? If that is the case, one thing that I suggested in #7807 is to create a test fixture file with all the tensor bytes in the right format. This could remove some of these dependencies, reducing the amount of crates that we will have to audit.

Feb 08 '24 22:02 abrown

@abrown, I've trimmed down to about as slim as I think I can get the dependencies. I've removed any dependency on half and ndarray which cut the list of dependencies by over half.

12 unvetted dependencies:
  bytes:1.5.0 missing ["safe-to-deploy"]
  castaway:0.2.2 missing ["safe-to-deploy"]
  compact_str:0.7.1 missing ["safe-to-deploy"]
  filetime:0.2.16 missing ["safe-to-deploy"]
  flate2:1.0.28 missing ["safe-to-deploy"]
  ort:2.0.0-rc.0 missing ["safe-to-deploy"]
  ort-sys:2.0.0-rc.0 missing ["safe-to-deploy"]
  rustversion:1.0.14 missing ["safe-to-deploy"]
  static_assertions:1.1.0 missing ["safe-to-deploy"]
  tar:0.4.40 missing ["safe-to-deploy"]
  ureq:2.9.1 missing ["safe-to-deploy"]
  xattr:1.2.0 missing ["safe-to-deploy"]

WDYT?

Feb 13 '24 23:02 devigned

@abrown, I've trimmed down to about as slim as I think I can get the dependencies. I've removed any dependency on half and ndarray which cut the list of dependencies by over half.

WDYT?

Thanks! Let's talk about this in the Wasmtime meeting tomorrow.

Feb 14 '24 17:02 abrown

I audited the dependencies this PR would bring in and the only crate I didn't audit was compact_str: I do not feel comfortable putting a stamp of approval on this crate without some further discussion. The crate is well-designed and has many kinds of tests, including proptests, but the fundamental unsafe issue is that the crate manually implements what would otherwise be an enum: if a string is less than 24 bytes, it lives on the stack; otherwise, it lives on the heap. This causes all kinds of unsafe pointer copies for its special integer representation, unsafe ways to pass in unchecked UTF-8 strings, several unsafe implementations of Sync, Send, LifetimeFree, etc. I would note that all of these unsafe instances are carefully documented with "SAFETY:" comments which seemed reasonable, but the large amount of unsafety and how inherent it is to the crate gave me pause. Perhaps someone else can more confidently vouch for this?

Feb 20 '24 22:02 abrown

Given your review and the fact that I'm seeing extensive fuzzing and miri testing in the repo I think it's safe to put all that into the vet entry. Vetting isn't so much guaranteeing that the crate is correct but moreso that it doesn't intentionally do bad things. You've done an initial review and the author is clearly covering all the bases they can, so I think it's reasonable to say it's well vetted.

Feb 22 '24 18:02 alexcrichton

@abrown would you like me to add the vet audit entry for compact_str? Is that all we need to get this moving forward?

Feb 29 '24 16:02 devigned

@devigned: yeah, go for it!

Mar 04 '24 21:03 abrown

outstanding. Thanks for your work, @devigned and to the reviewers!

Mar 12 '24 14:03 squillace

go figure:

thread 'main' panicked at /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ort-sys-2.0.0-rc.0/build.rs:308:22: downloaded binaries not available for target riscv64gc-unknown-linux-gnu you may have to compile ONNX Runtime from source stack backtrace: 0: rust_begin_unwind at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5 1: core::panicking::panic_fmt at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14 2: build_script_build::prepare_libort_dir 3: build_script_build::real_main 4: build_script_build::main 5: core::ops::function::FnOnce::call_once note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace. warning: build failed, waiting for other jobs to finish... Error: Process completed with exit code 101.

Mar 12 '24 16:03 squillace

downloaded binaries not available for target riscv64gc-unknown-linux-gnu

That's fine; this just needs a few more target checks.

Mar 12 '24 16:03 abrown

downloaded binaries not available for target riscv64gc-unknown-linux-gnu

That's fine; this just needs a few more target checks.

I'm on it.

Mar 12 '24 16:03 devigned

let's go baby

Mar 12 '24 20:03 squillace

GACK. MingGW!!! crazy amazing

Mar 12 '24 20:03 squillace

you having fun yet, @devigned ?

Mar 12 '24 20:03 squillace

Since this isn't intended to work everywhere just yet, it might be best to add a new builder on CI specifically dedicated to testing ONNX rather than trying to get it to work across all the platforms.

If that works for you I'd recommend copying this to start, updating all the bits there (e.g. run on ubuntu, not on windows), and then be sure to add an entry down here to ensure we gate on it in CI.

Also, what I might also recommend, if you add prtest:full as a string in a commit message it'll run full CI on this PR before going to the merge queue, and that may help avoid the bouncing back and forth between the merge queue and back here.

Mar 12 '24 20:03 alexcrichton

drinking booze all night now over here, best soap opera I've watched in years

Mar 12 '24 20:03 squillace

@alexcrichton, in https://github.com/bytecodealliance/wasmtime/pull/7691/commits/2b7a104684f0c45e5969f5ef25b858f8069d69bd I'm reducing the set of triplets in which ONNX will run. I do not plan on pursuing riscv or s390 precompiled onnxruntime bins at this point. Are you good with the matrix-test approach or would you still prefer I break it out into another builder?

Mar 12 '24 21:03 devigned

Nah if that works for you seems fine, just trying to save some CI headache.

Mar 12 '24 21:03 alexcrichton

wasmtime wasmtime copied to clipboard

[WASI-NN] Add support for a ONNXruntime backend using ort

wasmtime
wasmtime copied to clipboard