rten icon indicating copy to clipboard operation
rten copied to clipboard

Support loading ONNX models directly

Open robertknight opened this issue 1 year ago • 1 comments

ONNX models currently have to be converted into the FlatBuffers-based .rten format to use them.

The .rten format is intended to support efficient loading, and have a small code footprint (it serves a similar role to ORT). However the need to convert models is a barrier to using this library, and inconvenient in projects that want to trial or mix different runtimes for various reasons. It would reduce friction if .onnx models could be loaded directly.

robertknight avatar May 04 '24 04:05 robertknight

Had a look at Rust protocol buffers runtimes:

  • Prost has the widest adoption, but it has downsides:
    • It allocates Vec<T>s for each repeated field, with an exception for bytes (https://github.com/tokio-rs/prost/pull/449)
    • It requires protoc installed at build time if you follow their recommended steps to compile protos as part of a build.rs
  • quick-protobuf has less adoption and hasn't been released in a while, but seems much more aligned with this project's priorities:
    • Minimizes allocations when deserializing
    • Does not require protoc installed

robertknight avatar May 26 '24 10:05 robertknight