rten icon indicating copy to clipboard operation
rten copied to clipboard

Support "real" boolean tensors

Open robertknight opened this issue 1 year ago • 1 comments

RTen currently uses i32 tensors everywhere that the original ONNX model used bool tensors. This was originally done to reduce the amount of code generated. This is the same reason that i64 tensors used for indexes in the original model are also converted to i32. Another upside is that model loading becomes cheaper because we don't need to verify that all elements of a bool tensor have a legal bit pattern (all i32 bit patterns are valid, but Rust bool values are bytes that are either 0 or 1).

There are downsides however:

  • i32 elements take up more 4x more memory than bool elements. This makes eg. large mask tensors larger.
  • The generated code may be less efficient in some cases (eg. one can fit more bool than i32 lanes in a SIMD vector)
  • Model inputs that should logically be Tensor<bool> need to instead by passed as Tensor<i32>

The main obstacle to this change is backwards compatibility with existing models. In existing models bool data types may have been converted to i32 in:

  • Constant nodes
  • The target data type for Cast operations

I haven't found a super-pressing need to make this change yet. This issue exists to capture the rationale for the current approach and document issues with making the change.

robertknight avatar Nov 24 '24 10:11 robertknight

From an initial experiment, adding true support for boolean tensors adds about 89KB to the CLI, or 22KB after gzip. Some of that could be reclaimed by de-duplicating operator implementations which only care about value size and not contents.

robertknight avatar May 29 '25 16:05 robertknight