wasm-tools icon indicating copy to clipboard operation
wasm-tools copied to clipboard

wasmparser: `usize` may not be the most appropriate type for offsets

Open nagisa opened this issue 1 year ago • 1 comments

The wasmparser implements an incremental parser for the binary webassembly encoding. In practice this means that tooling can work with files larger the size of the virtual memory. This is not unlike the concept of 32-bit systems still being capable of working with filesystem files larger than 4GiB (i.e. off_t > uint32_t).

While wasmparser does impose a limit on the number of entities within module contents, these limits are not sufficiently low to limit modules to under-4GiB on 32-bit (or 64-kiB on 16-bit) targets. That is, I believe, quite intentional. However it still uses usize to represent the offset while parsing, which seems like a foot gun. We probably should replace the offset with a u64.

This issue has not been written with any particular use-case in mind, so it isn’t something that needs to be fixed ASAP, but it still seems like something we probably should look into fixing when the opportunity presents itself.

nagisa avatar Nov 23 '22 13:11 nagisa

I definitely agree that usize probably isn't the best, and I think that it should switch to using u64. The offsets I think are pretty pervasively used for in-memory indexing though so I think some care will be required to avoid adding lots of as casts all over the place, but it may also not be as bad as I think .

alexcrichton avatar Nov 23 '22 19:11 alexcrichton