simdutf8
simdutf8 copied to clipboard
SIMD-accelerated UTF-8 validation for Rust.
Currently only full slices can be validated using the basic API. Using a streaming API with `init()`, `update()`, `finish_validation()` functions validation could be done on the fly. With the compat...
As part of #56, there is a remaining TODO to integrate with the fuzzer. based on the README for `rust-fuzz` x86-64 is required so we cannot run the fuzzer natively...
Will eventually fix #2. TODO: - Table lookup should work with two intrinsic calls - Fastest way to do `is_ascii()` - `vqshlq_n_u8` + `FPSCR` check? - Runtime Neon availability detection
It appears that [`validate_utf8_basic`](https://docs.rs/simdutf8/0.1.3/src/simdutf8/implementation/x86/mod.rs.html#14) and similar functions trigger [a mislink](https://github.com/rust-lang/rust/issues/81408) on Windows with lld and thinlto. This is not a terribly uncommon combination, so it may be worth exploring alternatives...
- [x] AVX 2 - [x] SSE 4.2 - [x] aarch64 Neon
https://github.com/simdutf/simdutf
- Still some required intrinsics missing. - Runtime feature detection is required. WIP draft pull request: #43
Could be implemented reasonably easy, but will only be done if there is real demand.