branchless-utf8
branchless-utf8 copied to clipboard
Branchless UTF-8 decoder
In your post, you say: "Adding that !len is actually somewhat costly, though I couldn’t figure out why." My suspicion was that it is because the "!" operator would essentially...
it's documented behavior so not strictly a bug, however you could avoid overreads by by computing the offsets from a table, there's only a few.
In case you're interested in a demonstration, I've added one to [my fork](https://github.com/dnbaker/branchless-utf8) which I used to see how to use UTF-8 in C. It does add a zlib dependency...
Hi, I read your http://nullprogram.com/blog/2017/10/06/ - interesting work. For what it's worth, note that at the end of http://bjoern.hoehrmann.de/utf-8/decoder/dfa/#variations there is an improved version of the decoder that saves a...
https://en.m.wikipedia.org/wiki/Restrict GCC's and Clang's __restrict__ will further let the compiler optimize the code. Can you rerun your tests? Should give at least another 10% perrformance boost.
Found this code referenced inside imgui, so far as I can tell I'm not sure why the `lengths` array needs to contain 32 results. The reason being is that the...