mqtt_cpp icon indicating copy to clipboard operation
mqtt_cpp copied to clipboard

Faster UTF8 validation

Open jonesmz opened this issue 4 years ago • 1 comments

Just making the project aware of this faster algorithm. https://lemire.me/blog/2020/10/20/ridiculously-fast-unicode-utf-8-validation/

Possible ways to take advantage of this are to provide some kind of hook for user code to provide it's own UTF8 validation, or a compile time option to specify a UTF8 validation function as a dependency.

jonesmz avatar Oct 22 '20 01:10 jonesmz

It seems to use vectorized (SIMD) instructions, i would say it goes a bit to far to have this kind of optimization. The UTF8 validation overhead is only the tiniest percentage of the whole workload. Not sure if optimization of this would give you any noticable performance gain. I wonder why boost locale does not have a validation function and select an optimized version based on CPU architecture.

kleunen avatar Oct 23 '20 09:10 kleunen