Daniel Lemire
Daniel Lemire
Something seems wrong with the 'from-utf8' code, performance-wise. I may have messed up somewhere.
@clausecker I am currently trying to figure out whether I made a mistake somewhere. The code is correct, but it seems that the performance is disappointing. I am not sure...
> For adding my own changes, I suppose I just make a PR against the avx512feature branch, adding my code to the src/avx512vbmi directory? That would be great. Do not...
I am going to try to figure out the performance numbers for the 'from utf-8'. It should be better than what I am seeing right now.
> For adding my own changes, I suppose I just make a PR against the avx512feature branch, adding my code to the src/avx512vbmi directory? Actually... no... please start from the...
Note: the assembly output is not what I expect. Investigating.
Here are dirty benchmarks... I am using an icelake server (AWS) with GCC 11. First transcoding... ``` $ for i in unicode_lipsum/lipsum/*.utf8.txt ; do ./build/benchmarks/benchmark -P convert_utf8_to_utf16+haswell -P convert_utf8_to_utf16+icelake -F...
The new numbers.... (they are better) ``` $ for i in unicode_lipsum/lipsum/*.utf8.txt ; do ./build/benchmarks/benchmark -P convert_utf8_to_utf16+haswell -P convert_utf8_to_utf16+icelake -F $i -I 10000; done We define the number of bytes...
Some of the icelake kernel is 'buggy'. I will soon patch it up.
The silly icelake bug was fixed. (Resulting code is suboptimal.)