Daniel Lemire

Results 1863 comments of Daniel Lemire

@easyaspi314 Sure!!! Saving 2 kB if it is performance neutral would be huge. Can you try it out ? I recommend you run benchmarks too.

We care a lot about ARM. > Well an initial test on an ARM Cortex-X1 (yes it is my phone) + Clang 16 shows about 5% overhead and probably can...

> is definitely worth a 5% perf loss Glancing at the code, I am not sure that this should cause a 5% perf loss. It is fairly difficult to measure...

Here are my results (compare with above) on this PR.... Before... ``` ./build/benchmarks/benchmark -P convert_utf8_to_utf16+westmere -F unicode_lipsum/lipsum/*.utf8.txt | grep GB 2.252 GB/s (3.6 %) 1.262 Gc/s 1.78 byte/char 3.196 GB/s...

My concern is that according to my naive view, your PR should be performance neutral... but it seems that it is not. I see a measurable impact (up to 10%)....

Intriguing. Is there some kind of specification for such outputs?

@victor1234 Do you know why it fails under Visual Studio?

Yes, it looks good: https://docs.oracle.com/en/java/javase/17/docs/api/jdk.incubator.vector/jdk/incubator/vector/ByteVector.html It should be just one short function.

Yes, a pull request to provide this functionality in C is invited.