simd-json
simd-json copied to clipboard
Documentation
This really needs some documentation love ...
We've done quite a bit on this, I'd love to get some feedback of what's missing from users and not just us.
It would be nice if the documentation tells clearly if I must use exactly target-cpu=native
and that is the only target-cpu
value supported. For example I would like to build my app for different CPUs, but it is not clear if I can say target-cpu=znver2
while building on cascadelake
and it would correctly disable AVX 512 instructions.
@Licenser can you confirm what I asked above? Is target-cpu=native
required or can I use something like target-cpu=znver2
? I can then create a PR for improving the documentation accordingly.
Heya, Sorry for the late reply. I'm terrible with GitHub notifications :sweat_smile:
target-cpu=native
means that rust figures out the most advanced instruction set that the CPU supports. This is nice if you know that you're going to run the code on exactly this machine as it will squeeze out as many optimizations as possible. This is terrible since even minor differences can mean that the program won't run anymore. For example, we can't use caching in CI in combination with target-cpu=native
as the Github Actions hosts are already different enough for code compiled with native
to work across them.
target-feature=...
is the alternative for portable code, but it reduces the optimizations rust can make. For example, +avx,+avx2,+sse4.2
gives a good set that produces reasonable results for tremor. That for example will work on any x86_64 CPU that supports the given instructions, but it's possible to limit it to, say, +sse4.2
and get a lower-performing but more portable build. I'm sure there are better combinations of flags (or additional noes) that could be combined but this is what "worked for us".
Those things are generally applicable to rust programs as the compiler does quite some optimizations under the hood if allowed.
I hope that answers your question, even so very belatedly o.O
Not entirely, because I'm especially interested if using target-cpu=znver2
would work as expected and enable optimization for AMD's Zen 2 architecture. Based on my experiments so far, it seems to do the trick, but maybe you have better knowledge. When it comes to SIMD instructions, they tend to be a bit special thing.
My use case is that I want to build code that is optimized for Zen 2 based processors, but I'm building the software on Intel's Cascadelake based environment. It seems that having target-cpu=znver2
does the trick, but I'm not sure if I'm missing something.
To my knowledge, that should include all instruction sets for Zen 2 the compiler understands, but I've never cross compiled between architectures. But yes AFIAK that's what you describe/aim for is what will happen.
Thanks. I'll write something in the docs to help to clarify this and send a PR once I'm done.
I think the documentation has improved enough to close this