pythran
pythran copied to clipboard
Factor 2.5 slower than julia in microbenchmark
There was some discussion of a microbenchmark on HN yesterday and I ran this against pythran. Compared to the fully optimized julia version (which uses LoopVectorization.jl) pythran is about a factor 2.5 slower (it performs approximately the same as the parallel julia version). It would be nice to figure out if it's possible get that speed back. I've created a repository for the comparison.
It seems much of the speedup comes from the fact that the julia code optimised np.exp(1j*x)
to:
xs,xc = sincos(x)
y = xc+1j*xs
because x is real. I thought this would be something the compiler could figure out itself (is there an easy way of looking at the assembler generated by pythran?). Could this somehow be done by pythran?
Also I've tried compiling with and without -DUSE_XSIMD
and I don't see any speed difference, which I found surprising.