Jochen Schröder

Results 39 comments of Jochen Schröder

I can further simplify the problem to the following function: ```python #pythran export index1(float64[:,:,3], bool[:,:]) def index1(x, idx): return x[idx] ``` called with ```python ins] In [16]: x=np.random.randn(10*10*3).reshape(10,10,3) [ins] In...

I suspect that this is the same error as #1664. I can simplify to: ```python #pythran export draw_line(int64[][], int64[]) def draw_line(grid, x): grid[:,x] = 1 return grid ``` which results...

Are you sure it is simd instructions? At least removing the @simd from the julia loop does not make a difference (does julia use simd instructions on broadcasting?). Similarly compiling...

Same here ``` shape= (1024, 3) compute : 1 * norm norm = 0.0029 s compute_opt : 1 * norm compute2 : 6.85 * norm compute3 : 2.41 * norm...

@serge-sans-paille yes, -mfma did not make a difference (neither with gcc nor clang). Could it be an alignment issue, so that simd instructions are not used. Is there a way...

Looking at the discussion of the julia folks about optimising this [here](https://discourse.julialang.org/t/nbabel-nbody-integrator-speed-up/51712/22) should the arrays be transposed for better simd access? They discuss that they get significant benefit from laying...

This is my laptop, so slightly different numbers, but I still don't see a difference with and without `-mfma` ``` transonic microbench_pythran.py -af "-march=native -Ofast" 1 files created or updated...

@paugier sorry yes forgot. ``` julia microbench_ju_tuple.jl With Tuple{Float64, Float64, Float64}: 2.862 ms (0 allocations: 0 bytes) With Tuple{Float64, Float64, Float64, Float64}: 2.644 ms (0 allocations: 0 bytes) ``` So...

As a follow-up on my home desktop (Ryzen 3600) compared to my laptop (i7) pythran fares slightly better, but also no difference with and without `-mfma` ``` julia microbench_ju_tuple.jl With...