YaoBlocks.jl icon indicating copy to clipboard operation
YaoBlocks.jl copied to clipboard

Weird Performance of SMatrix and Matrix on instruct!

Open Roger-luo opened this issue 6 years ago • 2 comments

MWE:

function instruct2!(state, U, loc)
    a, c, b, d = U
    step = 1 << (loc - 1)
    step_2 = 1 << loc
    for j in 0:step_2:size(state, 1)-step
       @inbounds for i in j+1:j+step
            YaoArrayRegister.u1rows!(state, i, i+step, a, b, c, d)
       end
    end
    return state
end

The performance of instruct is quite different on my machine with Julia 1.1 with SMatrix and Matrix, unexpectedly, SMatrix is even slower. This is causing current QCBM circuit slower than before.

Roger-luo avatar Apr 11 '19 09:04 Roger-luo

It will be more obvious when you pack a few instruct together, e.g using chained put blocks.

using StaticArrays, BenchmarkTools
U = @SMatrix rand(ComplexF64, 2, 2)
st = rand(ComplexF64, 1<<20)
julia> @benchmark foreach(k->instruct2!($st, $U, 2), 1:100)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     180.581 ms (0.00% GC)
  median time:      186.595 ms (0.00% GC)
  mean time:        186.496 ms (0.00% GC)
  maximum time:     193.396 ms (0.00% GC)
  --------------
  samples:          27
  evals/sample:     1

julia> @benchmark foreach(k->instruct2!($st, $(Matrix(U)), 2), 1:100)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     169.435 ms (0.00% GC)
  median time:      187.297 ms (0.00% GC)
  mean time:        188.134 ms (0.00% GC)
  maximum time:     208.341 ms (0.00% GC)
  --------------
  samples:          27
  evals/sample:     1

Roger-luo avatar Apr 11 '19 10:04 Roger-luo

But SMatrix should be faster

julia> @benchmark foreach(k->instruct3!($st, $U, 2), 1:100)
BenchmarkTools.Trial:
  memory estimate:  80 bytes
  allocs estimate:  1
  --------------
  minimum time:     10.900 ns (0.00% GC)
  median time:      13.812 ns (0.00% GC)
  mean time:        24.196 ns (38.99% GC)
  maximum time:     50.778 μs (99.92% GC)
  --------------
  samples:          10000
  evals/sample:     998

julia> @benchmark foreach(k->instruct3!($st, $(Matrix(U)), 2), 1:100)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     63.906 ns (0.00% GC)
  median time:      67.717 ns (0.00% GC)
  mean time:        71.838 ns (0.00% GC)
  maximum time:     212.529 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     976

Roger-luo avatar Apr 11 '19 10:04 Roger-luo