StaticArrays.jl icon indicating copy to clipboard operation
StaticArrays.jl copied to clipboard

Slow in-place broadcasted assignment (.=) of mutable FieldVectors

Open stillyslalom opened this issue 5 years ago • 8 comments

As noted on Discourse (https://discourse.julialang.org/t/julia-style-arrays-matrix-vs-structs/32424/10?u=stillyslalom), broadcasted assignment seems to be many times slower for mutable FieldVectors than for MVectors.

using StaticArrays

struct Point3D{T} <: FieldVector{3, T}
    x::T
    y::T
    z::T
end

mutable struct MPoint3D{T} <: FieldVector{3, T}
    x::T
    y::T
    z::T
end

function f1!(v)
    for i in eachindex(v)
        x, y, z = v[i]
        v[i] = Point3D(2x, x*y, x*z)
    end
end

function f2!(v)
    for i in eachindex(v)
        x, y, z = v[i]
        v[i] .= (2x, x*y, x*z)
    end
end

using BenchmarkTools
v1 = [@SVector(rand(3)) for i = 1:1000]
v2 = [@MVector(rand(3)) for i = 1:1000]
v3 = [Point3D(rand(3))  for i = 1:1000]
v4 = [MPoint3D(rand(3)) for i = 1:1000]

@btime f1!($v1) # 853.985 ns (0 allocations: 0 bytes)
@btime f2!($v2) # 1.775 μs (0 allocations: 0 bytes)

@btime f1!($v3) # 852.917 ns (0 allocations: 0 bytes)
@btime f2!($v4) # 60.091 μs (3000 allocations: 46.88 KiB)

stillyslalom avatar Dec 18 '19 17:12 stillyslalom

How is it for a normal MVector?

KristofferC avatar Dec 18 '19 17:12 KristofferC

That's v2 in the benchmark above:

@btime f2!($v2) # 1.775 μs (0 allocations: 0 bytes)

stillyslalom avatar Dec 18 '19 17:12 stillyslalom

Ah, sorry, missed that.

I think it would be interesting to also add this style of assignment to see how that compares with broadcasting for the different types, https://discourse.julialang.org/t/julia-style-arrays-matrix-vs-structs/32424/10?u=kristoffer.carlsson.

KristofferC avatar Dec 18 '19 17:12 KristofferC

Your field-assignment function is still faster than broadcasting over MVectors, though:

function f3!(v)
   for vv in v
       x, y, z = vv
       vv.x = 2x
       vv.y = x*y
       vv.z = x*z
   end
end

@btime f3!($v4) # 1.158 μs (0 allocations: 0 bytes)

stillyslalom avatar Dec 18 '19 17:12 stillyslalom

As far as I can tell indices used in broadcasting over a FieldVector are not constant-propagated to setfield! somehow. We can fix that by passing the index in type of an argument using Val but it's quite ugly. Do you have a better idea?

mateuszbaran avatar Dec 18 '19 18:12 mateuszbaran

mutable struct MPoint3D{T} <: FieldVector{3, T}
    x::T
    y::T
    z::T
end

It's true that this is a performance bug, but I'm also tempted to say "don't do that" :-) Even though we do allow mutable field vectors I personally can't think of a use for them in numerical work. In practice I find that the immutable versions encourage a more functional style in geometric computation which is much more naturally mathematical, far more memory efficient when stored in a large array, and faster.

@mateuszbaran thanks for investigating. Did you have a more minimal piece of code you were looking at in coming to the conclusion about setfield!?

c42f avatar Dec 19 '19 08:12 c42f

@mateuszbaran thanks for investigating. Did you have a more minimal piece of code you were looking at in coming to the conclusion about setfield!?

I've inspected the code generated for the problematic case and compared it with this:

julia> using StaticArrays

julia> mutable struct MPoint3D{T} <: FieldVector{3, T}
           x::T
           y::T
           z::T
       end

julia> f(a, i) = a[i] = 100
f (generic function with 1 method)

julia> a = MPoint3D(1,2,3)
3-element MPoint3D{Int64} with indices SOneTo(3):
 1
 2
 3

julia> @code_native f(a, 2)
	.text
; ┌ @ REPL[3]:1 within `f'
	pushq	%r14
	pushq	%rbx
	subq	$56, %rsp
	vxorps	%xmm0, %xmm0, %xmm0
	vmovaps	%xmm0, (%rsp)
	movq	$0, 16(%rsp)
	movq	%fs:0, %rbx
; │┌ @ FieldArray.jl:107 within `setindex!'
	movq	$2, (%rsp)
	movq	-15712(%rbx), %rax
	movq	%rax, 8(%rsp)
	movq	%rdi, %r14
	movq	%rsp, %rax
	movq	%rax, -15712(%rbx)
	movabsq	$jl_box_int64, %rax
	movq	%rsi, %rdi
	callq	*%rax
	movq	%rax, 16(%rsp)
	movq	%r14, 32(%rsp)
	movq	%rax, 40(%rsp)
	movabsq	$139887313344800, %rax  # imm = 0x7F3A0D9ED920
	movq	%rax, 48(%rsp)
	movabsq	$jl_f_setfield, %rax
	leaq	32(%rsp), %rsi
	xorl	%edi, %edi
	movl	$3, %edx
	callq	*%rax
	movq	8(%rsp), %rax
	movq	%rax, -15712(%rbx)
; │└
	movl	$100, %eax
	addq	$56, %rsp
	popq	%rbx
	popq	%r14
	retq
	nopw	%cs:(%rax,%rax)
; └

Although:

julia> g(a) = f(a, 2)
g (generic function with 1 method)

julia> @code_native g(a)
	.text
; ┌ @ REPL[6]:1 within `g'
; │┌ @ REPL[3]:1 within `f'
; ││┌ @ REPL[6]:1 within `setindex!'
	movq	$100, 8(%rdi)
; │└└
	movl	$100, %eax
	retq
	nop
; └

so the constant propagation sometimes works and sometimes it doesn't.

mateuszbaran avatar Dec 19 '19 08:12 mateuszbaran

It's true that this is a performance bug, but I'm also tempted to say "don't do that" :-) Even though we do allow mutable field vectors I personally can't think of a use for them in numerical work. In practice I find that the immutable versions encourage a more functional style in geometric computation which is much more naturally mathematical, far more memory efficient when stored in a large array, and faster.

@c42f Solving PDEs with moving boundaries is a use case for me.

kylebeggs avatar Aug 12 '22 14:08 kylebeggs