julia icon indicating copy to clipboard operation
julia copied to clipboard

ScopedValue is allocating when accessed

Open sgaure opened this issue 1 year ago • 7 comments

I have looked into using scoped values for some temporary arrays to avoid allocations in parallel tasks. However, it seems scoped values are allocating when accessed, whereas with tls it can be avoided. This is unfortunate, since gc in parallel tasks can be a performance problem.

using .Threads
using BenchmarkTools

@noinline function tlsfun()
    tlsvec = get!(() -> [0], task_local_storage(), :myvec)::Vector{Int}
    tlsvec[1] += 1
    return nothing
end


const dynvec = ScopedValue([0])

@noinline function dynfun()
    dvec = dynvec[]
    dvec[1] += 1
    return nothing
end


function tlsrun()
    @sync for _ in 1:nthreads()
        @spawn for _ in 1:100000; tlsfun(); end
    end
end


function dynrun()
    @sync for _ in 1:nthreads()
        @with dynvec=>[0] @spawn for _ in 1:100000; dynfun(); end
    end
end


@btime tlsrun()
@btime dynrun()
versioninfo()

output:

  2.326 ms (202 allocations: 21.03 KiB)
  8.238 ms (2400274 allocations: 36.64 MiB)

Julia Version 1.12.0-DEV.121
Commit bc2212cc0e* (2024-03-04 01:20 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 24 × AMD Ryzen Threadripper PRO 5945WX 12-Cores
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 24 default, 0 interactive, 12 GC (on 24 virtual cores)
Environment:
  JULIA_EDITOR = emacs -nw

sgaure avatar Mar 04 '24 09:03 sgaure

This is probably caused by 5b2fcb68800 and is not multi-threading related:

julia> const dynvec = ScopedValue([0])
julia> @noinline function dynfun()
           dvec = dynvec[]
           dvec[1] += 1
           return nothing
       end
julia> foo() = @with dynvec=>[0] for _ in 1:1000_000; dynfun(); end

julia> @allocated foo()
16000208

The problem is this @noinline which forces a wrapping Tuple{Vector{Int}} to be allocated as a temporary, even though it is immediately unwrapped at every call-site: https://github.com/JuliaLang/julia/blob/58291db09d18f59223edbdc15592ffcf0eb3dcfa/base/dict.jl#L1004

topolarity avatar Mar 05 '24 15:03 topolarity

So is the problem there that the API wrongly returns an object of type (leaf.val,) instead of Some{V}(leaf.val)?

vtjnash avatar Mar 05 '24 15:03 vtjnash

Would Some{V}(leaf.val) bypass the need to allocate the temporary here?

topolarity avatar Mar 05 '24 15:03 topolarity

I guess not. Apparently we do not have calling convention support for Union{Struct, Ghost}, even though we very easily could (we have many variations on it already) and probably should (it is the iteration protocol)

vtjnash avatar Mar 05 '24 15:03 vtjnash

Any chance of progress on this?

StefanKarpinski avatar Jun 05 '25 13:06 StefanKarpinski

PR #55045 is merged now. The example by topolarity still allocates on 2264f502756.

nsajko avatar Dec 08 '25 17:12 nsajko

Yes, this is only fixable by reverting 5b2fcb68800875e570d7bb8c78ed00d360b6cfd5, on top of #55045

vtjnash avatar Dec 08 '25 18:12 vtjnash