julia icon indicating copy to clipboard operation
julia copied to clipboard

`precompile` fails to cache code for calls that are not fully specialized (non-concrete signature)

Open KristofferC opened this issue 3 years ago • 2 comments

In https://github.com/JuliaLang/julia/pull/46690, I want to deserialize an object on a precompile worker and it is quite important that this is fully precompiled since a lot of precompile workers are spawned and each of these have a separate process and thus they would all need to compile the same code over and over if this isn't the case.

For arbitrary data types, deserialize has a function that takes ::DataType and will not get specialized:

https://github.com/JuliaLang/julia/blob/aae8c484fd4ef9c9d7119e00f5eee679c905e542/stdlib/Serialization/src/Serialization.jl#L1468

As an example workload we can look at the following code

using Serialization

struct MyStruct
    x::String
end

s = MyStruct("foo")

io = IOBuffer()
serialize(io, s)
seekstart(io)
@time deserialize(io);

Running that with --trace-compile=stderr the following precompile statements are emitted from the deserialize call:

precompile(Tuple{typeof(Serialization.deserialize), Serialization.Serializer{Base.GenericIOBuffer{Array{UInt8, 1}}}, DataType})

However, restarting Julia and executing these before the deserialize call we can see that the precompile call for the DataType fails:

julia> precompile(Tuple{typeof(Serialization.deserialize), Serialization.Serializer{Base.IOStream}, DataType})
false

and the compilation time is still there.

However, if we specialize the function on the data type (as done in https://github.com/JuliaLang/julia/pull/46690/commits/6e0b2dad814b7a6603c47a5d0523591109e99dc0#diff-0fbb91f060d958d93fcf0a101ade4d87b365f3664a409276c591d7faf77acb67) the precompilation statements we get is:

precompile(Tuple{typeof(Serialization.deserialize), Serialization.Serializer{Base.GenericIOBuffer{Array{UInt8, 1}}}, Type{Main.MyStruct}})

the compilation time is then removed:

julia> @time deserialize(io);
  0.000014 seconds (14 allocations: 1.148 KiB)

It would be good if the code that is compiled for signatures that are non-concrete could be cached. They clearly are when you run the code since you don't have to compile it for every call.

KristofferC avatar Sep 15 '22 11:09 KristofferC

What happens if you explicitly add @nospecialize(t::DataType)?

timholy avatar Sep 15 '22 14:09 timholy

That doesn't seem to change anything (running with --trace-compile=stderr):

julia> precompile(Tuple{typeof(Serialization.deserialize), Base.GenericIOBuffer{Array{UInt8, 1}}})
precompile(Tuple{typeof(Serialization.deserialize), Base.GenericIOBuffer{Array{UInt8, 1}}})
true

julia> precompile(Tuple{typeof(Serialization.deserialize), Serialization.Serializer{Base.GenericIOBuffer{Array{UInt8, 1}}}, DataType})
false

julia> @time deserialize(io);
precompile(Tuple{typeof(Serialization.deserialize), Serialization.Serializer{Base.GenericIOBuffer{Array{UInt8, 1}}}, DataType})
  0.013337 seconds (15.33 k allocations: 1011.251 KiB, 98.18% compilation time)

You can see, the precompile statement for the DataType one is printed again, showing how it failed to cache the one from the explicit precompile call.

KristofferC avatar Sep 15 '22 14:09 KristofferC