Oceananigans.jl
Oceananigans.jl copied to clipboard
Adding Metal support
Closes #2618
Great work.
What's the error with TimeInterval?
How will we test code for Metal GPU? Is there anything available through github actions, or will we have to hook something up via buildkite?
Copy-pasting this error from the #2618 :
model = HydrostaticFreeSurfaceModel(; grid)
ERROR: Metal does not support Float64 values, try using Float32 instead
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] check_eltype(T::Type)
@ Metal ~/.julia/packages/Metal/lnkVP/src/array.jl:32
[3] Metal.MtlArray{Float64, 3, Metal.MTL.MTLResourceStorageModePrivate}(#unused#::UndefInitializer, dims::Tuple{Int64, Int64, Int64})
@ Metal ~/.julia/packages/Metal/lnkVP/src/array.jl:50
[4] (Metal.MtlArray{Float64, 3})(#unused#::UndefInitializer, dims::Tuple{Int64, Int64, Int64})
@ Metal ~/.julia/packages/Metal/lnkVP/src/array.jl:98
[5] MtlArray
@ ~/.julia/packages/Metal/lnkVP/src/array.jl:157 [inlined]
[6] Metal.MtlArray(A::Array{Float64, 3})
@ Metal ~/.julia/packages/Metal/lnkVP/src/array.jl:173
[7] arch_array(#unused#::Oceananigans.Architectures.MetalBackend, a::Array{Float64, 3})
@ Oceananigans.Architectures ~/Documents/Projects/Oceananigans.jl/src/Architectures.jl:75
[8] Oceananigans.Solvers.FFTBasedPoissonSolver(grid::RectilinearGrid{Float64, Periodic, Periodic, Flat, Float64, Float64, Float64, OffsetArrays.OffsetVector{Float64, Vector{Float64}}, OffsetArrays.OffsetVector{Float64, Vector{Float64}}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Oceananigans.Architectures.MetalBackend}, planner_flag::UInt32)
@ Oceananigans.Solvers ~/Documents/Projects/Oceananigans.jl/src/Solvers/fft_based_poisson_solver.jl:61
[9] Oceananigans.Solvers.FFTBasedPoissonSolver(grid::RectilinearGrid{Float64, Periodic, Periodic, Flat, Float64, Float64, Float64, OffsetArrays.OffsetVector{Float64, Vector{Float64}}, OffsetArrays.OffsetVector{Float64, Vector{Float64}}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Oceananigans.Architectures.MetalBackend})
@ Oceananigans.Solvers ~/Documents/Projects/Oceananigans.jl/src/Solvers/fft_based_poisson_solver.jl:51
[10] Oceananigans.Models.HydrostaticFreeSurfaceModels.FFTImplicitFreeSurfaceSolver(grid::RectilinearGrid{Float32, Periodic, Periodic, Bounded, Float32, Float32, Float32, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, Oceananigans.Architectures.MetalBackend}, settings::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, gravitational_acceleration::Float32)
@ Oceananigans.Models.HydrostaticFreeSurfaceModels ~/Documents/Projects/Oceananigans.jl/src/Models/HydrostaticFreeSurfaceModels/fft_based_implicit_free_surface_solver.jl:67
[11] build_implicit_step_solver
@ ~/Documents/Projects/Oceananigans.jl/src/Models/HydrostaticFreeSurfaceModels/fft_based_implicit_free_surface_solver.jl:73 [inlined]
[12] build_implicit_step_solver(#unused#::Val{:Default}, grid::RectilinearGrid{Float32, Periodic, Periodic, Bounded, Float32, Float32, Float32, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, Oceananigans.Architectures.MetalBackend}, settings::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, gravitational_acceleration::Float32)
@ Oceananigans.Models.HydrostaticFreeSurfaceModels ~/Documents/Projects/Oceananigans.jl/src/Models/HydrostaticFreeSurfaceModels/implicit_free_surface.jl:111
[13] FreeSurface(free_surface::ImplicitFreeSurface{Nothing, Float64, Nothing, Nothing, Symbol, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, velocities::NamedTuple{(:u, :v, :w), Tuple{Field{Face, Center, Center, Nothing, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Float32, Float32, Float32, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, Oceananigans.Architectures.MetalBackend}, Tuple{Colon, Colon, Colon}, OffsetArrays.OffsetArray{Float32, 3, Metal.MtlArray{Float32, 3, Metal.MTL.MTLResourceStorageModePrivate}}, Float32, FieldBoundaryConditions{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Flux, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Flux, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Flux, Nothing}}, Nothing, Oceananigans.Fields.FieldBoundaryBuffers{Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}}, Field{Center, Face, Center, Nothing, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Float32, Float32, Float32, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, Oceananigans.Architectures.MetalBackend}, Tuple{Colon, Colon, Colon}, OffsetArrays.OffsetArray{Float32, 3, Metal.MtlArray{Float32, 3, Metal.MTL.MTLResourceStorageModePrivate}}, Float32, FieldBoundaryConditions{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Flux, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Flux, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Flux, Nothing}}, Nothing, Oceananigans.Fields.FieldBoundaryBuffers{Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}}, Field{Center, Center, Face, Nothing, RectilinearGrid{Float32, Periodic, Periodic, Bounded, Float32, Float32, Float32, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, Oceananigans.Architectures.MetalBackend}, Tuple{Colon, Colon, Colon}, OffsetArrays.OffsetArray{Float32, 3, Metal.MtlArray{Float32, 3, Metal.MTL.MTLResourceStorageModePrivate}}, Float32, FieldBoundaryConditions{BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, BoundaryCondition{Oceananigans.BoundaryConditions.Periodic, Nothing}, Nothing, Nothing, BoundaryCondition{Oceananigans.BoundaryConditions.Flux, Nothing}}, Nothing, Oceananigans.Fields.FieldBoundaryBuffers{Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}}}}, grid::RectilinearGrid{Float32, Periodic, Periodic, Bounded, Float32, Float32, Float32, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, Oceananigans.Architectures.MetalBackend})
@ Oceananigans.Models.HydrostaticFreeSurfaceModels ~/Documents/Projects/Oceananigans.jl/src/Models/HydrostaticFreeSurfaceModels/implicit_free_surface.jl:95
[14] HydrostaticFreeSurfaceModel(; grid::RectilinearGrid{Float32, Periodic, Periodic, Bounded, Float32, Float32, Float32, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, OffsetArrays.OffsetVector{Float32, Vector{Float32}}, Oceananigans.Architectures.MetalBackend}, clock::Clock{Float32}, momentum_advection::Centered{1, Float64, Nothing, Nothing, Nothing, Nothing}, tracer_advection::Centered{1, Float64, Nothing, Nothing, Nothing, Nothing}, buoyancy::SeawaterBuoyancy{Float32, LinearEquationOfState{Float32}, Nothing, Nothing}, coriolis::Nothing, free_surface::ImplicitFreeSurface{Nothing, Float64, Nothing, Nothing, Symbol, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, forcing::NamedTuple{(), Tuple{}}, closure::Nothing, boundary_conditions::NamedTuple{(), Tuple{}}, tracers::Tuple{Symbol, Symbol}, particles::Nothing, biogeochemistry::Nothing, velocities::Nothing, pressure::Nothing, diffusivity_fields::Nothing, auxiliary_fields::NamedTuple{(), Tuple{}})
@ Oceananigans.Models.HydrostaticFreeSurfaceModels ~/Documents/Projects/Oceananigans.jl/src/Models/HydrostaticFreeSurfaceModels/hydrostatic_free_surface_model.jl:169
[15] top-level scope
@ REPL[5]:1
[16] top-level scope
@ ~/.julia/packages/Metal/lnkVP/src/initialization.jl:57
The eltype of the grid is not being propagated here:
https://github.com/CliMA/Oceananigans.jl/blob/e16cdc6cfb67703df9e29368017868331f68b1c0/src/Models/HydrostaticFreeSurfaceModels/fft_based_implicit_free_surface_solver.jl#L61-L64
So that needs to be fixed.
However, are FFTs available on Metal? If not, then no FFT code can be ported there.
Thanks, I'll go through all the comments tomorrow!
~~As far as I can tell there are no Julia FFT libraries yet based on this discussion https://discourse.julialang.org/t/metal-jl-does-not-speed-up-fft/95528/3 so~~(see below) I think that is going to be the main barrier at the moment.
I don't think any metal runners are available for GitHub actions.
What's the error with
TimeInterval?
I think the issue was the interval being stored as a Float64, but I'm not sure why that would ever endup in a kernel.
I've just realised that the AppleAccelerate library is actually for M1 CPU acceleration so isn't the CUDA FFT equivalent we need here.
Perhaps we could convert to this: https://github.com/DTolm/VkFFT which supports hardware-accelerated FFT on CUDA, Metal and lots of others. It looks like that library is more performant than cuFFT as well.
What's the error with
TimeInterval?I think the issue was the interval being stored as a Float64, but I'm not sure why that would ever endup in a kernel.
(copy-pasting from #2618)
One possibility is that there is a type promotion occurring within align_time_step:
https://github.com/CliMA/Oceananigans.jl/blob/e16cdc6cfb67703df9e29368017868331f68b1c0/src/Simulations/run.jl#L24-L33
and
https://github.com/CliMA/Oceananigans.jl/blob/e16cdc6cfb67703df9e29368017868331f68b1c0/src/Simulations/run.jl#L133-L134
This would cause widespread issues in switching to single precision, so @simone-silvestri may want to take note.
The quick fix is to convert after calculating the aligned time-step:
Δt = aligned_time_step(sim, sim.Δt)
Δt = convert(eltype(sim.model), Δt)
I think we need a more defined interface for this in the long run though. We really need both eltype (the floating point type used by state variables, grid metrics, etc) and a timetype (the type of model.clock.time).
(Also I'm not sure eltype(model) is defined yet, but it should be...)
Perhaps we could convert to this: https://github.com/DTolm/VkFFT which supports hardware-accelerated FFT on CUDA, Metal and lots of others. It looks like that library is more performant than
cuFFTas well.
Why do we have to convert? Can we use that only for Metal?
Yeah we could just do it for Metal. I was just thinking it might be just as much effort as using it for all.
Yeah we could just do it for Metal. I was just thinking it might be just as much effort as using it for all.
I think it would be more effort if we take into account the need to rebenchmark a lot of cases. I also think it adds some risk...
I think it would be more effort if we take into account the need to rebenchmark a lot of cases. I also think it adds some risk...
Yeah true, they do claim it is more peformant so maybe something to consider in the future.
I've now had a play trying to wrap VkFFT with https://github.com/JuliaInterop/Clang.jl/tree/master but it is proving difficult given my inexperience with C.
Does anyone working on Oceananigans have experience doing that sort of thing?
I've now had a play trying to wrap VkFFT with https://github.com/JuliaInterop/Clang.jl/tree/master but it is proving difficult given my inexperience with C.
Does anyone working on Oceananigans have experience doing that sort of thing?
Could be worth asking on julia slack! You'll have to ship an independent wrapper package (eg VkFFT.jl) and figure out how to precompile the binaries, right (so we can install everything from the REPL)?
Could be good for this PR to focus on getting explicit free surface to work, then build up the rest of the features after that. Doing this for real will also require figuring testing out I think.
weird that all the tests fail, when I run e.g. RectilinearGrid(CPU(), FT, size=(16, 16, 16), extent=(1, 1, 1)) locally it works fine
Curious what's the status of this effort to add Metal support to Oceananigans.
Curious what's the status of this effort to add Metal support to Oceananigans.
It's crazy how easy it is to add this support, but the major limitation is that Metal only supports Float32. There hasn't been much effort to validate anything for Float32, though this is a worthwhile goal...
Also if we do refactor this PR, I think we should probably put the Metal functionality in an extension, much as #3468 does.