Oceananigans.jl
Oceananigans.jl copied to clipboard
`maximum(abs, v)` doesn't work on GPU in Julia 1.10.0 with grid size larger than (10, 10, 10)
(as discussed with @simone-silvestri)
I encountered this bug when trying to upgrade to julia 1.10.0. What happens is maximum(abs, v)
doesn't work for grids larger than (10, 10, 10). However maximum(abs, u)
, maximum(abs, w)
, maximum(abs, b)
, maximum(u)
, maximum(v)
, maximum(w)
, and maximum(b)
work just fine.
Here's a MWE tested on Supercloud and Tartarus:
using Oceananigans
grid = RectilinearGrid(GPU(),
size = (16, 16, 16),
x = (0, 1),
y = (0, 1),
z = (-1, 0),
topology = (Periodic, Periodic, Bounded))
model = NonhydrostaticModel(; grid)
u, v, w = model.velocities
maximum(u)
maximum(w)
maximum(v)
maximum(abs, u)
maximum(abs, w)
maximum(abs, v)
ERROR: LoadError: CUDA error: too many resources requested for launch (code 701, ERROR_LAUNCH_OUT_OF_RESOURCES)
Stacktrace:
[1] throw_api_error(res::CUDA.cudaError_enum)
@ CUDA ~/.julia/packages/CUDA/35NC6/lib/cudadrv/libcuda.jl:27
[2] check
@ ~/.julia/packages/CUDA/35NC6/lib/cudadrv/libcuda.jl:34 [inlined]
[3] cuLaunchKernel
@ ~/.julia/packages/CUDA/35NC6/lib/utils/call.jl:26 [inlined]
[4] (::CUDA.var"#863#864"{Bool, Int64, CUDA.CuStream, CUDA.CuFunction, CUDA.CuDim3, CUDA.CuDim3})(kernelParams::Vector{Ptr{Nothing}})
@ CUDA ~/.julia/packages/CUDA/35NC6/lib/cudadrv/execution.jl:69
[5] macro expansion
@ ~/.julia/packages/CUDA/35NC6/lib/cudadrv/execution.jl:33 [inlined]
[6] macro expansion
@ ./none:0 [inlined]
[7] pack_arguments(::CUDA.var"#863#864"{…}, ::CUDA.KernelState, ::CartesianIndices{…}, ::CartesianIndices{…}, ::CUDA.CuDeviceArray{…}, ::Oceananigans.AbstractOperations.ConditionalOperation{…})
@ CUDA ./none:0
[8] launch(f::CUDA.CuFunction, args::Vararg{…}; blocks::Union{…}, threads::Union{…}, cooperative::Bool, shmem::Integer, stream::CUDA.CuStream) where N
@ CUDA ~/.julia/packages/CUDA/35NC6/lib/cudadrv/execution.jl:62 [inlined]
[9] #868
@ CUDA ~/.julia/packages/CUDA/35NC6/lib/cudadrv/execution.jl:136 [inlined]
[10] macro expansion
@ CUDA ~/.julia/packages/CUDA/35NC6/lib/cudadrv/execution.jl:95 [inlined]
[11] macro expansion
@ CUDA ./none:0 [inlined]
[12] convert_arguments
@ CUDA ./none:0 [inlined]
[13] #cudacall#867
@ CUDA ~/.julia/packages/CUDA/35NC6/lib/cudadrv/execution.jl:135 [inlined]
[14] cudacall
@ CUDA ~/.julia/packages/CUDA/35NC6/lib/cudadrv/execution.jl:134 [inlined]
[15] macro expansion
@ CUDA ~/.julia/packages/CUDA/35NC6/src/compiler/execution.jl:219 [inlined]
[16] macro expansion
@ CUDA ./none:0 [inlined]
[17] call(::CUDA.HostKernel{…}, ::typeof(identity), ::typeof(max), ::Nothing, ::CartesianIndices{…}, ::CartesianIndices{…}, ::Val{…}, ::CUDA.CuDeviceArray{…}, ::Oceananigans.AbstractOperations.ConditionalOperation{…}; call_kwargs::@Kwargs{…})
@ CUDA ./none:0
[18] (::CUDA.HostKernel{…})(::Function, ::Vararg{…}; threads::Int64, blocks::Int64, kwargs::@Kwargs{…})
@ CUDA ~/.julia/packages/CUDA/35NC6/src/compiler/execution.jl:340
[19] macro expansion
@ ~/.julia/packages/CUDA/35NC6/src/compiler/execution.jl:106 [inlined]
[20] mapreducedim!(f::typeof(identity), op::typeof(max), R::SubArray{…}, A::Oceananigans.AbstractOperations.ConditionalOperation{…}; init::Nothing)
@ CUDA ~/.julia/packages/CUDA/35NC6/src/mapreduce.jl:271
[21] mapreducedim!(f::typeof(identity), op::typeof(max), R::SubArray{…}, A::Oceananigans.AbstractOperations.ConditionalOperation{…})
@ CUDA ~/.julia/packages/CUDA/35NC6/src/mapreduce.jl:169
[22] mapreducedim!(f::Function, op::Function, R::SubArray{…}, A::Oceananigans.AbstractOperations.ConditionalOperation{…})
@ GPUArrays ~/.julia/packages/GPUArrays/5XhED/src/host/mapreduce.jl:10
[23] #maximum!#860
@ Base ./reducedim.jl:1034 [inlined]
[24] maximum!(f::Function, r::Field{…}, a::Oceananigans.AbstractOperations.ConditionalOperation{…}; condition::Nothing, mask::Float64, kwargs::@Kwargs{…})
@ Oceananigans.Fields ~/.julia/packages/Oceananigans/r28zw/src/Fields/field.jl:618
[25] maximum(f::Function, c::Field{…}; condition::Nothing, mask::Float64, dims::Function)
@ Oceananigans.Fields ~/.julia/packages/Oceananigans/r28zw/src/Fields/field.jl:648
[26] maximum(f::Function, c::Field{…})
@ Oceananigans.Fields ~/.julia/packages/Oceananigans/r28zw/src/Fields/field.jl:637
[27] top-level scope
@ ~/SaltyOceanParameterizations.jl/CUDA_MWE.jl:20
[28] include(fname::String)
@ Base.MainInclude ./client.jl:489
[29] top-level scope
@ REPL[19]:1
[30] top-level scope
@ ~/.julia/packages/CUDA/35NC6/src/initialization.jl:190
in expression starting at /home/xinkai/SaltyOceanParameterizations.jl/CUDA_MWE.jl:20
Some type information was truncated. Use `show(err)` to see complete types.
Note that line 20 is the last line of the code snippet above (maximum(abs, v)
)
Here's the Julia version info:
Julia Version 1.10.0
Commit 3120989f39b (2023-12-25 18:01 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 48 × Intel(R) Xeon(R) Silver 4214 CPU @ 2.20GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, cascadelake)
Threads: 1 on 48 virtual cores
Here's the CUDA runtime version:
CUDA runtime 11.8, artifact installation
CUDA driver 11.8
NVIDIA driver 520.61.5
CUDA libraries:
- CUBLAS: 11.11.3
- CURAND: 10.3.0
- CUFFT: 10.9.0
- CUSOLVER: 11.4.1
- CUSPARSE: 11.7.5
- CUPTI: 18.0.0
- NVML: 11.0.0+520.61.5
Julia packages:
- CUDA: 4.4.1
- CUDA_Driver_jll: 0.5.0+1
- CUDA_Runtime_jll: 0.6.0+0
Toolchain:
- Julia: 1.10.0
- LLVM: 15.0.7
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5
- Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86
1 device:
0: NVIDIA TITAN V (sm_70, 9.027 GiB / 12.000 GiB available)
In Julia 1.9 this does not seem to be a problem.
Can you try using the branch ncc/use-julia-v1.9.4
which, despite its original name, uses Julia v1.10.0?
on tartarus with the above-mentioned branch things seem OK
navidcy:Oceananigans.jl/ |ncc/use-julia-v1.9.4 ✓|$ julia-1.10 --project
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.10.0 (2023-12-25)
_/ |\__'_|_|_|\__'_| |
|__/ |
julia> using Oceananigans
[ Info: Oceananigans will use 48 threads
julia> grid = RectilinearGrid(GPU(),
size = (16, 16, 16),
x = (0, 1),
y = (0, 1),
z = (-1, 0),
topology = (Periodic, Periodic, Bounded))
16×16×16 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on GPU with 3×3×3 halo
├── Periodic x ∈ [0.0, 1.0) regularly spaced with Δx=0.0625
├── Periodic y ∈ [0.0, 1.0) regularly spaced with Δy=0.0625
└── Bounded z ∈ [-1.0, 0.0] regularly spaced with Δz=0.0625
julia> model = NonhydrostaticModel(; grid)
NonhydrostaticModel{GPU, RectilinearGrid}(time = 0 seconds, iteration = 0)
├── grid: 16×16×16 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on GPU with 3×3×3 halo
├── timestepper: QuasiAdamsBashforth2TimeStepper
├── tracers: ()
├── closure: Nothing
├── buoyancy: Nothing
└── coriolis: Nothing
julia> u, v, w = model.velocities
NamedTuple with 3 Fields on 16×16×16 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on GPU with 3×3×3 halo:
├── u: 16×16×16 Field{Face, Center, Center} on RectilinearGrid on GPU
├── v: 16×16×16 Field{Center, Face, Center} on RectilinearGrid on GPU
└── w: 16×16×17 Field{Center, Center, Face} on RectilinearGrid on GPU
julia> maximum(u)
0.0
julia> maximum(w)
0.0
julia> maximum(v)
0.0
julia> maximum(abs, u)
0.0
julia> maximum(abs, w)
0.0
julia> maximum(abs, v)
0.0
While using main
indeed I can reproduce the error above...
navidcy:Oceananigans.jl/ |main ✓|$ julia-1.10 --project
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.10.0 (2023-12-25)
_/ |\__'_|_|_|\__'_| |
|__/ |
julia> using Oceananigans
┌ Warning: The active manifest file has dependencies that were resolved with a different julia version (1.9.3). Unexpected behavior may occur.
└ @ ~/Oceananigans.jl/Manifest.toml:0
┌ Warning: The project dependencies or compat requirements have changed since the manifest was last resolved.
│ It is recommended to `Pkg.resolve()` or consider `Pkg.update()` if necessary.
└ @ Pkg.API ~/julia-1.10/usr/share/julia/stdlib/v1.10/Pkg/src/API.jl:1800
Precompiling Oceananigans
1 dependency successfully precompiled in 21 seconds. 143 already precompiled.
[ Info: Oceananigans will use 48 threads
julia> grid = RectilinearGrid(GPU(),
size = (16, 16, 16),
x = (0, 1),
y = (0, 1),
z = (-1, 0),
topology = (Periodic, Periodic, Bounded))
16×16×16 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on GPU with 3×3×3 halo
├── Periodic x ∈ [0.0, 1.0) regularly spaced with Δx=0.0625
├── Periodic y ∈ [0.0, 1.0) regularly spaced with Δy=0.0625
└── Bounded z ∈ [-1.0, 0.0] regularly spaced with Δz=0.0625
julia> model = NonhydrostaticModel(; grid)
NonhydrostaticModel{GPU, RectilinearGrid}(time = 0 seconds, iteration = 0)
├── grid: 16×16×16 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on GPU with 3×3×3 halo
├── timestepper: QuasiAdamsBashforth2TimeStepper
├── tracers: ()
├── closure: Nothing
├── buoyancy: Nothing
└── coriolis: Nothing
julia> u, v, w = model.velocities
NamedTuple with 3 Fields on 16×16×16 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on GPU with 3×3×3 halo:
├── u: 16×16×16 Field{Face, Center, Center} on RectilinearGrid on GPU
├── v: 16×16×16 Field{Center, Face, Center} on RectilinearGrid on GPU
└── w: 16×16×17 Field{Center, Center, Face} on RectilinearGrid on GPU
julia> maximum(u)
0.0
julia> maximum(w)
0.0
julia> maximum(v)
0.0
julia> maximum(abs, u)
0.0
julia> maximum(abs, w)
ERROR: CUDA error: too many resources requested for launch (code 701, ERROR_LAUNCH_OUT_OF_RESOURCES)
Stacktrace:
[1] throw_api_error(res::CUDA.cudaError_enum)
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/libcuda.jl:27
[2] check
@ ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/libcuda.jl:34 [inlined]
[3] cuLaunchKernel
@ ~/.julia/packages/CUDA/nbRJk/lib/utils/call.jl:26 [inlined]
[4] (::CUDA.var"#867#868"{Bool, Int64, CUDA.CuStream, CUDA.CuFunction, CUDA.CuDim3, CUDA.CuDim3})(kernelParams::Vector{Ptr{Nothing}})
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:69
[5] macro expansion
@ ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:33 [inlined]
[6] macro expansion
@ ./none:0 [inlined]
[7] pack_arguments(::CUDA.var"#867#868"{…}, ::CUDA.KernelState, ::CartesianIndices{…}, ::CartesianIndices{…}, ::CUDA.CuDeviceArray{…}, ::Oceananigans.AbstractOperations.ConditionalOperation{…})
@ CUDA ./none:0
[8] launch(f::CUDA.CuFunction, args::Vararg{…}; blocks::Union{…}, threads::Union{…}, cooperative::Bool, shmem::Integer, stream::CUDA.CuStream) where N
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:62 [inlined]
[9] #872
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:136 [inlined]
[10] macro expansion
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:95 [inlined]
[11] macro expansion
@ CUDA ./none:0 [inlined]
[12] convert_arguments
@ CUDA ./none:0 [inlined]
[13] #cudacall#871
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:135 [inlined]
[14] cudacall
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:134 [inlined]
[15] macro expansion
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:223 [inlined]
[16] macro expansion
@ CUDA ./none:0 [inlined]
[17] call(::CUDA.HostKernel{…}, ::typeof(identity), ::typeof(max), ::Nothing, ::CartesianIndices{…}, ::CartesianIndices{…}, ::Val{…}, ::CUDA.CuDeviceArray{…}, ::Oceananigans.AbstractOperations.ConditionalOperation{…}; call_kwargs::@Kwargs{…})
@ CUDA ./none:0
[18] (::CUDA.HostKernel{…})(::Function, ::Vararg{…}; threads::Int64, blocks::Int64, kwargs::@Kwargs{…})
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:345
[19] macro expansion
@ ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:106 [inlined]
[20] mapreducedim!(f::typeof(identity), op::typeof(max), R::SubArray{…}, A::Oceananigans.AbstractOperations.ConditionalOperation{…}; init::Nothing)
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/mapreduce.jl:271
[21] mapreducedim!(f::typeof(identity), op::typeof(max), R::SubArray{…}, A::Oceananigans.AbstractOperations.ConditionalOperation{…})
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/mapreduce.jl:169
[22] mapreducedim!(f::Function, op::Function, R::SubArray{…}, A::Oceananigans.AbstractOperations.ConditionalOperation{…})
@ GPUArrays ~/.julia/packages/GPUArrays/EZkix/src/host/mapreduce.jl:10
[23] #maximum!#860
@ Base ./reducedim.jl:1034 [inlined]
[24] maximum!(f::Function, r::Field{…}, a::Oceananigans.AbstractOperations.ConditionalOperation{…}; condition::Nothing, mask::Float64, kwargs::@Kwargs{…})
@ Oceananigans.Fields ~/Oceananigans.jl/src/Fields/field.jl:618
[25] maximum(f::Function, c::Field{…}; condition::Nothing, mask::Float64, dims::Function)
@ Oceananigans.Fields ~/Oceananigans.jl/src/Fields/field.jl:648
[26] maximum(f::Function, c::Field{…})
@ Oceananigans.Fields ~/Oceananigans.jl/src/Fields/field.jl:637
[27] top-level scope
@ REPL[9]:1
[28] top-level scope
@ ~/.julia/packages/CUDA/nbRJk/src/initialization.jl:205
Some type information was truncated. Use `show(err)` to see complete types.
julia> maximum(abs, v)
ERROR: CUDA error: too many resources requested for launch (code 701, ERROR_LAUNCH_OUT_OF_RESOURCES)
Stacktrace:
[1] throw_api_error(res::CUDA.cudaError_enum)
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/libcuda.jl:27
[2] check
@ ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/libcuda.jl:34 [inlined]
[3] cuLaunchKernel
@ ~/.julia/packages/CUDA/nbRJk/lib/utils/call.jl:26 [inlined]
[4] (::CUDA.var"#867#868"{Bool, Int64, CUDA.CuStream, CUDA.CuFunction, CUDA.CuDim3, CUDA.CuDim3})(kernelParams::Vector{Ptr{Nothing}})
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:69
[5] macro expansion
@ ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:33 [inlined]
[6] macro expansion
@ ./none:0 [inlined]
[7] pack_arguments(::CUDA.var"#867#868"{…}, ::CUDA.KernelState, ::CartesianIndices{…}, ::CartesianIndices{…}, ::CUDA.CuDeviceArray{…}, ::Oceananigans.AbstractOperations.ConditionalOperation{…})
@ CUDA ./none:0
[8] launch(f::CUDA.CuFunction, args::Vararg{…}; blocks::Union{…}, threads::Union{…}, cooperative::Bool, shmem::Integer, stream::CUDA.CuStream) where N
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:62 [inlined]
[9] #872
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:136 [inlined]
[10] macro expansion
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:95 [inlined]
[11] macro expansion
@ CUDA ./none:0 [inlined]
[12] convert_arguments
@ CUDA ./none:0 [inlined]
[13] #cudacall#871
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:135 [inlined]
[14] cudacall
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:134 [inlined]
[15] macro expansion
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:223 [inlined]
[16] macro expansion
@ CUDA ./none:0 [inlined]
[17] call(::CUDA.HostKernel{…}, ::typeof(identity), ::typeof(max), ::Nothing, ::CartesianIndices{…}, ::CartesianIndices{…}, ::Val{…}, ::CUDA.CuDeviceArray{…}, ::Oceananigans.AbstractOperations.ConditionalOperation{…}; call_kwargs::@Kwargs{…})
@ CUDA ./none:0
[18] (::CUDA.HostKernel{…})(::Function, ::Vararg{…}; threads::Int64, blocks::Int64, kwargs::@Kwargs{…})
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:345
[19] macro expansion
@ ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:106 [inlined]
[20] mapreducedim!(f::typeof(identity), op::typeof(max), R::SubArray{…}, A::Oceananigans.AbstractOperations.ConditionalOperation{…}; init::Nothing)
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/mapreduce.jl:271
[21] mapreducedim!(f::typeof(identity), op::typeof(max), R::SubArray{…}, A::Oceananigans.AbstractOperations.ConditionalOperation{…})
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/mapreduce.jl:169
[22] mapreducedim!(f::Function, op::Function, R::SubArray{…}, A::Oceananigans.AbstractOperations.ConditionalOperation{…})
@ GPUArrays ~/.julia/packages/GPUArrays/EZkix/src/host/mapreduce.jl:10
[23] #maximum!#860
@ Base ./reducedim.jl:1034 [inlined]
[24] maximum!(f::Function, r::Field{…}, a::Oceananigans.AbstractOperations.ConditionalOperation{…}; condition::Nothing, mask::Float64, kwargs::@Kwargs{…})
@ Oceananigans.Fields ~/Oceananigans.jl/src/Fields/field.jl:618
[25] maximum(f::Function, c::Field{…}; condition::Nothing, mask::Float64, dims::Function)
@ Oceananigans.Fields ~/Oceananigans.jl/src/Fields/field.jl:648
[26] maximum(f::Function, c::Field{…})
@ Oceananigans.Fields ~/Oceananigans.jl/src/Fields/field.jl:637
[27] top-level scope
@ REPL[10]:1
[28] top-level scope
@ ~/.julia/packages/CUDA/nbRJk/src/initialization.jl:205
Some type information was truncated. Use `show(err)` to see complete types.
That suggests that it's because the package dependencies on main
were resolved with Julia v1.9.3.
┌ Warning: The active manifest file has dependencies that were resolved with a different julia version (1.9.3). Unexpected behavior may occur.
This issue will be resolved when #3403 is merged.
It looks like the conditional reduction is too heavy for mapreduce
. Perhaps @simone-silvestri has ideas to resolve this.
The operation should not be too large since the grid is very small. Probably this is a symptom of a bug that does not affect the results but results in a waste of computational resources somewhere in conditional operation. I ll have a look
I think the size dependence has to do with how mapreduce
works; it breaks the reduction into chunks and (10, 10, 10) might be just one chunk.
See here: https://github.com/JuliaGPU/CuArrays.jl/blob/284142de673572fc90578e15c8dce04e5589a17b/src/mapreduce.jl#L221
I also had this issue, as new into GPU running, I was super confused about this error. It will be helpful if this issue is not fixable, to at least point out in the documentation.
I encountered this error by running a simulation based on the tutorial (Langmuir turbulence) in GPUs. Note that the print function prints the maximum(abs, u), maximum(abs, v), maximum(abs, w)
:
msg = @sprintf("i: %04d, t: %s, Δt: %s, umax = (%.1e, %.1e, %.1e) ms⁻¹, wall time: %s\n",
iteration(simulation),
prettytime(time(simulation)),
prettytime(simulation.Δt),
maximum(abs, u), maximum(abs, v), maximum(abs, w),
prettytime(simulation.run_wall_time))
thus resulting in the error:
LoadError: CUDA error: too many resources requested for launch
For reference, the code works once the maximum
functions are removed:
msg = @sprintf("i: %04d, t: %s, �~Tt: %s, wall time: %s\n",
iteration(simulation),
prettytime(time(simulation)),
prettytime(simulation.�~Tt),
prettytime(simulation.run_wall_time))
reopening this
@simone-silvestri has declared an interest in fixing this
can you try maximum
without abs
?
I think its the abs
(probably any function) that's the main issue
@simone-silvestri, effectively if I try maximum
without abs
the printing function works well. @glwagner is right, any function within the maximum
creates the same issue (I tested with sum
).
Well sum
definitely won't work (it has to be a simple single-argument transformation) but you could try a function like
square(x) = x * x
or log
if you want to be adventurous
Is this still an issue? @xkykai's MWE runs fine for me (I went up to 256x256x256), and I've been doing maximum(abs, u)
on the GPU for a few versions.
Out of curiousity, @josuemtzmo are you able to reproduce the error on the latest versions of Julia, CUDA.jl, and Oceananigans.jl?
I'm using Oceananigans v0.91.7 with
julia> versioninfo()
Julia Version 1.10.4
Commit 48d4fd4843 (2024-06-04 10:41 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 24 × AMD Ryzen 9 5900X 12-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 24 virtual cores)
and
julia> Oceananigans.CUDA.versioninfo()
CUDA runtime 12.5, artifact installation
CUDA driver 12.5
NVIDIA driver 556.12.0
CUDA libraries:
- CUBLAS: 12.5.3
- CURAND: 10.3.6
- CUFFT: 11.2.3
- CUSOLVER: 11.6.3
- CUSPARSE: 12.5.1
- CUPTI: 2024.2.1 (API 23.0.0)
- NVML: 12.0.0+556.12
Julia packages:
- CUDA: 5.4.3
- CUDA_Driver_jll: 0.9.2+0
- CUDA_Runtime_jll: 0.14.1+0
Toolchain:
- Julia: 1.10.4
- LLVM: 15.0.7
1 device:
0: NVIDIA GeForce RTX 3080 (sm_86, 5.794 GiB / 10.000 GiB available)
Hello,
I've tested it in Oceananigans v0.91.8
with:
julia> versioninfo()
Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 64 × Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, skylake-avx512)
Threads: 1 default, 0 interactive, 1 GC (on 64 virtual cores)
Environment:
JULIA_CUDA_MEMORY_POOL = none
julia> Oceananigans.CUDA.versioninfo()
CUDA runtime 12.1, artifact installation
CUDA driver 12.1
NVIDIA driver 530.30.2
CUDA libraries:
- CUBLAS: 12.1.3
- CURAND: 10.3.2
- CUFFT: 11.0.2
- CUSOLVER: 11.4.5
- CUSPARSE: 12.1.0
- CUPTI: 2023.1.1 (API 18.0.0)
- NVML: 12.0.0+530.30.2
Julia packages:
- CUDA: 5.4.3
- CUDA_Driver_jll: 0.9.2+0
- CUDA_Runtime_jll: 0.14.1+0
Toolchain:
- Julia: 1.10.4
- LLVM: 15.0.7
Environment:
- JULIA_CUDA_MEMORY_POOL: none
Preferences:
- CUDA_Runtime_jll.version: 12.1
1 device:
0: Tesla V100-PCIE-32GB (sm_70, 30.884 GiB / 32.000 GiB available)
and the issue seems solved.
I agree with @ali-ramadhan, it seems that this issue was fixed at some point, although I haven't managed to pinpoint the version, I think I had the issue when I was using CUDA v5.1.2