CUDA.jl icon indicating copy to clipboard operation
CUDA.jl copied to clipboard

Adding two cuarrays of different sizes throws no error

Open jacob-m-wilson-42 opened this issue 6 months ago • 0 comments

Describe the bug

When two cuarrays of different sizes are added together, if the first array is larger than the second array, then no error is thrown and an unknown result is produced. If the second array is larger than the first, an error is thrown as it should. I am seeing this error for arrays of size Nx1.

To reproduce

Silently passes a result:

using CUDA

CuArray(rand(10,1)) + CuArray(rand(3,1))

Throws an error (as it should):

using CUDA

CuArray(rand(3,1)) + CuArray(rand(10,1))
Manifest.toml [[deps.CUDA]] deps = ["AbstractFFTs", "Adapt", "BFloat16s", "CEnum", "CUDA_Driver_jll", "CUDA_Runtime_Discovery", "CUDA_Runtime_jll", "Crayons", "DataFrames", "ExprTools", "GPUArrays", "GPUCompiler", "KernelAbstractions", "LLVM", "LLVMLoopInfo", "LazyArtifacts", "Libdl", "LinearAlgebra", "Logging", "NVTX", "Preferences", "PrettyTables", "Printf", "Random", "Random123", "RandomNumbers", "Reexport", "Requires", "SparseArrays", "StaticArrays", "Statistics"] git-tree-sha1 = "fdd9dfb67dfefd548f51000cc400bb51003de247" uuid = "052768ef-5323-5732-b1bb-66c8b64840ba" version = "5.4.3"

[deps.CUDA.extensions] ChainRulesCoreExt = "ChainRulesCore" EnzymeCoreExt = "EnzymeCore" SpecialFunctionsExt = "SpecialFunctions"

[deps.CUDA.weakdeps] ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4" EnzymeCore = "f151be2c-9106-41f4-ab19-57ee4f262869" SpecialFunctions = "276daf66-3868-5448-9aa4-cd146d93841b"

[[deps.CUDA_Driver_jll]] deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] git-tree-sha1 = "325058b426c2b421e3d2df3d5fa646d72d2e3e7e" uuid = "4ee394cb-3365-5eb0-8335-949819d2adfc" version = "0.9.2+0"

[[deps.CUDA_Runtime_Discovery]] deps = ["Libdl"] git-tree-sha1 = "33576c7c1b2500f8e7e6baa082e04563203b3a45" uuid = "1af6417a-86b4-443c-805f-a4643ffb695f" version = "0.3.5"

[[deps.CUDA_Runtime_jll]] deps = ["Artifacts", "CUDA_Driver_jll", "JLLWrappers", "LazyArtifacts", "Libdl", "TOML"] git-tree-sha1 = "afea94249b821dc754a8ca6695d3daed851e1f5a" uuid = "76a88914-d11a-5bdc-97e0-2f5a05c973a2" version = "0.14.1+0"

[[deps.CUDNN_jll]] deps = ["Artifacts", "CUDA_Runtime_jll", "JLLWrappers", "LazyArtifacts", "Libdl", "TOML"] git-tree-sha1 = "cbf7d75f8c58b147bdf6acea2e5bc96cececa6d4" uuid = "62b44479-cb7b-5706-934f-f13b2eb2e645" version = "9.0.0+1"

Expected behavior

A dimension mismatch error should be thrown.

Version info

Details on Julia:

Julia Version 1.10.0 Commit 3120989f39 (2023-12-25 18:01 UTC) Build Info: Official https://julialang.org/ release Platform Info: OS: Windows (x86_64-w64-mingw32) CPU: 20 × 13th Gen Intel(R) Core(TM) i9-13900H WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-15.0.7 (ORCJIT, goldmont) Threads: 24 on 20 virtual cores Environment: JULIA_EDITOR = code JULIA_VSCODE_REPL = 1 JULIA_NUM_THREADS = 14

Details on CUDA

CUDA runtime 12.5, artifact installation CUDA driver 12.5 NVIDIA driver 555.97.0

CUDA libraries:

  • CUBLAS: 12.5.3
  • CURAND: 10.3.6
  • CUFFT: 11.2.3
  • CUSOLVER: 11.6.3
  • CUSPARSE: 12.5.1
  • CUPTI: 2024.2.1 (API 23.0.0)
  • NVML: 12.0.0+555.97

Julia packages:

  • CUDA: 5.4.3
  • CUDA_Driver_jll: 0.9.2+0
  • CUDA_Runtime_jll: 0.14.1+0

Toolchain:

  • Julia: 1.10.0
  • LLVM: 15.0.7

1 device: 0: NVIDIA GeForce RTX 4070 Laptop GPU (sm_89, 3.203 GiB / 7.996 GiB available)

Additional context

I discovered this error when debugging a Flux model. My output dimension was not as expected. After digging deeper this bug is what I found.

jacob-m-wilson-42 avatar Jul 17 '25 01:07 jacob-m-wilson-42