GPUArrays.jl icon indicating copy to clipboard operation
GPUArrays.jl copied to clipboard

Reusable array functionality for Julia's various GPU backends.

Results 74 GPUArrays.jl issues
Sort by recently updated
recently updated
newest added

Related to https://github.com/JuliaLang/julia/issues/54546 and https://github.com/JuliaLang/julia/pull/54587. cc @dkarrasch

This is part of ongoing work to make Enzyme + CUDA.jl work nicely from a user point of view. cc @vchuravy The implementation of `map!` (https://github.com/JuliaGPU/GPUArrays.jl/blob/ec9fe5b6f7522902e444c95a0c9248a4bc55d602/src/host/broadcast.jl#L120C46-L120C59) creates a broadcasted object...

Tested on main. Seems to be a Metal-specific issue as the test passes with JLArrays. ``` using Metal, GPUArrays, Random, Test begin AT = MtlArray a = AT(zeros(Float32, 1000,1000)) b...

bug

For example, `AbstractGPUVecOrMat`: ```julia julia> LinearAlgebra.Adjoint{Float64, Matrix{Float64}} LinearAlgebra.Adjoint{Float64, CuMatrix{Float64}}

On main branch. Noticed while working on JuliaGPU/Metal.jl#321. Will still be relevant once JuliaGPU/Metal.jl#321 is merged as the MPSMatrixRandom generation is not always used. Unsigned integers and 32/64 bit variants...

bug

The switch to KA.jl significantly slowed down several operations. --- CUDA.jl: `permudetims`, `broadcast`, and many others https://speed.juliagpu.org/changes/?tre=10&rev=6221589f5befec8f6f157a5a5271667dba09d0b6&exe=11&env=1 --- Metal.jl: `permudetims` ``` private array/permutedims/4d 2911500 ns 860084 ns 3.39 private array/permutedims/2d...

Ported from oneAPI.jl - [ ] Currentl limited to a static workgroupsize

As @maleadt mentioned in https://github.com/JuliaGPU/Metal.jl/issues/422. I re-open a new issue here. The current `LinearAlgebra.kron` only supports for `CuArray`, and the other `GPUArray` uses scalar indexing. Also, the methods for Kronecker...

Introduce `GPUNumber` to store the resul of `mapreduce` across all `dims` (i.e. `dims = :`) instead of immediately transferring it to host. `GPUNumber` copies its value to host when it...

enhancement
performance

https://github.com/JuliaGPU/AMDGPU.jl/pull/669