ArrayLayouts.jl icon indicating copy to clipboard operation
ArrayLayouts.jl copied to clipboard

Extremely slow BLAS muladd when α,β are Int

Open sztal opened this issue 3 weeks ago • 5 comments

Hi, perhaps I do not understand something about how ArrayLayouts are supposed to work and be used, but it seems that BLAS matrix multiplication (both default_blasmul! and tiled_blasmul!) is very slow in comparison to 5-argument LinearAlgebra.mul!.

Below is a reproducible example.

using LinearAlgebra
import ArrayLayouts: default_blasmul!, tiled_blasmul!, tile_size

A, B, C = (rand(1000, 1000) for _ in 1:3)

# mul! is fast
mul!(copy(C), A, B, 1, 1) ≈ A*B + C  # true
@btime mul!(copy(C), A, B, 1, 1);
# 6.448 ms (4 allocations: 7.63 MiB)

# default_blasmul! is super slow!!
default_blasmul!(1, A, B, 1, copy(C)) ≈ A*B + C  # true
@btime default_blasmul!(1, A, B, 1, copy(C));
# 852.139 ms (3 allocations: 7.63 MiB)

# tiled_blasmul! is also very slow!
ts = tile_size(eltype(A), eltype(B), eltype(C))
tiled_blasmul!(ts, 1, A, B, 1, copy(C)) ≈ A*B + C  # true
@btime tiled_blasmul!(ts, 1, A, B, 1, copy(C));
# 402.555 ms (12 allocations: 7.66 MiB)

And this of course affects operations using MulAdd wrapper. What am I doing wrong? Or is it really a bug?

Version info

Julia Version 1.11.3
Commit d63adeda50d (2025-01-21 19:42 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 16 × AMD Ryzen 9 5900HX with Radeon Graphics
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 16 virtual cores)
Environment:
  JULIA_PROJECT = @.
  JULIA_PKG_PRESERVE_TIERED_INSTALLED = true
  JULIA_REVISE = manual

PKG status

  [dce04be8] ArgCheck v2.4.0
  [7d9fca2a] Arpack v0.5.4
  [4c555306] ArrayLayouts v1.11.0
  [86223c79] Graphs v1.12.0
⌃ [0b1a1467] KrylovKit v0.9.3
⌃ [5078a376] LazyArrays v2.4.0
  [2ab3a3ac] LogExpFunctions v0.3.29
  [5a0628fe] Supposition v0.3.5
  [37e2e46d] LinearAlgebra v1.11.0
  [2f01184e] SparseArrays v1.11.0
  [f489334b] StyledStrings v1.11.0
  [8dfed614] Test v1.11.0

sztal avatar Feb 07 '25 17:02 sztal