julia Allow taking Matrix slices without an extra allocation

Since changing Array to use Memory as the backing, we had the option of making non-Vector arrays more flexible, but had instead preserved the restriction that they must be zero offset and equal in length to the Memory. This results in extra complexity, restrictions, and allocations however, but doesn't gain any known benefits. Indeed, it makes it harder to allow resizing or reshaping Matrices completely flexibly. This PR aims to test if nanosoldier detects any benefit, or whether this restriction has outlived its usefulness.

Oct 18 '24 15:10 vtjnash

@nanosoldier runbenchmarks(ALL, vs=":master")

Oct 18 '24 15:10 vtjnash

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

Oct 19 '24 00:10 nanosoldier

Looks like it gave LLVM a little bit of trouble on a few benchmarks, but overall seemed potentially worth doing this for the simplification of everything. I am not sure that overall it matters that much if LLVM removes boundschecking of a Matrix when indexing with eachindex? It is too bad though that IRCE might not know of this pattern already.

ID	time ratio	memory ratio
["array", "index", ("sumelt_boundscheck", "Matrix{Int32}")]	4.00 (50%) ❌	1.00 (1%)
["array", "index", ("sumelt_boundscheck", "Matrix{Int64}")]	2.12 (50%) ❌	1.00 (1%)

function perf_sumelt_boundscheck(A)
    s = zero(eltype(A))
    for i = 1:length(A)
        s += A[i]
    end
    return s
end

Oct 19 '24 01:10 vtjnash

looks like more negative than positive, but lots of the negative is ~5%

Oct 19 '24 01:10 oscardssmith

Probably mostly noise though, since it is worth noting this only affects code which uses Matrix (or rather, Array !<: Vector)

Oct 21 '24 19:10 vtjnash

It's a bit scary that such a common for loop pattern gets a regression IMO

Oct 29 '24 19:10 gbaraldi

Is it common to compute the sum of the elements of a matrix?

Oct 29 '24 19:10 vtjnash

I think doing something to all elements of a matrix is not uncommon?

Oct 29 '24 20:10 gbaraldi

@nanosoldier runbenchmarks(!"scalar", vs=":master")

Nov 05 '24 14:11 vtjnash

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

Nov 06 '24 01:11 nanosoldier

Seems like vector of bignums are 10% faster, as expected now, but let's check @nanosoldier runbenchmarks(!"scalar", vs=":master")

Nov 06 '24 03:11 vtjnash

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

Nov 06 '24 14:11 nanosoldier