Mill.jl
Mill.jl copied to clipboard
Enhancement: Various speedups
Noting down some areas where significant speedups may be achieved:
-
vcat
inProductNode
s leads to a lot of copying - data deduplication in leaves may lead to lower memory requirements and also to saving some compute (computing ngrams only once for identical strings in
NGramMatrix
multiplication) - deduplicating instances in
BagNode
s in a similar fashion
PoC for deduplication in NGramMatrices
julia> function Mill._mul(A::AbstractMatrix, S::PooledVector, n, b, m)
C = zeros(eltype(A), size(A, 1), length(S))
iz = Mill._init_z(n, b)
idcs = Dict(r => Queue{Int}() for r in values(S.invpool))
for (i,r) in S.refs |> enumerate
enqueue!(idcs[r], i)
end
for (s, r) in S.invpool
z = iz
for l in 1:Mill._len(s, n)
z = Mill._next_ngram(z, l, codeunits(s), n, b)
zm = z%m + 1
for k in idcs[r]
for i in 1:size(C, 1)
@inbounds C[i, k] += A[i, zm]
end
end
end
end
C
end
julia> ss = [randstring(50), randstring(50)];
julia> S = [rand(ss) for _ in 1:100];
julia> n1 = NGramMatrix(S);
julia> n2 = NGramMatrix(PooledArray(S));
julia> x = randn(100, 2053);
julia> x*n1; @btime x*n1;
181.145 μs (2 allocations: 78.20 KiB)
julia> x*n2; @btime x*n2;
108.265 μs (14 allocations: 95.23 KiB)
I guess that dedup in NGramMatrices will offer the highest benefit. Multiplication of OneHot is essentially a copying, and Dense matrices should not contain that many duplicates (although they might).
Sure, and deduplication of instances in BagNodes
as well. That said, it is possible that in some cases the vanilla version will still be faster