ITensors.jl icon indicating copy to clipboard operation
ITensors.jl copied to clipboard

[ITensors] [ENHANCEMENT] In-place addition of a product A .+= B .* C.

Open ArtemStrashko opened this issue 2 years ago • 10 comments

Is your feature request related to a problem? Please describe.

For improving performance, reducing memory allocations is one of a key techniques. This often requires pre-allocating memory and then doing updates in-place. ITensors already provides functionality like A .= B .* C, while A .+= B .* C, which would allow to add (rather than to write) the output of B * C to A in-place, is missing. This would be useful for performant ML applications, in particular for gradient accumulation.

Describe the solution you'd like

indices = [Index(2), Index(3)]
A = ITensor(indices)
B = randomITensor(indices[1])
C = randomITensor(indices[2])
A .+= B .* C # (similarly to A .= B .* C)

Describe alternatives you've considered

A simple alternative would be to introduce a buffer tensor, which would double the amount of pre-allocated memory though:

indices = [Index(2), Index(3)]
A = ITensor(indices)
A_buffer = ITensor(indices)
B = randomITensor(indices[1])
C = randomITensor(indices[2])
A_buffer .= B .* C
A += A_buffer

ArtemStrashko avatar Jul 20 '23 16:07 ArtemStrashko

I would have assumed this was already defined, but should be easy enough to add. Note that is just special syntax for ITensors.contract!, i.e. A .+= B .* C would just call ITensors.contract!(A, B, C, 1, 1).

mtfishman avatar Jul 20 '23 16:07 mtfishman

I see, thanks. By the way, when I try it with a more relevant for me example,

indices = [Index(i) for i in 2:4]
A = ITensor(indices)
B = randomITensor(indices[1])
C = randomITensor(indices[2:3])
ITensors.contract!(A, B, C, 1, 1)

returns

ERROR: contract!! not yet implemented for outer product tensor contraction with non-trivial α and β

ArtemStrashko avatar Jul 20 '23 16:07 ArtemStrashko

I guess that is a different issue particular to outer products. A workaround for that could be to add dummy (dimension-1) indices shared between tensors B and C.

mtfishman avatar Jul 20 '23 16:07 mtfishman

OK, I see, thank you, this should work, though it seems a bit inconvenient in practice.

ArtemStrashko avatar Jul 20 '23 16:07 ArtemStrashko

Yeah, ideally we would fix that limitation, of course.

mtfishman avatar Jul 20 '23 16:07 mtfishman

Presumably, I am doing something wrong, but I am still getting an error with this trick:

indices = [Index(2), Index(3), Index(4)]
extra_index = Index(1)
A = ITensor(indices)
B = randomITensor([indices[1], extra_index])
C = randomITensor([indices[2:3], extra_index])
ITensors.contract!(A, B, C, 1, 1)

even though inds(B*C) = inds(A) and A .= B .* C works. By the way, when I try A .+= B .* C I get a weird error saying that I am trying to add a tensor with three indices into a tensor with two indices.

ArtemStrashko avatar Jul 21 '23 14:07 ArtemStrashko

In the above, the data of tensor A is not allocated yet. Looks like it works if you initialize it with:

A = ITensor(0.0, indices)

Though the original way you did it should work, that looks like another bug that is particular to using unallocated tensors (ITensors with an EmptyStorage storage type).

However, it then make me wonder why can't just use:

A = ITensor(indices)
B = randomITensor([indices[1], extra_index])
C = randomITensor([indices[2:3], extra_index])
ITensors.contract!(A, B, C)

if A is zero anyway.

mtfishman avatar Jul 21 '23 15:07 mtfishman

I see, thank you. Well, in general A is non-zero. By the way, maybe I missed it, could you please point me to the documentation of ITensors.contract!(...) if available?

ArtemStrashko avatar Jul 21 '23 20:07 ArtemStrashko

In particular, what are alpha and beta in ITensors.contract!(A, B, C, alpha, beta)?

ArtemStrashko avatar Jul 21 '23 20:07 ArtemStrashko

It is the same convention as Julia's LinearAlgebra.mul! function: https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#Low-level-matrix-operations.

mtfishman avatar Jul 21 '23 20:07 mtfishman