ITensors.jl icon indicating copy to clipboard operation
ITensors.jl copied to clipboard

`oneITensor` to rule them all

Open mtfishman opened this issue 3 years ago • 2 comments

A common code pattern is:

A = ITensor(1)
A *= B
A *= C

which is useful when initializing a loop, for example. However, it has bothered me that it makes an unnecessary copy of B. We decided, I think correctly, that ITensor(1) * B should make a copy of B, since it could lead to very subtle bugs if ITensor(1) * B didn't copy but ITensor(1.000001) * B did copy (say you were doing floating point operations to compute a constant a, and then did ITensor(a) * B, and then one time a happened to be equal to 1 but you were relying on the copying behavior).

My proposal for this would be to introduce yet another storage type as well as a special number type called One. The proposal would be the following:

using ITensors
using ITensors.NDTensors

# Multiplicative identity
struct One <: Number
end

struct Scalar{ElT} <: TensorStorage{ElT}
  data::Base.RefValue{ElT}
end
Scalar(n::Number) = Scalar(Ref(n))

oneITensor() = itensor(Scalar(One()), IndexSet())

The Base.RefValue storage allows the value to be modified in-place, i.e. would allow T[] = 2.0. That's not totally necessary but I think is helpful for generic code, since a lot of code in NDTensors assumes the tensors are mutable.

Then, we can define contraction such that (oneITensor() * B) = B (without copying). Scalar(One()) storage could be interpreted as a universal multiplicative identity. Note that if someone did 1.0 * oneITensor(), it would create a tensor with storage Scalar(1.0), which would then act as ITensor(1.0) with copying behavior with multiplication. This allows the nice property that (1.0 * oneITensor()) * B and oneITensor() * (1.0 * B) both return copies of B.

Another note that the Scalar storage isn't strictly necessary here, and we could use a length 1 Dense storage with element type One, but I've been meaning to add a Scalar type anyway since I think it will help clean up some code logic with scalar ITensor contractions.

A bonus from this is getting a behavior that I've wanted for delta tensors, which is that A * delta(i, j) would not copy A if the contraction is equivalent to an index replacement. This faced a similar issue to trying to make ITensor(1) * A non-copying, which is that A * (1 * delta(i, j)) would not copy but (A * 1) * delta(i, j) would copy (and similarly, if 1 was replaced by a general floating point constant a, the behavior would subtly change of a happened to be equal to 1).

With the new One type, a delta tensor could have storage with a value of One instead of 1, which would be clearly documented. Then, if you did 1 * delta(i, j), it would return a tensor with storage of 1.0 and have the regular copying behavior, which would have the property that A * (1 * delta(i, j)) and (A * 1) * delta(i, j) always return a copy of A.

mtfishman avatar Apr 05 '21 15:04 mtfishman

This is definitely interesting. So to make sure I understand, is the key to the different behavior here mainly that these tensors would have a distinct storage type? I.e. that you can overload on the type Scalar{One} versus Scalar{Float64} say but not on the value inside of the Scalar storage, of course.

emstoudenmire avatar Apr 06 '21 02:04 emstoudenmire

Sort of, but I think it is better to think about it in terms of the properties that we want to satisfy. I think ultimately the properties we want for any ITensor A are:

  1. oneITensor() * A returns A (without a copy).
  2. 1.0 * oneITensor() is the same as ITensor(1.0) (more generally, a * oneITensor() == ITensor(a) for all a isa Number).
  3. ITensor(1.0) * A and 1.0 * A returns copy(A) (more generally, ITensor(a) acts the same as a for all a isa Number).

So from the properties above it follows that (oneITensor() * 1.0) * A and oneITensor() * (1.0 * A) both return copy(A).

Contrast this with the case where we chose that ITensor(1.0) * A returned A without a copy. In that case, (ITensor(1.0) * 1.0) * A would return A without a copy and ITensor(1.0) * (1.0 * A) would return copy(A) (actually copy(copy(A))), which would be very confusing and break associativity of ITensor contraction. So ultimately it follows that it requires introducing a new version of 1.

In principle we could do all of this without dispatch and without literally introducing a new type. For example, we could have some runtime value that flagged what kind of 1 you had in the storage, so oneITensor() could create ITensor(1.0; is_special_one = true) with a runtime flag that said it was a special kind of 1.0 that satisfied all of the properties above. So the essential idea is more about having two versions of identity which have the behavior defined above. It kind of distinguishes between being in floating point world vs. exact arithmetic world, where we choose different behaviors in these two worlds. In floating point world, we want to treat the multiplication of ITensor(1.0) * A and ITensor(1.0 + eps()) * A as essentially the same (it would be weird if one copied and the other didn't, since you may accidentally get one or the other through floating point arithmetic), so we're introducing a different version of 1 that is more strictly defined (either through a type or a runtime flag). Having it as a different type instead of having it as a runtime flag will allow for better code organization through dispatch, so that is the main reason to have it as a special type One, but that is more an implementation detail.

mtfishman avatar Apr 06 '21 14:04 mtfishman