ITensors.jl
ITensors.jl copied to clipboard
`oneITensor` to rule them all
A common code pattern is:
A = ITensor(1)
A *= B
A *= C
which is useful when initializing a loop, for example. However, it has bothered me that it makes an unnecessary copy of B
. We decided, I think correctly, that ITensor(1) * B
should make a copy of B
, since it could lead to very subtle bugs if ITensor(1) * B
didn't copy but ITensor(1.000001) * B
did copy (say you were doing floating point operations to compute a constant a
, and then did ITensor(a) * B
, and then one time a
happened to be equal to 1
but you were relying on the copying behavior).
My proposal for this would be to introduce yet another storage type as well as a special number type called One
. The proposal would be the following:
using ITensors
using ITensors.NDTensors
# Multiplicative identity
struct One <: Number
end
struct Scalar{ElT} <: TensorStorage{ElT}
data::Base.RefValue{ElT}
end
Scalar(n::Number) = Scalar(Ref(n))
oneITensor() = itensor(Scalar(One()), IndexSet())
The Base.RefValue
storage allows the value to be modified in-place, i.e. would allow T[] = 2.0
. That's not totally necessary but I think is helpful for generic code, since a lot of code in NDTensors
assumes the tensors are mutable.
Then, we can define contraction such that (oneITensor() * B) = B
(without copying). Scalar(One())
storage could be interpreted as a universal multiplicative identity. Note that if someone did 1.0 * oneITensor()
, it would create a tensor with storage Scalar(1.0)
, which would then act as ITensor(1.0)
with copying behavior with multiplication. This allows the nice property that (1.0 * oneITensor()) * B
and oneITensor() * (1.0 * B)
both return copies of B
.
Another note that the Scalar
storage isn't strictly necessary here, and we could use a length 1 Dense
storage with element type One
, but I've been meaning to add a Scalar
type anyway since I think it will help clean up some code logic with scalar ITensor contractions.
A bonus from this is getting a behavior that I've wanted for delta
tensors, which is that A * delta(i, j)
would not copy A
if the contraction is equivalent to an index replacement. This faced a similar issue to trying to make ITensor(1) * A
non-copying, which is that A * (1 * delta(i, j))
would not copy but (A * 1) * delta(i, j)
would copy (and similarly, if 1
was replaced by a general floating point constant a
, the behavior would subtly change of a
happened to be equal to 1
).
With the new One
type, a delta
tensor could have storage with a value of One
instead of 1
, which would be clearly documented. Then, if you did 1 * delta(i, j)
, it would return a tensor with storage of 1.0
and have the regular copying behavior, which would have the property that A * (1 * delta(i, j))
and (A * 1) * delta(i, j)
always return a copy of A
.
This is definitely interesting. So to make sure I understand, is the key to the different behavior here mainly that these tensors would have a distinct storage type? I.e. that you can overload on the type Scalar{One}
versus Scalar{Float64}
say but not on the value inside of the Scalar storage, of course.
Sort of, but I think it is better to think about it in terms of the properties that we want to satisfy. I think ultimately the properties we want for any ITensor A
are:
-
oneITensor() * A
returnsA
(without a copy). -
1.0 * oneITensor()
is the same asITensor(1.0)
(more generally,a * oneITensor() == ITensor(a)
for alla isa Number
). -
ITensor(1.0) * A
and1.0 * A
returnscopy(A)
(more generally,ITensor(a)
acts the same asa
for alla isa Number
).
So from the properties above it follows that (oneITensor() * 1.0) * A
and oneITensor() * (1.0 * A)
both return copy(A)
.
Contrast this with the case where we chose that ITensor(1.0) * A
returned A
without a copy. In that case, (ITensor(1.0) * 1.0) * A
would return A
without a copy and ITensor(1.0) * (1.0 * A)
would return copy(A)
(actually copy(copy(A))
), which would be very confusing and break associativity of ITensor contraction. So ultimately it follows that it requires introducing a new version of 1
.
In principle we could do all of this without dispatch and without literally introducing a new type. For example, we could have some runtime value that flagged what kind of 1
you had in the storage, so oneITensor()
could create ITensor(1.0; is_special_one = true)
with a runtime flag that said it was a special kind of 1.0
that satisfied all of the properties above. So the essential idea is more about having two versions of identity which have the behavior defined above. It kind of distinguishes between being in floating point world vs. exact arithmetic world, where we choose different behaviors in these two worlds. In floating point world, we want to treat the multiplication of ITensor(1.0) * A
and ITensor(1.0 + eps()) * A
as essentially the same (it would be weird if one copied and the other didn't, since you may accidentally get one or the other through floating point arithmetic), so we're introducing a different version of 1 that is more strictly defined (either through a type or a runtime flag). Having it as a different type instead of having it as a runtime flag will allow for better code organization through dispatch, so that is the main reason to have it as a special type One
, but that is more an implementation detail.