ITensorTDVP.jl icon indicating copy to clipboard operation
ITensorTDVP.jl copied to clipboard

Basis extension method

Open emstoudenmire opened this issue 3 years ago • 4 comments

Adds the Yang and White basis extension method [arxiv:2005.06104]. Currently only uses Krylov vectors instead of vectors defined by (1-tau*H)|psi> which is one minor difference from their paper.

Also includes a test code that (intentionally) isn't yet included in the test suite.

Here are some possible improvements:

  • allow a maxdim argument to be passed to extend and through basis_extend
  • current behavior is letting bond dimension get too big when used in imaginary time evolution
  • come up with better names:

    should basis_extend be called krylov_extend? should extend be called basis_extend?

  • Use (1-tau*H)|psi> to generate "Krylov" vectors instead of H|psi>. Needed?

emstoudenmire avatar May 03 '22 21:05 emstoudenmire

@emstoudenmire https://github.com/mtfishman/ITensorNetworks.jl/pull/43 is now merged so this can be moved to that repository.

mtfishman avatar Jan 09 '23 21:01 mtfishman

@emstoudenmire seems like people are itching to use this, but that it became outdated since it is relying on an internal ITensors.jl function (ITensors.directsum_itensors) that was removed in https://github.com/ITensor/ITensors.jl/pull/1185.

Should we try to update it and merge it, since it seems critical for allowing some people to use TDVP?

mtfishman avatar Oct 27 '23 12:10 mtfishman

Something I see that is missing is expanding by a sum of Hamiltonians, I guess that would be easy enough to add on later.

mtfishman avatar Oct 27 '23 12:10 mtfishman

@emstoudenmire is there a way to organize the extend method such that it first creates an MPS psi_extend that stores the expansion on each site, and then uses +(psi, psi_extend; alg="directsum") to extend the basis of the original MPS?

mtfishman avatar Oct 31 '23 15:10 mtfishman

@emstoudenmire I've done a pretty extensive rewrite, though it is mostly aesthetic changes and the basic algorithmic structure remains the same. I've also proposed a new interface, let me know what you think.

I may go ahead and refactor by first making an MPS containing the expansion basis (expansion_state) and then calling +(state, expansion_state; alg="directsum") as a final step and see how that looks. Beyond that if we like this new interface I think this is good to go.

mtfishman avatar May 15 '24 22:05 mtfishman

I think it looks really good. Definitely the names and the interface are much improved (and of course things like keyword handling are better).

Yes, so the main other things left for this PR in my mind were:

  • try implementing it using sum or + as you had mentioned previously
  • possible explore some defaults to see what really helps with practical calculations (like what default krylovdim). I did do that a little bit but I think It was starting to explore more systematic tests, then it got away from me.

Probably we can be more practical and just give it to users to try and learn from them whether it's working for their projects, or whether they need to e.g. use a very different krylov dim or other parameters. I think it ought to work in most cases as is.

emstoudenmire avatar May 15 '24 22:05 emstoudenmire

@emstoudenmire in the latest I changed expand_basis(...) to expand(...). I think it is more "punchy", hopefully it is not too terse but I think it should be clear what it means and it has a nice symmetry with the function truncate(...) for truncating the bond dimension of an MPS. My main concern would be finding another better meaning for a function called expand(...) or just that users would confuse it with another operation (maybe expand(...) could mean expanding the site basis, or expanding the number of sites of the MPS, ...).

I'm going back and forth about whether the MPO/MPS that are used to determine the expansion basis should be arguments or keyword arguments but I think arguments make more sense since then they can be dispatched on.

I was also going back and forth about what the meaning of krylovdim should be. My first reaction when I read krylovdim is that it is the size of the Krylov space, so for an initial vector v and an operator A my guess would be that krylovdim=3 would make a space spanning {v, Av, A²v}, i.e. involve 2 applications of A. However, the previous code interpreted krylovdim as the size of the space beyond v itself, so krylovdim=3 builds the space {v, Av, A²v, A³v}. That also appears to be the design in [KrylovKit.jl(https://jutho.github.io/KrylovKit.jl/stable):

using KrylovKit
using LinearAlgebra
n = 10
A = hermitianpart(randn(n, n))
krylovdim = 3
vals, vecs, info = eigsolve(A, 1; krylovdim, maxiter=1)
@show krylovdim, info.numops

outputs:

(krylovdim, info.numops) = (3, 3)

In the latest I went with the design that krylovdim=3 builds the space {v, Av, A²v, A³v}, curious about your thoughts on that. I still set a default of 2 which seems to be the recommendation from https://arxiv.org/abs/2005.06104 for getting a good balance between effectiveness of the expansion and performance.

mtfishman avatar May 16 '24 13:05 mtfishman

Yes, so the main other things left for this PR in my mind were:

* try implementing it using sum or `+` as you had mentioned previously

I tried that but gave up, I think it is trickier than I imagined because of how the basis of the state get modified during the algorithm (which I think you had found as well). We can investigate it more in future work. I was thinking it could be a good forward-looking design for implementing global basis expansion for trees and general tensor networks in ITensorNetworks.jl so we can investigate it there.

* possible explore some defaults to see what really helps with practical calculations (like what default krylovdim). I did do that a little bit but I think It was starting to explore more systematic tests, then it got away from me.

I kept the default of krylovdim=2 for now and kept a default of cutoff of about 1e-8 for Float64 inputs for now. I added some docstrings which include warnings that the keyword arguments and defaults are experimental and subject to change.

Probably we can be more practical and just give it to users to try and learn from them whether it's working for their projects, or whether they need to e.g. use a very different krylov dim or other parameters. I think it ought to work in most cases as is.

Agreed.

mtfishman avatar May 16 '24 13:05 mtfishman

Support for this will also be added to ITensorMPS.jl with this PR: https://github.com/ITensor/ITensorMPS.jl/pull/17.

mtfishman avatar May 16 '24 13:05 mtfishman

Great. I think I like just plain "expand" also for the reason you said. Hopefully it is clear from doc strings and arguments what exactly it does in case of any confusion. Like if later there is an expand which increases the number of sites it would take different arguments like an index array.

Makes sense about delaying the sum or + design until we can figure out the right algorithm to use there. And about giving it to users now.

For the krylovdim I don't have too strong an opinion. Maybe the best thing is to think about what krylovdim of 0 or 1 means. So in your convention it means the basis would not be expanded. In Juthos it would mean one non-trivial Krylov vector Hv gets generated. So in his convention 0 is the case where nothing happens. That doesn't say which one is better but I thought it might clarify the decision.

emstoudenmire avatar May 16 '24 13:05 emstoudenmire

For the krylovdim I don't have too strong an opinion. Maybe the best thing is to think about what krylovdim of 0 or 1 means. So in your convention it means the basis would not be expanded. In Juthos it would mean one non-trivial Krylov vector Hv gets generated. So in his convention 0 is the case where nothing happens. That doesn't say which one is better but I thought it might clarify the decision.

My feeling is that the convention where krylovdim=1 performs an expansion of the Krylov space by Av is a bit of a misnomer, since in my mind a Krylov subspace of size/dimension 1 is the original vector v, so it should really be called krylov_expansion_dim or something like that (both here and in KrylovKit.jl) which would indicate that it is the dimension the Krylov space will be expanded by beyond the trivial space composed of v. But anyway, I think it is better to follow the convention of KrylovKit.jl as a way to break the tie.

mtfishman avatar May 16 '24 13:05 mtfishman

Great. I think I like just plain "expand" also for the reason you said. Hopefully it is clear from doc strings and arguments what exactly it does in case of any confusion. Like if later there is an expand which increases the number of sites it would take different arguments like an index array.

I think expand(::MPS, ...) (or in ITensorNetworks.jl, expand(::ITensorNetwork, ...)) should have a single conceptual meaning and not be overloaded for other meanings like that in the future, so I would say this decision is voiding using it for other kinds of operations that don't involve expanding the MPS bond dimensions. The reason is both simplicity for users and also not relying on subtle dispatch behavior, i.e. we should leave open the possibility of allowing the current expand(::MPS, ...) function to accept very generic (even fully unspecified) arguments, say if we want to handle more generic inputs of basis expansion references, and I can imagine cases where that could lead to ambiguity both in the code and for users if we try to overload the terminology expand(...) too much for conceptually different operations.

But anyway, I can't think of an alternative meaning for expand(::MPS, ...) that wouldn't have a reasonable or better alternative name, in the example of expanding the number of sites that should probably be called something else, either expand_sites(::MPS, ...) or better yet not use the terminology expand* at all and use terminology like cat in the case of MPS (for concatenate) or union in the case of ITensorNetworks (for graph union). If you are expanding the number of sites of an MPS you should be specifying the state of the new sites anyway, otherwise it is an underdefined operation, so really that is the concept of unioning, appending, or concatenating two tensor networks, not expansion.

Additionally, expanding the site dimensions of an MPS may be a reasonable operation but that is much more specialized so should probably be called something like expand_sites(::MPS, ...). The fact that we haven't hit that ambiguity with truncate(::MPS, ...) (i.e. it implicitly truncates the link indices and not the sites) makes me think that won't lead to ambiguities in the case of expand(...).

mtfishman avatar May 16 '24 13:05 mtfishman