qadence
qadence copied to clipboard
[Feature, Performance] Optimize HamEvo for commuting Hamiltonians
Currently HamEvo of some generator is always exponentiated fully, without checking for commutation relations. However, generators composed of several commuting parts can be exponentiated separately. We can make some of those checks automatic in the instantiation of HamEvo and then optimize the calculation in the backend. Below is an example script doing it manually to showcase the potential:
from qadence import X, Y, Z, N, kron, add, CNOT
from qadence import HamEvo, run, chain, random_state
from qadence import equivalent_state, hamiltonian_factory
import time
n_qubits = 10
# Consider a generator composed of commuting terms, e.g. a neutral atom interaction on a line register
op = N
gen_full = add(N(i) @ N(i+1) for i in range(n_qubits - 1))
# Let's split it in two parts, such that gen_ful = gen_part_1 + gen_part_2
gen_part_1 = add(op(i) @ op(i+1) for i in range(n_qubits // 2))
gen_part_2 = add(op(i) @ op(i+1) for i in range(n_qubits // 2, n_qubits-1))
# Because the two generators commute, they can be exponentiated separately as smaller generators
op_split = chain(HamEvo(gen_part_1, 1.0), HamEvo(gen_part_2, 1.0))
# Here we exponentiate the two together
op_full = HamEvo(gen_full, 1.0)
# Some random initial state
state_init = random_state(n_qubits)
start = time.time()
state_split = run(n_qubits, op_split, state = state_init)
mid = time.time()
state_full = run(n_qubits, op_full, state = state_init)
end = time.time()
# Check that indeed we get the same result
assert equivalent_state(state_split, state_full, atol = 1e-12)
# Check the time taken for both calculations
print("Split exponentiation: ", mid - start)
print("Full exponentiation: ", end - mid)
Split exponentiation: 0.010470867156982422
Full exponentiation: 0.6789381504058838
To implement this, there is already a block_is_commuting_hamiltonian
function in qadence.blocks.utils
that would be useful. It seem efficient, but should be reviewed. This function just returns True or False, so a first implementation could be based on that: essentially, if True
, every term in the AddBlock
is separated into its own matrix to be exponentiated.
A next level would be to look for optimizations even if the function returns False, where each non-commuting term would be aggregated into its own group. However I suspect this would not be straightforward.
Related to https://github.com/pasqal-io/qadence/issues/134 since this would reduce to calling block to tensor on each of the smaller commuting terms.
@vytautas-a like we discussed, I suspect a first implementation of this may only require dealing with the HamEvo
class in qadence.operations
, but I'm not sure about a clean way to do it. The digital_decomposition
method is relevant.
The HamEvo
instantiates a TimeEvolutionBlock
which then is handled by the convert_ops.py
in the pyqtorch
backend (also horqrux
backend), so all the code needed should be in these files.
HamEvo checks if the qubit hamiltonian is made of commuting terms and generates a gate representation of it from what I can see. So why does it take different execution times if the generator is specified as one chunk or separate commuting chunks when both have identical digital decomposition to be implemented in the end
HamEvo checks if the qubit hamiltonian is made of commuting terms and generates a gate representation of it from what I can see. So why does it take different execution times if the generator is specified as one chunk or separate commuting chunks when both have identical digital decomposition to be implemented in the end
Actually it currently only does that if you call HamEvo.digital_decomposition()
. Otherwise it builds the full matrix for the Hamiltonian and then exponentiates it, which is why it's faster in my example: there it is exponentiating two smaller matrices instead of a large one.
But the suggestion here is exactly to make the commutation check be done automatically and optimize based on that.
Cool, understood. I would like to add a few points with designing the algorithm that I thought about
- If the commuting operators act on the same set of qubits, it is good to keep them together, eg Z_1Z_2+X_1X_2
- If there is a generator term with support that encompasses the support of other terms, then they need to included with it eg Z1Z2Z3 + Z2 +Z3
- Start with the Z basis, if 2 generators have only a small overlap keep them separate like Z_1Z_2 separate from Z_2Z_3. Use other noncommuting generators like X_2 as a glue. for these separate groups to come together.
Found a paper about distributing a pauli decomposition into commuting set of collections, https://arxiv.org/pdf/1908.06942.pdf
Apparently there is a lot of work on this mostly in the direction of doing efficient observable estimation with minimal shots, but can be used here as well, i think
Nice @rajaiitp ! Seems relevant yes.
Also came accross this one: https://arxiv.org/pdf/1907.09040.pdf
Closing after opening in PyQTorch: https://github.com/pasqal-io/pyqtorch/issues/177