qml
qml copied to clipboard
Functionality for parallel `qml.expval(H)` execution needed
In the VQE with parallel QPUs on Rigetti Forest demo, deprecated qml.ExpvalCost functionality is used that makes use of internally parallelizing the executions with dask. Currently, something similar is not available with qml.expval(H), and, more importantly, not possible manually with user-facing functions and dask.
@antalszava and I concluded that there is no point to re-write the ExpvalCost logic with non-user-facing functions now, but we should instead find possibilities to provide parallelization possibilities with qml.expval(H) in the near future.
For now, we leave the demo as is with a warning about the deprecation, see https://github.com/PennyLaneAI/qml/pull/506
@Qottmann I imagine the closest approach would be to do something like:
@qml.qnode(dev)
def circuit(x, h):
qml.RX(x, wires=0)
qml.RY(x * 2, wires=0)
return qml.expval(h)
H = qml.PauliZ(0) + qml.PauliX(1)
results = [dask.delayed(circuit)(0.2, h) for h in H.ops]
results = H.coeffs @ dask.compute(*results, scheduler="threads")
Would this be sufficient in the tutorial? It avoids ExpvalCost, while making the Dask usage explicit.
In principle this would work, but do you see a way to incorporate the measurement optimization as well? Already in this example it is executing 2 expvals whereas only 1 is necessary. In many Hamiltonians you have big commuting Pauli groups, so executing all individual expvals in parallel is most likely slower (and wastes unnecessary hardware resources) than just default execution:
Zs = [qml.PauliZ(i) for i in range(10)]
Xs = [qml.PauliZ(i) for i in range(10)]
H = qml.Hamiltonian(coeffs = np.arange(20), observables = Zs + Xs, grouping_type="qwc")
results = [dask.delayed(circuit)(0.2, h) for h in H.ops]
>>> %timeit result = H.coeffs @ dask.compute(*results, scheduler="threads")
73.9 ms ± 5.17 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> %timeit circuit(0.2, H)
11.2 ms ± 289 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
As far as I understand it this tutorial was intended for parallel hardware execution so I would try to make sure to incorporate measurement optimization.
Small note, the above example also does not distribute the computation to different devices but I think this can be done with this modification:
devs = [qml.device("default.qubit", wires=3) for _ in range(2)]
def circuit(x, h):
qml.RX(x, wires=0)
qml.RY(x * 2, wires=0)
return qml.expval(h)
H = qml.PauliZ(0) + qml.PauliX(1)
results = [dask.delayed(qml.QNode(circuit, dev))(0.2, h) for h, dev in zip(H.ops, devs)]
results = H.coeffs @ dask.compute(*results, scheduler="threads")
Oops! Yes, perfect @Qottmann :)
I created a PR for this here https://github.com/PennyLaneAI/qml/pull/510