qml Functionality for parallel `qml.expval(H)` execution needed

In the VQE with parallel QPUs on Rigetti Forest demo, deprecated qml.ExpvalCost functionality is used that makes use of internally parallelizing the executions with dask. Currently, something similar is not available with qml.expval(H), and, more importantly, not possible manually with user-facing functions and dask.

@antalszava and I concluded that there is no point to re-write the ExpvalCost logic with non-user-facing functions now, but we should instead find possibilities to provide parallelization possibilities with qml.expval(H) in the near future.

For now, we leave the demo as is with a warning about the deprecation, see https://github.com/PennyLaneAI/qml/pull/506

Jun 14 '22 19:06 Qottmann

@Qottmann I imagine the closest approach would be to do something like:

@qml.qnode(dev)
def circuit(x, h):
    qml.RX(x, wires=0)
    qml.RY(x * 2, wires=0)
    return qml.expval(h)

H = qml.PauliZ(0) + qml.PauliX(1)

results = [dask.delayed(circuit)(0.2, h) for h in H.ops]
results = H.coeffs @ dask.compute(*results, scheduler="threads")

Would this be sufficient in the tutorial? It avoids ExpvalCost, while making the Dask usage explicit.

Jun 15 '22 02:06 josh146

In principle this would work, but do you see a way to incorporate the measurement optimization as well? Already in this example it is executing 2 expvals whereas only 1 is necessary. In many Hamiltonians you have big commuting Pauli groups, so executing all individual expvals in parallel is most likely slower (and wastes unnecessary hardware resources) than just default execution:

Zs = [qml.PauliZ(i) for i in range(10)]
Xs = [qml.PauliZ(i) for i in range(10)]
H = qml.Hamiltonian(coeffs = np.arange(20), observables = Zs + Xs, grouping_type="qwc")

results = [dask.delayed(circuit)(0.2, h) for h in H.ops]

>>> %timeit result = H.coeffs @ dask.compute(*results, scheduler="threads")
73.9 ms ± 5.17 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

>>> %timeit circuit(0.2, H)
11.2 ms ± 289 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

As far as I understand it this tutorial was intended for parallel hardware execution so I would try to make sure to incorporate measurement optimization.

Jun 15 '22 13:06 Qottmann

Small note, the above example also does not distribute the computation to different devices but I think this can be done with this modification:

devs = [qml.device("default.qubit", wires=3) for _ in range(2)]

def circuit(x, h):
    qml.RX(x, wires=0)
    qml.RY(x * 2, wires=0)
    return qml.expval(h)

H = qml.PauliZ(0) + qml.PauliX(1)

results = [dask.delayed(qml.QNode(circuit, dev))(0.2, h) for h, dev in zip(H.ops, devs)]
results = H.coeffs @ dask.compute(*results, scheduler="threads")

Jun 15 '22 17:06 Qottmann

Oops! Yes, perfect @Qottmann :)

Jun 15 '22 18:06 josh146

I created a PR for this here https://github.com/PennyLaneAI/qml/pull/510

Jun 15 '22 19:06 Qottmann

qml qml copied to clipboard

Functionality for parallel `qml.expval(H)` execution needed

qml
qml copied to clipboard