qiskit-addon-cutting Add support for generating subexperiments with LO's translated to a native gate set

Fixes https://github.com/Qiskit-Extensions/circuit-knitting-toolbox/issues/492

This PR introduces an EagleEquivalenceLibrary and HeronEquivalenceLibrary for optionally translating all the gates we use in cutting decompositions to the corresponding native gate set during creation of subexperiments. This allows users to transpile their "Cut Circuit" one time, and when they generate_cutting_experiments all of their subexperiments will already be translated to the native gate set for the processor type they specified. If they don't specify, the "standard" gate set will be used. Heron single qubit native gates are the same as Eagle, so we get that equivalence library for free.

TODO:

[X] Decide how to allow users to specify a native gate set. Probably a string from {"standard", "eagle", "heron"}, which defaults to "standard".
[X] Clean up logic in qpd around parsing the target device and getting the right equivalence library
[X] Tests for the library and tests for the new option in generate_cutting_experiments
[X] Release note

Mar 28 '24 22:03 caleb-johnson

Pull Request Test Coverage Report for Build 9084999560

Details

73 of 73 (100.0%) changed or added relevant lines in 2 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.09%) to 95.582%

Totals
Change from base Build 9084779766:	0.09%
Covered Lines:	3570
Relevant Lines:	3735

💛 - Coveralls

Mar 28 '24 22:03 coveralls

I was seeing some failures when I tried to use this in the roundtrip tests. For every roundtrip test, I would run the standard gate set and also the eagle gate set. 4 tests were failing. I suppose those tests should've been passing if all of our decompositions are logically equivalent.

I'd like to integrate the different equivalences into the roundtrip tests elegantly somehow. Would love to hear any suggestions. I also need to figure out why the tests are failing before merging this

Mar 29 '24 00:03 caleb-johnson

I was seeing some failures when I tried to use this in the roundtrip tests. For every roundtrip test, I would run the standard gate set and also the eagle gate set. 4 tests were failing. I suppose those tests should've been passing if all of our decompositions are logically equivalent.

I'd like to integrate the different equivalences into the roundtrip tests elegantly somehow. Would love to hear any suggestions. I also need to figure out why the tests are failing before merging this

All the roundtrip tests pass with Eagle translated LO's except those including crx or csx gates. csx is just a specific instance of crx decomposition, so the bug can be traced back to a problem with translating the decompositions from crx gates.

Mar 29 '24 00:03 caleb-johnson

@garrison @Ibrahim-Shehzad this is ready for final review now.

Mar 30 '24 17:03 caleb-johnson

All the roundtrip tests pass with Eagle translated LO's except those including crx or csx gates. csx is just a specific instance of crx decomposition, so the bug can be traced back to a problem with translating the decompositions from crx gates.

I had left off an SXGate from the translation, so the RXGate translation was simply incorrect. This is fixed now.

Mar 30 '24 17:03 caleb-johnson

For QuantumCircuit <-> DAGCircuit, just as a quick check: are you using copy_operations=False as appropriate when you are sure about your ownership (or not) of the inputs? You can almost certainly do dag_to_circuit(dag, copy_operations=False) on output, and if you have assumed ownership of the input, you can do it in circuit_to_dag too, which avoids all copies.

edit: Qiskit's PassManager.run copies on input but not on output, that is it assumes that it's borrowing the input data, but after that, any mutations are local, so at the exit point of run, it knows it owns its data so there's no need to copy out. I don't know what (if anything) BasePass.__call__ does on copy - I suspect nothing, since 95% of the time we forget that you can call an individual pass on a QuantumCircuit directly (which we probably ought to look at).

Apr 01 '24 16:04 jakelishman

and from the API perspective, the user has already chosen a backend once

I'm not sure what you mean here. They've potentially transpiled their cut circuit using their backend in their local environment, but they haven't given a backend to CKT at this point in the workflow. They're using this new kwarg to communicate to CKT how their cut circuit has been transpiled so we can prepare their gate decompositions correctly

Apr 01 '24 17:04 caleb-johnson

Instead of {"heron", "eagle"}, we could take the Backend directly and key on its sorted(backend.basis_gates) to find the right equivalence library.

EDIT: This is kind of a bad idea. Using these two strings makes everything very straightforward. The info we need is being passed explicitly from the user. Passing in a backend means we have to account for all the different backend classes and make sure we are extracting their basis gate sets correctly and making sure we have accounted for all the permutations of basis gates across Runtime, qiskit provider, and Qiskit generic backends.

Leaving this the way it is

Apr 01 '24 18:04 caleb-johnson

@garrison I know we run some transpilation during creation of the subexperiments (e.g. RemoveFinalResets), so we already do the copies for each subexperiment. If you think it is more appropriate to handle the sampled gates at that time, that would be equally effective.

Totally open to adding these gates during transpilation or the way I have implemented here. I don't think users should have to do any extra copies for translation that can reasonably be avoided though, due to the volume of subexperiments.

Apr 03 '24 18:04 caleb-johnson

@garrison I know we run some transpilation during creation of the subexperiments (e.g. RemoveFinalResets), so we already do the copies for each subexperiment. If you think it is more appropriate to handle the sampled gates at that time, that would be equally effective.

Obviously Caleb knows this already, but for anyone else watching this thread: we actually ended up removing these passes in #556 in order to improve performance, replacing them with functions that operate directly on the QuantumCircuit object instead of the DAGCircuit. The modifications made by these passes are so incredibly minor that the DAGCircuit representation isn't necessary, and removing the round-trip speeds up experiment generation significantly.

May 07 '24 20:05 garrison

I am essentially on board with this.

Indeed, I have always desired for the bulk of transpilation to occur before generating the subexperiments, and my modification and test in #303 were meant to the be first step toward supporting that kind of workflow. (My recent PR, #573, creates a how-to guide for demonstrating this.)

However, obviously there are some steps which must be run after the subexperiments are generated. Previously, I've imagined that this will involve a pass manager that has a reduced set of passes, including some final basis translation together with some simple optimization passes (e.g., if two Hadamard gates appear on a qubit consecutively, they can be removed because $H^2 = I$). So, from my prior perspective, we should just bite the bullet with the DAGCircuit conversion, because translation might not be the only thing we need to do on the subexperiments.

I now think that there is a time where such passes on a DAGCircuit are appropriate and a time where they are not appropriate. The optimization passes I am imagining will only make minor changes to the subexperiments, and this might indeed make sense in a workflow when one is targeting the absolute best circuiits (analogous to setting optimization_level=3).

However, there are plenty of occasions where one doesn't want to wait to optimize the circuits to the best of the transpiler's ability, and the current PR is valuable for this reason. After all, we should aim to have the simple case be quick.

Note there are some limitations to this PR's approach, times at which conversion to DAGCircuit and back is probably going to be necessary, even if one does not want to optimize at the highest level. Ones I can think of right now:

if ancilla qubits are added during the QPD decomposition
if we support QPDs where some of the subexperiments can have gates with more than one qubit

Right now, we don't support either case, but maybe we will have to this about this in the future, e.g. if we want to provide direct support for cutting Toffoli gates (https://github.com/Qiskit-Extensions/circuit-knitting-toolbox/issues/258#issuecomment-2061648344).

Instead of {"heron", "eagle"}, we could take the Backend directly and key on its sorted(backend.basis_gates) to find the right equivalence library.

EDIT: This is kind of a bad idea. Using these two strings makes everything very straightforward. The info we need is being passed explicitly from the user. Passing in a backend means we have to account for all the different backend classes and make sure we are extracting their basis gate sets correctly and making sure we have accounted for all the permutations of basis gates across Runtime, qiskit provider, and Qiskit generic backends.

Actually, IMO taking the Backend directly would actually be preferable. Needing to pass the backend type as a string violates the don't-repeat-yourself philosophy and forces the user to know that ibm_sherbrooke is an "eagle" backend and ibm_torino is a "heron" backend, among others. And if they switch backends, they will now have two places to update in the code. (I could see this leading to very annoying errors when one forgets to modify both places, or they otherwise go out-of-sync.)

If the target instruction set of the backend matches something we recognize, then we can just use the equivalence library in this PR. If it does not match something we recognize, we can warn the user that the slow code path is being used, then allow Qiskit to do the basis translation. This way, the user will always get something that works, but will benefit from the improved performance on backends that we explicitly have written optimized equivalence libraries for.

And maybe eventually some of this work can influence Qiskit, such that we don't even have to have our own equivalences here and can just rely on the ones in Qiskit to be as good as ours. Some day.

May 09 '24 01:05 garrison

In a conversation with @mtreinish, he suggested to me that we should be able to use the standard equivalence library provided by Qiskit. If it is lacking some desired equivalences, then we can always copy the object and add our own equivalances to our library's copy, then use that as we see fit. Or, if there are equivalences that are not in Qiskit, they are likely to be welcome upstream too. Either way, this allows us to avoid using the BasisTranslator pass.

May 13 '24 20:05 garrison

In a conversation with @mtreinish, he suggested to me that we should be able to use the standard equivalence library provided by Qiskit. If it is lacking some desired equivalences, then we can always copy the object and add our own equivalances to our library's copy, then use that as we see fit. Or, if there are equivalences that are not in Qiskit, they are likely to be welcome upstream too. Either way, this allows us to avoid using the BasisTranslator pass.

How can we use the sel if the returned gates are in the standard gate set? That is the problem we are trying to correct. Our QPD gates are also in that gate set. This PR defines a new equivalence library -- one for each architecture we support. We don't need to build any optimization on top of this library because we just define one translation per gate we use in QPD, so we just have O(1) translation of our gates as they come out of the QPDBasis objects and into our subexperiments

May 13 '24 21:05 caleb-johnson

In a conversation with @mtreinish, he suggested to me that we should be able to use the standard equivalence library provided by Qiskit. If it is lacking some desired equivalences, then we can always copy the object and add our own equivalances to our library's copy, then use that as we see fit. Or, if there are equivalences that are not in Qiskit, they are likely to be welcome upstream too. Either way, this allows us to avoid using the BasisTranslator pass.

How can we use the sel if the returned gates are in the standard gate set? That is the problem we are trying to correct. Our QPD gates are also in that gate set. This PR defines a new equivalence library -- one for each architecture we support. We don't need to build any optimization on top of this library because we just define one translation per gate we use in QPD, so we just have O(1) translation of our gates as they come out of the QPDBasis objects and into our subexperiments

Sorry, after talking to @mtreinish, I had thought the Dijkstra's algorithm search was in EquivalenceLibrary, but this is wrong; it looks like it is only in BasisTranslator. The idea was that if the direct mapping we want is in the EquivalenceLibrary, then it will still take O(1) time because it will avoid the Dijkstra search. This part would still be possible if we were using BasisTranslator with an improved equivalence library, but that would require a DAGCircuit conversion and back.

May 14 '24 13:05 garrison

If you're trying to avoid using the basis translator and the dag then creating an equivalence library is not needed. You should just use a python dictionary. Under the covers the advantage the EquivalenceLibrary class provides you is that it builds a rustworkx graph of all the equivalence relationships in the library which makes things like doing a Dijkstra search to find a translation between arbitrary gates easy. If you're just looking for a static lookup map of 1:1 mappings, then the extra machinery in the EquivalenceLibrary class is unnecessary overhead. You can just do something like:

eagle_y_circ = QuantumCircuit(1)
eagle_y.rz(math.pi, 0)
eagle_y.x(0)
eagle_translation_dict = {
    "y": eagle_y
}

which will be faster and simpler. Also the other thing you can do is use the synthesis library to generate the definitions if you're not confident in the decomposition being good. Something like:

from qiskit.synthesis import OneQubitEulerDecomposer,  TwoQubitBasisDecomposer
from qiskit.circuit.library import ECRGate, CZGate, YGate

one_qubit_decomp = OneQubitEulerDecomposer("ZSXX")
eagle_decomp_two = TwoQubitBasisDecomposer(ECRGate(), euler_basis="ZSXX")
heron_decomp_two = TwoQubitBasisDecomposer(CZGate(), euler_basis="ZSXX")

eagle_translation_dict = {
    "y": one_qubit_decomp(YGate()),
    "cz": eagle_decomp_two(CZGate()),
}
heron_translation_dict = {
    "y": eagle_translation_dict["y"],
   "ecr": heron_decomp_two(ECRGate()),
}

This won't always give the optimal decomposition, but it simplifies it as much as the transpiler can by default. For example, SwapGate -> CZGate can done as:

def_swap_cz = QuantumCircuit(q, global_phase=-pi / 2)
def_swap_cz.sx(0)
def_swap_cz.sx(1)
def_swap_cz.cz(0, 1)
def_swap_cz.sx(0)
def_swap_cz.sx(1)
def_swap_cz.cz(0, 1)
def_swap_cz.sx(0)
def_swap_cz.sx(1)
def_swap_cz.cz(0, 1)

but the decomposition from TwoQubitBasisDecomposer has a few extra 1q gates over that.

May 18 '24 15:05 mtreinish

If you're trying to avoid using the basis translator and the dag then creating an equivalence library is not needed.

Thanks Matthew, the reason I didn't do that is because the parameters pass through nicely from the input gate to the translated representation. I'm sure that functionality can be recreated w a normal Python dict, but I'd have to think how.

EDIT: I think this is actually really easy. I just hadn't used python dicts like this before :)

May 20 '24 16:05 caleb-johnson

qiskit-addon-cutting qiskit-addon-cutting copied to clipboard

Add support for generating subexperiments with LO's translated to a native gate set

Pull Request Test Coverage Report for Build 9084999560

Details

💛 - Coveralls

qiskit-addon-cutting
qiskit-addon-cutting copied to clipboard